Compare commits

...

17 Commits

Author SHA1 Message Date
Eric Eastwood
48aeb269fd Remove stray newline 2025-09-22 14:55:52 -05:00
Eric Eastwood
c6070388d4 Merge branch 'develop' into madlittlemods/clarify-log-context-interaction-with-deferred
Conflicts:
	tests/util/test_logcontext.py
2025-09-22 14:55:25 -05:00
Eric Eastwood
777979ef13 Merge branch 'develop' into madlittlemods/clarify-log-context-interaction-with-deferred 2025-09-17 09:58:51 -05:00
Eric Eastwood
74f4140f8e Better flow 2025-09-13 11:18:02 -05:00
Eric Eastwood
7fd588d7a9 Add note about other deferreds that you may have stored 2025-09-12 20:17:53 -05:00
Eric Eastwood
c21697c194 Fix lints 2025-09-12 20:17:28 -05:00
Eric Eastwood
d30d92a0c7 Describe what context explicitly 2025-09-12 18:12:25 -05:00
Eric Eastwood
ec113e091c Add @logcontext_clean 2025-09-12 18:12:12 -05:00
Eric Eastwood
98952e1549 Fix lints 2025-09-12 18:02:35 -05:00
Eric Eastwood
bd0516b71f Add tests 2025-09-12 17:48:31 -05:00
Eric Eastwood
56c723f597 Add changelog 2025-09-12 16:49:47 -05:00
Eric Eastwood
271fe4ee17 Clarify why we have an empty pass block in the context manager 2025-09-12 16:47:13 -05:00
Eric Eastwood
d39cc96681 Clarify "exit" 2025-09-12 16:37:34 -05:00
Eric Eastwood
8b50ed3821 Solve the deferred callback in current context problem 2025-09-12 16:34:37 -05:00
Eric Eastwood
9ea87da317 Structure docs 2025-09-12 15:15:55 -05:00
Eric Eastwood
9069fd170e Iterate on docs 2025-09-12 14:32:56 -05:00
Eric Eastwood
275c4aa4aa WIP: Explain how Deferred callbacks interact with logcontexts
Spawning from https://github.com/matrix-org/synapse/pull/12588#discussion_r865843321
2025-09-12 09:40:35 -05:00
3 changed files with 365 additions and 2 deletions

1
changelog.d/18914.doc Normal file
View File

@@ -0,0 +1 @@
Explain how Deferred callbacks interact with logcontexts.

View File

@@ -143,8 +143,7 @@ cares about.
The following sections describe pitfalls and helpful patterns when
implementing these rules.
Always await your awaitables
----------------------------
## Always await your awaitables
Whenever you get an awaitable back from a function, you should `await` on
it as soon as possible. Do not pass go; do not do any logging; do not
@@ -203,6 +202,171 @@ async def sleep(seconds):
return await context.make_deferred_yieldable(get_sleep_deferred(seconds))
```
## Deferred callbacks
When a deferred callback is called, it inherits the current logcontext. The deferred
callback chain can resume a coroutine, which if following our logcontext rules, will
restore its own logcontext, then run:
- until it yields control back to the reactor, setting the sentinel logcontext
- or until it finishes, restoring the logcontext it was started with (calling context)
This behavior creates two specific issues:
**Issue 1:** The first issue is that the callback may have reset the logcontext to the
sentinel before returning. This means our calling function will continue with the
sentinel logcontext instead of the logcontext it was started with (bad).
**Issue 2:** The second issue is that the current logcontext that called the deferred
callback could finish before the callback finishes (bad).
In the following example, the deferred callback is called with the "main" logcontext and
runs until we yield control back to the reactor in the `await` inside `clock.sleep(0)`.
Since `clock.sleep(0)` follows our logcontext rules, it sets the logcontext to the
sentinel before yielding control back to the reactor. Our `main` function continues with
the sentinel logcontext (first bad thing) instead of the "main" logcontext. Then the
`with LoggingContext("main")` block exits, finishing the "main" logcontext and yielding
control back to the reactor again. Finally, later on when `clock.sleep(0)` completes,
our `with LoggingContext("competing")` block exits, and restores the previous "main"
logcontext which has already finished, resulting in `WARNING: Re-starting finished log
context main` and leaking the `main` logcontext into the reactor which will then
erronously be associated with the next task the reactor picks up.
```python
async def competing_callback():
# Since this is run with the "main" logcontext, when the "competing"
# logcontext exits, it will restore the previous "main" logcontext which has
# already finished and results in "WARNING: Re-starting finished log context main"
# and leaking the `main` logcontext into the reactor.
with LoggingContext("competing"):
await clock.sleep(0)
def main():
with LoggingContext("main"):
d = defer.Deferred()
d.addCallback(lambda _: defer.ensureDeferred(competing_callback()))
# Call the callback within the "main" logcontext.
d.callback(None)
# Bad: This will be logged against sentinel logcontext
logger.debug("ugh")
main()
```
**Solution 1:** We could of course fix this by following the general rule of "always
await your awaitables":
```python
async def main():
with LoggingContext("main"):
d = defer.Deferred()
d.addCallback(lambda _: defer.ensureDeferred(competing_callback()))
d.callback(None)
# Wait for `d` to finish before continuing so the "main" logcontext is
# still active. This works because `d` already follows our logcontext
# rules. If not, we would also have to use `make_deferred_yieldable(d)`.
await d
# Good: This will be logged against the "main" logcontext
logger.debug("phew")
```
**Solution 2:** We could also fix this by surrounding the call to `d.callback` with a
`PreserveLoggingContext`, which will reset the logcontext to the sentinel before calling
the callback, and restore the "main" logcontext afterwards before continuing the `main`
function. This solves the problem because when the "competing" logcontext exits, it will
restore the sentinel logcontext which is never finished by its nature, so there is no
warning and no leakage into the reactor.
```python
async def main():
with LoggingContext("main"):
d = defer.Deferred()
d.addCallback(lambda _: defer.ensureDeferred(competing_callback()))
d.callback(None)
with PreserveLoggingContext():
# Call the callback with the sentinel logcontext.
d.callback(None)
# Good: This will be logged against the "main" logcontext
logger.debug("phew")
```
**Solution 3:** But let's say you *do* want to run (fire-and-forget) the deferred
callback in the current context without running into issues:
We can solve the first issue by using `run_in_background(...)` to run the callback in
the current logcontext and it handles the magic behind the scenes of a) restoring the
calling logcontext before returning to the caller and b) resetting the logcontext to the
sentinel after the deferred completes and we yield control back to the reactor to avoid
leaking the logcontext into the reactor.
To solve the second issue, we can extend the lifetime of the "main" logcontext by
avoiding the `LoggingContext`'s context manager lifetime methods
(`__enter__`/`__exit__`). We can still set "main" as the current logcontext by using
`PreserveLoggingContext` and passing in the "main" logcontext.
```python
async def main():
main_context = LoggingContext("main")
with PreserveLoggingContext(main_context):
d = defer.Deferred()
d.addCallback(lambda _: defer.ensureDeferred(competing_callback()))
# The whole lambda will be run in the "main" logcontext. But we're using
# a trick to return the deferred `d` itself so that `run_in_background`
# will wait on that to complete and reset the logcontext to the sentinel
# when it does to avoid leaking the "main" logcontext into the reactor.
run_in_background(lambda: (d.callback(None), d)[1])
# Good: This will be logged against the "main" logcontext
logger.debug("phew")
...
# Wherever possible, it's best to finish the logcontext by calling `__exit__` at some
# point. This allows us to catch bugs if we later try to erroneously restart a finished
# logcontext.
#
# Since the "main" logcontext stores the `LoggingContext.previous_context` when it is
# created, we can wrap this call in `PreserveLoggingContext()` to restore the correct
# previous logcontext. Our goal is to have the calling context remain unchanged after
# finishing the "main" logcontext.
with PreserveLoggingContext():
# Finish the "main" logcontext
with main_context:
# Empty block - We're just trying to call `__exit__` on the "main" context
# manager to finish it. We can't call `__exit__` directly as the code expects us
# to `__enter__` before calling `__exit__` to `start`/`stop` things
# appropriately. And in any case, it's probably best not to call the internal
# methods directly.
pass
```
The same thing applies if you have some deferreds stored somewhere which you want to
callback in the current logcontext.
### Deferred errbacks and cancellations
The same care should be taken when calling errbacks on deferreds. An errback and
callback act the same in this regard (see section above).
```python
d = defer.Deferred()
d.addErrback(some_other_function)
d.errback(failure)
```
Additionally, cancellation is the same as directly calling the errback with a
`twisted.internet.defer.CancelledError`:
```python
d = defer.Deferred()
d.addErrback(some_other_function)
d.cancel()
```
## Fire-and-forget
Sometimes you want to fire off a chain of execution, but not wait for

View File

@@ -249,6 +249,204 @@ class LoggingContextTestCase(unittest.TestCase):
# Back to the sentinel context
self._check_test_key("sentinel")
@logcontext_clean
async def test_deferred_callback_await_in_current_logcontext(self) -> None:
"""
Test that calling the deferred callback in the current logcontext ("foo") and
waiting for it to finish in a logcontext blocks works as expected.
Works because "always await your awaitables".
Demonstrates one pattern that we can use fix the naive case where we just call
`d.callback(None)` without anything else. See the *Deferred callbacks* section
of docs/log_contexts.md for more details.
"""
clock = Clock(reactor)
# Sanity check that we start in the sentinel context
self._check_test_key("sentinel")
callback_finished = False
async def competing_callback() -> None:
nonlocal callback_finished
try:
# The deferred callback should have the same logcontext as the caller
self._check_test_key("foo")
with LoggingContext("competing"):
await clock.sleep(0)
self._check_test_key("competing")
self._check_test_key("foo")
finally:
# When exceptions happen, we still want to mark the callback as finished
# so that the test can complete and we see the underlying error.
callback_finished = True
with LoggingContext("foo"):
d: defer.Deferred[None] = defer.Deferred()
d.addCallback(lambda _: defer.ensureDeferred(competing_callback()))
self._check_test_key("foo")
d.callback(None)
# The fix for the naive case is here (i.e. things don't work correctly if we
# don't await here).
#
# Wait for `d` to finish before continuing so the "main" logcontext is
# still active. This works because `d` already follows our logcontext
# rules. If not, we would also have to use `make_deferred_yieldable(d)`.
await d
self._check_test_key("foo")
await clock.sleep(0)
self.assertTrue(
callback_finished,
"Callback never finished which means the test probably didn't wait long enough",
)
# Back to the sentinel context
self._check_test_key("sentinel")
@logcontext_clean
async def test_deferred_callback_preserve_logging_context(self) -> None:
"""
Test that calling the deferred callback inside `PreserveLoggingContext()` (in
the sentinel context) works as expected.
Demonstrates one pattern that we can use fix the naive case where we just call
`d.callback(None)` without anything else. See the *Deferred callbacks* section
of docs/log_contexts.md for more details.
"""
clock = Clock(reactor)
# Sanity check that we start in the sentinel context
self._check_test_key("sentinel")
callback_finished = False
async def competing_callback() -> None:
nonlocal callback_finished
try:
# The deferred callback should have the same logcontext as the caller
self._check_test_key("sentinel")
with LoggingContext("competing"):
await clock.sleep(0)
self._check_test_key("competing")
self._check_test_key("sentinel")
finally:
# When exceptions happen, we still want to mark the callback as finished
# so that the test can complete and we see the underlying error.
callback_finished = True
with LoggingContext("foo"):
d: defer.Deferred[None] = defer.Deferred()
d.addCallback(lambda _: defer.ensureDeferred(competing_callback()))
self._check_test_key("foo")
# The fix for the naive case is here (i.e. things don't work correctly if we
# don't `PreserveLoggingContext()` here).
#
# `PreserveLoggingContext` will reset the logcontext to the sentinel before
# calling the callback, and restore the "foo" logcontext afterwards before
# continuing the foo block. This solves the problem because when the
# "competing" logcontext exits, it will restore the sentinel logcontext
# which is never finished by its nature, so there is no warning and no
# leakage into the reactor.
with PreserveLoggingContext():
d.callback(None)
self._check_test_key("foo")
await clock.sleep(0)
self.assertTrue(
callback_finished,
"Callback never finished which means the test probably didn't wait long enough",
)
# Back to the sentinel context
self._check_test_key("sentinel")
@logcontext_clean
async def test_deferred_callback_fire_and_forget_with_current_context(self) -> None:
"""
Test that it's possible to call the deferred callback with the current context
while fire-and-forgetting the callback (no adverse effects like leaking the
logcontext into the reactor or restarting an already finished logcontext).
Demonstrates one pattern that we can use fix the naive case where we just call
`d.callback(None)` without anything else. See the *Deferred callbacks* section
of docs/log_contexts.md for more details.
"""
clock = Clock(reactor)
# Sanity check that we start in the sentinel context
self._check_test_key("sentinel")
callback_finished = False
async def competing_callback() -> None:
nonlocal callback_finished
try:
# The deferred callback should have the same logcontext as the caller
self._check_test_key("foo")
with LoggingContext("competing"):
await clock.sleep(0)
self._check_test_key("competing")
self._check_test_key("foo")
finally:
# When exceptions happen, we still want to mark the callback as finished
# so that the test can complete and we see the underlying error.
callback_finished = True
# Part of fix for the naive case is here (i.e. things don't work correctly if we
# don't `PreserveLoggingContext(...)` here).
#
# We can extend the lifetime of the "foo" logcontext is to avoid calling the
# context manager lifetime methods of `LoggingContext` (`__enter__`/`__exit__`).
# And we can still set the current logcontext by using `PreserveLoggingContext`
# and passing in the "foo" logcontext.
with PreserveLoggingContext(LoggingContext("foo")):
d: defer.Deferred[None] = defer.Deferred()
d.addCallback(lambda _: defer.ensureDeferred(competing_callback()))
self._check_test_key("foo")
# Other part of fix for the naive case is here (i.e. things don't work
# correctly if we don't `run_in_background(...)` here).
#
# `run_in_background(...)` will run the whole lambda in the current
# logcontext and it handles the magic behind the scenes of a) restoring the
# calling logcontext before returning to the caller and b) resetting the
# logcontext to the sentinel after the deferred completes and we yield
# control back to the reactor to avoid leaking the logcontext into the
# reactor.
#
# We're using a lambda here as a little trick so we can still get everything
# to run in the "foo" logcontext, but return the deferred `d` itself so that
# `run_in_background` will wait on that to complete before resetting the
# logcontext to the sentinel.
#
# type-ignore[call-overload]: This appears like a mypy type inference bug. A
# function that returns a deferred is exactly what `run_in_background`
# expects.
#
# type-ignore[func-returns-value]: This appears like a mypy type inference
# bug. We're always returning the deferred `d`.
run_in_background(lambda: (d.callback(None), d)[1]) # type: ignore[call-overload, func-returns-value]
self._check_test_key("foo")
await clock.sleep(0)
self.assertTrue(
callback_finished,
"Callback never finished which means the test probably didn't wait long enough",
)
# Back to the sentinel context
self._check_test_key("sentinel")
def _test_run_in_background(self, function: Callable[[], object]) -> defer.Deferred:
sentinel_context = current_context()