Compare commits

...

215 Commits

Author SHA1 Message Date
Devon Hudson
f39e2125a9 Don't prevent module_api from cancelling background processes on shutdown 2025-09-29 12:35:06 -06:00
Devon Hudson
3e92bf70d4 Remove unnecessary comments 2025-09-29 12:32:04 -06:00
Devon Hudson
3ca4e98e78 Remove unnecessary comment 2025-09-29 12:04:34 -06:00
Devon Hudson
f2f0baea84 Address synmark review comments 2025-09-29 12:02:33 -06:00
Devon Hudson
46e112b238 Make test comments clearer 2025-09-29 12:00:49 -06:00
Devon Hudson
baa066c588 Apply wording to other mypy lint codes 2025-09-29 11:59:02 -06:00
Devon Hudson
89a133e03b Update lint ignore comment 2025-09-29 11:56:30 -06:00
Devon Hudson
54d2fe4d90 Address mypy plugin review comments 2025-09-29 11:54:53 -06:00
Devon Hudson
751cacd34a Fix registered shutdown handlers 2025-09-28 08:43:32 -06:00
Devon Hudson
8d05718dda Fix appservice tests with wrap_as_background_process changes 2025-09-27 21:09:01 -06:00
Devon Hudson
cb699a7eb5 Make call_later cancellation bool optional with default logic 2025-09-27 20:33:33 -06:00
Devon Hudson
92dba449f8 Switch to using hs.run_as_background_process 2025-09-27 19:34:36 -06:00
Devon Hudson
25c4ba8ec8 Merge branch 'develop' into devon/clean-shutdown 2025-09-26 19:33:16 -06:00
Devon Hudson
50a3cd1fb6 Lint new Clock creation 2025-09-26 17:39:48 -06:00
Devon Hudson
28fdf12ecf Add lint for new Clock creation 2025-09-26 16:42:46 -06:00
Eric Eastwood
5143f93dc9 Fix server_name in logging context for multiple Synapse instances in one process (#18868)
### Background

As part of Element's plan to support a light form of vhosting (virtual
host) (multiple instances of Synapse in the same Python process), we're
currently diving into the details and implications of running multiple
instances of Synapse in the same Python process.

"Per-tenant logging" tracked internally by
https://github.com/element-hq/synapse-small-hosts/issues/48

### Prior art

Previously, we exposed `server_name` by providing a static logging
`MetadataFilter` that injected the values:


205d9e4fc4/synapse/config/logger.py (L216)

While this can work fine for the normal case of one Synapse instance per
Python process, this configures things globally and isn't compatible
when we try to start multiple Synapse instances because each subsequent
tenant will overwrite the previous tenant.


### What does this PR do?

We remove the `MetadataFilter` and replace it by tracking the
`server_name` in the `LoggingContext` and expose it with our existing
[`LoggingContextFilter`](205d9e4fc4/synapse/logging/context.py (L584-L622))
that we already use to expose information about the `request`.

This means that the `server_name` value follows wherever we log as
expected even when we have multiple Synapse instances running in the
same process.


### A note on logcontext

Anywhere, Synapse mistakenly uses the `sentinel` logcontext to log
something, we won't know which server sent the log. We've been fixing up
`sentinel` logcontext usage as tracked by
https://github.com/element-hq/synapse/issues/18905

Any further `sentinel` logcontext usage we find in the future can be
fixed piecemeal as normal.


d2a966f922/docs/log_contexts.md (L71-L81)


### Testing strategy

1. Adjust your logging config to include `%(server_name)s` in the format
    ```yaml
    formatters:
        precise:
format: '%(asctime)s - %(server_name)s - %(name)s - %(lineno)d -
%(levelname)s - %(request)s - %(message)s'
    ```
1. Start Synapse: `poetry run synapse_homeserver --config-path
homeserver.yaml`
1. Make some requests (`curl
http://localhost:8008/_matrix/client/versions`, etc)
1. Open the homeserver logs and notice the `server_name` in the logs as
expected. `unknown_server_from_sentinel_context` is expected for the
`sentinel` logcontext (things outside of Synapse).
2025-09-26 17:10:48 -05:00
Devon Hudson
527bd48b8f Fix postgres shutdown tests 2025-09-26 16:09:56 -06:00
Devon Hudson
34d314beca Fix removeSystemEventTrigger in tests 2025-09-26 15:08:54 -06:00
Devon Hudson
d6d4780eee Can't subscript Deferred in python 3.9 2025-09-26 12:01:15 -06:00
Devon Hudson
b1f887c799 Can't subscript Deferred in python 3.9 2025-09-26 11:51:43 -06:00
Devon Hudson
8184e9b599 Merge branch 'develop' into devon/clean-shutdown 2025-09-26 11:37:46 -06:00
Devon Hudson
d7f8c1cf62 Initial conversion run_as_background_process to use hs 2025-09-26 11:33:13 -06:00
Eric Eastwood
2f2b854ac1 Fix logcontext handling in timeout_deferred tests (#18974)
Related to https://github.com/element-hq/synapse/issues/18905

These fixes were split off from
https://github.com/element-hq/synapse/pull/18828 where @devonh was
seeing some test failures because `timeout_deferred(...)` is being
updated to use `Clock` utilities instead of raw `reactor` methods. This
test was failing in that branch/PR until we made this new version that
handles the logcontexts properly.

While the previous version of this test does pass on `develop`, it was
using what appears completely wrong assertions, assumptions, and bad
patterns to make it happen (see diff comments below)

---

Test originally introduced in https://github.com/matrix-org/synapse/pull/4407
2025-09-26 11:10:02 -05:00
Andrew Morgan
8f61bdb470 Note optional Element Commecial License in SPDX specifiers (#18973) 2025-09-26 12:43:07 +01:00
Andrew Morgan
7c32988f6b Update URLs in dockerfile metadata (#18971) 2025-09-26 12:40:50 +01:00
Hammy Havoc
688f635b59 Updated providers.json to use X instead of Twitter following rebrand and schema change (#18767) 2025-09-26 11:06:50 +01:00
Eric Eastwood
04721c85e6 Disconnect background process work from request trace (#18932)
Before https://github.com/element-hq/synapse/pull/18849, we we're using
our own custom `LogContextScopeManager` which tied the tracing scope to
the `LoggingContext`. Since we created a new
`BackgroundProcessLoggingContext` any time we
`run_as_background_process(...)`, the trace for the background work was
separate from the trace that kicked of the work as expected (e.g.
request trace is separate from the background process we kicked to fetch
more messages from the federation).

Since we've now switched to the `ContextVarsScopeManager` (in
https://github.com/element-hq/synapse/pull/18849), the tracing scope now
crosses the `LoggingContext` boundaries (and thread boundaries) without
a problem. This means we end up with request traces that include all of
the background work that we've kicked off bloating the trace and making
it hard to understand what's going on.

This PR separates the traces again to how things were before.
Additionally, things are even better now since I added some cross-link
references between the traces to easily be able to jump between.

Follow-up to https://github.com/element-hq/synapse/pull/18849

---

In the before, you can see that the trace is blown up by the background
process (`bgproc.qwer`).

In the after, we now only have a little cross-link marker span
(`start_bgproc.qwer`) to jump to background process trace.

Before | After
---  | ---
<some image> | <some image>



### Testing strategy

1. Run a Jaeger instance
(https://www.jaegertracing.io/docs/1.6/getting-started/)
    ```shell
    $ docker run -d --name jaeger \
      -e COLLECTOR_ZIPKIN_HTTP_PORT=9411 \
      -p 5775:5775/udp \
      -p 6831:6831/udp \
      -p 6832:6832/udp \
      -p 5778:5778 \
      -p 16686:16686 \
      -p 14268:14268 \
      -p 9411:9411 \
      jaegertracing/all-in-one:1.59.0
    ```
 1. Configure Synapse to use tracing:
     `homeserver.yaml`
     ```yaml
    ## Tracing ##
    opentracing:
      enabled: true
      jaeger_config:
        sampler:
          type: const
          param: 1
        logging:
          false
    ```
1. Make sure the optional `opentracing` dependency is installed: `poetry
install --extras all`
1. In the `VersionsRestServlet`, modify it to kick off a dummy
background process (easy to test this way)
    ```python
from synapse.metrics.background_process_metrics import
run_as_background_process

    async def _qwer() -> None:
        await self.clock.sleep(1)

    run_as_background_process("qwer", "test_server", _qwer)
    ```
1. Run Synapse: `poetry run synapse_homeserver --config-path
homeserver.yaml`
1. Fire off a version requests: `curl
http://localhost:8008/_matrix/client/versions`
 1. Visit http://localhost:16686/search to view the traces
     - Select the correct service
     - Look for the  `VersionsRestServlet` operation
     - Press 'Find Traces' button
     - Select the relevant trace
     - Notice how the trace isn't bloated
- Look for the `start_bgproc.qwer` span cross-linking to the background
process
- Jump to the other trace using the cross-link reference ->
`bgproc.qwer`
2025-09-25 21:45:18 -05:00
Devon Hudson
03b95947e2 Add assert that updates arent complete 2025-09-25 15:14:28 -06:00
Devon Hudson
30dbc2b55c Add shutdown test where background updates haven't completed 2025-09-25 15:12:05 -06:00
Devon Hudson
130dcda816 Re-add call_later cancellation tracking 2025-09-25 14:26:33 -06:00
Eric Eastwood
3f5c463103 Clean up test 2025-09-25 15:05:29 -05:00
Eric Eastwood
0f23e500e9 Fix test not follow Synapse logcontext rules 2025-09-25 15:05:25 -05:00
Travis Ralston
d2a966f922 Use signature support from policy servers when available (#18934)
Opening on Kegan's behalf


[MSC4284](https://github.com/matrix-org/matrix-spec-proposals/pull/4284)
has already been opened accordingly.

---------

Co-authored-by: Kegan Dougal <7190048+kegsay@users.noreply.github.com>
Co-authored-by: Eric Eastwood <erice@element.io>
2025-09-25 19:30:24 +00:00
Devon Hudson
d0555f9abc Merge branch 'develop' into devon/clean-shutdown 2025-09-25 11:38:04 -06:00
Andrew Morgan
dee6ba57a6 Merge branch 'release-v1.139' into develop 2025-09-25 12:57:39 +01:00
Devon Hudson
8855f4a313 Override MemoryReactor in tests for clean shutdown 2025-09-24 16:40:32 -06:00
dependabot[bot]
6ff181dbc7 Bump typing-extensions from 4.14.1 to 4.15.0 (#18956)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-24 23:06:45 +01:00
Devon Hudson
eb16eae264 Add debugging info for shutdown test 2025-09-24 15:45:39 -06:00
Hugh Nimmo-Smith
fd8fa97b6a Document and fix room_config param when user_may_create_room callback is invoked for a room upgrade (#18721)
Co-authored-by: Eric Eastwood <erice@element.io>
2025-09-24 21:42:19 +00:00
Devon Hudson
8b086c6a84 Return deferred from cleanup_func 2025-09-24 15:31:27 -06:00
Eric Eastwood
5266e423e2 Explain how Deferred callbacks interact with logcontexts (#18914)
Spawning from
https://github.com/matrix-org/synapse/pull/12588#discussion_r865843321

> It turns out `Deferred.cancel()` is a lot like
`Deferred.callback()`/`errback()` in that it will trash the logging
context:
> it can resume a coroutine, which will restore its own logging context,
then run:
> 
>  - until it blocks, setting the sentinel context
>  - or until it terminates, setting the context it was started with
> 
> So we need to wrap it in `with PreserveLoggingContext():`, like we do
with `.callback()`:
> 
> ```python
> with PreserveLoggingContext():
>     self.render_deferred.cancel()
> ```
>
> *-- @squahtx,
https://github.com/matrix-org/synapse/pull/12588#discussion_r865843321*
2025-09-24 16:20:42 -05:00
Eric Eastwood
0458f691b6 Fix run_coroutine_in_background(...) incorrectly handling logcontext (#18964)
Regressed in
https://github.com/element-hq/synapse/pull/18900#discussion_r2331554278
(see conversation there for more context)


### How is this a regression?

> To give this an update with more hindsight; this logic *was* redundant
with the early return and it is safe to remove this complexity

> 
> It seems like this actually has to do with completed vs incomplete
deferreds...
> 
> To explain how things previously worked *without* the early-return
shortcut:
> 
> With the normal case of **incomplete awaitable**, we store the
`calling_context` and the `f` function is called and runs until it
yields to the reactor. Because `f` follows the logcontext rules, it sets
the `sentinel` logcontext. Then in `run_in_background(...)`, we restore
the `calling_context`, store the current `ctx` (which is `sentinel`) and
return. When the deferred completes, we restore `ctx` (which is
`sentinel`) before yielding to the reactor again (all good
)
> 
> With the other case where we see a **completed awaitable**, we store
the `calling_context` and the `f` function is called and runs to
completion (no logcontext change). *This is where the shortcut would
kick in but I'm going to continue explaining as if we commented out the
shortcut.* -- Then in `run_in_background(...)`, we restore the
`calling_context`, store the current `ctx` (which is same as the
`calling_context`). Because the deferred is already completed, our extra
callback is called immediately and we restore `ctx` (which is same as
the `calling_context`). Since we never yield to the reactor, the
`calling_context` is perfect as that's what we want again (all good
)
> 
> ---
> 
> But this also means that our early-return shortcut is no longer just
an optimization and is *necessary* to act correctly in the **completed
awaitable** case as we want to return with the `calling_context` and not
reset to the `sentinel` context. I've updated the comment in
https://github.com/element-hq/synapse/pull/18964 to explain the
necessity as it's currently just described as an optimization.
> 
> But because we made the same change to
`run_coroutine_in_background(...)` which didn't have the same
early-return shortcut, we regressed the correct behavior  . This is
being fixed in https://github.com/element-hq/synapse/pull/18964
>
>
> *-- @MadLittleMods,
https://github.com/element-hq/synapse/pull/18900#discussion_r2373582917*

### How did we find this problem?

Spawning from @wrjlewis
[seeing](https://matrix.to/#/!SGNQGPGUwtcPBUotTL:matrix.org/$h3TxxPVlqC6BTL07dbrsz6PmaUoZxLiXnSTEY-QYDtA?via=jki.re&via=matrix.org&via=element.io)
`Starting metrics collection 'typing.get_new_events' from sentinel
context: metrics will be lost` in the logs:

<details>
<summary>More logs</summary>

```
synapse.http.request_metrics - 222 - ERROR - sentinel - Trying to stop RequestMetrics in the sentinel context.
2025-09-23 14:43:19,712 - synapse.util.metrics - 212 - WARNING - sentinel - Starting metrics collection 'typing.get_new_events' from sentinel context: metrics will be lost
2025-09-23 14:43:19,713 - synapse.rest.client.sync - 851 - INFO - sentinel - Client has disconnected; not serializing response.
2025-09-23 14:43:19,713 - synapse.http.server - 825 - WARNING - sentinel - Not sending response to request <XForwardedForRequest at 0x7f23e8111ed0 method='POST' uri='/_matrix/client/unstable/org.matrix.simplified_msc3575/sync?pos=281963%2Fs929324_147053_10_2652457_147960_2013_25554_4709564_0_164_2&timeout=30000' clientproto='HTTP/1.1' site='8008'>, already dis
connected.
2025-09-23 14:43:19,713 - synapse.access.http.8008 - 515 - INFO - sentinel - 92.40.194.87 - 8008 - {@me:wi11.co.uk} Processed request: 30.005sec/-8.041sec (0.001sec, 0.000sec) (0.000sec/0.002sec/2) 0B 200! "POST /_matrix/client/unstable/org.matrix.simplified_msc3575/
```

</details>

From the logs there, we can see things relating to
`typing.get_new_events` and
`/_matrix/client/unstable/org.matrix.simplified_msc3575/sync` which led
me to trying out Sliding Sync with the typing extension enabled and
allowed me to reproduce the problem locally. Sliding Sync is a unique
scenario as it's the only place we use `gather_optional_coroutines(...)`
-> `run_coroutine_in_background(...)` (introduced in
https://github.com/element-hq/synapse/pull/17884) to exhibit this
behavior.


### Testing strategy

1. Configure Synapse to enable
[MSC4186](https://github.com/matrix-org/matrix-spec-proposals/pull/4186):
Simplified Sliding Sync which is actually under
[MSC3575](https://github.com/matrix-org/matrix-spec-proposals/pull/3575)
    ```yaml
    experimental_features:
      msc3575_enabled: true
    ```
1. Start synapse: `poetry run synapse_homeserver --config-path
homeserver.yaml`
 1. Make a Sliding Sync request with one of the extensions enabled
    ```http
POST
http://localhost:8008/_matrix/client/unstable/org.matrix.simplified_msc3575/sync
    {
      "lists": {},
      "room_subscriptions": {
            "!FlgJYGQKAIvAscfBhq:my.synapse.linux.server": {
                "required_state": [],
                "timeline_limit": 1
            }
        },
        "extensions": {
            "typing": {
                "enabled": true
            }
        }
    }
    ```
1. Open your homeserver logs and notice warnings about `Starting ...
from sentinel context: metrics will be lost`
2025-09-24 15:24:47 +00:00
Eric Eastwood
25fa555395 Fix no active span when trying to log tracing error on startup (#18959)
Fix `no active span when trying to log` tracing error on startup.

Example error:
```log
synapse.logging.opentracing - 427 - ERROR - wake_destinations_needing_catchup-0 - There was no active span when trying to log. Did you forget to start one or did a context slip?
Stack (most recent call last):
  File "/usr/lib/python3.13/threading.py", line 1014, in _bootstrap
    self._bootstrap_inner()
  File "/usr/lib/python3.13/threading.py", line 1043, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.13/threading.py", line 994, in run
    self._target(*self._args, **self._kwargs)
  File "python3.13/site-packages/twisted/_threads/_threadworker.py", line 75, in work
    task()
  File "python3.13/site-packages/twisted/_threads/_team.py", line 192, in doWork
    task()
  File "python3.13/site-packages/twisted/python/threadpool.py", line 269, in inContext
    result = inContext.theWork()  # type: ignore[attr-defined]
  File "python3.13/site-packages/twisted/python/threadpool.py", line 285, in <lambda>
    inContext.theWork = lambda: context.call(  # type: ignore[attr-defined]
  File "python3.13/site-packages/twisted/python/context.py", line 117, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "python3.13/site-packages/twisted/python/context.py", line 82, in callWithContext
    return func(*args, **kw)
  File "python3.13/site-packages/twisted/enterprise/adbapi.py", line 282, in _runWithConnection
    result = func(conn, *args, **kw)
  File "synapse/synapse/storage/database.py", line 1094, in inner_func
    return func(db_conn, *args, **kwargs)
  File "synapse/synapse/storage/database.py", line 822, in new_transaction
    opentracing.log_kv({"message": "commit"})
  File "synapse/synapse/logging/opentracing.py", line 427, in ensure_active_span_inner_2
    logger.error(
```


### Why did this happen before?

This previously occurred because we called `init_tracer(...)` after the
reactor started up in `_base.start()`. But we actually attempt some
database transactions earlier than that which try to do some tracing
because of that `oidc = hs.get_oidc_handler()` line.

Notice `oidc = hs.get_oidc_handler()` happened before `_base.start(hs)`:


5be7679dd9/synapse/app/homeserver.py (L397-L408)


With this PR, I've updated things to `init_tracer(...)` earlier on
alongside where we `setup_logging(...)`.
2025-09-24 10:12:08 -05:00
Andrew Morgan
7708801d56 Fix triage_labelled GHA workflow (#18913) 2025-09-24 14:17:14 +01:00
Andrew Morgan
d3fc638c29 Merge branch 'master' into develop 2025-09-24 13:50:05 +01:00
Andrew Morgan
6c292dc4ee 1.138.2 2025-09-24 12:26:49 +01:00
Andrew Morgan
120389b077 Note ubuntu release support update in the upgrade notes 2025-09-24 12:25:41 +01:00
Andrew Morgan
71b34b3a07 Drop support for Ubuntu 24.10 'Oracular Oriole', add support for Ubuntu 25.04 'Plucky Puffin' (#18962) 2025-09-24 12:24:32 +01:00
PizZaKatZe
e766f325af fix: Compute user last seen timestamp from last seen devices (#18948)
## Fix last seen timestamp in `/_synapse/admin/v2/users` response

Fixes #18955

The last seen timestamps contained in `/_synapse/admin/v2/users`
responses were computed as follows:

```sql
                [...]
                LEFT JOIN (
                    SELECT user_id, MAX(last_seen) AS last_seen_ts
                    FROM user_ips GROUP BY user_id
                ) ls ON u.name = ls.user_id
                [...]
```

4367fb2d07/synapse/storage/databases/main/__init__.py (L302C1-L305C44)

This leads to empty timestamps (as in: user was never seen) if users are
inactive for longer than
[`user_ips_max_age`](https://element-hq.github.io/synapse/latest/usage/configuration/config_documentation.html#user_ips_max_age).

The fix is quite trivial: Use the `devices` table, as this one also
contains last seen timestamps but is *not* periodically purged.

We are using this for automatic user account deletion (via
[synadm](https://codeberg.org/synadm/synadm)) and the patched code works
as intended, whereas the unpatched version wants to delete users during
long vacations. 🫣
2025-09-24 11:59:11 +01:00
Tulir Asokan
512b3f50cf Update MSC4326 error code (#18947) 2025-09-24 11:57:24 +01:00
Andrew Morgan
0fbf296c99 1.138.1 2025-09-24 11:32:48 +01:00
Andrew Morgan
0c8594c9a8 Fix performance regression related to delayed events processing (#18926) 2025-09-24 11:30:47 +01:00
Shay
35c9cbb09d Add an Admin API to query a piece of local or cached remote media by ID (#18911) 2025-09-23 16:25:56 -05:00
dependabot[bot]
9680804496 Bump phonenumbers from 9.0.13 to 9.0.14 (#18954)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-23 17:18:41 +01:00
dependabot[bot]
8f63e2246a Bump pygithub from 2.7.0 to 2.8.1 (#18952)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-23 17:18:19 +01:00
dependabot[bot]
aa83d660d5 Bump anyhow from 1.0.99 to 1.0.100 (#18950)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-23 17:18:09 +01:00
dependabot[bot]
641ced06a2 Bump Swatinem/rust-cache from 2.8.0 to 2.8.1 (#18949)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-23 17:17:28 +01:00
dependabot[bot]
354f1cc219 Bump authlib from 1.6.3 to 1.6.4 (#18957)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-23 16:44:38 +01:00
dependabot[bot]
478f593b6c Bump serde from 1.0.224 to 1.0.226 (#18953)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-23 16:42:38 +01:00
dependabot[bot]
cd6c424adb Bump types-requests from 2.32.4.20250809 to 2.32.4.20250913 (#18951)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-23 16:40:36 +01:00
Andrew Morgan
b70f668a8c Merge branch 'release-v1.139' into develop 2025-09-23 16:28:04 +01:00
Andrew Morgan
0447496549 Merge branch 'release-v1.139' into develop 2025-09-23 16:05:53 +01:00
Andrew Morgan
9ed0d36fe2 Bump batch size from 50 to 1000 for _get_e2e_cross_signing_signatures_for_devices query (#18939)
Co-authored-by: Eric Eastwood <erice@element.io>
2025-09-23 15:47:29 +01:00
Devon Hudson
0bd1706837 Switch test hs cleanup to use weakref 2025-09-19 15:25:16 -06:00
Devon Hudson
9fd10db1f6 Remove unused import 2025-09-19 15:15:49 -06:00
Devon Hudson
34140b3600 Add arg to setup_test_homeserver for cleanup 2025-09-19 15:09:13 -06:00
Devon Hudson
490195f1ef Add comment about HTTP federation test 2025-09-19 15:02:54 -06:00
Devon Hudson
c36aaa1518 Fix lint ignore renames 2025-09-19 14:25:23 -06:00
Devon Hudson
63e096cb43 Move comment next to applicable arg 2025-09-19 14:15:42 -06:00
Devon Hudson
1c2a229655 Update tests/app/test_homeserver_shutdown.py
Co-authored-by: Eric Eastwood <erice@element.io>
2025-09-19 20:10:05 +00:00
Devon Hudson
3e37d95dc8 Update synapse/util/async_helpers.py
Co-authored-by: Eric Eastwood <erice@element.io>
2025-09-19 20:09:34 +00:00
Devon Hudson
6cc2c2f9df Update synapse/util/__init__.py
Co-authored-by: Eric Eastwood <erice@element.io>
2025-09-19 20:09:20 +00:00
Devon Hudson
075ef101f3 Rename lints for clarity 2025-09-19 14:08:45 -06:00
Devon Hudson
c02f0bdb06 Better lint categories 2025-09-19 13:27:50 -06:00
Devon Hudson
0ec5803364 Line wrap lint errors 2025-09-19 13:26:47 -06:00
Devon Hudson
e383758eb2 Remove args that aren't args anymore 2025-09-19 13:25:00 -06:00
Devon Hudson
f8a5bed8f8 Merge branch 'develop' into devon/clean-shutdown 2025-09-18 15:16:04 -06:00
Devon Hudson
a1e84145aa Update docstrings around using internal Clock 2025-09-18 15:04:17 -06:00
Devon Hudson
da4bdfec44 Change wording of freeze docstring 2025-09-18 14:40:33 -06:00
Devon Hudson
daa6c3eecc Update mypy ignore comments in tests 2025-09-18 12:02:00 -06:00
Devon Hudson
17d012d732 Remove cast 2025-09-18 11:55:05 -06:00
Devon Hudson
9fe95a4e99 Add comment explaining lack of shutdown 2025-09-18 11:22:53 -06:00
Devon Hudson
9123ec710b Update docstring 2025-09-18 11:18:29 -06:00
Devon Hudson
0bb0e6ef8d Update docstring 2025-09-18 11:17:57 -06:00
Devon Hudson
ecb2608880 Add comment to clock about lints 2025-09-18 11:15:53 -06:00
Devon Hudson
8f57498f5f Refactor sighup callbacks 2025-09-18 11:12:06 -06:00
Devon Hudson
5f167ff6e2 Update synapse/notifier.py
Co-authored-by: Eric Eastwood <erice@element.io>
2025-09-18 17:07:26 +00:00
Devon Hudson
fe3491b2ef Switch arg default to True 2025-09-18 09:20:47 -06:00
Devon Hudson
2465c2f872 Change call_later to set default cancel arg 2025-09-18 09:01:20 -06:00
Devon Hudson
37f971373f Extend docstring for homeserver_instance_id arg 2025-09-18 08:03:12 -06:00
Devon Hudson
83c84a0568 Make timeout_deferred use kwargs 2025-09-16 10:24:05 -06:00
Devon Hudson
4636d1f219 Propagate cancel_on_shutdown up to timeout_deferred 2025-09-16 10:19:33 -06:00
Devon Hudson
7d5f6e0551 Add comment explaining lack of tracking for response cache 2025-09-16 08:58:38 -06:00
Devon Hudson
c2d47d823e Fix call_later call cancellation 2025-09-15 19:11:01 -06:00
Devon Hudson
f3d8c17dbd Document calls to call_later 2025-09-15 19:10:45 -06:00
Devon Hudson
18ed2d14a5 Fix sleep call 2025-09-12 17:27:11 -06:00
Devon Hudson
2a4bae56c8 Add comments to call_later tracking choice 2025-09-12 17:23:58 -06:00
Devon Hudson
6b643f2b09 Remove unused optional arg 2025-09-12 16:58:28 -06:00
Devon Hudson
3fe65ab88c Explain mypy ignores 2025-09-12 16:22:42 -06:00
Devon Hudson
0337e1b6e9 Flush out freeze comments 2025-09-12 15:45:42 -06:00
Devon Hudson
0c10de30ed Update docstring for freeze arg 2025-09-12 15:41:16 -06:00
Devon Hudson
a72ce6f5f7 Update changelog.d/18828.feature
Co-authored-by: Eric Eastwood <erice@element.io>
2025-09-12 21:36:37 +00:00
Devon Hudson
5b6006486e Add docstring for delay tracking threshold 2025-09-12 15:35:52 -06:00
Devon Hudson
e7431d05ac Indent docstring 2025-09-12 15:32:22 -06:00
Devon Hudson
e3560245d6 Flush out docstring 2025-09-12 15:31:05 -06:00
Devon Hudson
ad6cdf476f Rename variable 2025-09-12 15:29:21 -06:00
Devon Hudson
83680f3f6e Update synapse/util/__init__.py
Co-authored-by: Eric Eastwood <erice@element.io>
2025-09-12 21:26:51 +00:00
Devon Hudson
996b924c83 Rename variable 2025-09-12 15:24:53 -06:00
Devon Hudson
4d87c26250 Update synapse/server.py
Co-authored-by: Eric Eastwood <erice@element.io>
2025-09-12 21:22:25 +00:00
Devon Hudson
01c9de60ee Remove unnecessary cleanup_metrics 2025-09-12 15:18:20 -06:00
Devon Hudson
49e0df21b6 Update synapse/server.py
Co-authored-by: Eric Eastwood <erice@element.io>
2025-09-12 20:51:11 +00:00
Devon Hudson
fde361a7c3 Revert log to debug 2025-09-12 14:49:45 -06:00
Devon Hudson
4023df93d1 Update scripts-dev/mypy_synapse_plugin.py
Co-authored-by: Eric Eastwood <erice@element.io>
2025-09-12 20:48:54 +00:00
Devon Hudson
3da4253f40 Fix wording 2025-09-12 14:41:25 -06:00
Devon Hudson
e278d38e01 Update scripts-dev/mypy_synapse_plugin.py
Co-authored-by: Eric Eastwood <erice@element.io>
2025-09-12 20:40:54 +00:00
Devon Hudson
169bc53a7d Make docstring more verbose 2025-09-12 12:47:47 -06:00
Devon Hudson
83be0161f4 Add reasoning for unregister_sighups being first 2025-09-12 09:00:02 -06:00
Devon Hudson
d6bdce6fba Add test for clean server shutdown 2025-09-11 17:59:29 -06:00
Devon Hudson
42d990da40 Fully shutdown background updater 2025-09-11 17:59:09 -06:00
Devon Hudson
d03ec2f62c Remove delayed calls that raise 2025-09-11 17:58:48 -06:00
Devon Hudson
03016753ce Readd mistakenly removed test 2025-09-11 09:36:29 -06:00
Devon Hudson
b117145417 Document function arg 2025-09-10 16:54:34 -06:00
Devon Hudson
2337b64e6c Change Clock to conditionally track calls for cleanup 2025-09-10 16:44:51 -06:00
Devon Hudson
a9df2ba5ff Add lint for using our internal Clock 2025-09-10 15:37:47 -06:00
Devon Hudson
a86bfe0cb3 Remove unnecessary TODO 2025-09-09 14:23:02 -06:00
Devon Hudson
da6f85ee5d Force shutdown handler registration to use keyword arguments 2025-09-09 14:05:47 -06:00
Devon Hudson
dafba10620 Readd metric_name to metrics hooks 2025-09-09 13:39:34 -06:00
Devon Hudson
4cd3d9172e Merge branch 'develop' into devon/clean-shutdown 2025-09-09 09:58:16 -06:00
Devon Hudson
9403bbdd9d Make clock variables instance specific 2025-09-09 09:49:24 -06:00
Devon Hudson
23e587f655 Merge branch 'develop' into devon/clean-shutdown 2025-09-09 14:24:23 +00:00
Devon Hudson
9202f50db5 Fully cleanup federation on shutdown 2025-09-08 17:07:09 -06:00
Devon Hudson
5dce3938e3 Remove old test 2025-09-08 17:06:47 -06:00
Devon Hudson
267da3eaed Refactor clock variable names 2025-09-08 17:06:25 -06:00
Devon Hudson
0c63671545 Fix linter error 2025-09-05 17:20:42 -06:00
Devon Hudson
ccf2585a4c Update synapse/server.py
Co-authored-by: Eric Eastwood <erice@element.io>
2025-09-05 23:18:00 +00:00
Devon Hudson
f814dd0411 Fix incorrect servername change 2025-09-05 17:17:27 -06:00
Devon Hudson
79e84ebb6f Update var name 2025-09-05 17:13:09 -06:00
Devon Hudson
672adc2a9b Fix function name 2025-09-05 17:11:44 -06:00
Devon Hudson
4a6ead1e90 Further explain docstring args 2025-09-05 17:06:27 -06:00
Devon Hudson
c3856ac65f Update test shutdown comment 2025-09-05 17:03:25 -06:00
Devon Hudson
ac8ecb1e24 Call hs.shutdown during tests 2025-09-05 16:42:28 -06:00
Devon Hudson
1f334efcb6 Move up unregister_sighups call 2025-09-05 14:40:43 -06:00
Devon Hudson
7b64868b6e Make homeserver shutdown async 2025-09-05 14:31:59 -06:00
Devon Hudson
cf10f45b0e Rename sighup map 2025-09-05 11:03:32 -06:00
Devon Hudson
219d00293a Update synapse/app/_base.py
Co-authored-by: Eric Eastwood <erice@element.io>
2025-09-05 17:01:23 +00:00
Devon Hudson
41734ca91a Add docstring for unregister_sighups 2025-09-05 11:00:29 -06:00
Devon Hudson
cb2f562e55 Add docstring for freeze 2025-09-05 10:59:16 -06:00
Devon Hudson
a53a7dfa88 Add return docstring 2025-09-05 10:54:26 -06:00
Devon Hudson
073734a409 Add instance_id to docstring 2025-09-05 10:52:37 -06:00
Devon Hudson
070f3026e5 Fix linter errors 2025-09-05 10:51:22 -06:00
Devon Hudson
e8a145b3eb More docstrings 2025-09-05 10:49:14 -06:00
Devon Hudson
27f40390b3 Update synapse/app/_base.py
Co-authored-by: Eric Eastwood <erice@element.io>
2025-09-05 16:48:38 +00:00
Devon Hudson
7fc17a887c Update synapse/_scripts/synapse_port_db.py
Co-authored-by: Eric Eastwood <erice@element.io>
2025-09-05 16:48:27 +00:00
Devon Hudson
ed1b436362 Add comment explaining need for type ignore 2025-09-05 09:48:19 -06:00
Devon Hudson
57b030d1cc Remove unused code 2025-09-05 09:31:16 -06:00
Devon Hudson
f00d90b0ac Reword SynapseProtocol docstrings 2025-09-05 09:20:32 -06:00
Devon Hudson
42f3e8fadd Callout full name of _FakePort in comment 2025-09-05 09:09:15 -06:00
Devon Hudson
a7f394814d Add TODO about shutdown freeze detection 2025-09-05 09:05:28 -06:00
Devon Hudson
62978f159a Remove unnecessary cleanup changes 2025-09-04 14:59:08 -06:00
Devon Hudson
a5ff83edb3 Add docs about using Clock instead of reactor directly 2025-09-04 14:31:40 -06:00
Devon Hudson
4d2e59526a Add comment documenting SynapseSite shutdown 2025-09-04 14:00:19 -06:00
Devon Hudson
1a22e8f0f0 Rename factory arg to site 2025-09-04 13:42:04 -06:00
Devon Hudson
0e67230b2a Clarify our_server_name 2025-09-04 13:35:47 -06:00
Devon Hudson
35f88f4527 Return None instead of raising exception 2025-09-04 13:28:13 -06:00
Devon Hudson
2619070c14 Make clock a required arg for Linearizer 2025-09-04 12:48:05 -06:00
Devon Hudson
b697f9a864 Remove background process registration 2025-09-04 11:59:57 -06:00
Devon Hudson
cc598a6e0a Fix linter error 2025-09-03 14:30:07 -06:00
Devon Hudson
5188a784ff Merge branch 'develop' into devon/clean-shutdown 2025-09-03 14:27:04 -06:00
Devon Hudson
4e7ddb34c7 Tighten type of event trigger id 2025-08-29 18:28:18 -06:00
Devon Hudson
93a218323b Add field descriptions to docs 2025-08-29 18:24:01 -06:00
Devon Hudson
649c182903 Rename var for clarity 2025-08-29 18:16:42 -06:00
Devon Hudson
0f26d20189 Use gauge.remove 2025-08-29 18:15:14 -06:00
Devon Hudson
e99f9cdfc7 Don't renam arg 2025-08-29 18:05:30 -06:00
Devon Hudson
4b67de55cd Assert metrics aren't being clobbered 2025-08-29 17:59:09 -06:00
Devon Hudson
849a093cfa Move server shutdown in tests 2025-08-29 17:40:52 -06:00
Devon Hudson
947cac97c6 Update docstring for setup function 2025-08-29 15:37:28 -06:00
Devon Hudson
4759c558ed Update comment again 2025-08-29 15:09:22 -06:00
Devon Hudson
e4233b9213 Use instance_id instead of server_name 2025-08-29 15:08:49 -06:00
Devon Hudson
9607670aa6 Update comment to reflect new variable 2025-08-29 15:04:35 -06:00
Devon Hudson
25e199a345 Cleanly shutdown open connections to Synapse 2025-08-27 17:01:28 -06:00
Devon Hudson
4f1603d127 Merge branch 'develop' into devon/clean-shutdown 2025-08-20 17:36:06 -06:00
Devon Hudson
b980858ee1 PR cleanup changes 2025-08-20 17:31:35 -06:00
Devon Hudson
25fdddd521 Revert unnecessary logic changes 2025-08-20 17:03:44 -06:00
Devon Hudson
1bd2ad3b47 Remove unnecessary cleanup step 2025-08-20 16:37:02 -06:00
Devon Hudson
7804e16c86 Apply formatting fixes 2025-08-20 13:54:04 -06:00
Devon Hudson
42882e7c01 Optionally freeze gc objects at startup 2025-08-20 13:53:18 -06:00
Devon Hudson
60c8088404 Fix formatting 2025-08-20 13:45:46 -06:00
Devon Hudson
02112fa55c Add args to registered shutdown handlers 2025-08-20 13:44:18 -06:00
Devon Hudson
51d0757300 Refactor shutdown handler registration function 2025-08-20 11:55:38 -06:00
Devon Hudson
fa978970e6 Remove shutdown function override 2025-08-20 11:44:07 -06:00
Devon Hudson
44e48f6958 Fix linter errors 2025-08-20 11:40:07 -06:00
Devon Hudson
a77e41b98b Cleanly shutdown metrics servers 2025-08-20 09:43:34 -06:00
Devon Hudson
ad6f02e695 Reset accidental constant value change 2025-08-19 16:58:30 -06:00
Devon Hudson
da4aa351e0 Address more TODO comments 2025-08-19 16:58:09 -06:00
Devon Hudson
35ee71ad3c Reintroduce code that no longer blocks clean shutdown 2025-08-19 16:08:24 -06:00
Devon Hudson
470978a9a9 Remove outdated TODO 2025-08-19 16:07:15 -06:00
Devon Hudson
667351ac5c Add manhole port to shutdown list 2025-08-19 16:02:39 -06:00
Devon Hudson
a3160908ef Add shutdown call to keyring mock 2025-08-19 15:13:59 -06:00
Devon Hudson
2b38ed02c3 Clean up batching queue metrics shutdown 2025-08-19 14:47:36 -06:00
Devon Hudson
a94c483c27 Fix type comparison 2025-08-19 12:39:29 -06:00
Devon Hudson
287fc3c18c Modify tests to shutdown homeserver on teardown 2025-08-19 12:35:07 -06:00
Devon Hudson
f26088b1c4 Add import 2025-08-19 11:38:18 -06:00
Devon Hudson
35fc370d1c Add new func to MockHomeserver 2025-08-19 11:23:36 -06:00
Devon Hudson
adeba65940 Remove unnecessary redirect 2025-08-19 11:15:23 -06:00
Devon Hudson
aa62ad866f Address linter errors 2025-08-19 10:49:55 -06:00
Devon Hudson
fe7548102b Add changelog entry 2025-08-15 17:54:29 -06:00
Devon Hudson
94a1bbb0e4 Cleanup unnecessary teardown attempts 2025-08-15 17:51:46 -06:00
Devon Hudson
5a4dd914c9 Merge branch 'develop' into devon/clean-shutdown 2025-08-15 16:53:50 -06:00
Devon Hudson
144bff02dc Remove unused code 2025-08-15 16:21:19 -06:00
Devon Hudson
31a607abee Make clock capable of cleaning up it's own outstanding calls 2025-08-15 16:08:18 -06:00
Devon Hudson
5078451bdc Cleanup SynapseHomeServer without weakrefs 2025-08-14 14:26:30 -06:00
Devon Hudson
df7b437f12 Revert "Temporarily disable all tests that call generate_latest"
This reverts commit 61508a6c26.
2025-08-06 15:21:25 -06:00
Devon Hudson
61508a6c26 Temporarily disable all tests that call generate_latest 2025-08-06 15:18:48 -06:00
Devon Hudson
c283db8a06 More shutdown cleanup 2025-08-06 14:54:20 -06:00
Devon Hudson
7a9725c5ce WIP on clean shutdown of SynapseHomeServer class 2025-07-30 15:41:31 -06:00
214 changed files with 4118 additions and 1197 deletions

View File

@@ -0,0 +1,29 @@
#!/usr/bin/env bash
set -euo pipefail
# 1) Resolve project ID.
PROJECT_ID=$(gh project view "$PROJECT_NUMBER" --owner "$PROJECT_OWNER" --format json | jq -r '.id')
# 2) Find existing item (project card) for this issue.
ITEM_ID=$(
gh project item-list "$PROJECT_NUMBER" --owner "$PROJECT_OWNER" --format json \
| jq -r --arg url "$ISSUE_URL" '.items[] | select(.content.url==$url) | .id' | head -n1
)
# 3) If one doesn't exist, add this issue to the project.
if [ -z "${ITEM_ID:-}" ]; then
ITEM_ID=$(gh project item-add "$PROJECT_NUMBER" --owner "$PROJECT_OWNER" --url "$ISSUE_URL" --format json | jq -r '.id')
fi
# 4) Get Status field id + the option id for TARGET_STATUS.
FIELDS_JSON=$(gh project field-list "$PROJECT_NUMBER" --owner "$PROJECT_OWNER" --format json)
STATUS_FIELD=$(echo "$FIELDS_JSON" | jq -r '.fields[] | select(.name=="Status")')
STATUS_FIELD_ID=$(echo "$STATUS_FIELD" | jq -r '.id')
OPTION_ID=$(echo "$STATUS_FIELD" | jq -r --arg name "$TARGET_STATUS" '.options[] | select(.name==$name) | .id')
if [ -z "${OPTION_ID:-}" ]; then
echo "No Status option named \"$TARGET_STATUS\" found"; exit 1
fi
# 5) Set Status (moves item to the matching column in the board view).
gh project item-edit --id "$ITEM_ID" --project-id "$PROJECT_ID" --field-id "$STATUS_FIELD_ID" --single-select-option-id "$OPTION_ID"

View File

@@ -25,7 +25,7 @@ jobs:
with:
toolchain: ${{ env.RUST_VERSION }}
components: clippy, rustfmt
- uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
- uses: Swatinem/rust-cache@f13886b937689c021905a6b90929199931d60db1 # v2.8.1
- name: Setup Poetry
uses: matrix-org/setup-python-poetry@5bbf6603c5c930615ec8a29f1b5d7d258d905aa4 # v2.0.0

View File

@@ -47,7 +47,7 @@ jobs:
uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
with:
toolchain: ${{ env.RUST_VERSION }}
- uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
- uses: Swatinem/rust-cache@f13886b937689c021905a6b90929199931d60db1 # v2.8.1
# The dev dependencies aren't exposed in the wheel metadata (at least with current
# poetry-core versions), so we install with poetry.
@@ -83,7 +83,7 @@ jobs:
uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
with:
toolchain: ${{ env.RUST_VERSION }}
- uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
- uses: Swatinem/rust-cache@f13886b937689c021905a6b90929199931d60db1 # v2.8.1
- run: sudo apt-get -qq install xmlsec1
- name: Set up PostgreSQL ${{ matrix.postgres-version }}
@@ -158,7 +158,7 @@ jobs:
uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
with:
toolchain: ${{ env.RUST_VERSION }}
- uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
- uses: Swatinem/rust-cache@f13886b937689c021905a6b90929199931d60db1 # v2.8.1
- name: Ensure sytest runs `pip install`
# Delete the lockfile so sytest will `pip install` rather than `poetry install`

View File

@@ -91,7 +91,7 @@ jobs:
uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
with:
toolchain: ${{ env.RUST_VERSION }}
- uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
- uses: Swatinem/rust-cache@f13886b937689c021905a6b90929199931d60db1 # v2.8.1
- uses: matrix-org/setup-python-poetry@5bbf6603c5c930615ec8a29f1b5d7d258d905aa4 # v2.0.0
with:
python-version: "3.x"
@@ -157,7 +157,7 @@ jobs:
uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
with:
toolchain: ${{ env.RUST_VERSION }}
- uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
- uses: Swatinem/rust-cache@f13886b937689c021905a6b90929199931d60db1 # v2.8.1
- name: Setup Poetry
uses: matrix-org/setup-python-poetry@5bbf6603c5c930615ec8a29f1b5d7d258d905aa4 # v2.0.0
@@ -220,7 +220,7 @@ jobs:
uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
with:
toolchain: ${{ env.RUST_VERSION }}
- uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
- uses: Swatinem/rust-cache@f13886b937689c021905a6b90929199931d60db1 # v2.8.1
- uses: matrix-org/setup-python-poetry@5bbf6603c5c930615ec8a29f1b5d7d258d905aa4 # v2.0.0
with:
poetry-version: "2.1.1"
@@ -240,7 +240,7 @@ jobs:
with:
components: clippy
toolchain: ${{ env.RUST_VERSION }}
- uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
- uses: Swatinem/rust-cache@f13886b937689c021905a6b90929199931d60db1 # v2.8.1
- run: cargo clippy -- -D warnings
@@ -259,7 +259,7 @@ jobs:
with:
toolchain: nightly-2025-04-23
components: clippy
- uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
- uses: Swatinem/rust-cache@f13886b937689c021905a6b90929199931d60db1 # v2.8.1
- run: cargo clippy --all-features -- -D warnings
@@ -276,7 +276,7 @@ jobs:
uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
with:
toolchain: ${{ env.RUST_VERSION }}
- uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
- uses: Swatinem/rust-cache@f13886b937689c021905a6b90929199931d60db1 # v2.8.1
- name: Setup Poetry
uses: matrix-org/setup-python-poetry@5bbf6603c5c930615ec8a29f1b5d7d258d905aa4 # v2.0.0
@@ -315,7 +315,7 @@ jobs:
# `.rustfmt.toml`.
toolchain: nightly-2025-04-23
components: rustfmt
- uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
- uses: Swatinem/rust-cache@f13886b937689c021905a6b90929199931d60db1 # v2.8.1
- run: cargo fmt --check
@@ -415,7 +415,7 @@ jobs:
uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
with:
toolchain: ${{ env.RUST_VERSION }}
- uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
- uses: Swatinem/rust-cache@f13886b937689c021905a6b90929199931d60db1 # v2.8.1
- uses: matrix-org/setup-python-poetry@5bbf6603c5c930615ec8a29f1b5d7d258d905aa4 # v2.0.0
with:
@@ -459,7 +459,7 @@ jobs:
uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
with:
toolchain: ${{ env.RUST_VERSION }}
- uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
- uses: Swatinem/rust-cache@f13886b937689c021905a6b90929199931d60db1 # v2.8.1
# There aren't wheels for some of the older deps, so we need to install
# their build dependencies
@@ -576,7 +576,7 @@ jobs:
uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
with:
toolchain: ${{ env.RUST_VERSION }}
- uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
- uses: Swatinem/rust-cache@f13886b937689c021905a6b90929199931d60db1 # v2.8.1
- name: Run SyTest
run: /bootstrap.sh synapse
@@ -722,7 +722,7 @@ jobs:
uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
with:
toolchain: ${{ env.RUST_VERSION }}
- uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
- uses: Swatinem/rust-cache@f13886b937689c021905a6b90929199931d60db1 # v2.8.1
- name: Prepare Complement's Prerequisites
run: synapse/.ci/scripts/setup_complement_prerequisites.sh
@@ -756,7 +756,7 @@ jobs:
uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
with:
toolchain: ${{ env.RUST_VERSION }}
- uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
- uses: Swatinem/rust-cache@f13886b937689c021905a6b90929199931d60db1 # v2.8.1
- run: cargo test
@@ -776,7 +776,7 @@ jobs:
uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
with:
toolchain: nightly-2022-12-01
- uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
- uses: Swatinem/rust-cache@f13886b937689c021905a6b90929199931d60db1 # v2.8.1
- run: cargo bench --no-run

View File

@@ -6,43 +6,26 @@ on:
jobs:
move_needs_info:
name: Move X-Needs-Info on the triage board
runs-on: ubuntu-latest
if: >
contains(github.event.issue.labels.*.name, 'X-Needs-Info')
permissions:
contents: read
env:
# This token must have the following scopes: ["repo:public_repo", "admin:org->read:org", "user->read:user", "project"]
GITHUB_TOKEN: ${{ secrets.ELEMENT_BOT_TOKEN }}
PROJECT_OWNER: matrix-org
# Backend issue triage board.
# https://github.com/orgs/matrix-org/projects/67/views/1
PROJECT_NUMBER: 67
ISSUE_URL: ${{ github.event.issue.html_url }}
# This field is case-sensitive.
TARGET_STATUS: Needs info
steps:
- uses: actions/add-to-project@4515659e2b458b27365e167605ac44f219494b66 # v1.0.2
id: add_project
- uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with:
project-url: "https://github.com/orgs/matrix-org/projects/67"
github-token: ${{ secrets.ELEMENT_BOT_TOKEN }}
# This action will error if the issue already exists on the project. Which is
# common as `X-Needs-Info` will often be added to issues that are already in
# the triage queue. Prevent the whole job from failing in this case.
continue-on-error: true
- name: Set status
env:
GITHUB_TOKEN: ${{ secrets.ELEMENT_BOT_TOKEN }}
run: |
gh api graphql -f query='
mutation(
$project: ID!
$item: ID!
$fieldid: ID!
$columnid: String!
) {
updateProjectV2ItemFieldValue(
input: {
projectId: $project
itemId: $item
fieldId: $fieldid
value: {
singleSelectOptionId: $columnid
}
}
) {
projectV2Item {
id
}
}
}' -f project="PVT_kwDOAIB0Bs4AFDdZ" -f item=${{ steps.add_project.outputs.itemId }} -f fieldid="PVTSSF_lADOAIB0Bs4AFDdZzgC6ZA4" -f columnid=ba22e43c --silent
# Only clone the script file we care about, instead of the whole repo.
sparse-checkout: .ci/scripts/triage_labelled_issue.sh
- name: Ensure issue exists on the board, then set Status
run: .ci/scripts/triage_labelled_issue.sh

View File

@@ -49,7 +49,7 @@ jobs:
uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
with:
toolchain: ${{ env.RUST_VERSION }}
- uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
- uses: Swatinem/rust-cache@f13886b937689c021905a6b90929199931d60db1 # v2.8.1
- uses: matrix-org/setup-python-poetry@5bbf6603c5c930615ec8a29f1b5d7d258d905aa4 # v2.0.0
with:
@@ -77,7 +77,7 @@ jobs:
uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
with:
toolchain: ${{ env.RUST_VERSION }}
- uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
- uses: Swatinem/rust-cache@f13886b937689c021905a6b90929199931d60db1 # v2.8.1
- uses: matrix-org/setup-python-poetry@5bbf6603c5c930615ec8a29f1b5d7d258d905aa4 # v2.0.0
with:
@@ -123,7 +123,7 @@ jobs:
uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
with:
toolchain: ${{ env.RUST_VERSION }}
- uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
- uses: Swatinem/rust-cache@f13886b937689c021905a6b90929199931d60db1 # v2.8.1
- name: Patch dependencies
# Note: The poetry commands want to create a virtualenv in /src/.venv/,

View File

@@ -11,8 +11,7 @@
## Internal Changes
- Drop support for Ubuntu 24.10 Oracular Oriole, and add support for Ubuntu 25.04 Plucky Puffin. ([\#18962](https://github.com/element-hq/synapse/issues/18962))
- Drop support for Ubuntu 24.10 Oracular Oriole, and add support for Ubuntu 25.04 Plucky Puffin. This change was applied on top of 1.139.0rc1. ([\#18962](https://github.com/element-hq/synapse/issues/18962))
@@ -83,6 +82,23 @@
* Bump types-requests from 2.32.4.20250611 to 2.32.4.20250809. ([\#18895](https://github.com/element-hq/synapse/issues/18895))
* Bump types-setuptools from 80.9.0.20250809 to 80.9.0.20250822. ([\#18924](https://github.com/element-hq/synapse/issues/18924))
# Synapse 1.138.2 (2025-09-24)
## Internal Changes
- Drop support for Ubuntu 24.10 Oracular Oriole, and add support for Ubuntu 25.04 Plucky Puffin. This change was applied on top of 1.138.1. ([\#18962](https://github.com/element-hq/synapse/issues/18962))
# Synapse 1.138.1 (2025-09-24)
## Bugfixes
- Fix a performance regression related to the experimental Delayed Events ([MSC4140](https://github.com/matrix-org/matrix-spec-proposals/pull/4140)) feature. ([\#18926](https://github.com/element-hq/synapse/issues/18926))
# Synapse 1.138.0 (2025-09-09)
No significant changes since 1.138.0rc1.

16
Cargo.lock generated
View File

@@ -28,9 +28,9 @@ dependencies = [
[[package]]
name = "anyhow"
version = "1.0.99"
version = "1.0.100"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b0674a1ddeecb70197781e945de4b3b8ffb61fa939a5597bcf48503737663100"
checksum = "a23eb6b1614318a8071c9b2521f36b424b2c83db5eb3a0fead4a6c0809af6e61"
[[package]]
name = "arc-swap"
@@ -1250,9 +1250,9 @@ dependencies = [
[[package]]
name = "serde"
version = "1.0.224"
version = "1.0.226"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6aaeb1e94f53b16384af593c71e20b095e958dab1d26939c1b70645c5cfbcc0b"
checksum = "0dca6411025b24b60bfa7ec1fe1f8e710ac09782dca409ee8237ba74b51295fd"
dependencies = [
"serde_core",
"serde_derive",
@@ -1260,18 +1260,18 @@ dependencies = [
[[package]]
name = "serde_core"
version = "1.0.224"
version = "1.0.226"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "32f39390fa6346e24defbcdd3d9544ba8a19985d0af74df8501fbfe9a64341ab"
checksum = "ba2ba63999edb9dac981fb34b3e5c0d111a69b0924e253ed29d83f7c99e966a4"
dependencies = [
"serde_derive",
]
[[package]]
name = "serde_derive"
version = "1.0.224"
version = "1.0.226"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "87ff78ab5e8561c9a675bfc1785cb07ae721f0ee53329a595cefd8c04c2ac4e0"
checksum = "8db53ae22f34573731bafa1db20f04027b2d25e02d8205921b569171699cdb33"
dependencies = [
"proc-macro2",
"quote",

View File

@@ -265,6 +265,8 @@ This software is dual-licensed by New Vector Ltd (Element). It can be used eithe
Unless required by applicable law or agreed to in writing, software distributed under the Licenses is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the Licenses for the specific language governing permissions and limitations under the Licenses.
Please contact `licensing@element.io <mailto:licensing@element.io>`_ to purchase an Element commercial license for this software.
.. |support| image:: https://img.shields.io/badge/matrix-community%20support-success
:alt: (get community support in #synapse:matrix.org)

1
changelog.d/18721.bugfix Normal file
View File

@@ -0,0 +1 @@
Fix room upgrade `room_config` argument and documentation for `user_may_create_room` spam-checker callback.

1
changelog.d/18767.misc Normal file
View File

@@ -0,0 +1 @@
Update OEmbed providers to use 'X' instead of 'Twitter' in URL previews, following a rebrand. Contributed by @HammyHavoc.

View File

@@ -0,0 +1 @@
Cleanly shutdown `SynapseHomeServer` object.

1
changelog.d/18868.misc Normal file
View File

@@ -0,0 +1 @@
Fix `server_name` in logging context for multiple Synapse instances in one process.

View File

@@ -0,0 +1,2 @@
Add an Admin API that allows server admins to to query and investigate the metadata of local or cached remote media via
the `origin/media_id` identifier found in a [Matrix Content URI](https://spec.matrix.org/v1.14/client-server-api/#matrix-content-mxc-uris).

1
changelog.d/18913.misc Normal file
View File

@@ -0,0 +1 @@
Fix the GitHub Actions workflow that moves issues labeled "X-Needs-Info" to the "Needs info" column on the team's internal triage board.

1
changelog.d/18914.doc Normal file
View File

@@ -0,0 +1 @@
Explain how Deferred callbacks interact with logcontexts.

1
changelog.d/18932.misc Normal file
View File

@@ -0,0 +1 @@
Disconnect background process work from request trace.

View File

@@ -0,0 +1 @@
Update [MSC4284: Policy Servers](https://github.com/matrix-org/matrix-spec-proposals/pull/4284) implementation to support signatures when available.

1
changelog.d/18939.misc Normal file
View File

@@ -0,0 +1 @@
Reduce overall number of calls to `_get_e2e_cross_signing_signatures_for_devices` by increasing the batch size of devices the query is called with, reducing DB load.

1
changelog.d/18947.misc Normal file
View File

@@ -0,0 +1 @@
Update error code used when an appservice tries to masquerade as an unknown device using [MSC4326](https://github.com/matrix-org/matrix-spec-proposals/pull/4326). Contributed by @tulir @ Beeper.

1
changelog.d/18948.bugfix Normal file
View File

@@ -0,0 +1 @@
Compute a user's last seen timestamp from their devices' last seen timestamps instead of IPs, because the latter are automatically cleared according to `user_ips_max_age`.

1
changelog.d/18959.misc Normal file
View File

@@ -0,0 +1 @@
Fix `no active span when trying to log` tracing error on startup (when OpenTracing is enabled).

1
changelog.d/18964.misc Normal file
View File

@@ -0,0 +1 @@
Fix `run_coroutine_in_background(...)` incorrectly handling logcontext.

1
changelog.d/18971.misc Normal file
View File

@@ -0,0 +1 @@
Update dockerfile metadata to fix broken link; point to documentation website.

1
changelog.d/18973.misc Normal file
View File

@@ -0,0 +1 @@
Note that the code is additionally licensed under the [Element Commercial license](https://github.com/element-hq/synapse/blob/develop/LICENSE-COMMERCIAL) in SPDX expression field configs.

1
changelog.d/18974.misc Normal file
View File

@@ -0,0 +1 @@
Fix logcontext handling in `timeout_deferred` tests.

20
debian/changelog vendored
View File

@@ -1,3 +1,11 @@
matrix-synapse-py3 (1.139.0~rc3+nmu1) UNRELEASED; urgency=medium
* The licensing specifier has been updated to add an optional
`LicenseRef-Element-Commercial` license. The code was already licensed in
this manner - the debian metadata was just not updated to reflect it.
-- Synapse Packaging team <packages@matrix.org> Thu, 25 Sep 2025 12:17:17 +0100
matrix-synapse-py3 (1.139.0~rc3) stable; urgency=medium
* New Synapse release 1.139.0rc3.
@@ -16,6 +24,18 @@ matrix-synapse-py3 (1.139.0~rc1) stable; urgency=medium
-- Synapse Packaging team <packages@matrix.org> Tue, 23 Sep 2025 13:24:50 +0100
matrix-synapse-py3 (1.138.2) stable; urgency=medium
* New Synapse release 1.138.2.
-- Synapse Packaging team <packages@matrix.org> Wed, 24 Sep 2025 12:26:16 +0100
matrix-synapse-py3 (1.138.1) stable; urgency=medium
* New Synapse release 1.138.1.
-- Synapse Packaging team <packages@matrix.org> Wed, 24 Sep 2025 11:32:38 +0100
matrix-synapse-py3 (1.138.0) stable; urgency=medium
* New Synapse release 1.138.0.

2
debian/copyright vendored
View File

@@ -8,7 +8,7 @@ License: Apache-2.0
Files: *
Copyright: 2023 New Vector Ltd
License: AGPL-3.0-or-later
License: AGPL-3.0-or-later or LicenseRef-Element-Commercial
Files: synapse/config/saml2.py
Copyright: 2015, Ericsson

View File

@@ -171,10 +171,10 @@ FROM docker.io/library/python:${PYTHON_VERSION}-slim-${DEBIAN_VERSION}
ARG TARGETARCH
LABEL org.opencontainers.image.url='https://matrix.org/docs/projects/server/synapse'
LABEL org.opencontainers.image.documentation='https://github.com/element-hq/synapse/blob/master/docker/README.md'
LABEL org.opencontainers.image.url='https://github.com/element-hq/synapse'
LABEL org.opencontainers.image.documentation='https://element-hq.github.io/synapse/latest/'
LABEL org.opencontainers.image.source='https://github.com/element-hq/synapse.git'
LABEL org.opencontainers.image.licenses='AGPL-3.0-or-later'
LABEL org.opencontainers.image.licenses='AGPL-3.0-or-later OR LicenseRef-Element-Commercial'
# On the runtime image, /lib is a symlink to /usr/lib, so we need to copy the
# libraries to the right place, else the `COPY` won't work.

View File

@@ -39,6 +39,40 @@ the use of the
[List media uploaded by a user](user_admin_api.md#list-media-uploaded-by-a-user)
Admin API.
## Query a piece of media by ID
This API returns information about a piece of local or cached remote media given the origin server name and media id. If
information is requested for remote media which is not cached the endpoint will return 404.
Request:
```http
GET /_synapse/admin/v1/media/<origin>/<media_id>
```
The API returns a JSON body with media info like the following:
Response:
```json
{
"media_info": {
"media_origin": "remote.com",
"user_id": null,
"media_id": "sdginwegWEG",
"media_type": "img/png",
"media_length": 67,
"upload_name": "test.png",
"created_ts": 300,
"filesystem_id": "wgeweg",
"url_cache": null,
"last_access_ts": 400,
"quarantined_by": null,
"authenticated": false,
"safe_from_quarantine": null,
"sha256": "ebf4f635a17d10d6eb46ba680b70142419aa3220f228001a036d311a22ee9d2a"
}
}
```
# Quarantine media
Quarantining media means that it is marked as inaccessible by users. It applies

View File

@@ -143,8 +143,7 @@ cares about.
The following sections describe pitfalls and helpful patterns when
implementing these rules.
Always await your awaitables
----------------------------
## Always await your awaitables
Whenever you get an awaitable back from a function, you should `await` on
it as soon as possible. Do not pass go; do not do any logging; do not
@@ -203,6 +202,171 @@ async def sleep(seconds):
return await context.make_deferred_yieldable(get_sleep_deferred(seconds))
```
## Deferred callbacks
When a deferred callback is called, it inherits the current logcontext. The deferred
callback chain can resume a coroutine, which if following our logcontext rules, will
restore its own logcontext, then run:
- until it yields control back to the reactor, setting the sentinel logcontext
- or until it finishes, restoring the logcontext it was started with (calling context)
This behavior creates two specific issues:
**Issue 1:** The first issue is that the callback may have reset the logcontext to the
sentinel before returning. This means our calling function will continue with the
sentinel logcontext instead of the logcontext it was started with (bad).
**Issue 2:** The second issue is that the current logcontext that called the deferred
callback could finish before the callback finishes (bad).
In the following example, the deferred callback is called with the "main" logcontext and
runs until we yield control back to the reactor in the `await` inside `clock.sleep(0)`.
Since `clock.sleep(0)` follows our logcontext rules, it sets the logcontext to the
sentinel before yielding control back to the reactor. Our `main` function continues with
the sentinel logcontext (first bad thing) instead of the "main" logcontext. Then the
`with LoggingContext("main")` block exits, finishing the "main" logcontext and yielding
control back to the reactor again. Finally, later on when `clock.sleep(0)` completes,
our `with LoggingContext("competing")` block exits, and restores the previous "main"
logcontext which has already finished, resulting in `WARNING: Re-starting finished log
context main` and leaking the `main` logcontext into the reactor which will then
erronously be associated with the next task the reactor picks up.
```python
async def competing_callback():
# Since this is run with the "main" logcontext, when the "competing"
# logcontext exits, it will restore the previous "main" logcontext which has
# already finished and results in "WARNING: Re-starting finished log context main"
# and leaking the `main` logcontext into the reactor.
with LoggingContext("competing"):
await clock.sleep(0)
def main():
with LoggingContext("main"):
d = defer.Deferred()
d.addCallback(lambda _: defer.ensureDeferred(competing_callback()))
# Call the callback within the "main" logcontext.
d.callback(None)
# Bad: This will be logged against sentinel logcontext
logger.debug("ugh")
main()
```
**Solution 1:** We could of course fix this by following the general rule of "always
await your awaitables":
```python
async def main():
with LoggingContext("main"):
d = defer.Deferred()
d.addCallback(lambda _: defer.ensureDeferred(competing_callback()))
d.callback(None)
# Wait for `d` to finish before continuing so the "main" logcontext is
# still active. This works because `d` already follows our logcontext
# rules. If not, we would also have to use `make_deferred_yieldable(d)`.
await d
# Good: This will be logged against the "main" logcontext
logger.debug("phew")
```
**Solution 2:** We could also fix this by surrounding the call to `d.callback` with a
`PreserveLoggingContext`, which will reset the logcontext to the sentinel before calling
the callback, and restore the "main" logcontext afterwards before continuing the `main`
function. This solves the problem because when the "competing" logcontext exits, it will
restore the sentinel logcontext which is never finished by its nature, so there is no
warning and no leakage into the reactor.
```python
async def main():
with LoggingContext("main"):
d = defer.Deferred()
d.addCallback(lambda _: defer.ensureDeferred(competing_callback()))
d.callback(None)
with PreserveLoggingContext():
# Call the callback with the sentinel logcontext.
d.callback(None)
# Good: This will be logged against the "main" logcontext
logger.debug("phew")
```
**Solution 3:** But let's say you *do* want to run (fire-and-forget) the deferred
callback in the current context without running into issues:
We can solve the first issue by using `run_in_background(...)` to run the callback in
the current logcontext and it handles the magic behind the scenes of a) restoring the
calling logcontext before returning to the caller and b) resetting the logcontext to the
sentinel after the deferred completes and we yield control back to the reactor to avoid
leaking the logcontext into the reactor.
To solve the second issue, we can extend the lifetime of the "main" logcontext by
avoiding the `LoggingContext`'s context manager lifetime methods
(`__enter__`/`__exit__`). We can still set "main" as the current logcontext by using
`PreserveLoggingContext` and passing in the "main" logcontext.
```python
async def main():
main_context = LoggingContext("main")
with PreserveLoggingContext(main_context):
d = defer.Deferred()
d.addCallback(lambda _: defer.ensureDeferred(competing_callback()))
# The whole lambda will be run in the "main" logcontext. But we're using
# a trick to return the deferred `d` itself so that `run_in_background`
# will wait on that to complete and reset the logcontext to the sentinel
# when it does to avoid leaking the "main" logcontext into the reactor.
run_in_background(lambda: (d.callback(None), d)[1])
# Good: This will be logged against the "main" logcontext
logger.debug("phew")
...
# Wherever possible, it's best to finish the logcontext by calling `__exit__` at some
# point. This allows us to catch bugs if we later try to erroneously restart a finished
# logcontext.
#
# Since the "main" logcontext stores the `LoggingContext.previous_context` when it is
# created, we can wrap this call in `PreserveLoggingContext()` to restore the correct
# previous logcontext. Our goal is to have the calling context remain unchanged after
# finishing the "main" logcontext.
with PreserveLoggingContext():
# Finish the "main" logcontext
with main_context:
# Empty block - We're just trying to call `__exit__` on the "main" context
# manager to finish it. We can't call `__exit__` directly as the code expects us
# to `__enter__` before calling `__exit__` to `start`/`stop` things
# appropriately. And in any case, it's probably best not to call the internal
# methods directly.
pass
```
The same thing applies if you have some deferreds stored somewhere which you want to
callback in the current logcontext.
### Deferred errbacks and cancellations
The same care should be taken when calling errbacks on deferreds. An errback and
callback act the same in this regard (see section above).
```python
d = defer.Deferred()
d.addErrback(some_other_function)
d.errback(failure)
```
Additionally, cancellation is the same as directly calling the errback with a
`twisted.internet.defer.CancelledError`:
```python
d = defer.Deferred()
d.addErrback(some_other_function)
d.cancel()
```
## Fire-and-forget
Sometimes you want to fire off a chain of execution, but not wait for

View File

@@ -195,12 +195,15 @@ _Changed in Synapse v1.132.0: Added the `room_config` argument. Callbacks that o
async def user_may_create_room(user_id: str, room_config: synapse.module_api.JsonDict) -> Union["synapse.module_api.NOT_SPAM", "synapse.module_api.errors.Codes", bool]
```
Called when processing a room creation request.
Called when processing a room creation or room upgrade request.
The arguments passed to this callback are:
* `user_id`: The Matrix user ID of the user (e.g. `@alice:example.com`).
* `room_config`: The contents of the body of a [/createRoom request](https://spec.matrix.org/latest/client-server-api/#post_matrixclientv3createroom) as a dictionary.
* `room_config`: The contents of the body of the [`/createRoom` request](https://spec.matrix.org/v1.15/client-server-api/#post_matrixclientv3createroom) as a dictionary.
For a [room upgrade request](https://spec.matrix.org/v1.15/client-server-api/#post_matrixclientv3roomsroomidupgrade) it is a synthesised subset of what an equivalent
`/createRoom` request would have looked like. Specifically, it contains the `creation_content` (linking to the previous room) and `initial_state` (containing a
subset of the state of the previous room).
The callback must return one of:
- `synapse.module_api.NOT_SPAM`, to allow the operation. Other callbacks may still

View File

@@ -119,12 +119,6 @@ stacking them up. You can monitor the currently running background updates with
# Upgrading to v1.139.0
## Drop support for Ubuntu 24.10 Oracular Oriole, and add support for Ubuntu 25.04 Plucky Puffin
Ubuntu 24.10 Oracular Oriole [has been end-of-life since 10 Jul
2025](https://endoflife.date/ubuntu). This release drops support for Ubuntu
24.10, and in its place adds support for Ubuntu 25.04 Plucky Puffin.
## `/register` requests from old application service implementations may break when using MAS
Application Services that do not set `inhibit_login=true` when calling `POST
@@ -140,6 +134,16 @@ ensure it is up to date. If it is, then kindly let the author know that they
need to update their implementation to call `/register` with
`inhibit_login=true`.
# Upgrading to v1.138.2
## Drop support for Ubuntu 24.10 Oracular Oriole, and add support for Ubuntu 25.04 Plucky Puffin
Ubuntu 24.10 Oracular Oriole [has been end-of-life since 10 Jul
2025](https://endoflife.date/ubuntu). This release drops support for Ubuntu
24.10, and in its place adds support for Ubuntu 25.04 Plucky Puffin.
This notice also applies to the v1.139.0 release.
# Upgrading to v1.136.0
## Deprecate `run_as_background_process` exported as part of the module API interface in favor of `ModuleApi.run_as_background_process`

32
poetry.lock generated
View File

@@ -1,4 +1,4 @@
# This file is automatically @generated by Poetry 2.1.1 and should not be changed by hand.
# This file is automatically @generated by Poetry 2.2.0 and should not be changed by hand.
[[package]]
name = "annotated-types"
@@ -34,15 +34,15 @@ tests-mypy = ["mypy (>=1.11.1) ; platform_python_implementation == \"CPython\" a
[[package]]
name = "authlib"
version = "1.6.3"
version = "1.6.4"
description = "The ultimate Python library in building OAuth and OpenID Connect servers and clients."
optional = true
python-versions = ">=3.9"
groups = ["main"]
markers = "extra == \"all\" or extra == \"jwt\" or extra == \"oidc\""
files = [
{file = "authlib-1.6.3-py2.py3-none-any.whl", hash = "sha256:7ea0f082edd95a03b7b72edac65ec7f8f68d703017d7e37573aee4fc603f2a48"},
{file = "authlib-1.6.3.tar.gz", hash = "sha256:9f7a982cc395de719e4c2215c5707e7ea690ecf84f1ab126f28c053f4219e610"},
{file = "authlib-1.6.4-py2.py3-none-any.whl", hash = "sha256:39313d2a2caac3ecf6d8f95fbebdfd30ae6ea6ae6a6db794d976405fdd9aa796"},
{file = "authlib-1.6.4.tar.gz", hash = "sha256:104b0442a43061dc8bc23b133d1d06a2b0a9c2e3e33f34c4338929e816287649"},
]
[package.dependencies]
@@ -1531,14 +1531,14 @@ files = [
[[package]]
name = "phonenumbers"
version = "9.0.13"
version = "9.0.14"
description = "Python version of Google's common library for parsing, formatting, storing and validating international phone numbers."
optional = false
python-versions = "*"
groups = ["main"]
files = [
{file = "phonenumbers-9.0.13-py2.py3-none-any.whl", hash = "sha256:b97661e177773e7509c6d503e0f537cd0af22aa3746231654590876eb9430915"},
{file = "phonenumbers-9.0.13.tar.gz", hash = "sha256:eca06e01382412c45316868f86a44bb217c02f9ee7196589041556a2f54a7639"},
{file = "phonenumbers-9.0.14-py2.py3-none-any.whl", hash = "sha256:6bdf5c46dbfefa1d941d122432d1958418d1dfe3f8c8c81d4c8e80f5442ea41f"},
{file = "phonenumbers-9.0.14.tar.gz", hash = "sha256:98afb3e86bf9ae02cc7c98ca44fa8827babb72842f90da9884c5d998937572ae"},
]
[[package]]
@@ -1908,14 +1908,14 @@ typing-extensions = ">=4.6.0,<4.7.0 || >4.7.0"
[[package]]
name = "pygithub"
version = "2.7.0"
version = "2.8.1"
description = "Use the full Github API v3"
optional = false
python-versions = ">=3.8"
groups = ["dev"]
files = [
{file = "pygithub-2.7.0-py3-none-any.whl", hash = "sha256:40ecbfe26dc55cc34ab4b0ffa1d455e6f816ef9a2bc8d6f5ad18ce572f163700"},
{file = "pygithub-2.7.0.tar.gz", hash = "sha256:7cd6eafabb09b5369afba3586d86b1f1ad6f1326d2ff01bc47bb26615dce4cbb"},
{file = "pygithub-2.8.1-py3-none-any.whl", hash = "sha256:23a0a5bca93baef082e03411bf0ce27204c32be8bfa7abc92fe4a3e132936df0"},
{file = "pygithub-2.8.1.tar.gz", hash = "sha256:341b7c78521cb07324ff670afd1baa2bf5c286f8d9fd302c1798ba594a5400c9"},
]
[package.dependencies]
@@ -3011,14 +3011,14 @@ files = [
[[package]]
name = "types-requests"
version = "2.32.4.20250809"
version = "2.32.4.20250913"
description = "Typing stubs for requests"
optional = false
python-versions = ">=3.9"
groups = ["dev"]
files = [
{file = "types_requests-2.32.4.20250809-py3-none-any.whl", hash = "sha256:f73d1832fb519ece02c85b1f09d5f0dd3108938e7d47e7f94bbfa18a6782b163"},
{file = "types_requests-2.32.4.20250809.tar.gz", hash = "sha256:d8060de1c8ee599311f56ff58010fb4902f462a1470802cf9f6ed27bc46c4df3"},
{file = "types_requests-2.32.4.20250913-py3-none-any.whl", hash = "sha256:78c9c1fffebbe0fa487a418e0fa5252017e9c60d1a2da394077f1780f655d7e1"},
{file = "types_requests-2.32.4.20250913.tar.gz", hash = "sha256:abd6d4f9ce3a9383f269775a9835a4c24e5cd6b9f647d64f88aa4613c33def5d"},
]
[package.dependencies]
@@ -3038,14 +3038,14 @@ files = [
[[package]]
name = "typing-extensions"
version = "4.14.1"
version = "4.15.0"
description = "Backported and Experimental Type Hints for Python 3.9+"
optional = false
python-versions = ">=3.9"
groups = ["main", "dev"]
files = [
{file = "typing_extensions-4.14.1-py3-none-any.whl", hash = "sha256:d1e1e3b58374dc93031d6eda2420a48ea44a36c2b4766a4fdeb3710755731d76"},
{file = "typing_extensions-4.14.1.tar.gz", hash = "sha256:38b39f4aeeab64884ce9f74c94263ef78f3c22467c8724005483154c26648d36"},
{file = "typing_extensions-4.15.0-py3-none-any.whl", hash = "sha256:f0fa19c6845758ab08074a0cfa8b7aecb71c999ca73d62883bc25cc018c4e548"},
{file = "typing_extensions-4.15.0.tar.gz", hash = "sha256:0cea48d173cc12fa28ecabc3b837ea3cf6f38c6d1136f85cbaaf598984861466"},
]
[[package]]

View File

@@ -104,7 +104,7 @@ name = "matrix-synapse"
version = "1.139.0rc3"
description = "Homeserver for the Matrix decentralised comms protocol"
authors = ["Matrix.org Team and Contributors <packages@matrix.org>"]
license = "AGPL-3.0-or-later"
license = "AGPL-3.0-or-later OR LicenseRef-Element-Commercial"
readme = "README.rst"
repository = "https://github.com/element-hq/synapse"
packages = [

View File

@@ -68,18 +68,42 @@ PROMETHEUS_METRIC_MISSING_FROM_LIST_TO_CHECK = ErrorCode(
category="per-homeserver-tenant-metrics",
)
PREFER_SYNAPSE_CLOCK_CALL_LATER = ErrorCode(
"call-later-not-tracked",
"Prefer using `synapse.util.Clock.call_later` instead of `reactor.callLater`",
category="synapse-reactor-clock",
)
PREFER_SYNAPSE_CLOCK_LOOPING_CALL = ErrorCode(
"prefer-synapse-clock-looping-call",
"Prefer using `synapse.util.Clock.looping_call` instead of `task.LoopingCall`",
category="synapse-reactor-clock",
)
PREFER_SYNAPSE_CLOCK_CALL_WHEN_RUNNING = ErrorCode(
"prefer-synapse-clock-call-when-running",
"`synapse.util.Clock.call_when_running` should be used instead of `reactor.callWhenRunning`",
"Prefer using `synapse.util.Clock.call_when_running` instead of `reactor.callWhenRunning`",
category="synapse-reactor-clock",
)
PREFER_SYNAPSE_CLOCK_ADD_SYSTEM_EVENT_TRIGGER = ErrorCode(
"prefer-synapse-clock-add-system-event-trigger",
"`synapse.util.Clock.add_system_event_trigger` should be used instead of `reactor.addSystemEventTrigger`",
"Prefer using `synapse.util.Clock.add_system_event_trigger` instead of `reactor.addSystemEventTrigger`",
category="synapse-reactor-clock",
)
MULTIPLE_INTERNAL_CLOCKS_CREATED = ErrorCode(
"multiple-internal-clocks",
"Only one instance of `clock.Clock` should be created",
category="synapse-reactor-clock",
)
UNTRACKED_BACKGROUND_PROCESS = ErrorCode(
"untracked-background-process",
"Prefer using `HomeServer.run_as_background_process` method over the bare `run_as_background_process`",
category="synapse-tracked-calls",
)
class Sentinel(enum.Enum):
# defining a sentinel in this way allows mypy to correctly handle the
@@ -222,6 +246,18 @@ class SynapsePlugin(Plugin):
# callback, let's just pass it in while we have it.
return lambda ctx: check_prometheus_metric_instantiation(ctx, fullname)
if fullname == "twisted.internet.task.LoopingCall":
return check_looping_call
if fullname == "synapse.util.clock.Clock":
return check_clock_creation
if (
fullname
== "synapse.metrics.background_process_metrics.run_as_background_process"
):
return check_background_process
return None
def get_method_signature_hook(
@@ -241,6 +277,13 @@ class SynapsePlugin(Plugin):
):
return check_is_cacheable_wrapper
if fullname in (
"twisted.internet.interfaces.IReactorTime.callLater",
"synapse.types.ISynapseThreadlessReactor.callLater",
"synapse.types.ISynapseReactor.callLater",
):
return check_call_later
if fullname in (
"twisted.internet.interfaces.IReactorCore.callWhenRunning",
"synapse.types.ISynapseThreadlessReactor.callWhenRunning",
@@ -258,6 +301,78 @@ class SynapsePlugin(Plugin):
return None
def check_clock_creation(ctx: FunctionSigContext) -> CallableType:
"""
Ensure that the only `clock.Clock` instance is the one used by the `HomeServer`.
This is so that the `HomeServer` can cancel any tracked delayed or looping calls
during server shutdown.
Args:
ctx: The `FunctionSigContext` from mypy.
"""
signature: CallableType = ctx.default_signature
ctx.api.fail(
"Expected the only `clock.Clock` instance to be the one used by the `HomeServer`. "
"This is so that the `HomeServer` can cancel any tracked delayed or looping calls "
"during server shutdown",
ctx.context,
code=MULTIPLE_INTERNAL_CLOCKS_CREATED,
)
return signature
def check_call_later(ctx: MethodSigContext) -> CallableType:
"""
Ensure that the `reactor.callLater` callsites aren't used.
`synapse.util.Clock.call_later` should always be used instead of `reactor.callLater`.
This is because the `synapse.util.Clock` tracks delayed calls in order to cancel any
outstanding calls during server shutdown. Delayed calls which are either short lived
(<~60s) or frequently called and can be tracked via other means could be candidates for
using `synapse.util.Clock.call_later` with `call_later_cancel_on_shutdown` set to
`False`. There shouldn't be a need to use `reactor.callLater` outside of tests or the
`Clock` class itself. If a need arises, you can use a type ignore comment to disable the
check, e.g. `# type: ignore[call-later-not-tracked]`.
Args:
ctx: The `FunctionSigContext` from mypy.
"""
signature: CallableType = ctx.default_signature
ctx.api.fail(
"Expected all `reactor.callLater` calls to use `synapse.util.Clock.call_later` "
"instead. This is so that long lived calls can be tracked for cancellation during "
"server shutdown",
ctx.context,
code=PREFER_SYNAPSE_CLOCK_CALL_LATER,
)
return signature
def check_looping_call(ctx: FunctionSigContext) -> CallableType:
"""
Ensure that the `task.LoopingCall` callsites aren't used.
`synapse.util.Clock.looping_call` should always be used instead of `task.LoopingCall`.
`synapse.util.Clock` tracks looping calls in order to cancel any outstanding calls
during server shutdown.
Args:
ctx: The `FunctionSigContext` from mypy.
"""
signature: CallableType = ctx.default_signature
ctx.api.fail(
"Expected all `task.LoopingCall` instances to use `synapse.util.Clock.looping_call` "
"instead. This is so that long lived calls can be tracked for cancellation during "
"server shutdown",
ctx.context,
code=PREFER_SYNAPSE_CLOCK_LOOPING_CALL,
)
return signature
def check_call_when_running(ctx: MethodSigContext) -> CallableType:
"""
Ensure that the `reactor.callWhenRunning` callsites aren't used.
@@ -312,6 +427,27 @@ def check_add_system_event_trigger(ctx: MethodSigContext) -> CallableType:
return signature
def check_background_process(ctx: FunctionSigContext) -> CallableType:
"""
Ensure that calls to `run_as_background_process` use the `HomeServer` method.
This is so that the `HomeServer` can cancel any running background processes during
server shutdown.
Args:
ctx: The `FunctionSigContext` from mypy.
"""
signature: CallableType = ctx.default_signature
ctx.api.fail(
"Prefer using `HomeServer.run_as_background_process` method over the bare "
"`run_as_background_process`. This is so that the `HomeServer` can cancel "
"any background processes during server shutdown",
ctx.context,
code=UNTRACKED_BACKGROUND_PROCESS,
)
return signature
def analyze_prometheus_metric_classes(ctx: ClassDefContext) -> None:
"""
Cross-check the list of Prometheus metric classes against the

View File

@@ -157,7 +157,12 @@ def get_registered_paths_for_default(
# TODO We only do this to avoid an error, but don't need the database etc
hs.setup()
registered_paths = get_registered_paths_for_hs(hs)
hs.cleanup()
# NOTE: a more robust implementation would properly shutdown/cleanup each server
# to avoid resource buildup.
# However, the call to `shutdown` is `async` so it would require additional complexity here.
# We are intentionally skipping this cleanup because this is a short-lived, one-off
# utility script where the simpler approach is sufficient and we shouldn't run into
# any resource buildup issues.
return registered_paths

View File

@@ -28,7 +28,6 @@ import yaml
from twisted.internet import defer, reactor as reactor_
from synapse.config.homeserver import HomeServerConfig
from synapse.metrics.background_process_metrics import run_as_background_process
from synapse.server import HomeServer
from synapse.storage import DataStore
from synapse.types import ISynapseReactor
@@ -53,7 +52,6 @@ class MockHomeserver(HomeServer):
def run_background_updates(hs: HomeServer) -> None:
server_name = hs.hostname
main = hs.get_datastores().main
state = hs.get_datastores().state
@@ -67,9 +65,8 @@ def run_background_updates(hs: HomeServer) -> None:
def run() -> None:
# Apply all background updates on the database.
defer.ensureDeferred(
run_as_background_process(
hs.run_as_background_process(
"background_updates",
server_name,
run_background_updates,
)
)

View File

@@ -354,12 +354,10 @@ class BaseAuth:
effective_user_id, effective_device_id
)
if device_opt is None:
# For now, use 400 M_EXCLUSIVE if the device doesn't exist.
# This is an open thread of discussion on MSC3202 as of 2021-12-09.
raise AuthError(
400,
f"Application service trying to use a device that doesn't exist ('{effective_device_id}' for {effective_user_id})",
Codes.EXCLUSIVE,
Codes.UNKNOWN_DEVICE,
)
return create_requester(

View File

@@ -149,6 +149,9 @@ class Codes(str, Enum):
)
MSC4306_NOT_IN_THREAD = "IO.ELEMENT.MSC4306.M_NOT_IN_THREAD"
# Part of MSC4326
UNKNOWN_DEVICE = "ORG.MATRIX.MSC4326.M_UNKNOWN_DEVICE"
class CodeMessageException(RuntimeError):
"""An exception with integer code, a message string attributes and optional headers.

View File

@@ -28,6 +28,7 @@ import sys
import traceback
import warnings
from textwrap import indent
from threading import Thread
from typing import (
TYPE_CHECKING,
Any,
@@ -40,6 +41,7 @@ from typing import (
Tuple,
cast,
)
from wsgiref.simple_server import WSGIServer
from cryptography.utils import CryptographyDeprecationWarning
from typing_extensions import ParamSpec
@@ -73,7 +75,6 @@ from synapse.events.presence_router import load_legacy_presence_router
from synapse.handlers.auth import load_legacy_password_auth_providers
from synapse.http.site import SynapseSite
from synapse.logging.context import LoggingContext, PreserveLoggingContext
from synapse.logging.opentracing import init_tracer
from synapse.metrics import install_gc_manager, register_threadpool
from synapse.metrics.background_process_metrics import run_as_background_process
from synapse.metrics.jemalloc import setup_jemalloc_stats
@@ -98,22 +99,47 @@ reactor = cast(ISynapseReactor, _reactor)
logger = logging.getLogger(__name__)
# list of tuples of function, args list, kwargs dict
_sighup_callbacks: List[
Tuple[Callable[..., None], Tuple[object, ...], Dict[str, object]]
] = []
_instance_id_to_sighup_callbacks_map: Dict[
str, List[Tuple[Callable[..., None], Tuple[object, ...], Dict[str, object]]]
] = {}
"""
Map from homeserver instance_id to a list of callbacks.
We use `instance_id` instead of `server_name` because it's possible to have multiple
workers running in the same process with the same `server_name`.
"""
P = ParamSpec("P")
def register_sighup(func: Callable[P, None], *args: P.args, **kwargs: P.kwargs) -> None:
def register_sighup(
homeserver_instance_id: str,
func: Callable[P, None],
*args: P.args,
**kwargs: P.kwargs,
) -> None:
"""
Register a function to be called when a SIGHUP occurs.
Args:
homeserver_instance_id: The unique ID for this Synapse process instance
(`hs.get_instance_id()`) that this hook is associated with.
func: Function to be called when sent a SIGHUP signal.
*args, **kwargs: args and kwargs to be passed to the target function.
"""
_sighup_callbacks.append((func, args, kwargs))
_instance_id_to_sighup_callbacks_map.setdefault(homeserver_instance_id, []).append(
(func, args, kwargs)
)
def unregister_sighups(instance_id: str) -> None:
"""
Unregister all sighup functions associated with this Synapse instance.
Args:
instance_id: Unique ID for this Synapse process instance.
"""
_instance_id_to_sighup_callbacks_map.pop(instance_id, [])
def start_worker_reactor(
@@ -282,7 +308,9 @@ def register_start(
clock.call_when_running(lambda: defer.ensureDeferred(wrapper()))
def listen_metrics(bind_addresses: StrCollection, port: int) -> None:
def listen_metrics(
bind_addresses: StrCollection, port: int
) -> List[Tuple[WSGIServer, Thread]]:
"""
Start Prometheus metrics server.
@@ -295,14 +323,22 @@ def listen_metrics(bind_addresses: StrCollection, port: int) -> None:
bytecode at a time), this still works because the metrics thread can preempt the
Twisted reactor thread between bytecode boundaries and the metrics thread gets
scheduled with roughly equal priority to the Twisted reactor thread.
Returns:
List of WSGIServer with the thread they are running on.
"""
from prometheus_client import start_http_server as start_http_server_prometheus
from synapse.metrics import RegistryProxy
servers: List[Tuple[WSGIServer, Thread]] = []
for host in bind_addresses:
logger.info("Starting metrics listener on %s:%d", host, port)
start_http_server_prometheus(port, addr=host, registry=RegistryProxy)
server, thread = start_http_server_prometheus(
port, addr=host, registry=RegistryProxy
)
servers.append((server, thread))
return servers
def listen_manhole(
@@ -310,7 +346,7 @@ def listen_manhole(
port: int,
manhole_settings: ManholeConfig,
manhole_globals: dict,
) -> None:
) -> List[Port]:
# twisted.conch.manhole 21.1.0 uses "int_from_bytes", which produces a confusing
# warning. It's fixed by https://github.com/twisted/twisted/pull/1522), so
# suppress the warning for now.
@@ -322,7 +358,7 @@ def listen_manhole(
from synapse.util.manhole import manhole
listen_tcp(
return listen_tcp(
bind_addresses,
port,
manhole(settings=manhole_settings, globals=manhole_globals),
@@ -499,7 +535,7 @@ def refresh_certificate(hs: "HomeServer") -> None:
logger.info("Context factories updated.")
async def start(hs: "HomeServer") -> None:
async def start(hs: "HomeServer", freeze: bool = True) -> None:
"""
Start a Synapse server or worker.
@@ -510,6 +546,11 @@ async def start(hs: "HomeServer") -> None:
Args:
hs: homeserver instance
freeze: whether to freeze the homeserver base objects in the garbage collector.
May improve garbage collection performance by marking objects with an effectively
static lifetime as frozen so they don't need to be considered for cleanup.
If you ever want to `shutdown` the homeserver, this needs to be
False otherwise the homeserver cannot be garbage collected after `shutdown`.
"""
server_name = hs.hostname
reactor = hs.get_reactor()
@@ -542,12 +583,17 @@ async def start(hs: "HomeServer") -> None:
# we're not using systemd.
sdnotify(b"RELOADING=1")
for i, args, kwargs in _sighup_callbacks:
i(*args, **kwargs)
for sighup_callbacks in _instance_id_to_sighup_callbacks_map.values():
for func, args, kwargs in sighup_callbacks:
func(*args, **kwargs)
sdnotify(b"READY=1")
return run_as_background_process(
# It's okay to ignore the linter error here and call
# `run_as_background_process` directly because `_handle_sighup` operates
# outside of the scope of a specific `HomeServer` instance and holds no
# references to it which would prevent a clean shutdown.
return run_as_background_process( # type: ignore[untracked-background-process]
"sighup",
server_name,
_handle_sighup,
@@ -565,8 +611,8 @@ async def start(hs: "HomeServer") -> None:
signal.signal(signal.SIGHUP, run_sighup)
register_sighup(refresh_certificate, hs)
register_sighup(reload_cache_config, hs.config)
register_sighup(hs.get_instance_id(), refresh_certificate, hs)
register_sighup(hs.get_instance_id(), reload_cache_config, hs.config)
# Apply the cache config.
hs.config.caches.resize_all_caches()
@@ -574,9 +620,6 @@ async def start(hs: "HomeServer") -> None:
# Load the certificate from disk.
refresh_certificate(hs)
# Start the tracer
init_tracer(hs) # noqa
# Instantiate the modules so they can register their web resources to the module API
# before we start the listeners.
module_api = hs.get_module_api()
@@ -603,11 +646,15 @@ async def start(hs: "HomeServer") -> None:
hs.get_pusherpool().start()
def log_shutdown() -> None:
with LoggingContext("log_shutdown"):
with LoggingContext(name="log_shutdown", server_name=server_name):
logger.info("Shutting down...")
# Log when we start the shut down process.
hs.get_clock().add_system_event_trigger("before", "shutdown", log_shutdown)
hs.register_sync_shutdown_handler(
phase="before",
eventType="shutdown",
shutdown_func=log_shutdown,
)
setup_sentry(hs)
setup_sdnotify(hs)
@@ -636,18 +683,24 @@ async def start(hs: "HomeServer") -> None:
# `REQUIRED_ON_BACKGROUND_TASK_STARTUP`
start_phone_stats_home(hs)
# We now freeze all allocated objects in the hopes that (almost)
# everything currently allocated are things that will be used for the
# rest of time. Doing so means less work each GC (hopefully).
#
# PyPy does not (yet?) implement gc.freeze()
if hasattr(gc, "freeze"):
gc.collect()
gc.freeze()
if freeze:
# We now freeze all allocated objects in the hopes that (almost)
# everything currently allocated are things that will be used for the
# rest of time. Doing so means less work each GC (hopefully).
#
# Note that freezing the homeserver object means that it won't be able to be
# garbage collected in the case of attempting an in-memory `shutdown`. This only
# needs to be considered if such a case is desirable. Exiting the entire Python
# process will function expectedly either way.
#
# PyPy does not (yet?) implement gc.freeze()
if hasattr(gc, "freeze"):
gc.collect()
gc.freeze()
# Speed up shutdowns by freezing all allocated objects. This moves everything
# into the permanent generation and excludes them from the final GC.
atexit.register(gc.freeze)
# Speed up process exit by freezing all allocated objects. This moves everything
# into the permanent generation and excludes them from the final GC.
atexit.register(gc.freeze)
def reload_cache_config(config: HomeServerConfig) -> None:

View File

@@ -329,7 +329,7 @@ def start(config: HomeServerConfig, args: argparse.Namespace) -> None:
# command.
async def run() -> None:
with LoggingContext(name="command"):
with LoggingContext(name="command", server_name=config.server.server_name):
await _base.start(ss)
await args.func(ss, args)
@@ -342,5 +342,5 @@ def start(config: HomeServerConfig, args: argparse.Namespace) -> None:
if __name__ == "__main__":
homeserver_config, args = load_config(sys.argv[1:])
with LoggingContext(name="main"):
with LoggingContext(name="main", server_name=homeserver_config.server.server_name):
start(homeserver_config, args)

View File

@@ -27,7 +27,7 @@ from synapse.util.logcontext import LoggingContext
def main() -> None:
homeserver_config = load_config(sys.argv[1:])
with LoggingContext(name="main"):
with LoggingContext(name="main", server_name=homeserver_config.server.server_name):
start(homeserver_config)

View File

@@ -27,7 +27,7 @@ from synapse.util.logcontext import LoggingContext
def main() -> None:
homeserver_config = load_config(sys.argv[1:])
with LoggingContext(name="main"):
with LoggingContext(name="main", server_name=homeserver_config.server.server_name):
start(homeserver_config)

View File

@@ -26,7 +26,7 @@ from synapse.util.logcontext import LoggingContext
def main() -> None:
homeserver_config = load_config(sys.argv[1:])
with LoggingContext(name="main"):
with LoggingContext(name="main", server_name=homeserver_config.server.server_name):
start(homeserver_config)

View File

@@ -27,7 +27,7 @@ from synapse.util.logcontext import LoggingContext
def main() -> None:
homeserver_config = load_config(sys.argv[1:])
with LoggingContext(name="main"):
with LoggingContext(name="main", server_name=homeserver_config.server.server_name):
start(homeserver_config)

View File

@@ -27,7 +27,7 @@ from synapse.util.logcontext import LoggingContext
def main() -> None:
homeserver_config = load_config(sys.argv[1:])
with LoggingContext(name="main"):
with LoggingContext(name="main", server_name=homeserver_config.server.server_name):
start(homeserver_config)

View File

@@ -27,7 +27,7 @@ from synapse.util.logcontext import LoggingContext
def main() -> None:
homeserver_config = load_config(sys.argv[1:])
with LoggingContext(name="main"):
with LoggingContext(name="main", server_name=homeserver_config.server.server_name):
start(homeserver_config)

View File

@@ -49,6 +49,7 @@ from synapse.config.server import ListenerConfig, TCPListenerConfig
from synapse.federation.transport.server import TransportLayerServer
from synapse.http.server import JsonResource, OptionsResource
from synapse.logging.context import LoggingContext
from synapse.logging.opentracing import init_tracer
from synapse.metrics import METRICS_PREFIX, MetricsResource, RegistryProxy
from synapse.replication.http import REPLICATION_PREFIX, ReplicationRestResource
from synapse.rest import ClientRestResource, admin
@@ -277,11 +278,13 @@ class GenericWorkerServer(HomeServer):
self._listen_http(listener)
elif listener.type == "manhole":
if isinstance(listener, TCPListenerConfig):
_base.listen_manhole(
listener.bind_addresses,
listener.port,
manhole_settings=self.config.server.manhole_settings,
manhole_globals={"hs": self},
self._listening_services.extend(
_base.listen_manhole(
listener.bind_addresses,
listener.port,
manhole_settings=self.config.server.manhole_settings,
manhole_globals={"hs": self},
)
)
else:
raise ConfigError(
@@ -295,9 +298,11 @@ class GenericWorkerServer(HomeServer):
)
else:
if isinstance(listener, TCPListenerConfig):
_base.listen_metrics(
listener.bind_addresses,
listener.port,
self._metrics_listeners.extend(
_base.listen_metrics(
listener.bind_addresses,
listener.port,
)
)
else:
raise ConfigError(
@@ -359,6 +364,9 @@ def start(config: HomeServerConfig) -> None:
setup_logging(hs, config, use_worker_options=True)
# Start the tracer
init_tracer(hs) # noqa
try:
hs.setup()
@@ -382,7 +390,7 @@ def start(config: HomeServerConfig) -> None:
def main() -> None:
homeserver_config = load_config(sys.argv[1:])
with LoggingContext(name="main"):
with LoggingContext(name="main", server_name=homeserver_config.server.server_name):
start(homeserver_config)

View File

@@ -22,14 +22,13 @@
import logging
import os
import sys
from typing import Dict, Iterable, List
from typing import Dict, Iterable, List, Optional
from twisted.internet.tcp import Port
from twisted.web.resource import EncodingResourceWrapper, Resource
from twisted.web.server import GzipEncoderFactory
import synapse
import synapse.config.logger
from synapse import events
from synapse.api.urls import (
CLIENT_API_PREFIX,
@@ -50,6 +49,7 @@ from synapse.app._base import (
)
from synapse.config._base import ConfigError, format_config_error
from synapse.config.homeserver import HomeServerConfig
from synapse.config.logger import setup_logging
from synapse.config.server import ListenerConfig, TCPListenerConfig
from synapse.federation.transport.server import TransportLayerServer
from synapse.http.additional_resource import AdditionalResource
@@ -60,6 +60,7 @@ from synapse.http.server import (
StaticResource,
)
from synapse.logging.context import LoggingContext
from synapse.logging.opentracing import init_tracer
from synapse.metrics import METRICS_PREFIX, MetricsResource, RegistryProxy
from synapse.replication.http import REPLICATION_PREFIX, ReplicationRestResource
from synapse.rest import ClientRestResource, admin
@@ -69,6 +70,7 @@ from synapse.rest.synapse.client import build_synapse_client_resource_tree
from synapse.rest.well_known import well_known_resource
from synapse.server import HomeServer
from synapse.storage import DataStore
from synapse.types import ISynapseReactor
from synapse.util.check_dependencies import VERSION, check_requirements
from synapse.util.httpresourcetree import create_resource_tree
from synapse.util.module_loader import load_module
@@ -276,11 +278,13 @@ class SynapseHomeServer(HomeServer):
)
elif listener.type == "manhole":
if isinstance(listener, TCPListenerConfig):
_base.listen_manhole(
listener.bind_addresses,
listener.port,
manhole_settings=self.config.server.manhole_settings,
manhole_globals={"hs": self},
self._listening_services.extend(
_base.listen_manhole(
listener.bind_addresses,
listener.port,
manhole_settings=self.config.server.manhole_settings,
manhole_globals={"hs": self},
)
)
else:
raise ConfigError(
@@ -293,9 +297,11 @@ class SynapseHomeServer(HomeServer):
)
else:
if isinstance(listener, TCPListenerConfig):
_base.listen_metrics(
listener.bind_addresses,
listener.port,
self._metrics_listeners.extend(
_base.listen_metrics(
listener.bind_addresses,
listener.port,
)
)
else:
raise ConfigError(
@@ -339,12 +345,23 @@ def load_or_generate_config(argv_options: List[str]) -> HomeServerConfig:
return config
def setup(config: HomeServerConfig) -> SynapseHomeServer:
def setup(
config: HomeServerConfig,
reactor: Optional[ISynapseReactor] = None,
freeze: bool = True,
) -> SynapseHomeServer:
"""
Create and setup a Synapse homeserver instance given a configuration.
Args:
config: The configuration for the homeserver.
reactor: Optionally provide a reactor to use. Can be useful in different
scenarios that you want control over the reactor, such as tests.
freeze: whether to freeze the homeserver base objects in the garbage collector.
May improve garbage collection performance by marking objects with an effectively
static lifetime as frozen so they don't need to be considered for cleanup.
If you ever want to `shutdown` the homeserver, this needs to be
False otherwise the homeserver cannot be garbage collected after `shutdown`.
Returns:
A homeserver instance.
@@ -383,9 +400,13 @@ def setup(config: HomeServerConfig) -> SynapseHomeServer:
config.server.server_name,
config=config,
version_string=f"Synapse/{VERSION}",
reactor=reactor,
)
synapse.config.logger.setup_logging(hs, config, use_worker_options=False)
setup_logging(hs, config, use_worker_options=False)
# Start the tracer
init_tracer(hs) # noqa
logger.info("Setting up server")
@@ -401,7 +422,7 @@ def setup(config: HomeServerConfig) -> SynapseHomeServer:
# Loading the provider metadata also ensures the provider config is valid.
await oidc.load_metadata()
await _base.start(hs)
await _base.start(hs, freeze)
hs.get_datastores().main.db_pool.updates.start_doing_background_updates()
@@ -425,7 +446,7 @@ def run(hs: HomeServer) -> None:
def main() -> None:
homeserver_config = load_or_generate_config(sys.argv[1:])
with LoggingContext("main"):
with LoggingContext(name="main", server_name=homeserver_config.server.server_name):
# check base requirements
check_requirements()
hs = setup(homeserver_config)

View File

@@ -27,7 +27,7 @@ from synapse.util.logcontext import LoggingContext
def main() -> None:
homeserver_config = load_config(sys.argv[1:])
with LoggingContext(name="main"):
with LoggingContext(name="main", server_name=homeserver_config.server.server_name):
start(homeserver_config)

View File

@@ -29,9 +29,6 @@ from prometheus_client import Gauge
from twisted.internet import defer
from synapse.metrics import SERVER_NAME_LABEL
from synapse.metrics.background_process_metrics import (
run_as_background_process,
)
from synapse.types import JsonDict
from synapse.util.constants import ONE_HOUR_SECONDS, ONE_MINUTE_SECONDS
@@ -85,8 +82,6 @@ def phone_stats_home(
stats: JsonDict,
stats_process: List[Tuple[int, "resource.struct_rusage"]] = _stats_process,
) -> "defer.Deferred[None]":
server_name = hs.hostname
async def _phone_stats_home(
hs: "HomeServer",
stats: JsonDict,
@@ -200,8 +195,8 @@ def phone_stats_home(
except Exception as e:
logger.warning("Error reporting stats: %s", e)
return run_as_background_process(
"phone_stats_home", server_name, _phone_stats_home, hs, stats, stats_process
return hs.run_as_background_process(
"phone_stats_home", _phone_stats_home, hs, stats, stats_process
)
@@ -263,9 +258,8 @@ def start_phone_stats_home(hs: "HomeServer") -> None:
float(hs.config.server.max_mau_value)
)
return run_as_background_process(
return hs.run_as_background_process(
"generate_monthly_active_users",
server_name,
_generate_monthly_active_users,
)
@@ -285,10 +279,16 @@ def start_phone_stats_home(hs: "HomeServer") -> None:
# We need to defer this init for the cases that we daemonize
# otherwise the process ID we get is that of the non-daemon process
clock.call_later(0, performance_stats_init)
clock.call_later(
0,
performance_stats_init,
)
# We wait 5 minutes to send the first set of stats as the server can
# be quite busy the first few minutes
clock.call_later(
INITIAL_DELAY_BEFORE_FIRST_PHONE_HOME_SECONDS, phone_stats_home, hs, stats
INITIAL_DELAY_BEFORE_FIRST_PHONE_HOME_SECONDS,
phone_stats_home,
hs,
stats,
)

View File

@@ -27,7 +27,7 @@ from synapse.util.logcontext import LoggingContext
def main() -> None:
homeserver_config = load_config(sys.argv[1:])
with LoggingContext(name="main"):
with LoggingContext(name="main", server_name=homeserver_config.server.server_name):
start(homeserver_config)

View File

@@ -27,7 +27,7 @@ from synapse.util.logcontext import LoggingContext
def main() -> None:
homeserver_config = load_config(sys.argv[1:])
with LoggingContext(name="main"):
with LoggingContext(name="main", server_name=homeserver_config.server.server_name):
start(homeserver_config)

View File

@@ -27,7 +27,7 @@ from synapse.util.logcontext import LoggingContext
def main() -> None:
homeserver_config = load_config(sys.argv[1:])
with LoggingContext(name="main"):
with LoggingContext(name="main", server_name=homeserver_config.server.server_name):
start(homeserver_config)

View File

@@ -23,15 +23,33 @@
import logging
import re
from enum import Enum
from typing import TYPE_CHECKING, Dict, Iterable, List, Optional, Pattern, Sequence
from typing import (
TYPE_CHECKING,
Dict,
Iterable,
List,
Optional,
Pattern,
Sequence,
cast,
)
import attr
from netaddr import IPSet
from twisted.internet import reactor
from synapse.api.constants import EventTypes
from synapse.events import EventBase
from synapse.types import DeviceListUpdates, JsonDict, JsonMapping, UserID
from synapse.types import (
DeviceListUpdates,
ISynapseThreadlessReactor,
JsonDict,
JsonMapping,
UserID,
)
from synapse.util.caches.descriptors import _CacheContext, cached
from synapse.util.clock import Clock
if TYPE_CHECKING:
from synapse.appservice.api import ApplicationServiceApi
@@ -98,6 +116,15 @@ class ApplicationService:
self.sender = sender
# The application service user should be part of the server's domain.
self.server_name = sender.domain # nb must be called this for @cached
# Ideally we would require passing in the `HomeServer` `Clock` instance.
# However this is not currently possible as there are places which use
# `@cached` that aren't aware of the `HomeServer` instance.
# nb must be called this for @cached
self.clock = Clock(
cast(ISynapseThreadlessReactor, reactor), server_name=self.server_name
) # type: ignore[multiple-internal-clocks]
self.namespaces = self._check_namespaces(namespaces)
self.id = id
self.ip_range_whitelist = ip_range_whitelist

View File

@@ -81,7 +81,6 @@ from synapse.appservice import (
from synapse.appservice.api import ApplicationServiceApi
from synapse.events import EventBase
from synapse.logging.context import run_in_background
from synapse.metrics.background_process_metrics import run_as_background_process
from synapse.storage.databases.main import DataStore
from synapse.types import DeviceListUpdates, JsonMapping
from synapse.util.clock import Clock
@@ -200,6 +199,7 @@ class _ServiceQueuer:
)
self.server_name = hs.hostname
self.clock = hs.get_clock()
self.hs = hs
self._store = hs.get_datastores().main
def start_background_request(self, service: ApplicationService) -> None:
@@ -207,9 +207,7 @@ class _ServiceQueuer:
if service.id in self.requests_in_flight:
return
run_as_background_process(
"as-sender", self.server_name, self._send_request, service
)
self.hs.run_as_background_process("as-sender", self._send_request, service)
async def _send_request(self, service: ApplicationService) -> None:
# sanity-check: we shouldn't get here if this service already has a sender
@@ -361,6 +359,7 @@ class _TransactionController:
def __init__(self, hs: "HomeServer"):
self.server_name = hs.hostname
self.clock = hs.get_clock()
self.hs = hs
self.store = hs.get_datastores().main
self.as_api = hs.get_application_service_api()
@@ -448,6 +447,7 @@ class _TransactionController:
recoverer = self.RECOVERER_CLASS(
self.server_name,
self.clock,
self.hs,
self.store,
self.as_api,
service,
@@ -494,6 +494,7 @@ class _Recoverer:
self,
server_name: str,
clock: Clock,
hs: "HomeServer",
store: DataStore,
as_api: ApplicationServiceApi,
service: ApplicationService,
@@ -501,6 +502,7 @@ class _Recoverer:
):
self.server_name = server_name
self.clock = clock
self.hs = hs
self.store = store
self.as_api = as_api
self.service = service
@@ -513,9 +515,8 @@ class _Recoverer:
logger.info("Scheduling retries on %s in %fs", self.service.id, delay)
self.scheduled_recovery = self.clock.call_later(
delay,
run_as_background_process,
self.hs.run_as_background_process,
"as-recoverer",
self.server_name,
self.retry,
)
@@ -535,9 +536,8 @@ class _Recoverer:
if self.scheduled_recovery:
self.clock.cancel_call_later(self.scheduled_recovery)
# Run a retry, which will resechedule a recovery if it fails.
run_as_background_process(
self.hs.run_as_background_process(
"retry",
self.server_name,
self.retry,
)

View File

@@ -601,7 +601,7 @@ class RootConfig:
@classmethod
def load_config_with_parser(
cls: Type[TRootConfig], parser: argparse.ArgumentParser, argv: List[str]
cls: Type[TRootConfig], parser: argparse.ArgumentParser, argv_options: List[str]
) -> Tuple[TRootConfig, argparse.Namespace]:
"""Parse the commandline and config files with the given parser
@@ -611,14 +611,14 @@ class RootConfig:
Args:
parser
argv
argv_options: The options passed to Synapse. Usually `sys.argv[1:]`.
Returns:
Returns the parsed config object and the parsed argparse.Namespace
object from parser.parse_args(..)`
"""
config_args = parser.parse_args(argv)
config_args = parser.parse_args(argv_options)
config_files = find_config_files(search_paths=config_args.config_path)
obj = cls(config_files)

View File

@@ -40,7 +40,6 @@ from twisted.logger import (
)
from synapse.logging.context import LoggingContextFilter
from synapse.logging.filter import MetadataFilter
from synapse.synapse_rust import reset_logging_config
from synapse.types import JsonDict
@@ -213,13 +212,11 @@ def _setup_stdlib_logging(
# writes.
log_context_filter = LoggingContextFilter()
log_metadata_filter = MetadataFilter({"server_name": config.server.server_name})
old_factory = logging.getLogRecordFactory()
def factory(*args: Any, **kwargs: Any) -> logging.LogRecord:
record = old_factory(*args, **kwargs)
log_context_filter.filter(record)
log_metadata_filter.filter(record)
return record
logging.setLogRecordFactory(factory)
@@ -348,7 +345,9 @@ def setup_logging(
# Add a SIGHUP handler to reload the logging configuration, if one is available.
from synapse.app import _base as appbase
appbase.register_sighup(_reload_logging_config, log_config_path)
appbase.register_sighup(
hs.get_instance_id(), _reload_logging_config, log_config_path
)
# Log immediately so we can grep backwards.
logger.warning("***** STARTING SERVER *****")

View File

@@ -172,7 +172,7 @@ class Keyring:
_FetchKeyRequest, Dict[str, Dict[str, FetchKeyResult]]
] = BatchingQueue(
name="keyring_server",
server_name=self.server_name,
hs=hs,
clock=hs.get_clock(),
# The method called to fetch each key
process_batch_callback=self._inner_fetch_key_requests,
@@ -194,6 +194,14 @@ class Keyring:
valid_until_ts=2**63, # fake future timestamp
)
def shutdown(self) -> None:
"""
Prepares the KeyRing for garbage collection by shutting down it's queues.
"""
self._fetch_keys_queue.shutdown()
for key_fetcher in self._key_fetchers:
key_fetcher.shutdown()
async def verify_json_for_server(
self,
server_name: str,
@@ -316,7 +324,7 @@ class Keyring:
if key_result.valid_until_ts < verify_request.minimum_valid_until_ts:
continue
await self._process_json(key_result.verify_key, verify_request)
await self.process_json(key_result.verify_key, verify_request)
verified = True
if not verified:
@@ -326,7 +334,7 @@ class Keyring:
Codes.UNAUTHORIZED,
)
async def _process_json(
async def process_json(
self, verify_key: VerifyKey, verify_request: VerifyJsonRequest
) -> None:
"""Processes the `VerifyJsonRequest`. Raises if the signature can't be
@@ -479,11 +487,17 @@ class KeyFetcher(metaclass=abc.ABCMeta):
self.server_name = hs.hostname
self._queue = BatchingQueue(
name=self.__class__.__name__,
server_name=self.server_name,
hs=hs,
clock=hs.get_clock(),
process_batch_callback=self._fetch_keys,
)
def shutdown(self) -> None:
"""
Prepares the KeyFetcher for garbage collection by shutting down it's queue.
"""
self._queue.shutdown()
async def get_keys(
self, server_name: str, key_ids: List[str], minimum_valid_until_ts: int
) -> Dict[str, FetchKeyResult]:

View File

@@ -119,7 +119,6 @@ class InviteAutoAccepter:
event.state_key,
event.room_id,
"join",
bg_start_span=False,
)
if is_direct_message:

View File

@@ -148,6 +148,7 @@ class FederationClient(FederationBase):
self._get_pdu_cache: ExpiringCache[str, Tuple[EventBase, str]] = ExpiringCache(
cache_name="get_pdu_cache",
server_name=self.server_name,
hs=self.hs,
clock=self._clock,
max_len=1000,
expiry_ms=120 * 1000,
@@ -167,6 +168,7 @@ class FederationClient(FederationBase):
] = ExpiringCache(
cache_name="get_room_hierarchy_cache",
server_name=self.server_name,
hs=self.hs,
clock=self._clock,
max_len=1000,
expiry_ms=5 * 60 * 1000,
@@ -495,6 +497,43 @@ class FederationClient(FederationBase):
)
return RECOMMENDATION_OK
@trace
@tag_args
async def ask_policy_server_to_sign_event(
self, destination: str, pdu: EventBase, timeout: Optional[int] = None
) -> Optional[JsonDict]:
"""Requests that the destination server (typically a policy server)
sign the event as not spam.
If the policy server could not be contacted or the policy server
returned an error, this returns no signature.
Args:
destination: The remote homeserver to ask (a policy server)
pdu: The event to sign
timeout: How long to try (in ms) the destination for before
giving up. None indicates no timeout.
Returns:
The signature from the policy server, structured in the same was as the 'signatures'
JSON in the event e.g { "$policy_server_via_domain" : { "ed25519:policy_server": "signature_base64" }}
"""
logger.debug(
"ask_policy_server_to_sign_event for event_id=%s from %s",
pdu.event_id,
destination,
)
try:
return await self.transport_layer.ask_policy_server_to_sign_event(
destination, pdu, timeout=timeout
)
except Exception as e:
logger.warning(
"ask_policy_server_to_sign_event: server %s responded with error: %s",
destination,
e,
)
return None
@trace
@tag_args
async def get_pdu(

View File

@@ -159,7 +159,7 @@ class FederationServer(FederationBase):
# with FederationHandlerRegistry.
hs.get_directory_handler()
self._server_linearizer = Linearizer("fed_server")
self._server_linearizer = Linearizer(name="fed_server", clock=hs.get_clock())
# origins that we are currently processing a transaction from.
# a dict from origin to txn id.

View File

@@ -144,6 +144,9 @@ class FederationRemoteSendQueue(AbstractFederationSender):
self.clock.looping_call(self._clear_queue, 30 * 1000)
def shutdown(self) -> None:
"""Stops this federation sender instance from sending further transactions."""
def _next_pos(self) -> int:
pos = self.pos
self.pos += 1

View File

@@ -168,7 +168,6 @@ from synapse.metrics import (
events_processed_counter,
)
from synapse.metrics.background_process_metrics import (
run_as_background_process,
wrap_as_background_process,
)
from synapse.types import (
@@ -232,6 +231,11 @@ WAKEUP_INTERVAL_BETWEEN_DESTINATIONS_SEC = 5
class AbstractFederationSender(metaclass=abc.ABCMeta):
@abc.abstractmethod
def shutdown(self) -> None:
"""Stops this federation sender instance from sending further transactions."""
raise NotImplementedError()
@abc.abstractmethod
def notify_new_events(self, max_token: RoomStreamToken) -> None:
"""This gets called when we have some new events we might want to
@@ -326,6 +330,7 @@ class _DestinationWakeupQueue:
_MAX_TIME_IN_QUEUE = 30.0
sender: "FederationSender" = attr.ib()
hs: "HomeServer" = attr.ib()
server_name: str = attr.ib()
"""
Our homeserver name (used to label metrics) (`hs.hostname`).
@@ -453,18 +458,30 @@ class FederationSender(AbstractFederationSender):
1.0 / hs.config.ratelimiting.federation_rr_transactions_per_room_per_second
)
self._destination_wakeup_queue = _DestinationWakeupQueue(
self, self.server_name, self.clock, max_delay_s=rr_txn_interval_per_room_s
self,
hs,
self.server_name,
self.clock,
max_delay_s=rr_txn_interval_per_room_s,
)
# It is important for `_is_shutdown` to be instantiated before the looping call
# for `wake_destinations_needing_catchup`.
self._is_shutdown = False
# Regularly wake up destinations that have outstanding PDUs to be caught up
self.clock.looping_call_now(
run_as_background_process,
self.hs.run_as_background_process,
WAKEUP_RETRY_PERIOD_SEC * 1000.0,
"wake_destinations_needing_catchup",
self.server_name,
self._wake_destinations_needing_catchup,
)
def shutdown(self) -> None:
self._is_shutdown = True
for queue in self._per_destination_queues.values():
queue.shutdown()
def _get_per_destination_queue(
self, destination: str
) -> Optional[PerDestinationQueue]:
@@ -503,16 +520,15 @@ class FederationSender(AbstractFederationSender):
return
# fire off a processing loop in the background
run_as_background_process(
self.hs.run_as_background_process(
"process_event_queue_for_federation",
self.server_name,
self._process_event_queue_loop,
)
async def _process_event_queue_loop(self) -> None:
try:
self._is_processing = True
while True:
while not self._is_shutdown:
last_token = await self.store.get_federation_out_pos("events")
(
next_token,
@@ -1123,7 +1139,7 @@ class FederationSender(AbstractFederationSender):
last_processed: Optional[str] = None
while True:
while not self._is_shutdown:
destinations_to_wake = (
await self.store.get_catch_up_outstanding_destinations(last_processed)
)

View File

@@ -28,6 +28,8 @@ from typing import TYPE_CHECKING, Dict, Hashable, Iterable, List, Optional, Tupl
import attr
from prometheus_client import Counter
from twisted.internet import defer
from synapse.api.constants import EduTypes
from synapse.api.errors import (
FederationDeniedError,
@@ -41,7 +43,6 @@ from synapse.handlers.presence import format_user_presence_state
from synapse.logging import issue9533_logger
from synapse.logging.opentracing import SynapseTags, set_tag
from synapse.metrics import SERVER_NAME_LABEL, sent_transactions_counter
from synapse.metrics.background_process_metrics import run_as_background_process
from synapse.types import JsonDict, ReadReceipt
from synapse.util.retryutils import NotRetryingDestination, get_retry_limiter
from synapse.visibility import filter_events_for_server
@@ -79,6 +80,7 @@ MAX_PRESENCE_STATES_PER_EDU = 50
class PerDestinationQueue:
"""
Manages the per-destination transmission queues.
Runs until `shutdown()` is called on the queue.
Args:
hs
@@ -94,6 +96,7 @@ class PerDestinationQueue:
destination: str,
):
self.server_name = hs.hostname
self._hs = hs
self._clock = hs.get_clock()
self._storage_controllers = hs.get_storage_controllers()
self._store = hs.get_datastores().main
@@ -117,6 +120,8 @@ class PerDestinationQueue:
self._destination = destination
self.transmission_loop_running = False
self._transmission_loop_enabled = True
self.active_transmission_loop: Optional[defer.Deferred] = None
# Flag to signal to any running transmission loop that there is new data
# queued up to be sent.
@@ -171,6 +176,20 @@ class PerDestinationQueue:
def __str__(self) -> str:
return "PerDestinationQueue[%s]" % self._destination
def shutdown(self) -> None:
"""Instruct the queue to stop processing any further requests"""
self._transmission_loop_enabled = False
# The transaction manager must be shutdown before cancelling the active
# transmission loop. Otherwise the transmission loop can enter a new cycle of
# sleeping before retrying since the shutdown flag of the _transaction_manager
# hasn't been set yet.
self._transaction_manager.shutdown()
try:
if self.active_transmission_loop is not None:
self.active_transmission_loop.cancel()
except Exception:
pass
def pending_pdu_count(self) -> int:
return len(self._pending_pdus)
@@ -309,11 +328,14 @@ class PerDestinationQueue:
)
return
if not self._transmission_loop_enabled:
logger.warning("Shutdown has been requested. Not sending transaction")
return
logger.debug("TX [%s] Starting transaction loop", self._destination)
run_as_background_process(
self.active_transmission_loop = self._hs.run_as_background_process(
"federation_transaction_transmission_loop",
self.server_name,
self._transaction_transmission_loop,
)
@@ -321,13 +343,13 @@ class PerDestinationQueue:
pending_pdus: List[EventBase] = []
try:
self.transmission_loop_running = True
# This will throw if we wouldn't retry. We do this here so we fail
# quickly, but we will later check this again in the http client,
# hence why we throw the result away.
await get_retry_limiter(
destination=self._destination,
our_server_name=self.server_name,
hs=self._hs,
clock=self._clock,
store=self._store,
)
@@ -339,7 +361,7 @@ class PerDestinationQueue:
# not caught up yet
return
while True:
while self._transmission_loop_enabled:
self._new_data_to_send = False
async with _TransactionQueueManager(self) as (
@@ -352,8 +374,8 @@ class PerDestinationQueue:
# If we've gotten told about new things to send during
# checking for things to send, we try looking again.
# Otherwise new PDUs or EDUs might arrive in the meantime,
# but not get sent because we hold the
# `transmission_loop_running` flag.
# but not get sent because we currently have an
# `_active_transmission_loop` running.
if self._new_data_to_send:
continue
else:
@@ -442,6 +464,7 @@ class PerDestinationQueue:
)
finally:
# We want to be *very* sure we clear this after we stop processing
self.active_transmission_loop = None
self.transmission_loop_running = False
async def _catch_up_transmission_loop(self) -> None:
@@ -469,7 +492,7 @@ class PerDestinationQueue:
last_successful_stream_ordering: int = _tmp_last_successful_stream_ordering
# get at most 50 catchup room/PDUs
while True:
while self._transmission_loop_enabled:
event_ids = await self._store.get_catch_up_room_event_ids(
self._destination, last_successful_stream_ordering
)

View File

@@ -72,6 +72,12 @@ class TransactionManager:
# HACK to get unique tx id
self._next_txn_id = int(self.clock.time_msec())
self._is_shutdown = False
def shutdown(self) -> None:
self._is_shutdown = True
self._transport_layer.shutdown()
@measure_func("_send_new_transaction")
async def send_new_transaction(
self,
@@ -86,6 +92,12 @@ class TransactionManager:
edus: List of EDUs to send
"""
if self._is_shutdown:
logger.warning(
"TransactionManager has been shutdown, not sending transaction"
)
return
# Make a transaction-sending opentracing span. This span follows on from
# all the edus in that transaction. This needs to be done since there is
# no active span here, so if the edus were not received by the remote the

View File

@@ -70,6 +70,9 @@ class TransportLayerClient:
self.client = hs.get_federation_http_client()
self._is_mine_server_name = hs.is_mine_server_name
def shutdown(self) -> None:
self.client.shutdown()
async def get_room_state_ids(
self, destination: str, room_id: str, event_id: str
) -> JsonDict:
@@ -170,6 +173,32 @@ class TransportLayerClient:
timeout=timeout,
)
async def ask_policy_server_to_sign_event(
self, destination: str, event: EventBase, timeout: Optional[int] = None
) -> JsonDict:
"""Requests that the destination server (typically a policy server)
sign the event as not spam.
If the policy server could not be contacted or the policy server
returned an error, this raises that error.
Args:
destination: The host name of the policy server / homeserver.
event: The event to sign.
timeout: How long to try (in ms) the destination for before giving up.
None indicates no timeout.
Returns:
The signature from the policy server, structured in the same was as the 'signatures'
JSON in the event e.g { "$policy_server_via_domain" : { "ed25519:policy_server": "signature_base64" }}
"""
return await self.client.post_json(
destination=destination,
path="/_matrix/policy/unstable/org.matrix.msc4284/sign",
data=event.get_pdu_json(),
ignore_backoff=True,
timeout=timeout,
)
async def backfill(
self, destination: str, room_id: str, event_tuples: Collection[str], limit: int
) -> Optional[Union[JsonDict, list]]:

View File

@@ -37,10 +37,8 @@ logger = logging.getLogger(__name__)
class AccountValidityHandler:
def __init__(self, hs: "HomeServer"):
self.hs = hs
self.server_name = (
hs.hostname
) # nb must be called this for @wrap_as_background_process
self.hs = hs # nb must be called this for @wrap_as_background_process
self.server_name = hs.hostname
self.config = hs.config
self.store = hs.get_datastores().main
self.send_email_handler = hs.get_send_email_handler()

View File

@@ -47,7 +47,6 @@ from synapse.metrics import (
event_processing_loop_room_count,
)
from synapse.metrics.background_process_metrics import (
run_as_background_process,
wrap_as_background_process,
)
from synapse.storage.databases.main.directory import RoomAliasMapping
@@ -76,9 +75,8 @@ events_processed_counter = Counter(
class ApplicationServicesHandler:
def __init__(self, hs: "HomeServer"):
self.server_name = (
hs.hostname
) # nb must be called this for @wrap_as_background_process
self.server_name = hs.hostname
self.hs = hs # nb must be called this for @wrap_as_background_process
self.store = hs.get_datastores().main
self.is_mine_id = hs.is_mine_id
self.appservice_api = hs.get_application_service_api()
@@ -98,7 +96,7 @@ class ApplicationServicesHandler:
self.is_processing = False
self._ephemeral_events_linearizer = Linearizer(
name="appservice_ephemeral_events"
name="appservice_ephemeral_events", clock=hs.get_clock()
)
def notify_interested_services(self, max_token: RoomStreamToken) -> None:
@@ -171,8 +169,8 @@ class ApplicationServicesHandler:
except Exception:
logger.error("Application Services Failure")
run_as_background_process(
"as_scheduler", self.server_name, start_scheduler
self.hs.run_as_background_process(
"as_scheduler", start_scheduler
)
self.started_scheduler = True

View File

@@ -24,7 +24,6 @@ from typing import TYPE_CHECKING, Optional
from synapse.api.constants import Membership
from synapse.api.errors import SynapseError
from synapse.metrics.background_process_metrics import run_as_background_process
from synapse.replication.http.deactivate_account import (
ReplicationNotifyAccountDeactivatedServlet,
)
@@ -272,8 +271,8 @@ class DeactivateAccountHandler:
pending deactivation, if it isn't already running.
"""
if not self._user_parter_running:
run_as_background_process(
"user_parter_loop", self.server_name, self._user_parter_loop
self.hs.run_as_background_process(
"user_parter_loop", self._user_parter_loop
)
async def _user_parter_loop(self) -> None:

View File

@@ -24,9 +24,6 @@ from synapse.config.workers import MAIN_PROCESS_INSTANCE_NAME
from synapse.logging.context import make_deferred_yieldable
from synapse.logging.opentracing import set_tag
from synapse.metrics import SERVER_NAME_LABEL, event_processing_positions
from synapse.metrics.background_process_metrics import (
run_as_background_process,
)
from synapse.replication.http.delayed_events import (
ReplicationAddedDelayedEventRestServlet,
)
@@ -58,6 +55,7 @@ logger = logging.getLogger(__name__)
class DelayedEventsHandler:
def __init__(self, hs: "HomeServer"):
self.hs = hs
self.server_name = hs.hostname
self._store = hs.get_datastores().main
self._storage_controllers = hs.get_storage_controllers()
@@ -94,7 +92,10 @@ class DelayedEventsHandler:
hs.get_notifier().add_replication_callback(self.notify_new_event)
# Kick off again (without blocking) to catch any missed notifications
# that may have fired before the callback was added.
self._clock.call_later(0, self.notify_new_event)
self._clock.call_later(
0,
self.notify_new_event,
)
# Delayed events that are already marked as processed on startup might not have been
# sent properly on the last run of the server, so unmark them to send them again.
@@ -112,15 +113,14 @@ class DelayedEventsHandler:
self._schedule_next_at(next_send_ts)
# Can send the events in background after having awaited on marking them as processed
run_as_background_process(
self.hs.run_as_background_process(
"_send_events",
self.server_name,
self._send_events,
events,
)
self._initialized_from_db = run_as_background_process(
"_schedule_db_events", self.server_name, _schedule_db_events
self._initialized_from_db = self.hs.run_as_background_process(
"_schedule_db_events", _schedule_db_events
)
else:
self._repl_client = ReplicationAddedDelayedEventRestServlet.make_client(hs)
@@ -145,9 +145,7 @@ class DelayedEventsHandler:
finally:
self._event_processing = False
run_as_background_process(
"delayed_events.notify_new_event", self.server_name, process
)
self.hs.run_as_background_process("delayed_events.notify_new_event", process)
async def _unsafe_process_new_event(self) -> None:
# We purposefully fetch the current max room stream ordering before
@@ -542,9 +540,8 @@ class DelayedEventsHandler:
if self._next_delayed_event_call is None:
self._next_delayed_event_call = self._clock.call_later(
delay_sec,
run_as_background_process,
self.hs.run_as_background_process,
"_send_on_timeout",
self.server_name,
self._send_on_timeout,
)
else:

View File

@@ -47,7 +47,6 @@ from synapse.api.errors import (
)
from synapse.logging.opentracing import log_kv, set_tag, trace
from synapse.metrics.background_process_metrics import (
run_as_background_process,
wrap_as_background_process,
)
from synapse.replication.http.devices import (
@@ -191,10 +190,9 @@ class DeviceHandler:
and self._delete_stale_devices_after is not None
):
self.clock.looping_call(
run_as_background_process,
self.hs.run_as_background_process,
DELETE_STALE_DEVICES_INTERVAL_MS,
desc="delete_stale_devices",
server_name=self.server_name,
func=self._delete_stale_devices,
)
@@ -1444,14 +1442,19 @@ class DeviceListUpdater(DeviceListWorkerUpdater):
def __init__(self, hs: "HomeServer", device_handler: DeviceWriterHandler):
super().__init__(hs)
self.hs = hs
self.server_name = hs.hostname
self.federation = hs.get_federation_client()
self.server_name = hs.hostname # nb must be called this for @measure_func
self.clock = hs.get_clock() # nb must be called this for @measure_func
self.device_handler = device_handler
self._remote_edu_linearizer = Linearizer(name="remote_device_list")
self._resync_linearizer = Linearizer(name="remote_device_resync")
self._remote_edu_linearizer = Linearizer(
name="remote_device_list", clock=self.clock
)
self._resync_linearizer = Linearizer(
name="remote_device_resync", clock=self.clock
)
# user_id -> list of updates waiting to be handled.
self._pending_updates: Dict[
@@ -1464,6 +1467,7 @@ class DeviceListUpdater(DeviceListWorkerUpdater):
self._seen_updates: ExpiringCache[str, Set[str]] = ExpiringCache(
cache_name="device_update_edu",
server_name=self.server_name,
hs=self.hs,
clock=self.clock,
max_len=10000,
expiry_ms=30 * 60 * 1000,
@@ -1473,9 +1477,8 @@ class DeviceListUpdater(DeviceListWorkerUpdater):
# Attempt to resync out of sync device lists every 30s.
self._resync_retry_lock = Lock()
self.clock.looping_call(
run_as_background_process,
self.hs.run_as_background_process,
30 * 1000,
server_name=self.server_name,
func=self._maybe_retry_device_resync,
desc="_maybe_retry_device_resync",
)
@@ -1595,9 +1598,8 @@ class DeviceListUpdater(DeviceListWorkerUpdater):
if resync:
# We mark as stale up front in case we get restarted.
await self.store.mark_remote_users_device_caches_as_stale([user_id])
run_as_background_process(
self.hs.run_as_background_process(
"_maybe_retry_device_resync",
self.server_name,
self.multi_user_device_resync,
[user_id],
False,

View File

@@ -112,8 +112,7 @@ class E2eKeysHandler:
# Limit the number of in-flight requests from a single device.
self._query_devices_linearizer = Linearizer(
name="query_devices",
max_count=10,
name="query_devices", max_count=10, clock=hs.get_clock()
)
self._query_appservices_for_otks = (
@@ -1765,7 +1764,9 @@ class SigningKeyEduUpdater:
assert isinstance(device_handler, DeviceWriterHandler)
self._device_handler = device_handler
self._remote_edu_linearizer = Linearizer(name="remote_signing_key")
self._remote_edu_linearizer = Linearizer(
name="remote_signing_key", clock=self.clock
)
# user_id -> list of updates waiting to be handled.
self._pending_updates: Dict[str, List[Tuple[JsonDict, JsonDict]]] = {}

View File

@@ -72,7 +72,6 @@ from synapse.http.servlet import assert_params_in_dict
from synapse.logging.context import nested_logging_context
from synapse.logging.opentracing import SynapseTags, set_tag, tag_args, trace
from synapse.metrics import SERVER_NAME_LABEL
from synapse.metrics.background_process_metrics import run_as_background_process
from synapse.module_api import NOT_SPAM
from synapse.storage.databases.main.events_worker import EventRedactBehaviour
from synapse.storage.invite_rule import InviteRule
@@ -160,7 +159,7 @@ class FederationHandler:
self._notifier = hs.get_notifier()
self._worker_locks = hs.get_worker_locks_handler()
self._room_backfill = Linearizer("room_backfill")
self._room_backfill = Linearizer(name="room_backfill", clock=self.clock)
self._third_party_event_rules = (
hs.get_module_api_callbacks().third_party_event_rules
@@ -180,16 +179,16 @@ class FederationHandler:
# When the lock is held for a given room, no other concurrent code may
# partial state or un-partial state the room.
self._is_partial_state_room_linearizer = Linearizer(
name="_is_partial_state_room_linearizer"
name="_is_partial_state_room_linearizer",
clock=self.clock,
)
# if this is the main process, fire off a background process to resume
# any partial-state-resync operations which were in flight when we
# were shut down.
if not hs.config.worker.worker_app:
run_as_background_process(
self.hs.run_as_background_process(
"resume_sync_partial_state_room",
self.server_name,
self._resume_partial_state_room_sync,
)
@@ -317,9 +316,8 @@ class FederationHandler:
logger.debug(
"_maybe_backfill_inner: all backfill points are *after* current depth. Trying again with later backfill points."
)
run_as_background_process(
self.hs.run_as_background_process(
"_maybe_backfill_inner_anyway_with_max_depth",
self.server_name,
self.maybe_backfill,
room_id=room_id,
# We use `MAX_DEPTH` so that we find all backfill points next
@@ -801,9 +799,8 @@ class FederationHandler:
# lots of requests for missing prev_events which we do actually
# have. Hence we fire off the background task, but don't wait for it.
run_as_background_process(
self.hs.run_as_background_process(
"handle_queued_pdus",
self.server_name,
self._handle_queued_pdus,
room_queue,
)
@@ -1876,9 +1873,8 @@ class FederationHandler:
room_id=room_id,
)
run_as_background_process(
self.hs.run_as_background_process(
desc="sync_partial_state_room",
server_name=self.server_name,
func=_sync_partial_state_room_wrapper,
)

View File

@@ -81,7 +81,6 @@ from synapse.logging.opentracing import (
trace,
)
from synapse.metrics import SERVER_NAME_LABEL
from synapse.metrics.background_process_metrics import run_as_background_process
from synapse.replication.http.federation import (
ReplicationFederationSendEventsRestServlet,
)
@@ -153,6 +152,7 @@ class FederationEventHandler:
def __init__(self, hs: "HomeServer"):
self.server_name = hs.hostname
self.hs = hs
self._clock = hs.get_clock()
self._store = hs.get_datastores().main
self._state_store = hs.get_datastores().state
@@ -175,6 +175,7 @@ class FederationEventHandler:
)
self._notifier = hs.get_notifier()
self._server_name = hs.hostname
self._is_mine_id = hs.is_mine_id
self._is_mine_server_name = hs.is_mine_server_name
self._instance_name = hs.get_instance_name()
@@ -191,7 +192,7 @@ class FederationEventHandler:
# federation event staging area.
self.room_queues: Dict[str, List[Tuple[EventBase, str]]] = {}
self._room_pdu_linearizer = Linearizer("fed_room_pdu")
self._room_pdu_linearizer = Linearizer(name="fed_room_pdu", clock=self._clock)
async def on_receive_pdu(self, origin: str, pdu: EventBase) -> None:
"""Process a PDU received via a federation /send/ transaction
@@ -974,9 +975,8 @@ class FederationEventHandler:
# Process previously failed backfill events in the background to not waste
# time on something that is likely to fail again.
if len(events_with_failed_pull_attempts) > 0:
run_as_background_process(
self.hs.run_as_background_process(
"_process_new_pulled_events_with_failed_pull_attempts",
self.server_name,
_process_new_pulled_events,
events_with_failed_pull_attempts,
)
@@ -1568,9 +1568,8 @@ class FederationEventHandler:
resync = True
if resync:
run_as_background_process(
self.hs.run_as_background_process(
"resync_device_due_to_pdu",
self.server_name,
self._resync_device,
event.sender,
)

View File

@@ -67,7 +67,6 @@ from synapse.handlers.directory import DirectoryHandler
from synapse.handlers.worker_lock import NEW_EVENT_DURING_PURGE_LOCK_NAME
from synapse.logging import opentracing
from synapse.logging.context import make_deferred_yieldable, run_in_background
from synapse.metrics.background_process_metrics import run_as_background_process
from synapse.replication.http.send_events import ReplicationSendEventsRestServlet
from synapse.storage.databases.main.events_worker import EventRedactBehaviour
from synapse.types import (
@@ -99,6 +98,7 @@ class MessageHandler:
def __init__(self, hs: "HomeServer"):
self.server_name = hs.hostname
self.hs = hs
self.auth = hs.get_auth()
self.clock = hs.get_clock()
self.state = hs.get_state_handler()
@@ -113,8 +113,8 @@ class MessageHandler:
self._scheduled_expiry: Optional[IDelayedCall] = None
if not hs.config.worker.worker_app:
run_as_background_process(
"_schedule_next_expiry", self.server_name, self._schedule_next_expiry
self.hs.run_as_background_process(
"_schedule_next_expiry", self._schedule_next_expiry
)
async def get_room_data(
@@ -444,9 +444,8 @@ class MessageHandler:
self._scheduled_expiry = self.clock.call_later(
delay,
run_as_background_process,
self.hs.run_as_background_process,
"_expire_event",
self.server_name,
self._expire_event,
event_id,
)
@@ -513,7 +512,9 @@ class EventCreationHandler:
# We limit concurrent event creation for a room to 1. This prevents state resolution
# from occurring when sending bursts of events to a local room
self.limiter = Linearizer(max_count=1, name="room_event_creation_limit")
self.limiter = Linearizer(
max_count=1, name="room_event_creation_limit", clock=self.clock
)
self._bulk_push_rule_evaluator = hs.get_bulk_push_rule_evaluator()
@@ -546,9 +547,8 @@ class EventCreationHandler:
and self.config.server.cleanup_extremities_with_dummy_events
):
self.clock.looping_call(
lambda: run_as_background_process(
lambda: self.hs.run_as_background_process(
"send_dummy_events_to_fill_extremities",
self.server_name,
self._send_dummy_events_to_fill_extremities,
),
5 * 60 * 1000,
@@ -568,6 +568,7 @@ class EventCreationHandler:
self._external_cache_joined_hosts_updates = ExpiringCache(
cache_name="_external_cache_joined_hosts_updates",
server_name=self.server_name,
hs=self.hs,
clock=self.clock,
expiry_ms=30 * 60 * 1000,
)
@@ -1138,6 +1139,12 @@ class EventCreationHandler:
assert self.hs.is_mine_id(event.sender), "User must be our own: %s" % (
event.sender,
)
# if this room uses a policy server, try to get a signature now.
# We use verify=False here as we are about to call is_event_allowed on the same event
# which will do sig checks.
await self._policy_handler.ask_policy_server_to_sign_event(
event, verify=False
)
policy_allowed = await self._policy_handler.is_event_allowed(event)
if not policy_allowed:
@@ -2105,9 +2112,8 @@ class EventCreationHandler:
if event.type == EventTypes.Message:
# We don't want to block sending messages on any presence code. This
# matters as sometimes presence code can take a while.
run_as_background_process(
self.hs.run_as_background_process(
"bump_presence_active_time",
self.server_name,
self._bump_active_time,
requester.user,
requester.device_id,

View File

@@ -29,7 +29,6 @@ from synapse.api.filtering import Filter
from synapse.events.utils import SerializeEventConfig
from synapse.handlers.worker_lock import NEW_EVENT_DURING_PURGE_LOCK_NAME
from synapse.logging.opentracing import trace
from synapse.metrics.background_process_metrics import run_as_background_process
from synapse.rest.admin._base import assert_user_is_admin
from synapse.streams.config import PaginationConfig
from synapse.types import (
@@ -116,10 +115,9 @@ class PaginationHandler:
logger.info("Setting up purge job with config: %s", job)
self.clock.looping_call(
run_as_background_process,
self.hs.run_as_background_process,
job.interval,
"purge_history_for_rooms_in_range",
self.server_name,
self.purge_history_for_rooms_in_range,
job.shortest_max_lifetime,
job.longest_max_lifetime,
@@ -244,9 +242,8 @@ class PaginationHandler:
# We want to purge everything, including local events, and to run the purge in
# the background so that it's not blocking any other operation apart from
# other purges in the same room.
run_as_background_process(
self.hs.run_as_background_process(
PURGE_HISTORY_ACTION_NAME,
self.server_name,
self.purge_history,
room_id,
token,
@@ -604,9 +601,8 @@ class PaginationHandler:
# Otherwise, we can backfill in the background for eventual
# consistency's sake but we don't need to block the client waiting
# for a costly federation call and processing.
run_as_background_process(
self.hs.run_as_background_process(
"maybe_backfill_in_the_background",
self.server_name,
self.hs.get_federation_handler().maybe_backfill,
room_id,
curr_topo,

View File

@@ -107,7 +107,6 @@ from synapse.events.presence_router import PresenceRouter
from synapse.logging.context import run_in_background
from synapse.metrics import SERVER_NAME_LABEL, LaterGauge
from synapse.metrics.background_process_metrics import (
run_as_background_process,
wrap_as_background_process,
)
from synapse.replication.http.presence import (
@@ -537,19 +536,15 @@ class WorkerPresenceHandler(BasePresenceHandler):
self._bump_active_client = ReplicationBumpPresenceActiveTime.make_client(hs)
self._set_state_client = ReplicationPresenceSetState.make_client(hs)
self._send_stop_syncing_loop = self.clock.looping_call(
self.send_stop_syncing, UPDATE_SYNCING_USERS_MS
)
hs.get_clock().add_system_event_trigger(
"before",
"shutdown",
run_as_background_process,
"generic_presence.on_shutdown",
self.server_name,
self._on_shutdown,
self.clock.looping_call(self.send_stop_syncing, UPDATE_SYNCING_USERS_MS)
hs.register_async_shutdown_handler(
phase="before",
eventType="shutdown",
shutdown_func=self._on_shutdown,
)
@wrap_as_background_process("WorkerPresenceHandler._on_shutdown")
async def _on_shutdown(self) -> None:
if self._track_presence:
self.hs.get_replication_command_handler().send_command(
@@ -779,9 +774,7 @@ class WorkerPresenceHandler(BasePresenceHandler):
class PresenceHandler(BasePresenceHandler):
def __init__(self, hs: "HomeServer"):
super().__init__(hs)
self.server_name = (
hs.hostname
) # nb must be called this for @wrap_as_background_process
self.server_name = hs.hostname
self.wheel_timer: WheelTimer[str] = WheelTimer()
self.notifier = hs.get_notifier()
@@ -842,13 +835,10 @@ class PresenceHandler(BasePresenceHandler):
# have not yet been persisted
self.unpersisted_users_changes: Set[str] = set()
hs.get_clock().add_system_event_trigger(
"before",
"shutdown",
run_as_background_process,
"presence.on_shutdown",
self.server_name,
self._on_shutdown,
hs.register_async_shutdown_handler(
phase="before",
eventType="shutdown",
shutdown_func=self._on_shutdown,
)
# Keeps track of the number of *ongoing* syncs on this process. While
@@ -872,14 +862,19 @@ class PresenceHandler(BasePresenceHandler):
] = {}
self.external_process_last_updated_ms: Dict[str, int] = {}
self.external_sync_linearizer = Linearizer(name="external_sync_linearizer")
self.external_sync_linearizer = Linearizer(
name="external_sync_linearizer", clock=self.clock
)
if self._track_presence:
# Start a LoopingCall in 30s that fires every 5s.
# The initial delay is to allow disconnected clients a chance to
# reconnect before we treat them as offline.
self.clock.call_later(
30, self.clock.looping_call, self._handle_timeouts, 5000
30,
self.clock.looping_call,
self._handle_timeouts,
5000,
)
# Presence information is persisted, whether or not it is being tracked
@@ -906,6 +901,7 @@ class PresenceHandler(BasePresenceHandler):
self._event_pos = self.store.get_room_max_stream_ordering()
self._event_processing = False
@wrap_as_background_process("PresenceHandler._on_shutdown")
async def _on_shutdown(self) -> None:
"""Gets called when shutting down. This lets us persist any updates that
we haven't yet persisted, e.g. updates that only changes some internal
@@ -1537,8 +1533,8 @@ class PresenceHandler(BasePresenceHandler):
finally:
self._event_processing = False
run_as_background_process(
"presence.notify_new_event", self.server_name, _process_presence
self.hs.run_as_background_process(
"presence.notify_new_event", _process_presence
)
async def _unsafe_process(self) -> None:

View File

@@ -56,8 +56,8 @@ class ProfileHandler:
def __init__(self, hs: "HomeServer"):
self.server_name = hs.hostname # nb must be called this for @cached
self.clock = hs.get_clock() # nb must be called this for @cached
self.store = hs.get_datastores().main
self.clock = hs.get_clock()
self.hs = hs
self.federation = hs.get_federation_client()

View File

@@ -36,7 +36,9 @@ class ReadMarkerHandler:
def __init__(self, hs: "HomeServer"):
self.store = hs.get_datastores().main
self.account_data_handler = hs.get_account_data_handler()
self.read_marker_linearizer = Linearizer(name="read_marker")
self.read_marker_linearizer = Linearizer(
name="read_marker", clock=hs.get_clock()
)
async def received_client_read_marker(
self, room_id: str, user_id: str, event_id: str

View File

@@ -23,7 +23,14 @@
"""Contains functions for registering clients."""
import logging
from typing import TYPE_CHECKING, Iterable, List, Optional, Tuple, TypedDict
from typing import (
TYPE_CHECKING,
Iterable,
List,
Optional,
Tuple,
TypedDict,
)
from prometheus_client import Counter

View File

@@ -597,7 +597,7 @@ class RoomCreationHandler:
new_room_version,
additional_creators=additional_creators,
)
initial_state = {}
initial_state: MutableStateMap = {}
# Replicate relevant room events
types_to_copy: List[Tuple[str, Optional[str]]] = [
@@ -693,14 +693,23 @@ class RoomCreationHandler:
additional_creators,
)
# We construct what the body of a call to /createRoom would look like for passing
# to the spam checker. We don't include a preset here, as we expect the
# We construct a subset of what the body of a call to /createRoom would look like
# for passing to the spam checker. We don't include a preset here, as we expect the
# initial state to contain everything we need.
# TODO: given we are upgrading, it would make sense to pass the room_version
# TODO: the preset might be useful too
spam_check = await self._spam_checker_module_callbacks.user_may_create_room(
user_id,
{
"creation_content": creation_content,
"initial_state": list(initial_state.items()),
"initial_state": [
{
"type": state_key[0],
"state_key": state_key[1],
"content": event_content,
}
for state_key, event_content in initial_state.items()
],
},
)
if spam_check != self._spam_checker_module_callbacks.NOT_SPAM:

View File

@@ -50,7 +50,6 @@ from synapse.handlers.state_deltas import MatchChange, StateDeltasHandler
from synapse.handlers.worker_lock import NEW_EVENT_DURING_PURGE_LOCK_NAME
from synapse.logging import opentracing
from synapse.metrics import SERVER_NAME_LABEL, event_processing_positions
from synapse.metrics.background_process_metrics import run_as_background_process
from synapse.replication.http.push import ReplicationCopyPusherRestServlet
from synapse.storage.databases.main.state_deltas import StateDelta
from synapse.storage.invite_rule import InviteRule
@@ -114,8 +113,12 @@ class RoomMemberHandler(metaclass=abc.ABCMeta):
if self.hs.config.server.include_profile_data_on_invite:
self._membership_types_to_include_profile_data_in.add(Membership.INVITE)
self.member_linearizer: Linearizer = Linearizer(name="member")
self.member_as_limiter = Linearizer(max_count=10, name="member_as_limiter")
self.member_linearizer: Linearizer = Linearizer(
name="member", clock=hs.get_clock()
)
self.member_as_limiter = Linearizer(
max_count=10, name="member_as_limiter", clock=hs.get_clock()
)
self.clock = hs.get_clock()
self._spam_checker_module_callbacks = hs.get_module_api_callbacks().spam_checker
@@ -2186,7 +2189,10 @@ class RoomForgetterHandler(StateDeltasHandler):
self._notifier.add_replication_callback(self.notify_new_event)
# We kick this off to pick up outstanding work from before the last restart.
self._clock.call_later(0, self.notify_new_event)
self._clock.call_later(
0,
self.notify_new_event,
)
def notify_new_event(self) -> None:
"""Called when there may be more deltas to process"""
@@ -2201,9 +2207,7 @@ class RoomForgetterHandler(StateDeltasHandler):
finally:
self._is_processing = False
run_as_background_process(
"room_forgetter.notify_new_event", self.server_name, process
)
self._hs.run_as_background_process("room_forgetter.notify_new_event", process)
async def _unsafe_process(self) -> None:
# If self.pos is None then means we haven't fetched it from DB

View File

@@ -17,6 +17,11 @@
import logging
from typing import TYPE_CHECKING
from signedjson.key import decode_verify_key_bytes
from unpaddedbase64 import decode_base64
from synapse.api.errors import SynapseError
from synapse.crypto.keyring import VerifyJsonRequest
from synapse.events import EventBase
from synapse.types.handlers.policy_server import RECOMMENDATION_OK
from synapse.util.stringutils import parse_and_validate_server_name
@@ -26,6 +31,9 @@ if TYPE_CHECKING:
logger = logging.getLogger(__name__)
POLICY_SERVER_EVENT_TYPE = "org.matrix.msc4284.policy"
POLICY_SERVER_KEY_ID = "ed25519:policy_server"
class RoomPolicyHandler:
def __init__(self, hs: "HomeServer"):
@@ -54,11 +62,11 @@ class RoomPolicyHandler:
Returns:
bool: True if the event is allowed in the room, False otherwise.
"""
if event.type == "org.matrix.msc4284.policy" and event.state_key is not None:
if event.type == POLICY_SERVER_EVENT_TYPE and event.state_key is not None:
return True # always allow policy server change events
policy_event = await self._storage_controllers.state.get_current_state_event(
event.room_id, "org.matrix.msc4284.policy", ""
event.room_id, POLICY_SERVER_EVENT_TYPE, ""
)
if not policy_event:
return True # no policy server == default allow
@@ -81,6 +89,22 @@ class RoomPolicyHandler:
if not is_in_room:
return True # policy server not in room == default allow
# Check if the event has been signed with the public key in the policy server state event.
# If it is, we can save an HTTP hit.
# We actually want to get the policy server state event BEFORE THE EVENT rather than
# the current state value, else changing the public key will cause all of these checks to fail.
# However, if we are checking outlier events (which we will due to is_event_allowed being called
# near the edges at _check_sigs_and_hash) we won't know the state before the event, so the
# only safe option is to use the current state
public_key = policy_event.content.get("public_key", None)
if public_key is not None and isinstance(public_key, str):
valid = await self._verify_policy_server_signature(
event, policy_server, public_key
)
if valid:
return True
# fallthrough to hit /check manually
# At this point, the server appears valid and is in the room, so ask it to check
# the event.
recommendation = await self._federation_client.get_pdu_policy_recommendation(
@@ -90,3 +114,73 @@ class RoomPolicyHandler:
return False
return True # default allow
async def _verify_policy_server_signature(
self, event: EventBase, policy_server: str, public_key: str
) -> bool:
# check the event is signed with this (via, public_key).
verify_json_req = VerifyJsonRequest.from_event(policy_server, event, 0)
try:
key_bytes = decode_base64(public_key)
verify_key = decode_verify_key_bytes(POLICY_SERVER_KEY_ID, key_bytes)
# We would normally use KeyRing.verify_event_for_server but we can't here as we don't
# want to fetch the server key, and instead want to use the public key in the state event.
await self._hs.get_keyring().process_json(verify_key, verify_json_req)
# if the event is correctly signed by the public key in the policy server state event = Allow
return True
except Exception as ex:
logger.warning(
"failed to verify event using public key in policy server event: %s", ex
)
return False
async def ask_policy_server_to_sign_event(
self, event: EventBase, verify: bool = False
) -> None:
"""Ask the policy server to sign this event. The signature is added to the event signatures block.
Does nothing if there is no policy server state event in the room. If the policy server
refuses to sign the event (as it's marked as spam) does nothing.
Args:
event: The event to sign
verify: If True, verify that the signature is correctly signed by the public_key in the
policy server state event.
Raises:
if verify=True and the policy server signed the event with an invalid signature. Does
not raise if the policy server refuses to sign the event.
"""
policy_event = await self._storage_controllers.state.get_current_state_event(
event.room_id, POLICY_SERVER_EVENT_TYPE, ""
)
if not policy_event:
return
policy_server = policy_event.content.get("via", None)
if policy_server is None or not isinstance(policy_server, str):
return
# Only ask to sign events if the policy state event has a public_key (so they can be subsequently verified)
public_key = policy_event.content.get("public_key", None)
if public_key is None or not isinstance(public_key, str):
return
# Ask the policy server to sign this event.
# We set a smallish timeout here as we don't want to block event sending too long.
signature = await self._federation_client.ask_policy_server_to_sign_event(
policy_server,
event,
timeout=3000,
)
if (
# the policy server returns {} if it refuses to sign the event.
signature and len(signature) > 0
):
event.signatures.update(signature)
if verify:
is_valid = await self._verify_policy_server_signature(
event, policy_server, public_key
)
if not is_valid:
raise SynapseError(
500,
f"policy server {policy_server} failed to sign event correctly",
)

View File

@@ -224,7 +224,7 @@ class SsoHandler:
)
# a lock on the mappings
self._mapping_lock = Linearizer(name="sso_user_mapping", clock=hs.get_clock())
self._mapping_lock = Linearizer(clock=hs.get_clock(), name="sso_user_mapping")
# a map from session id to session data
self._username_mapping_sessions: Dict[str, UsernameMappingSession] = {}

View File

@@ -33,7 +33,6 @@ from typing import (
from synapse.api.constants import EventContentFields, EventTypes, Membership
from synapse.metrics import SERVER_NAME_LABEL, event_processing_positions
from synapse.metrics.background_process_metrics import run_as_background_process
from synapse.storage.databases.main.state_deltas import StateDelta
from synapse.types import JsonDict
from synapse.util.events import get_plain_text_topic_from_event_content
@@ -75,7 +74,10 @@ class StatsHandler:
# We kick this off so that we don't have to wait for a change before
# we start populating stats
self.clock.call_later(0, self.notify_new_event)
self.clock.call_later(
0,
self.notify_new_event,
)
def notify_new_event(self) -> None:
"""Called when there may be more deltas to process"""
@@ -90,7 +92,7 @@ class StatsHandler:
finally:
self._is_processing = False
run_as_background_process("stats.notify_new_event", self.server_name, process)
self.hs.run_as_background_process("stats.notify_new_event", process)
async def _unsafe_process(self) -> None:
# If self.pos is None then means we haven't fetched it from DB

View File

@@ -323,6 +323,7 @@ class SyncHandler:
] = ExpiringCache(
cache_name="lazy_loaded_members_cache",
server_name=self.server_name,
hs=hs,
clock=self.clock,
max_len=0,
expiry_ms=LAZY_LOADED_MEMBERS_CACHE_MAX_AGE,
@@ -980,7 +981,11 @@ class SyncHandler:
)
if cache is None:
logger.debug("creating LruCache for %r", cache_key)
cache = LruCache(max_size=LAZY_LOADED_MEMBERS_CACHE_MAX_SIZE)
cache = LruCache(
max_size=LAZY_LOADED_MEMBERS_CACHE_MAX_SIZE,
clock=self.clock,
server_name=self.server_name,
)
self.lazy_loaded_members_cache[cache_key] = cache
else:
logger.debug("found LruCache for %r", cache_key)

View File

@@ -28,7 +28,6 @@ from synapse.api.constants import EduTypes
from synapse.api.errors import AuthError, ShadowBanError, SynapseError
from synapse.appservice import ApplicationService
from synapse.metrics.background_process_metrics import (
run_as_background_process,
wrap_as_background_process,
)
from synapse.replication.tcp.streams import TypingStream
@@ -78,11 +77,10 @@ class FollowerTypingHandler:
"""
def __init__(self, hs: "HomeServer"):
self.hs = hs # nb must be called this for @wrap_as_background_process
self.store = hs.get_datastores().main
self._storage_controllers = hs.get_storage_controllers()
self.server_name = (
hs.hostname
) # nb must be called this for @wrap_as_background_process
self.server_name = hs.hostname
self.clock = hs.get_clock()
self.is_mine_id = hs.is_mine_id
self.is_mine_server_name = hs.is_mine_server_name
@@ -144,9 +142,8 @@ class FollowerTypingHandler:
if self.federation and self.is_mine_id(member.user_id):
last_fed_poke = self._member_last_federation_poke.get(member, None)
if not last_fed_poke or last_fed_poke + FEDERATION_PING_INTERVAL <= now:
run_as_background_process(
self.hs.run_as_background_process(
"typing._push_remote",
self.server_name,
self._push_remote,
member=member,
typing=True,
@@ -220,9 +217,8 @@ class FollowerTypingHandler:
self._rooms_updated.add(row.room_id)
if self.federation:
run_as_background_process(
self.hs.run_as_background_process(
"_send_changes_in_typing_to_remotes",
self.server_name,
self._send_changes_in_typing_to_remotes,
row.room_id,
prev_typing,
@@ -384,9 +380,8 @@ class TypingWriterHandler(FollowerTypingHandler):
def _push_update(self, member: RoomMember, typing: bool) -> None:
if self.hs.is_mine_id(member.user_id):
# Only send updates for changes to our own users.
run_as_background_process(
self.hs.run_as_background_process(
"typing._push_remote",
self.server_name,
self._push_remote,
member,
typing,

View File

@@ -36,7 +36,6 @@ from synapse.api.constants import (
from synapse.api.errors import Codes, SynapseError
from synapse.handlers.state_deltas import MatchChange, StateDeltasHandler
from synapse.metrics import SERVER_NAME_LABEL
from synapse.metrics.background_process_metrics import run_as_background_process
from synapse.storage.databases.main.state_deltas import StateDelta
from synapse.storage.databases.main.user_directory import SearchResult
from synapse.storage.roommember import ProfileInfo
@@ -137,11 +136,15 @@ class UserDirectoryHandler(StateDeltasHandler):
# We kick this off so that we don't have to wait for a change before
# we start populating the user directory
self.clock.call_later(0, self.notify_new_event)
self.clock.call_later(
0,
self.notify_new_event,
)
# Kick off the profile refresh process on startup
self._refresh_remote_profiles_call_later = self.clock.call_later(
10, self.kick_off_remote_profile_refresh_process
10,
self.kick_off_remote_profile_refresh_process,
)
async def search_users(
@@ -193,9 +196,7 @@ class UserDirectoryHandler(StateDeltasHandler):
self._is_processing = False
self._is_processing = True
run_as_background_process(
"user_directory.notify_new_event", self.server_name, process
)
self._hs.run_as_background_process("user_directory.notify_new_event", process)
async def handle_local_profile_change(
self, user_id: str, profile: ProfileInfo
@@ -609,8 +610,8 @@ class UserDirectoryHandler(StateDeltasHandler):
self._is_refreshing_remote_profiles = False
self._is_refreshing_remote_profiles = True
run_as_background_process(
"user_directory.refresh_remote_profiles", self.server_name, process
self._hs.run_as_background_process(
"user_directory.refresh_remote_profiles", process
)
async def _unsafe_refresh_remote_profiles(self) -> None:
@@ -655,8 +656,9 @@ class UserDirectoryHandler(StateDeltasHandler):
if not users:
return
_, _, next_try_at_ts = users[0]
delay = ((next_try_at_ts - self.clock.time_msec()) // 1000) + 2
self._refresh_remote_profiles_call_later = self.clock.call_later(
((next_try_at_ts - self.clock.time_msec()) // 1000) + 2,
delay,
self.kick_off_remote_profile_refresh_process,
)
@@ -692,9 +694,8 @@ class UserDirectoryHandler(StateDeltasHandler):
self._is_refreshing_remote_profiles_for_servers.remove(server_name)
self._is_refreshing_remote_profiles_for_servers.add(server_name)
run_as_background_process(
self._hs.run_as_background_process(
"user_directory.refresh_remote_profiles_for_remote_server",
self.server_name,
process,
)

View File

@@ -37,13 +37,13 @@ from weakref import WeakSet
import attr
from twisted.internet import defer
from twisted.internet.interfaces import IReactorTime
from synapse.logging.context import PreserveLoggingContext
from synapse.logging.opentracing import start_active_span
from synapse.metrics.background_process_metrics import wrap_as_background_process
from synapse.storage.databases.main.lock import Lock, LockStore
from synapse.util.async_helpers import timeout_deferred
from synapse.util.clock import Clock
from synapse.util.constants import ONE_MINUTE_SECONDS
if TYPE_CHECKING:
@@ -66,10 +66,8 @@ class WorkerLocksHandler:
"""
def __init__(self, hs: "HomeServer") -> None:
self.server_name = (
hs.hostname
) # nb must be called this for @wrap_as_background_process
self._reactor = hs.get_reactor()
self.hs = hs # nb must be called this for @wrap_as_background_process
self._clock = hs.get_clock()
self._store = hs.get_datastores().main
self._clock = hs.get_clock()
self._notifier = hs.get_notifier()
@@ -98,7 +96,7 @@ class WorkerLocksHandler:
"""
lock = WaitingLock(
reactor=self._reactor,
clock=self._clock,
store=self._store,
handler=self,
lock_name=lock_name,
@@ -129,7 +127,7 @@ class WorkerLocksHandler:
"""
lock = WaitingLock(
reactor=self._reactor,
clock=self._clock,
store=self._store,
handler=self,
lock_name=lock_name,
@@ -160,7 +158,7 @@ class WorkerLocksHandler:
lock = WaitingMultiLock(
lock_names=lock_names,
write=write,
reactor=self._reactor,
clock=self._clock,
store=self._store,
handler=self,
)
@@ -197,7 +195,11 @@ class WorkerLocksHandler:
if not deferred.called:
deferred.callback(None)
self._clock.call_later(0, _wake_all_locks, locks)
self._clock.call_later(
0,
_wake_all_locks,
locks,
)
@wrap_as_background_process("_cleanup_locks")
async def _cleanup_locks(self) -> None:
@@ -207,7 +209,7 @@ class WorkerLocksHandler:
@attr.s(auto_attribs=True, eq=False)
class WaitingLock:
reactor: IReactorTime
clock: Clock
store: LockStore
handler: WorkerLocksHandler
lock_name: str
@@ -246,10 +248,11 @@ class WaitingLock:
# periodically wake up in case the lock was released but we
# weren't notified.
with PreserveLoggingContext():
timeout = self._get_next_retry_interval()
await timeout_deferred(
deferred=self.deferred,
timeout=self._get_next_retry_interval(),
reactor=self.reactor,
timeout=timeout,
clock=self.clock,
)
except Exception:
pass
@@ -290,7 +293,7 @@ class WaitingMultiLock:
write: bool
reactor: IReactorTime
clock: Clock
store: LockStore
handler: WorkerLocksHandler
@@ -323,10 +326,11 @@ class WaitingMultiLock:
# periodically wake up in case the lock was released but we
# weren't notified.
with PreserveLoggingContext():
timeout = self._get_next_retry_interval()
await timeout_deferred(
deferred=self.deferred,
timeout=self._get_next_retry_interval(),
reactor=self.reactor,
timeout=timeout,
clock=self.clock,
)
except Exception:
pass

View File

@@ -54,7 +54,6 @@ from twisted.internet.interfaces import (
IOpenSSLContextFactory,
IReactorCore,
IReactorPluggableNameResolver,
IReactorTime,
IResolutionReceiver,
ITCPTransport,
)
@@ -88,6 +87,7 @@ from synapse.logging.opentracing import set_tag, start_active_span, tags
from synapse.metrics import SERVER_NAME_LABEL
from synapse.types import ISynapseReactor, StrSequence
from synapse.util.async_helpers import timeout_deferred
from synapse.util.clock import Clock
from synapse.util.json import json_decoder
if TYPE_CHECKING:
@@ -165,16 +165,17 @@ def _is_ip_blocked(
_EPSILON = 0.00000001
def _make_scheduler(
reactor: IReactorTime,
) -> Callable[[Callable[[], object]], IDelayedCall]:
def _make_scheduler(clock: Clock) -> Callable[[Callable[[], object]], IDelayedCall]:
"""Makes a schedular suitable for a Cooperator using the given reactor.
(This is effectively just a copy from `twisted.internet.task`)
"""
def _scheduler(x: Callable[[], object]) -> IDelayedCall:
return reactor.callLater(_EPSILON, x)
return clock.call_later(
_EPSILON,
x,
)
return _scheduler
@@ -367,7 +368,7 @@ class BaseHttpClient:
# We use this for our body producers to ensure that they use the correct
# reactor.
self._cooperator = Cooperator(scheduler=_make_scheduler(hs.get_reactor()))
self._cooperator = Cooperator(scheduler=_make_scheduler(hs.get_clock()))
async def request(
self,
@@ -436,9 +437,9 @@ class BaseHttpClient:
# we use our own timeout mechanism rather than treq's as a workaround
# for https://twistedmatrix.com/trac/ticket/9534.
request_deferred = timeout_deferred(
request_deferred,
60,
self.hs.get_reactor(),
deferred=request_deferred,
timeout=60,
clock=self.hs.get_clock(),
)
# turn timeouts into RequestTimedOutErrors
@@ -763,7 +764,11 @@ class BaseHttpClient:
d = read_body_with_max_size(response, output_stream, max_size)
# Ensure that the body is not read forever.
d = timeout_deferred(d, 30, self.hs.get_reactor())
d = timeout_deferred(
deferred=d,
timeout=30,
clock=self.hs.get_clock(),
)
length = await make_deferred_yieldable(d)
except BodyExceededMaxSize:
@@ -957,9 +962,9 @@ class ReplicationClient(BaseHttpClient):
# for https://twistedmatrix.com/trac/ticket/9534.
# (Updated url https://github.com/twisted/twisted/issues/9534)
request_deferred = timeout_deferred(
request_deferred,
60,
self.hs.get_reactor(),
deferred=request_deferred,
timeout=60,
clock=self.hs.get_clock(),
)
# turn timeouts into RequestTimedOutErrors

View File

@@ -67,6 +67,9 @@ class MatrixFederationAgent:
Args:
reactor: twisted reactor to use for underlying requests
clock: Internal `HomeServer` clock used to track delayed and looping calls.
Should be obtained from `hs.get_clock()`.
tls_client_options_factory:
factory to use for fetching client tls options, or none to disable TLS.
@@ -97,6 +100,7 @@ class MatrixFederationAgent:
*,
server_name: str,
reactor: ISynapseReactor,
clock: Clock,
tls_client_options_factory: Optional[FederationPolicyForHTTPS],
user_agent: bytes,
ip_allowlist: Optional[IPSet],
@@ -109,6 +113,7 @@ class MatrixFederationAgent:
Args:
server_name: Our homeserver name (used to label metrics) (`hs.hostname`).
reactor
clock: Should be the `hs` clock from `hs.get_clock()`
tls_client_options_factory
user_agent
ip_allowlist
@@ -124,7 +129,6 @@ class MatrixFederationAgent:
# addresses, to prevent DNS rebinding.
reactor = BlocklistingReactorWrapper(reactor, ip_allowlist, ip_blocklist)
self._clock = Clock(reactor)
self._pool = HTTPConnectionPool(reactor)
self._pool.retryAutomatically = False
self._pool.maxPersistentPerHost = 5
@@ -147,6 +151,7 @@ class MatrixFederationAgent:
_well_known_resolver = WellKnownResolver(
server_name=server_name,
reactor=reactor,
clock=clock,
agent=BlocklistingAgentWrapper(
ProxyAgent(
reactor=reactor,

View File

@@ -90,6 +90,7 @@ class WellKnownResolver:
self,
server_name: str,
reactor: ISynapseThreadlessReactor,
clock: Clock,
agent: IAgent,
user_agent: bytes,
well_known_cache: Optional[TTLCache[bytes, Optional[bytes]]] = None,
@@ -99,6 +100,7 @@ class WellKnownResolver:
Args:
server_name: Our homeserver name (used to label metrics) (`hs.hostname`).
reactor
clock: Should be the `hs` clock from `hs.get_clock()`
agent
user_agent
well_known_cache
@@ -107,7 +109,7 @@ class WellKnownResolver:
self.server_name = server_name
self._reactor = reactor
self._clock = Clock(reactor)
self._clock = clock
if well_known_cache is None:
well_known_cache = TTLCache(

View File

@@ -90,6 +90,7 @@ from synapse.logging.opentracing import set_tag, start_active_span, tags
from synapse.metrics import SERVER_NAME_LABEL
from synapse.types import JsonDict
from synapse.util.async_helpers import AwakenableSleeper, Linearizer, timeout_deferred
from synapse.util.clock import Clock
from synapse.util.json import json_decoder
from synapse.util.metrics import Measure
from synapse.util.stringutils import parse_and_validate_server_name
@@ -270,6 +271,7 @@ class LegacyJsonSendParser(_BaseJsonParser[Tuple[int, JsonDict]]):
async def _handle_response(
clock: Clock,
reactor: IReactorTime,
timeout_sec: float,
request: MatrixFederationRequest,
@@ -299,7 +301,12 @@ async def _handle_response(
check_content_type_is(response.headers, parser.CONTENT_TYPE)
d = read_body_with_max_size(response, parser, max_response_size)
d = timeout_deferred(d, timeout=timeout_sec, reactor=reactor)
d = timeout_deferred(
deferred=d,
timeout=timeout_sec,
cancel_on_shutdown=False, # We don't track this call since it's short
clock=clock,
)
length = await make_deferred_yieldable(d)
@@ -411,6 +418,7 @@ class MatrixFederationHttpClient:
self.server_name = hs.hostname
self.reactor = hs.get_reactor()
self.clock = hs.get_clock()
user_agent = hs.version_string
if hs.config.server.user_agent_suffix:
@@ -424,6 +432,7 @@ class MatrixFederationHttpClient:
federation_agent: IAgent = MatrixFederationAgent(
server_name=self.server_name,
reactor=self.reactor,
clock=self.clock,
tls_client_options_factory=tls_client_options_factory,
user_agent=user_agent.encode("ascii"),
ip_allowlist=hs.config.server.federation_ip_range_allowlist,
@@ -457,7 +466,6 @@ class MatrixFederationHttpClient:
ip_blocklist=hs.config.server.federation_ip_range_blocklist,
)
self.clock = hs.get_clock()
self._store = hs.get_datastores().main
self.version_string_bytes = hs.version_string.encode("ascii")
self.default_timeout_seconds = hs.config.federation.client_timeout_ms / 1000
@@ -470,9 +478,9 @@ class MatrixFederationHttpClient:
self.max_long_retries = hs.config.federation.max_long_retries
self.max_short_retries = hs.config.federation.max_short_retries
self._cooperator = Cooperator(scheduler=_make_scheduler(self.reactor))
self._cooperator = Cooperator(scheduler=_make_scheduler(self.clock))
self._sleeper = AwakenableSleeper(self.reactor)
self._sleeper = AwakenableSleeper(self.clock)
self._simple_http_client = SimpleHttpClient(
hs,
@@ -481,7 +489,13 @@ class MatrixFederationHttpClient:
use_proxy=True,
)
self.remote_download_linearizer = Linearizer("remote_download_linearizer", 6)
self.remote_download_linearizer = Linearizer(
name="remote_download_linearizer", max_count=6, clock=self.clock
)
self._is_shutdown = False
def shutdown(self) -> None:
self._is_shutdown = True
def wake_destination(self, destination: str) -> None:
"""Called when the remote server may have come back online."""
@@ -627,6 +641,7 @@ class MatrixFederationHttpClient:
limiter = await synapse.util.retryutils.get_retry_limiter(
destination=request.destination,
our_server_name=self.server_name,
hs=self.hs,
clock=self.clock,
store=self._store,
backoff_on_404=backoff_on_404,
@@ -673,7 +688,7 @@ class MatrixFederationHttpClient:
(b"", b"", path_bytes, None, query_bytes, b"")
)
while True:
while not self._is_shutdown:
try:
json = request.get_json()
if json:
@@ -731,9 +746,10 @@ class MatrixFederationHttpClient:
bodyProducer=producer,
)
request_deferred = timeout_deferred(
request_deferred,
deferred=request_deferred,
timeout=_sec_timeout,
reactor=self.reactor,
cancel_on_shutdown=False, # We don't track this call since it will typically be short
clock=self.clock,
)
response = await make_deferred_yieldable(request_deferred)
@@ -791,7 +807,10 @@ class MatrixFederationHttpClient:
# Update transactions table?
d = treq.content(response)
d = timeout_deferred(
d, timeout=_sec_timeout, reactor=self.reactor
deferred=d,
timeout=_sec_timeout,
cancel_on_shutdown=False, # We don't track this call since it will typically be short
clock=self.clock,
)
try:
@@ -860,6 +879,15 @@ class MatrixFederationHttpClient:
delay_seconds,
)
if self._is_shutdown:
# Immediately fail sending the request instead of starting a
# potentially long sleep after the server has requested
# shutdown.
# This is the code path followed when the
# `federation_transaction_transmission_loop` has been
# cancelled.
raise
# Sleep for the calculated delay, or wake up immediately
# if we get notified that the server is back up.
await self._sleeper.sleep(
@@ -1072,6 +1100,7 @@ class MatrixFederationHttpClient:
parser = cast(ByteParser[T], JsonParser())
body = await _handle_response(
self.clock,
self.reactor,
_sec_timeout,
request,
@@ -1150,7 +1179,13 @@ class MatrixFederationHttpClient:
_sec_timeout = self.default_timeout_seconds
body = await _handle_response(
self.reactor, _sec_timeout, request, response, start_ms, parser=JsonParser()
self.clock,
self.reactor,
_sec_timeout,
request,
response,
start_ms,
parser=JsonParser(),
)
return body
@@ -1356,6 +1391,7 @@ class MatrixFederationHttpClient:
parser = cast(ByteParser[T], JsonParser())
body = await _handle_response(
self.clock,
self.reactor,
_sec_timeout,
request,
@@ -1429,7 +1465,13 @@ class MatrixFederationHttpClient:
_sec_timeout = self.default_timeout_seconds
body = await _handle_response(
self.reactor, _sec_timeout, request, response, start_ms, parser=JsonParser()
self.clock,
self.reactor,
_sec_timeout,
request,
response,
start_ms,
parser=JsonParser(),
)
return body

View File

@@ -161,12 +161,13 @@ class ProxyResource(_AsyncResource):
bodyProducer=QuieterFileBodyProducer(request.content),
)
request_deferred = timeout_deferred(
request_deferred,
deferred=request_deferred,
# This should be set longer than the timeout in `MatrixFederationHttpClient`
# so that it has enough time to complete and pass us the data before we give
# up.
timeout=90,
reactor=self.reactor,
cancel_on_shutdown=False, # We don't track this call since it will typically be short
clock=self._clock,
)
response = await make_deferred_yieldable(request_deferred)

View File

@@ -411,8 +411,26 @@ class DirectServeJsonResource(_AsyncResource):
# Clock is optional as this class is exposed to the module API.
clock: Optional[Clock] = None,
):
"""
Args:
canonical_json: TODO
extract_context: TODO
clock: This is expected to be passed in by any Synapse code.
Only optional for the Module API.
"""
if clock is None:
clock = Clock(cast(ISynapseThreadlessReactor, reactor))
# Ideally we wouldn't ignore the linter error here and instead enforce a
# required `Clock` be passed into the `__init__` function.
# However, this would change the function signature which is currently being
# exported to the module api. Since we don't want to break that api, we have
# to settle with ignoring the linter error here.
# As of the time of writing this, all Synapse internal usages of
# `DirectServeJsonResource` pass in the existing homeserver clock instance.
clock = Clock( # type: ignore[multiple-internal-clocks]
cast(ISynapseThreadlessReactor, reactor),
server_name="synapse_module_running_from_unknown_server",
)
super().__init__(clock, extract_context)
self.canonical_json = canonical_json
@@ -590,8 +608,24 @@ class DirectServeHtmlResource(_AsyncResource):
# Clock is optional as this class is exposed to the module API.
clock: Optional[Clock] = None,
):
"""
Args:
extract_context: TODO
clock: This is expected to be passed in by any Synapse code.
Only optional for the Module API.
"""
if clock is None:
clock = Clock(cast(ISynapseThreadlessReactor, reactor))
# Ideally we wouldn't ignore the linter error here and instead enforce a
# required `Clock` be passed into the `__init__` function.
# However, this would change the function signature which is currently being
# exported to the module api. Since we don't want to break that api, we have
# to settle with ignoring the linter error here.
# As of the time of writing this, all Synapse internal usages of
# `DirectServeHtmlResource` pass in the existing homeserver clock instance.
clock = Clock( # type: ignore[multiple-internal-clocks]
cast(ISynapseThreadlessReactor, reactor),
server_name="synapse_module_running_from_unknown_server",
)
super().__init__(clock, extract_context)

View File

@@ -22,7 +22,7 @@ import contextlib
import logging
import time
from http import HTTPStatus
from typing import TYPE_CHECKING, Any, Generator, Optional, Tuple, Union
from typing import TYPE_CHECKING, Any, Generator, List, Optional, Tuple, Union
import attr
from zope.interface import implementer
@@ -30,6 +30,7 @@ from zope.interface import implementer
from twisted.internet.address import UNIXAddress
from twisted.internet.defer import Deferred
from twisted.internet.interfaces import IAddress
from twisted.internet.protocol import Protocol
from twisted.python.failure import Failure
from twisted.web.http import HTTPChannel
from twisted.web.resource import IResource, Resource
@@ -302,10 +303,15 @@ class SynapseRequest(Request):
# this is called once a Resource has been found to serve the request; in our
# case the Resource in question will normally be a JsonResource.
# create a LogContext for this request
# Create a LogContext for this request
#
# We only care about associating logs and tallying up metrics at the per-request
# level so we don't worry about setting the `parent_context`; preventing us from
# unnecessarily piling up metrics on the main process's context.
request_id = self.get_request_id()
self.logcontext = LoggingContext(
request_id,
name=request_id,
server_name=self.our_server_name,
request=ContextRequest(
request_id=request_id,
ip_address=self.get_client_ip_if_available(),
@@ -655,6 +661,70 @@ class _XForwardedForAddress:
host: str
class SynapseProtocol(HTTPChannel):
"""
Synapse-specific twisted http Protocol.
This is a small wrapper around the twisted HTTPChannel so we can track active
connections in order to close any outstanding connections on shutdown.
"""
def __init__(
self,
site: "SynapseSite",
our_server_name: str,
max_request_body_size: int,
request_id_header: Optional[str],
request_class: type,
):
super().__init__()
self.factory: SynapseSite = site
self.site = site
self.our_server_name = our_server_name
self.max_request_body_size = max_request_body_size
self.request_id_header = request_id_header
self.request_class = request_class
def connectionMade(self) -> None:
"""
Called when a connection is made.
This may be considered the initializer of the protocol, because
it is called when the connection is completed.
Add the connection to the factory's connection list when it's established.
"""
super().connectionMade()
self.factory.addConnection(self)
def connectionLost(self, reason: Failure) -> None: # type: ignore[override]
"""
Called when the connection is shut down.
Clear any circular references here, and any external references to this
Protocol. The connection has been closed. In our case, we need to remove the
connection from the factory's connection list, when it's lost.
"""
super().connectionLost(reason)
self.factory.removeConnection(self)
def requestFactory(self, http_channel: HTTPChannel, queued: bool) -> SynapseRequest: # type: ignore[override]
"""
A callable used to build `twisted.web.iweb.IRequest` objects.
Use our own custom SynapseRequest type instead of the regular
twisted.web.server.Request.
"""
return self.request_class(
self,
self.factory,
our_server_name=self.our_server_name,
max_request_body_size=self.max_request_body_size,
queued=queued,
request_id_header=self.request_id_header,
)
class SynapseSite(ProxySite):
"""
Synapse-specific twisted http Site
@@ -705,23 +775,44 @@ class SynapseSite(ProxySite):
assert config.http_options is not None
proxied = config.http_options.x_forwarded
request_class = XForwardedForRequest if proxied else SynapseRequest
self.request_class = XForwardedForRequest if proxied else SynapseRequest
request_id_header = config.http_options.request_id_header
self.request_id_header = config.http_options.request_id_header
self.max_request_body_size = max_request_body_size
def request_factory(channel: HTTPChannel, queued: bool) -> Request:
return request_class(
channel,
self,
our_server_name=self.server_name,
max_request_body_size=max_request_body_size,
queued=queued,
request_id_header=request_id_header,
)
self.requestFactory = request_factory # type: ignore
self.access_logger = logging.getLogger(logger_name)
self.server_version_string = server_version_string.encode("ascii")
self.connections: List[Protocol] = []
def buildProtocol(self, addr: IAddress) -> SynapseProtocol:
protocol = SynapseProtocol(
self,
self.server_name,
self.max_request_body_size,
self.request_id_header,
self.request_class,
)
return protocol
def addConnection(self, protocol: Protocol) -> None:
self.connections.append(protocol)
def removeConnection(self, protocol: Protocol) -> None:
if protocol in self.connections:
self.connections.remove(protocol)
def stopFactory(self) -> None:
super().stopFactory()
# Shutdown any connections which are still active.
# These can be long lived HTTP connections which wouldn't normally be closed
# when calling `shutdown` on the respective `Port`.
# Closing the connections here is required for us to fully shutdown the
# `SynapseHomeServer` in order for it to be garbage collected.
for protocol in self.connections[:]:
if protocol.transport is not None:
protocol.transport.loseConnection()
self.connections.clear()
def log(self, request: SynapseRequest) -> None: # type: ignore[override]
pass

View File

@@ -238,12 +238,13 @@ class _Sentinel:
we should always know which server the logs are coming from.
"""
__slots__ = ["previous_context", "finished", "request", "tag"]
__slots__ = ["previous_context", "finished", "server_name", "request", "tag"]
def __init__(self) -> None:
# Minimal set for compatibility with LoggingContext
self.previous_context = None
self.finished = False
self.server_name = "unknown_server_from_sentinel_context"
self.request = None
self.tag = None
@@ -282,14 +283,19 @@ class LoggingContext:
child to the parent
Args:
name: Name for the context for logging. If this is omitted, it is
inherited from the parent context.
name: Name for the context for logging.
server_name: The name of the server this context is associated with
(`config.server.server_name` or `hs.hostname`)
parent_context (LoggingContext|None): The parent of the new context
request: Synapse Request Context object. Useful to associate all the logs
happening to a given request.
"""
__slots__ = [
"previous_context",
"name",
"server_name",
"parent_context",
"_resource_usage",
"usage_start",
@@ -301,7 +307,9 @@ class LoggingContext:
def __init__(
self,
name: Optional[str] = None,
*,
name: str,
server_name: str,
parent_context: "Optional[LoggingContext]" = None,
request: Optional[ContextRequest] = None,
) -> None:
@@ -314,6 +322,8 @@ class LoggingContext:
# if the context is not currently active.
self.usage_start: Optional[resource.struct_rusage] = None
self.name = name
self.server_name = server_name
self.main_thread = get_thread_id()
self.request = None
self.tag = ""
@@ -325,23 +335,15 @@ class LoggingContext:
self.parent_context = parent_context
# Inherit some fields from the parent context
if self.parent_context is not None:
# we track the current request_id
# which request this corresponds to
self.request = self.parent_context.request
if request is not None:
# the request param overrides the request from the parent context
self.request = request
# if we don't have a `name`, but do have a parent context, use its name.
if self.parent_context and name is None:
name = str(self.parent_context)
if name is None:
raise ValueError(
"LoggingContext must be given either a name or a parent context"
)
self.name = name
def __str__(self) -> str:
return self.name
@@ -588,7 +590,26 @@ class LoggingContextFilter(logging.Filter):
record.
"""
def __init__(self, request: str = ""):
def __init__(
self,
# `request` is here for backwards compatibility since we previously recommended
# people manually configure `LoggingContextFilter` like the following.
#
# ```yaml
# filters:
# context:
# (): synapse.logging.context.LoggingContextFilter
# request: ""
# ```
#
# TODO: Since we now configure `LoggingContextFilter` automatically since #8051
# (2020-08-11), we could consider removing this useless parameter. This would
# require people to remove their own manual configuration of
# `LoggingContextFilter` as it would cause `TypeError: Filter.__init__() got an
# unexpected keyword argument 'request'` -> `ValueError: Unable to configure
# filter 'context'`
request: str = "",
):
self._default_request = request
def filter(self, record: logging.LogRecord) -> Literal[True]:
@@ -598,11 +619,13 @@ class LoggingContextFilter(logging.Filter):
"""
context = current_context()
record.request = self._default_request
record.server_name = "unknown_server_from_no_context"
# context should never be None, but if it somehow ends up being, then
# we end up in a death spiral of infinite loops, so let's check, for
# robustness' sake.
if context is not None:
record.server_name = context.server_name
# Logging is interested in the request ID. Note that for backwards
# compatibility this is stored as the "request" on the record.
record.request = str(context)
@@ -728,12 +751,15 @@ def nested_logging_context(suffix: str) -> LoggingContext:
"Starting nested logging context from sentinel context: metrics will be lost"
)
parent_context = None
server_name = "unknown_server_from_sentinel_context"
else:
assert isinstance(curr_context, LoggingContext)
parent_context = curr_context
server_name = parent_context.server_name
prefix = str(curr_context)
return LoggingContext(
prefix + "-" + suffix,
name=prefix + "-" + suffix,
server_name=server_name,
parent_context=parent_context,
)
@@ -1058,12 +1084,18 @@ def defer_to_threadpool(
"Calling defer_to_threadpool from sentinel context: metrics will be lost"
)
parent_context = None
server_name = "unknown_server_from_sentinel_context"
else:
assert isinstance(curr_context, LoggingContext)
parent_context = curr_context
server_name = parent_context.server_name
def g() -> R:
with LoggingContext(str(curr_context), parent_context=parent_context):
with LoggingContext(
name=str(curr_context),
server_name=server_name,
parent_context=parent_context,
):
return f(*args, **kwargs)
return make_deferred_yieldable(threads.deferToThreadPool(reactor, threadpool, g))

View File

@@ -1,38 +0,0 @@
#
# This file is licensed under the Affero General Public License (AGPL) version 3.
#
# Copyright 2020 The Matrix.org Foundation C.I.C.
# Copyright (C) 2023 New Vector, Ltd
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as
# published by the Free Software Foundation, either version 3 of the
# License, or (at your option) any later version.
#
# See the GNU Affero General Public License for more details:
# <https://www.gnu.org/licenses/agpl-3.0.html>.
#
# Originally licensed under the Apache License, Version 2.0:
# <http://www.apache.org/licenses/LICENSE-2.0>.
#
# [This file includes modifications made by New Vector Limited]
#
#
import logging
from typing import Literal
class MetadataFilter(logging.Filter):
"""Logging filter that adds constant values to each record.
Args:
metadata: Key-value pairs to add to each record.
"""
def __init__(self, metadata: dict):
self._metadata = metadata
def filter(self, record: logging.LogRecord) -> Literal[True]:
for key, value in self._metadata.items():
setattr(record, key, value)
return True

Some files were not shown because too many files have changed in this diff Show More