Fix CPU time going backwards when daemonize

When we `daemonize`, we fork the process and cputime metrics get confused about the per-thread resource usage appearing to go backwards because we're comparing the resource usage (`rusage`) from the original process to the forked process. We now kick off the background tasks (`run_as_background_process`) after we have forked the process so the `rusage` we record when we `start` is in the same thread when we `stop`. Bad log examples from before: ``` synapse.logging.context - ERROR - _schedule_next_expiry-0 - utime went backwards! 0.050467 < 0.886526 synapse.logging.context - ERROR - _schedule_db_events-0 - stime went backwards! 0.009941 < 0.155106 synapse.logging.context - ERROR - wake_destinations_needing_catchup-0 - stime went backwards! 0.010175 < 0.130923 synapse.logging.context - ERROR - resume_sync_partial_state_room-0 - utime went backwards! 0.052898 < 0.886526 ``` Testing strategy: 1. Run with `daemonize: true` in your `homeserver.yaml` 1. `poetry run synapse_homeserver --config-path homeserver.yaml` 1. Shutdown the server 1. Look for any bad log entries in your homeserver logs: - `Expected logging context sentinel but found main` - `Expected logging context main was lost` - `utime went backwards!`/`stime went backwards!`
Fix lints
2025-12-05 01:10:13 +00:00 · 2025-09-02 15:11:04 -05:00 · 2025-09-02 14:28:41 -05:00 · 2025-09-02 14:07:26 -05:00 · 2025-09-02 13:39:44 -05:00 · 2025-09-02 13:34:47 -05:00
75 changed files with 1077 additions and 1349 deletions
--- a/.github/workflows/docker.yml
+++ b/.github/workflows/docker.yml
@@ -31,7 +31,7 @@ jobs:
        uses: docker/setup-buildx-action@e468171a9de216ec08956ac3ada2f0791b6bd435 # v3.11.1

      - name: Checkout repository
-        uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+        uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0

      - name: Extract version from pyproject.toml
        # Note: explicitly requesting bash will mean bash is invoked with `-eo pipefail`, see
@@ -41,13 +41,13 @@ jobs:
          echo "SYNAPSE_VERSION=$(grep "^version" pyproject.toml | sed -E 's/version\s*=\s*["]([^"]*)["]/\1/')" >> $GITHUB_ENV

      - name: Log in to DockerHub
-        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0
+        uses: docker/login-action@184bdaa0721073962dff0199f1fb9940f07167d1 # v3.5.0
        with:
          username: ${{ secrets.DOCKERHUB_USERNAME }}
          password: ${{ secrets.DOCKERHUB_TOKEN }}

      - name: Log in to GHCR
-        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0
+        uses: docker/login-action@184bdaa0721073962dff0199f1fb9940f07167d1 # v3.5.0
        with:
          registry: ghcr.io
          username: ${{ github.repository_owner }}
@@ -102,14 +102,14 @@ jobs:
          merge-multiple: true

      - name: Log in to DockerHub
-        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0
+        uses: docker/login-action@184bdaa0721073962dff0199f1fb9940f07167d1 # v3.5.0
        if: ${{ startsWith(matrix.repository, 'docker.io') }}
        with:
          username: ${{ secrets.DOCKERHUB_USERNAME }}
          password: ${{ secrets.DOCKERHUB_TOKEN }}

      - name: Log in to GHCR
-        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0
+        uses: docker/login-action@184bdaa0721073962dff0199f1fb9940f07167d1 # v3.5.0
        if: ${{ startsWith(matrix.repository, 'ghcr.io') }}
        with:
          registry: ghcr.io
--- a/.github/workflows/docs-pr.yaml
+++ b/.github/workflows/docs-pr.yaml
@@ -13,7 +13,7 @@ jobs:
    name: GitHub Pages
    runs-on: ubuntu-latest
    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
        with:
          # Fetch all history so that the schema_versions script works.
          fetch-depth: 0
@@ -50,7 +50,7 @@ jobs:
    name: Check links in documentation
    runs-on: ubuntu-latest
    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0

      - name: Setup mdbook
        uses: peaceiris/actions-mdbook@ee69d230fe19748b7abf22df32acaa93833fad08 # v2.0.0
--- a/.github/workflows/docs.yaml
+++ b/.github/workflows/docs.yaml
@@ -50,7 +50,7 @@ jobs:
    needs:
      - pre
    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
        with:
          # Fetch all history so that the schema_versions script works.
          fetch-depth: 0
--- a/.github/workflows/fix_lint.yaml
+++ b/.github/workflows/fix_lint.yaml
@@ -18,10 +18,10 @@ jobs:

    steps:
      - name: Checkout repository
-        uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+        uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0

      - name: Install Rust
-        uses: dtolnay/rust-toolchain@b3b07ba8b418998c39fb20f53e8b695cdcc8de1b # master
+        uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
        with:
          toolchain: ${{ env.RUST_VERSION }}
          components: clippy, rustfmt
--- a/.github/workflows/latest_deps.yml
+++ b/.github/workflows/latest_deps.yml
@@ -42,9 +42,9 @@ jobs:
    if: needs.check_repo.outputs.should_run_workflow == 'true'
    runs-on: ubuntu-latest
    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
      - name: Install Rust
-        uses: dtolnay/rust-toolchain@b3b07ba8b418998c39fb20f53e8b695cdcc8de1b # master
+        uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
        with:
          toolchain: ${{ env.RUST_VERSION }}
      - uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
@@ -77,10 +77,10 @@ jobs:
            postgres-version: "14"

    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0

      - name: Install Rust
-        uses: dtolnay/rust-toolchain@b3b07ba8b418998c39fb20f53e8b695cdcc8de1b # master
+        uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
        with:
          toolchain: ${{ env.RUST_VERSION }}
      - uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
@@ -152,10 +152,10 @@ jobs:
      BLACKLIST: ${{ matrix.workers && 'synapse-blacklist-with-workers' }}

    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0

      - name: Install Rust
-        uses: dtolnay/rust-toolchain@b3b07ba8b418998c39fb20f53e8b695cdcc8de1b # master
+        uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
        with:
          toolchain: ${{ env.RUST_VERSION }}
      - uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
@@ -202,7 +202,7 @@ jobs:

    steps:
      - name: Check out synapse codebase
-        uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+        uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
        with:
          path: synapse

@@ -234,7 +234,7 @@ jobs:
    runs-on: ubuntu-latest

    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
      - uses: JasonEtco/create-an-issue@1b14a70e4d8dc185e5cc76d3bec9eab20257b2c5 # v2.9.2
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
--- a/.github/workflows/poetry_lockfile.yaml
+++ b/.github/workflows/poetry_lockfile.yaml
@@ -16,7 +16,7 @@ jobs:
    name: "Check locked dependencies have sdists"
    runs-on: ubuntu-latest
    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
      - uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
        with:
          python-version: '3.x'
--- a/.github/workflows/push_complement_image.yml
+++ b/.github/workflows/push_complement_image.yml
@@ -33,22 +33,22 @@ jobs:
      packages: write
    steps:
      - name: Checkout specific branch (debug build)
-        uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+        uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
        if: github.event_name == 'workflow_dispatch'
        with:
          ref: ${{ inputs.branch }}
      - name: Checkout clean copy of develop (scheduled build)
-        uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+        uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
        if: github.event_name == 'schedule'
        with:
          ref: develop
      - name: Checkout clean copy of master (on-push)
-        uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+        uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
        if: github.event_name == 'push'
        with:
          ref: master
      - name: Login to registry
-        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0
+        uses: docker/login-action@184bdaa0721073962dff0199f1fb9940f07167d1 # v3.5.0
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
--- a/.github/workflows/release-artifacts.yml
+++ b/.github/workflows/release-artifacts.yml
@@ -27,7 +27,7 @@ jobs:
    name: "Calculate list of debian distros"
    runs-on: ubuntu-latest
    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
      - uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
        with:
          python-version: "3.x"
@@ -55,7 +55,7 @@ jobs:

    steps:
      - name: Checkout
-        uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+        uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
        with:
          path: src

@@ -132,7 +132,7 @@ jobs:
            os: "ubuntu-24.04-arm"

    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0

      - uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
        with:
@@ -165,7 +165,7 @@ jobs:
    if: ${{ !startsWith(github.ref, 'refs/pull/') }}

    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
      - uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
        with:
          python-version: "3.10"
--- a/.github/workflows/schema.yaml
+++ b/.github/workflows/schema.yaml
@@ -14,7 +14,7 @@ jobs:
    name: Ensure Synapse config schema is valid
    runs-on: ubuntu-latest
    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
      - uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
        with:
          python-version: "3.x"
@@ -40,7 +40,7 @@ jobs:
    name: Ensure generated documentation is up-to-date
    runs-on: ubuntu-latest
    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
      - uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
        with:
          python-version: "3.x"
--- a/.github/workflows/tests.yml
+++ b/.github/workflows/tests.yml
@@ -86,9 +86,9 @@ jobs:
    if: ${{ needs.changes.outputs.linting == 'true' }}

    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
      - name: Install Rust
-        uses: dtolnay/rust-toolchain@b3b07ba8b418998c39fb20f53e8b695cdcc8de1b # master
+        uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
        with:
          toolchain: ${{ env.RUST_VERSION }}
      - uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
@@ -106,7 +106,7 @@ jobs:
    if: ${{ needs.changes.outputs.linting == 'true' }}

    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
      - uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
        with:
          python-version: "3.x"
@@ -116,7 +116,7 @@ jobs:
  check-lockfile:
    runs-on: ubuntu-latest
    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
      - uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
        with:
          python-version: "3.x"
@@ -129,7 +129,7 @@ jobs:

    steps:
      - name: Checkout repository
-        uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+        uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0

      - name: Setup Poetry
        uses: matrix-org/setup-python-poetry@5bbf6603c5c930615ec8a29f1b5d7d258d905aa4 # v2.0.0
@@ -151,10 +151,10 @@ jobs:

    steps:
      - name: Checkout repository
-        uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+        uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0

      - name: Install Rust
-        uses: dtolnay/rust-toolchain@b3b07ba8b418998c39fb20f53e8b695cdcc8de1b # master
+        uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
        with:
          toolchain: ${{ env.RUST_VERSION }}
      - uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
@@ -187,7 +187,7 @@ jobs:
  lint-crlf:
    runs-on: ubuntu-latest
    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
      - name: Check line endings
        run: scripts-dev/check_line_terminators.sh

@@ -195,7 +195,7 @@ jobs:
    if: ${{ (github.base_ref == 'develop'  || contains(github.base_ref, 'release-')) && github.actor != 'dependabot[bot]' }}
    runs-on: ubuntu-latest
    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
        with:
          ref: ${{ github.event.pull_request.head.sha }}
          fetch-depth: 0
@@ -213,11 +213,11 @@ jobs:
    if: ${{ needs.changes.outputs.linting == 'true' }}

    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
        with:
          ref: ${{ github.event.pull_request.head.sha }}
      - name: Install Rust
-        uses: dtolnay/rust-toolchain@b3b07ba8b418998c39fb20f53e8b695cdcc8de1b # master
+        uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
        with:
          toolchain: ${{ env.RUST_VERSION }}
      - uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
@@ -233,10 +233,10 @@ jobs:
    if: ${{ needs.changes.outputs.rust == 'true' }}

    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0

      - name: Install Rust
-        uses: dtolnay/rust-toolchain@b3b07ba8b418998c39fb20f53e8b695cdcc8de1b # master
+        uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
        with:
            components: clippy
            toolchain: ${{ env.RUST_VERSION }}
@@ -252,10 +252,10 @@ jobs:
    if: ${{ needs.changes.outputs.rust == 'true' }}

    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0

      - name: Install Rust
-        uses: dtolnay/rust-toolchain@b3b07ba8b418998c39fb20f53e8b695cdcc8de1b # master
+        uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
        with:
            toolchain: nightly-2025-04-23
            components: clippy
@@ -270,10 +270,10 @@ jobs:

    steps:
      - name: Checkout repository
-        uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+        uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0

      - name: Install Rust
-        uses: dtolnay/rust-toolchain@b3b07ba8b418998c39fb20f53e8b695cdcc8de1b # master
+        uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
        with:
          toolchain: ${{ env.RUST_VERSION }}
      - uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
@@ -306,10 +306,10 @@ jobs:
    if: ${{ needs.changes.outputs.rust == 'true' }}

    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0

      - name: Install Rust
-        uses: dtolnay/rust-toolchain@b3b07ba8b418998c39fb20f53e8b695cdcc8de1b # master
+        uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
        with:
          # We use nightly so that we can use some unstable options that we use in
          # `.rustfmt.toml`.
@@ -326,7 +326,7 @@ jobs:
    needs: changes
    if: ${{ needs.changes.outputs.linting_readme == 'true' }}
    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
      - uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
        with:
          python-version: "3.x"
@@ -376,7 +376,7 @@ jobs:
    needs: linting-done
    runs-on: ubuntu-latest
    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
      - uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
        with:
          python-version: "3.x"
@@ -397,7 +397,7 @@ jobs:
        job:  ${{ fromJson(needs.calculate-test-jobs.outputs.trial_test_matrix) }}

    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
      - run: sudo apt-get -qq install xmlsec1
      - name: Set up PostgreSQL ${{ matrix.job.postgres-version }}
        if: ${{ matrix.job.postgres-version }}
@@ -412,7 +412,7 @@ jobs:
            postgres:${{ matrix.job.postgres-version }}

      - name: Install Rust
-        uses: dtolnay/rust-toolchain@b3b07ba8b418998c39fb20f53e8b695cdcc8de1b # master
+        uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
        with:
          toolchain: ${{ env.RUST_VERSION }}
      - uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
@@ -453,10 +453,10 @@ jobs:
      - changes
    runs-on: ubuntu-22.04
    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0

      - name: Install Rust
-        uses: dtolnay/rust-toolchain@b3b07ba8b418998c39fb20f53e8b695cdcc8de1b # master
+        uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
        with:
          toolchain: ${{ env.RUST_VERSION }}
      - uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
@@ -518,7 +518,7 @@ jobs:
        extras: ["all"]

    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
      # Install libs necessary for PyPy to build binary wheels for dependencies
      - run: sudo apt-get -qq install xmlsec1 libxml2-dev libxslt-dev
      - uses: matrix-org/setup-python-poetry@5bbf6603c5c930615ec8a29f1b5d7d258d905aa4 # v2.0.0
@@ -568,12 +568,12 @@ jobs:
        job: ${{ fromJson(needs.calculate-test-jobs.outputs.sytest_test_matrix) }}

    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
      - name: Prepare test blacklist
        run: cat sytest-blacklist .ci/worker-blacklist > synapse-blacklist-with-workers

      - name: Install Rust
-        uses: dtolnay/rust-toolchain@b3b07ba8b418998c39fb20f53e8b695cdcc8de1b # master
+        uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
        with:
          toolchain: ${{ env.RUST_VERSION }}
      - uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
@@ -615,7 +615,7 @@ jobs:
          --health-retries 5

    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
      - run: sudo apt-get -qq install xmlsec1 postgresql-client
      - uses: matrix-org/setup-python-poetry@5bbf6603c5c930615ec8a29f1b5d7d258d905aa4 # v2.0.0
        with:
@@ -659,7 +659,7 @@ jobs:
          --health-retries 5

    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
      - name: Add PostgreSQL apt repository
        # We need a version of pg_dump that can handle the version of
        # PostgreSQL being tested against. The Ubuntu package repository lags
@@ -714,12 +714,12 @@ jobs:

    steps:
      - name: Checkout synapse codebase
-        uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+        uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
        with:
          path: synapse

      - name: Install Rust
-        uses: dtolnay/rust-toolchain@b3b07ba8b418998c39fb20f53e8b695cdcc8de1b # master
+        uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
        with:
          toolchain: ${{ env.RUST_VERSION }}
      - uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
@@ -750,10 +750,10 @@ jobs:
      - changes

    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0

      - name: Install Rust
-        uses: dtolnay/rust-toolchain@b3b07ba8b418998c39fb20f53e8b695cdcc8de1b # master
+        uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
        with:
          toolchain: ${{ env.RUST_VERSION }}
      - uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
@@ -770,10 +770,10 @@ jobs:
      - changes

    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0

      - name: Install Rust
-        uses: dtolnay/rust-toolchain@b3b07ba8b418998c39fb20f53e8b695cdcc8de1b # master
+        uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
        with:
            toolchain: nightly-2022-12-01
      - uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
--- a/.github/workflows/triage_labelled.yml
+++ b/.github/workflows/triage_labelled.yml
@@ -11,7 +11,7 @@ jobs:
    if: >
      contains(github.event.issue.labels.*.name, 'X-Needs-Info')
    steps:
-      - uses: actions/add-to-project@c0c5949b017d0d4a39f7ba888255881bdac2a823 # v1.0.2
+      - uses: actions/add-to-project@4515659e2b458b27365e167605ac44f219494b66 # v1.0.2
        id: add_project
        with:
          project-url: "https://github.com/orgs/matrix-org/projects/67"
--- a/.github/workflows/twisted_trunk.yml
+++ b/.github/workflows/twisted_trunk.yml
@@ -43,10 +43,10 @@ jobs:
    runs-on: ubuntu-latest

    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0

      - name: Install Rust
-        uses: dtolnay/rust-toolchain@b3b07ba8b418998c39fb20f53e8b695cdcc8de1b # master
+        uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
        with:
          toolchain: ${{ env.RUST_VERSION }}
      - uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
@@ -70,11 +70,11 @@ jobs:
    runs-on: ubuntu-latest

    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
      - run: sudo apt-get -qq install xmlsec1

      - name: Install Rust
-        uses: dtolnay/rust-toolchain@b3b07ba8b418998c39fb20f53e8b695cdcc8de1b # master
+        uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
        with:
          toolchain: ${{ env.RUST_VERSION }}
      - uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
@@ -117,10 +117,10 @@ jobs:
        - ${{ github.workspace }}:/src

    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0

      - name: Install Rust
-        uses: dtolnay/rust-toolchain@b3b07ba8b418998c39fb20f53e8b695cdcc8de1b # master
+        uses: dtolnay/rust-toolchain@e97e2d8cc328f1b50210efc529dca0028893a2d9 # master
        with:
          toolchain: ${{ env.RUST_VERSION }}
      - uses: Swatinem/rust-cache@98c8021b550208e191a6a3145459bfc9fb29c4c0 # v2.8.0
@@ -175,7 +175,7 @@ jobs:

    steps:
      - name: Run actions/checkout@v4 for synapse
-        uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+        uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
        with:
          path: synapse

@@ -217,7 +217,7 @@ jobs:
    runs-on: ubuntu-latest

    steps:
-      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
      - uses: JasonEtco/create-an-issue@1b14a70e4d8dc185e5cc76d3bec9eab20257b2c5 # v2.9.2
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
--- a/CHANGES.md
+++ b/CHANGES.md
@@ -1,3 +1,51 @@
+# Synapse 1.138.0rc1 (2025-09-02)
+
+### Features
+
+- Support for the stable endpoint and scopes of [MSC3861](https://github.com/matrix-org/matrix-spec-proposals/pull/3861) & co. ([\#18549](https://github.com/element-hq/synapse/issues/18549))
+
+### Bugfixes
+
+- Improve database performance of [MSC4293](https://github.com/matrix-org/matrix-spec-proposals/pull/4293) - Redact on Kick/Ban. ([\#18851](https://github.com/element-hq/synapse/issues/18851))
+- Do not throw an error when fetching a rejected delayed state event on startup. ([\#18858](https://github.com/element-hq/synapse/issues/18858))
+
+### Improved Documentation
+
+- Fix worker documentation incorrectly indicating all room Admin API requests were capable of being handled by workers. ([\#18853](https://github.com/element-hq/synapse/issues/18853))
+
+### Internal Changes
+
+- Instrument `_ByteProducer` with tracing to measure potential dead time while writing bytes to the request. ([\#18804](https://github.com/element-hq/synapse/issues/18804))
+- Switch to OpenTracing's `ContextVarsScopeManager` instead of our own custom `LogContextScopeManager`. ([\#18849](https://github.com/element-hq/synapse/issues/18849))
+- Trace how much work is being done while "recursively fetching redactions". ([\#18854](https://github.com/element-hq/synapse/issues/18854))
+- Link [upstream Twisted bug](https://github.com/twisted/twisted/issues/12498) tracking the problem that explains why we have to use a `Producer` to write bytes to the request. ([\#18855](https://github.com/element-hq/synapse/issues/18855))
+- Introduce `EventPersistencePair` type. ([\#18857](https://github.com/element-hq/synapse/issues/18857))
+
+
+
+### Updates to locked dependencies
+
+* Bump actions/add-to-project from c0c5949b017d0d4a39f7ba888255881bdac2a823 to 4515659e2b458b27365e167605ac44f219494b66. ([\#18863](https://github.com/element-hq/synapse/issues/18863))
+* Bump actions/checkout from 4.3.0 to 5.0.0. ([\#18834](https://github.com/element-hq/synapse/issues/18834))
+* Bump anyhow from 1.0.98 to 1.0.99. ([\#18841](https://github.com/element-hq/synapse/issues/18841))
+* Bump docker/login-action from 3.4.0 to 3.5.0. ([\#18835](https://github.com/element-hq/synapse/issues/18835))
+* Bump dtolnay/rust-toolchain from b3b07ba8b418998c39fb20f53e8b695cdcc8de1b to e97e2d8cc328f1b50210efc529dca0028893a2d9. ([\#18862](https://github.com/element-hq/synapse/issues/18862))
+* Bump phonenumbers from 9.0.11 to 9.0.12. ([\#18837](https://github.com/element-hq/synapse/issues/18837))
+* Bump regex from 1.11.1 to 1.11.2. ([\#18864](https://github.com/element-hq/synapse/issues/18864))
+* Bump reqwest from 0.12.22 to 0.12.23. ([\#18842](https://github.com/element-hq/synapse/issues/18842))
+* Bump ruff from 0.12.7 to 0.12.10. ([\#18865](https://github.com/element-hq/synapse/issues/18865))
+* Bump serde_json from 1.0.142 to 1.0.143. ([\#18866](https://github.com/element-hq/synapse/issues/18866))
+* Bump types-bleach from 6.2.0.20250514 to 6.2.0.20250809. ([\#18838](https://github.com/element-hq/synapse/issues/18838))
+* Bump types-jsonschema from 4.25.0.20250720 to 4.25.1.20250822. ([\#18867](https://github.com/element-hq/synapse/issues/18867))
+* Bump types-psycopg2 from 2.9.21.20250718 to 2.9.21.20250809. ([\#18836](https://github.com/element-hq/synapse/issues/18836))
+
+# Synapse 1.137.0 (2025-08-26)
+
+No significant changes since 1.137.0rc1.
+
+
+
+
 # Synapse 1.137.0rc1 (2025-08-19)

 ### Bugfixes
--- a/Cargo.lock
+++ b/Cargo.lock
@@ -28,9 +28,9 @@ dependencies = [

 [[package]]
 name = "anyhow"
-version = "1.0.98"
+version = "1.0.99"
 source = "registry+https://github.com/rust-lang/crates.io-index"
-checksum = "e16d2d3311acee920a9eb8d33b8cbc1787ce4a264e85f964c2404b969bdcd487"
+checksum = "b0674a1ddeecb70197781e945de4b3b8ffb61fa939a5597bcf48503737663100"

 [[package]]
 name = "arc-swap"
@@ -1062,9 +1062,9 @@ dependencies = [

 [[package]]
 name = "regex"
-version = "1.11.1"
+version = "1.11.2"
 source = "registry+https://github.com/rust-lang/crates.io-index"
-checksum = "b544ef1b4eac5dc2db33ea63606ae9ffcfac26c1416a2806ae0bf5f56b201191"
+checksum = "23d7fd106d8c02486a8d64e778353d1cffe08ce79ac2e82f540c86d0facf6912"
 dependencies = [
 "aho-corasick",
 "memchr",
@@ -1091,9 +1091,9 @@ checksum = "2b15c43186be67a4fd63bee50d0303afffcef381492ebe2c5d87f324e1b8815c"

 [[package]]
 name = "reqwest"
-version = "0.12.22"
+version = "0.12.23"
 source = "registry+https://github.com/rust-lang/crates.io-index"
-checksum = "cbc931937e6ca3a06e3b6c0aa7841849b160a90351d6ab467a8b9b9959767531"
+checksum = "d429f34c8092b2d42c7c93cec323bb4adeb7c67698f70839adec842ec10c7ceb"
 dependencies = [
 "base64",
 "bytes",
@@ -1270,9 +1270,9 @@ dependencies = [

 [[package]]
 name = "serde_json"
-version = "1.0.142"
+version = "1.0.143"
 source = "registry+https://github.com/rust-lang/crates.io-index"
-checksum = "030fedb782600dcbd6f02d479bf0d817ac3bb40d644745b769d6a96bc3afc5a7"
+checksum = "d401abef1d108fbd9cbaebc3e46611f4b1021f714a0597a71f41ee463f5f4a5a"
 dependencies = [
 "itoa",
 "memchr",
--- a/changelog.d/18791.misc
+++ b/changelog.d/18791.misc
@@ -0,0 +1 @@
+Fix `LaterGauge` metrics to collect from all servers.
--- a/changelog.d/18871.misc
+++ b/changelog.d/18871.misc
@@ -0,0 +1 @@
+Store the `LoggingContext` in a `ContextVar` instead of a thread-local variable.
--- a/changelog.d/18878.docker
+++ b/changelog.d/18878.docker
@@ -0,0 +1 @@
+Suppress "Applying schema" log noise bulk when `SYNAPSE_LOG_TESTING` is set.
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,15 @@
+matrix-synapse-py3 (1.138.0~rc1) stable; urgency=medium
+
+  * New synapse release 1.138.0rc1.
+
+ -- Synapse Packaging team <packages@matrix.org>  Tue, 02 Sep 2025 12:16:14 +0000
+
+matrix-synapse-py3 (1.137.0) stable; urgency=medium
+
+  * New Synapse release 1.137.0.
+
+ -- Synapse Packaging team <packages@matrix.org>  Tue, 26 Aug 2025 10:23:41 +0100
+
 matrix-synapse-py3 (1.137.0~rc1) stable; urgency=medium

  * New Synapse release 1.137.0rc1.
--- a/docker/conf/log.config
+++ b/docker/conf/log.config
@@ -77,6 +77,13 @@ loggers:
    #}
    synapse.visibility.filtered_event_debug:
        level: DEBUG
+
+    {#
+      If Synapse is under test, we don't care about seeing the "Applying schema" log
+      lines at the INFO level every time we run the tests (it's 100 lines of bulk)
+    #}
+    synapse.storage.prepare_database:
+      level: WARN
    {% endif %}

 root:
--- a/docs/log_contexts.md
+++ b/docs/log_contexts.md
@@ -1,12 +1,12 @@
 # Log Contexts

 To help track the processing of individual requests, synapse uses a
-'`log context`' to track which request it is handling at any given
-moment. This is done via a thread-local variable; a `logging.Filter` is
-then used to fish the information back out of the thread-local variable
+`LoggingContext` to track which request it is handling at any given
+moment. This is done via a `ContextVar` variable; a `logging.Filter` is
+then used to fish the information back out of the `ContextVar` variable
 and add it to each log record.

-Logcontexts are also used for CPU and database accounting, so that we
+Log contexts are also used for CPU and database accounting, so that we
 can track which requests were responsible for high CPU use or database
 activity.

@@ -14,18 +14,11 @@ The `synapse.logging.context` module provides facilities for managing
 the current log context (as well as providing the `LoggingContextFilter`
 class).

-Asynchronous functions make the whole thing complicated, so this document describes
-how it all works, and how to write code which follows the rules.
-
-In this document, "awaitable" refers to any object which can be `await`ed. In the context of
-Synapse, that normally means either a coroutine or a Twisted 
+In this document, "awaitable" refers to any object which can be `await`ed. In the
+context of Synapse, that normally means either a coroutine or a Twisted
 [`Deferred`](https://twistedmatrix.com/documents/current/api/twisted.internet.defer.Deferred.html).

-## Logcontexts without asynchronous code
-
-In the absence of any asynchronous voodoo, things are simple enough. As with
-any code of this nature, the rule is that our function should leave
-things as it found them:
+## Basic usage

 ```python
 from synapse.logging import context         # omitted from future snippets
@@ -45,7 +38,7 @@ def do_request_handling():
    logger.debug("phew")  # this will be logged against request_id
 ```

-LoggingContext implements the context management methods, so the above
+`LoggingContext` implements the context management methods, so the above
 can be written much more succinctly as:

 ```python
@@ -59,197 +52,76 @@ def do_request_handling():
    logger.debug("phew")
 ```

-## Using logcontexts with awaitables
+### The `sentinel` context

-Awaitables break the linear flow of code so that there is no longer a single entry point
-where we should set the logcontext and a single exit point where we should remove it.
+The default context is `context.SENTINEL_CONTEXT`, which is a sentinel value to
+represent the root context. This is what is used when there is no other context set.

-Consider the example above, where `do_request_handling` needs to do some
-blocking operation, and returns an awaitable:
+No CPU/database usage metrics are recorded against the `sentinel` context.
+
+Ideally, nothing from the Synapse homeserver would be logged against the `sentinel`
+context as we want to know where the logs came from. In practice, this is not always the
+case yet especially outside of request handling.
+
+Previously, the `sentinel` context played a bigger role when we had to carefully deal
+with thread-local storage; as we had to make sure to not leak another context to another
+task after we gave up control to the reactor so we set the 
+
+
+
+### `PreserveLoggingContext`
+
+In a similar vein of no longer as relevant, `PreserveLoggingContext` is another context
+manager helper and a little bit of syntactic sugar to set the current log context
+(without finishing it) and restore the previous context on exit.

 ```python
-async def handle_request(request_id):
-    with context.LoggingContext() as request_context:
-        request_context.request = request_id
-        await do_request_handling()
+import logging
+from synapse.logging.context import LoggingContext
+
+logger = logging.getLogger(__name__)
+
+def main() -> None:
+    with context.LoggingContext("main"):
+        task_context = context.LoggingContext("task")
+
+        with task_context:
+            logger.debug("foo")
+
+        # Bad: will throw an error because `task_context` is already finished
+        with task_context:
+            logger.debug("bar")
+
        logger.debug("finished")
 ```

-In the above flow:
-
-   The logcontext is set
-   `do_request_handling` is called, and returns an awaitable
-   `handle_request` awaits the awaitable
-   Execution of `handle_request` is suspended
-
-So we have stopped processing the request (and will probably go on to
-start processing the next), without clearing the logcontext.
-
-To circumvent this problem, synapse code assumes that, wherever you have
-an awaitable, you will want to `await` it. To that end, wherever
-functions return awaitables, we adopt the following conventions:
-
-**Rules for functions returning awaitables:**
-
-> -   If the awaitable is already complete, the function returns with the
->     same logcontext it started with.
-> -   If the awaitable is incomplete, the function clears the logcontext
->     before returning; when the awaitable completes, it restores the
->     logcontext before running any callbacks.
-
-That sounds complicated, but actually it means a lot of code (including
-the example above) "just works". There are two cases:
-
-   If `do_request_handling` returns a completed awaitable, then the
-    logcontext will still be in place. In this case, execution will
-    continue immediately after the `await`; the "finished" line will
-    be logged against the right context, and the `with` block restores
-    the original context before we return to the caller.
-   If the returned awaitable is incomplete, `do_request_handling` clears
-    the logcontext before returning. The logcontext is therefore clear
-    when `handle_request` `await`s the awaitable.
-
-    Once `do_request_handling`'s awaitable completes, it will reinstate
-    the logcontext, before running the second half of `handle_request`,
-    so again the "finished" line will be logged against the right context,
-    and the `with` block restores the original context.
-
-As an aside, it's worth noting that `handle_request` follows our rules
- though that only matters if the caller has its own logcontext which it
-cares about.
-
-The following sections describe pitfalls and helpful patterns when
-implementing these rules.
-
-Always await your awaitables
----------------------------
-
-Whenever you get an awaitable back from a function, you should `await` on
-it as soon as possible. Do not pass go; do not do any logging; do not
-call any other functions.
+This can be fixed by using `PreserveLoggingContext`:

 ```python
-async def fun():
-    logger.debug("starting")
-    await do_some_stuff()       # just like this
+import logging
+from synapse.logging.context import LoggingContext

-    coro = more_stuff()
-    result = await coro         # also fine, of course
+logger = logging.getLogger(__name__)

-    return result
+def main() -> None:
+    with context.LoggingContext("main"):
+        task_context = context.LoggingContext("task")
+
+        with PreserveLoggingContext(task_context):
+            logger.debug("foo")
+        with PreserveLoggingContext(task_context):
+            logger.debug("bar")
+
+        logger.debug("finished") # this will be logged against main
 ```

-Provided this pattern is followed all the way back up to the callchain
-to where the logcontext was set, this will make things work out ok:
-provided `do_some_stuff` and `more_stuff` follow the rules above, then
-so will `fun`.
+Or you could equivalently just manage the log context manually via
+`set_current_context`.

-It's all too easy to forget to `await`: for instance if we forgot that
-`do_some_stuff` returned an awaitable, we might plough on regardless. This
-leads to a mess; it will probably work itself out eventually, but not
-before a load of stuff has been logged against the wrong context.
-(Normally, other things will break, more obviously, if you forget to
-`await`, so this tends not to be a major problem in practice.)
-
-Of course sometimes you need to do something a bit fancier with your
-awaitable - not all code follows the linear A-then-B-then-C pattern.
-Notes on implementing more complex patterns are in later sections.
-
-## Where you create a new awaitable, make it follow the rules
-
-Most of the time, an awaitable comes from another synapse function.
-Sometimes, though, we need to make up a new awaitable, or we get an awaitable
-back from external code. We need to make it follow our rules.
-
-The easy way to do it is by using `context.make_deferred_yieldable`. Suppose we want to implement
-`sleep`, which returns a deferred which will run its callbacks after a
-given number of seconds. That might look like:
-
-```python
-# not a logcontext-rules-compliant function
-def get_sleep_deferred(seconds):
-    d = defer.Deferred()
-    reactor.callLater(seconds, d.callback, None)
-    return d
-```
-
-That doesn't follow the rules, but we can fix it by calling it through
-`context.make_deferred_yieldable`:
-
-```python
-async def sleep(seconds):
-    return await context.make_deferred_yieldable(get_sleep_deferred(seconds))
-```

 ## Fire-and-forget

-Sometimes you want to fire off a chain of execution, but not wait for
-its result. That might look a bit like this:
-
-```python
-async def do_request_handling():
-    await foreground_operation()
-
-    # *don't* do this
-    background_operation()
-
-    logger.debug("Request handling complete")
-
-async def background_operation():
-    await first_background_step()
-    logger.debug("Completed first step")
-    await second_background_step()
-    logger.debug("Completed second step")
-```
-
-The above code does a couple of steps in the background after
-`do_request_handling` has finished. The log lines are still logged
-against the `request_context` logcontext, which may or may not be
-desirable. There are two big problems with the above, however. The first
-problem is that, if `background_operation` returns an incomplete
-awaitable, it will expect its caller to `await` immediately, so will have
-cleared the logcontext. In this example, that means that 'Request
-handling complete' will be logged without any context.
-
-The second problem, which is potentially even worse, is that when the
-awaitable returned by `background_operation` completes, it will restore
-the original logcontext. There is nothing waiting on that awaitable, so
-the logcontext will leak into the reactor and possibly get attached to
-some arbitrary future operation.
-
-There are two potential solutions to this.
-
-One option is to surround the call to `background_operation` with a
-`PreserveLoggingContext` call. That will reset the logcontext before
-starting `background_operation` (so the context restored when the
-deferred completes will be the empty logcontext), and will restore the
-current logcontext before continuing the foreground process:
-
-```python
-async def do_request_handling():
-    await foreground_operation()
-
-    # start background_operation off in the empty logcontext, to
-    # avoid leaking the current context into the reactor.
-    with PreserveLoggingContext():
-        background_operation()
-
-    # this will now be logged against the request context
-    logger.debug("Request handling complete")
-```
-
-Obviously that option means that the operations done in
-`background_operation` would be not be logged against a logcontext
-(though that might be fixed by setting a different logcontext via a
-`with LoggingContext(...)` in `background_operation`).
-
-The second option is to use `context.run_in_background`, which wraps a
-function so that it doesn't reset the logcontext even when it returns
-an incomplete awaitable, and adds a callback to the returned awaitable to
-reset the logcontext. In other words, it turns a function that follows
-the Synapse rules about logcontexts and awaitables into one which behaves
-more like an external function --- the opposite operation to that
-described in the previous section. It can be used like this:
+To drive an awaitable in the background, you can use `context.run_in_background`:

 ```python
 async def do_request_handling():
@@ -261,104 +133,13 @@ async def do_request_handling():
    logger.debug("Request handling complete")
 ```

-## Passing synapse deferreds into third-party functions
-
-A typical example of this is where we want to collect together two or
-more awaitables via `defer.gatherResults`:
-
-```python
-a1 = operation1()
-a2 = operation2()
-a3 = defer.gatherResults([a1, a2])
-```
-
-This is really a variation of the fire-and-forget problem above, in that
-we are firing off `a1` and `a2` without awaiting on them. The difference
-is that we now have third-party code attached to their callbacks. Anyway
-either technique given in the [Fire-and-forget](#fire-and-forget)
-section will work.
-
-Of course, the new awaitable returned by `gather` needs to be
-wrapped in order to make it follow the logcontext rules before we can
-yield it, as described in [Where you create a new awaitable, make it
-follow the
-rules](#where-you-create-a-new-awaitable-make-it-follow-the-rules).
-
-So, option one: reset the logcontext before starting the operations to
-be gathered:
-
-```python
-async def do_request_handling():
-    with PreserveLoggingContext():
-        a1 = operation1()
-        a2 = operation2()
-        result = await defer.gatherResults([a1, a2])
-```
-
-In this case particularly, though, option two, of using
-`context.run_in_background` almost certainly makes more sense, so that
-`operation1` and `operation2` are both logged against the original
-logcontext. This looks like:
-
 ```python
 async def do_request_handling():
    a1 = context.run_in_background(operation1)
    a2 = context.run_in_background(operation2)

-    result = await make_deferred_yieldable(defer.gatherResults([a1, a2]))
+    result = await defer.gatherResults([a1, a2])
 ```

-## A note on garbage-collection of awaitable chains
-
-It turns out that our logcontext rules do not play nicely with awaitable
-chains which get orphaned and garbage-collected.
-
-Imagine we have some code that looks like this:
-
-```python
-listener_queue = []
-
-def on_something_interesting():
-    for d in listener_queue:
-        d.callback("foo")
-
-async def await_something_interesting():
-    new_awaitable = defer.Deferred()
-    listener_queue.append(new_awaitable)
-
-    with PreserveLoggingContext():
-        await new_awaitable
-```
-
-Obviously, the idea here is that we have a bunch of things which are
-waiting for an event. (It's just an example of the problem here, but a
-relatively common one.)
-
-Now let's imagine two further things happen. First of all, whatever was
-waiting for the interesting thing goes away. (Perhaps the request times
-out, or something *even more* interesting happens.)
-
-Secondly, let's suppose that we decide that the interesting thing is
-never going to happen, and we reset the listener queue:
-
-```python
-def reset_listener_queue():
-    listener_queue.clear()
-```
-
-So, both ends of the awaitable chain have now dropped their references,
-and the awaitable chain is now orphaned, and will be garbage-collected at
-some point. Note that `await_something_interesting` is a coroutine, 
-which Python implements as a generator function.  When Python
-garbage-collects generator functions, it gives them a chance to 
-clean up by making the `await` (or `yield`) raise a `GeneratorExit`
-exception. In our case, that means that the `__exit__` handler of
-`PreserveLoggingContext` will carefully restore the request context, but
-there is now nothing waiting for its return, so the request context is
-never cleared.
-
-To reiterate, this problem only arises when *both* ends of a awaitable
-chain are dropped. Dropping the the reference to an awaitable you're
-supposed to be awaiting is bad practice, so this doesn't
-actually happen too much. Unfortunately, when it does happen, it will
-lead to leaked logcontexts which are incredibly hard to track down.
+`background_process_metrics.run_as_background_process` also exists if you want some
+automatic tracing and metrics for the background task.
--- a/docs/workers.md
+++ b/docs/workers.md
@@ -252,7 +252,7 @@ information.
    ^/_matrix/client/(api/v1|r0|v3|unstable)/directory/room/.*$
    ^/_matrix/client/(r0|v3|unstable)/capabilities$
    ^/_matrix/client/(r0|v3|unstable)/notifications$
-    ^/_synapse/admin/v1/rooms/
+    ^/_synapse/admin/v1/rooms/[^/]+$

    # Encryption requests
    ^/_matrix/client/(r0|v3|unstable)/keys/query$
--- a/poetry.lock
+++ b/poetry.lock
@@ -1531,14 +1531,14 @@ files = [

 [[package]]
 name = "phonenumbers"
-version = "9.0.11"
+version = "9.0.12"
 description = "Python version of Google's common library for parsing, formatting, storing and validating international phone numbers."
 optional = false
 python-versions = "*"
 groups = ["main"]
 files = [
-    {file = "phonenumbers-9.0.11-py2.py3-none-any.whl", hash = "sha256:a8ebb2136f1f14dfdbadb98be01cb71b96f880dea011eb5e0921967fe3a23abf"},
-    {file = "phonenumbers-9.0.11.tar.gz", hash = "sha256:6573858dcf0a7a2753a071375e154d9fc11791546c699b575af95d2ba7d84a1d"},
+    {file = "phonenumbers-9.0.12-py2.py3-none-any.whl", hash = "sha256:900633afc3e12191458d710262df5efc117838bd1e2e613b64fa254a86bb20a1"},
+    {file = "phonenumbers-9.0.12.tar.gz", hash = "sha256:ccadff6b949494bd606836d8c9678bee5b55cb1cbad1e98bf7adae108e6fd0be"},
 ]

 [[package]]
@@ -2396,30 +2396,31 @@ files = [

 [[package]]
 name = "ruff"
-version = "0.12.7"
+version = "0.12.10"
 description = "An extremely fast Python linter and code formatter, written in Rust."
 optional = false
 python-versions = ">=3.7"
 groups = ["dev"]
 files = [
-    {file = "ruff-0.12.7-py3-none-linux_armv6l.whl", hash = "sha256:76e4f31529899b8c434c3c1dede98c4483b89590e15fb49f2d46183801565303"},
-    {file = "ruff-0.12.7-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:789b7a03e72507c54fb3ba6209e4bb36517b90f1a3569ea17084e3fd295500fb"},
-    {file = "ruff-0.12.7-py3-none-macosx_11_0_arm64.whl", hash = "sha256:2e1c2a3b8626339bb6369116e7030a4cf194ea48f49b64bb505732a7fce4f4e3"},
-    {file = "ruff-0.12.7-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:32dec41817623d388e645612ec70d5757a6d9c035f3744a52c7b195a57e03860"},
-    {file = "ruff-0.12.7-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:47ef751f722053a5df5fa48d412dbb54d41ab9b17875c6840a58ec63ff0c247c"},
-    {file = "ruff-0.12.7-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:a828a5fc25a3efd3e1ff7b241fd392686c9386f20e5ac90aa9234a5faa12c423"},
-    {file = "ruff-0.12.7-py3-none-manylinux_2_17_ppc64.manylinux2014_ppc64.whl", hash = "sha256:5726f59b171111fa6a69d82aef48f00b56598b03a22f0f4170664ff4d8298efb"},
-    {file = "ruff-0.12.7-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:74e6f5c04c4dd4aba223f4fe6e7104f79e0eebf7d307e4f9b18c18362124bccd"},
-    {file = "ruff-0.12.7-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:5d0bfe4e77fba61bf2ccadf8cf005d6133e3ce08793bbe870dd1c734f2699a3e"},
-    {file = "ruff-0.12.7-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:06bfb01e1623bf7f59ea749a841da56f8f653d641bfd046edee32ede7ff6c606"},
-    {file = "ruff-0.12.7-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:e41df94a957d50083fd09b916d6e89e497246698c3f3d5c681c8b3e7b9bb4ac8"},
-    {file = "ruff-0.12.7-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:4000623300563c709458d0ce170c3d0d788c23a058912f28bbadc6f905d67afa"},
-    {file = "ruff-0.12.7-py3-none-musllinux_1_2_i686.whl", hash = "sha256:69ffe0e5f9b2cf2b8e289a3f8945b402a1b19eff24ec389f45f23c42a3dd6fb5"},
-    {file = "ruff-0.12.7-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:a07a5c8ffa2611a52732bdc67bf88e243abd84fe2d7f6daef3826b59abbfeda4"},
-    {file = "ruff-0.12.7-py3-none-win32.whl", hash = "sha256:c928f1b2ec59fb77dfdf70e0419408898b63998789cc98197e15f560b9e77f77"},
-    {file = "ruff-0.12.7-py3-none-win_amd64.whl", hash = "sha256:9c18f3d707ee9edf89da76131956aba1270c6348bfee8f6c647de841eac7194f"},
-    {file = "ruff-0.12.7-py3-none-win_arm64.whl", hash = "sha256:dfce05101dbd11833a0776716d5d1578641b7fddb537fe7fa956ab85d1769b69"},
-    {file = "ruff-0.12.7.tar.gz", hash = "sha256:1fc3193f238bc2d7968772c82831a4ff69252f673be371fb49663f0068b7ec71"},
+    {file = "ruff-0.12.10-py3-none-linux_armv6l.whl", hash = "sha256:8b593cb0fb55cc8692dac7b06deb29afda78c721c7ccfed22db941201b7b8f7b"},
+    {file = "ruff-0.12.10-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:ebb7333a45d56efc7c110a46a69a1b32365d5c5161e7244aaf3aa20ce62399c1"},
+    {file = "ruff-0.12.10-py3-none-macosx_11_0_arm64.whl", hash = "sha256:d59e58586829f8e4a9920788f6efba97a13d1fa320b047814e8afede381c6839"},
+    {file = "ruff-0.12.10-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:822d9677b560f1fdeab69b89d1f444bf5459da4aa04e06e766cf0121771ab844"},
+    {file = "ruff-0.12.10-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:37b4a64f4062a50c75019c61c7017ff598cb444984b638511f48539d3a1c98db"},
+    {file = "ruff-0.12.10-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:2c6f4064c69d2542029b2a61d39920c85240c39837599d7f2e32e80d36401d6e"},
+    {file = "ruff-0.12.10-py3-none-manylinux_2_17_ppc64.manylinux2014_ppc64.whl", hash = "sha256:059e863ea3a9ade41407ad71c1de2badfbe01539117f38f763ba42a1206f7559"},
+    {file = "ruff-0.12.10-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:1bef6161e297c68908b7218fa6e0e93e99a286e5ed9653d4be71e687dff101cf"},
+    {file = "ruff-0.12.10-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:4f1345fbf8fb0531cd722285b5f15af49b2932742fc96b633e883da8d841896b"},
+    {file = "ruff-0.12.10-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:1f68433c4fbc63efbfa3ba5db31727db229fa4e61000f452c540474b03de52a9"},
+    {file = "ruff-0.12.10-py3-none-manylinux_2_31_riscv64.whl", hash = "sha256:141ce3d88803c625257b8a6debf4a0473eb6eed9643a6189b68838b43e78165a"},
+    {file = "ruff-0.12.10-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:f3fc21178cd44c98142ae7590f42ddcb587b8e09a3b849cbc84edb62ee95de60"},
+    {file = "ruff-0.12.10-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:7d1a4e0bdfafcd2e3e235ecf50bf0176f74dd37902f241588ae1f6c827a36c56"},
+    {file = "ruff-0.12.10-py3-none-musllinux_1_2_i686.whl", hash = "sha256:e67d96827854f50b9e3e8327b031647e7bcc090dbe7bb11101a81a3a2cbf1cc9"},
+    {file = "ruff-0.12.10-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:ae479e1a18b439c59138f066ae79cc0f3ee250712a873d00dbafadaad9481e5b"},
+    {file = "ruff-0.12.10-py3-none-win32.whl", hash = "sha256:9de785e95dc2f09846c5e6e1d3a3d32ecd0b283a979898ad427a9be7be22b266"},
+    {file = "ruff-0.12.10-py3-none-win_amd64.whl", hash = "sha256:7837eca8787f076f67aba2ca559cefd9c5cbc3a9852fd66186f4201b87c1563e"},
+    {file = "ruff-0.12.10-py3-none-win_arm64.whl", hash = "sha256:cc138cc06ed9d4bfa9d667a65af7172b47840e1a98b02ce7011c391e54635ffc"},
+    {file = "ruff-0.12.10.tar.gz", hash = "sha256:189ab65149d11ea69a2d775343adf5f49bb2426fc4780f65ee33b423ad2e47f9"},
 ]

 [[package]]
@@ -2877,14 +2878,14 @@ twisted = "*"

 [[package]]
 name = "types-bleach"
-version = "6.2.0.20250514"
+version = "6.2.0.20250809"
 description = "Typing stubs for bleach"
 optional = false
 python-versions = ">=3.9"
 groups = ["dev"]
 files = [
-    {file = "types_bleach-6.2.0.20250514-py3-none-any.whl", hash = "sha256:380cb74f0db1e3c3b2e0cde217221108e975e07e95ef0970c9d41f7cd4e8ea3c"},
-    {file = "types_bleach-6.2.0.20250514.tar.gz", hash = "sha256:38c2e51d9cac51dc70c1b66121a11f4dad8bbf47fbad494bb7a77d8b8f3c4323"},
+    {file = "types_bleach-6.2.0.20250809-py3-none-any.whl", hash = "sha256:0b372a75117947d9ac8a31ae733fd0f8d92ec75c4772e7b37093ba3fa5b48fb9"},
+    {file = "types_bleach-6.2.0.20250809.tar.gz", hash = "sha256:188d7a1119f6c953140b513ed57ba4213755695815472c19d0c22ac09c79b90b"},
 ]

 [package.dependencies]
@@ -2919,14 +2920,14 @@ files = [

 [[package]]
 name = "types-jsonschema"
-version = "4.25.0.20250720"
+version = "4.25.1.20250822"
 description = "Typing stubs for jsonschema"
 optional = false
 python-versions = ">=3.9"
 groups = ["dev"]
 files = [
-    {file = "types_jsonschema-4.25.0.20250720-py3-none-any.whl", hash = "sha256:7d7897c715310d8bf9ae27a2cedba78bbb09e4cad83ce06d2aa79b73a88941df"},
-    {file = "types_jsonschema-4.25.0.20250720.tar.gz", hash = "sha256:765a3b6144798fe3161fd8cbe570a756ed3e8c0e5adb7c09693eb49faad39dbd"},
+    {file = "types_jsonschema-4.25.1.20250822-py3-none-any.whl", hash = "sha256:f82c2d7fa1ce1c0b84ba1de4ed6798469768188884db04e66421913a4e181294"},
+    {file = "types_jsonschema-4.25.1.20250822.tar.gz", hash = "sha256:aac69ed4b23f49aaceb7fcb834141d61b9e4e6a7f6008cb2f0d3b831dfa8464a"},
 ]

 [package.dependencies]
@@ -2970,14 +2971,14 @@ files = [

 [[package]]
 name = "types-psycopg2"
-version = "2.9.21.20250718"
+version = "2.9.21.20250809"
 description = "Typing stubs for psycopg2"
 optional = false
 python-versions = ">=3.9"
 groups = ["dev"]
 files = [
-    {file = "types_psycopg2-2.9.21.20250718-py3-none-any.whl", hash = "sha256:bcf085d4293bda48f5943a46dadf0389b2f98f7e8007722f7e1c12ee0f541858"},
-    {file = "types_psycopg2-2.9.21.20250718.tar.gz", hash = "sha256:dc09a97272ef67e739e57b9f4740b761208f4514257e311c0b05c8c7a37d04b4"},
+    {file = "types_psycopg2-2.9.21.20250809-py3-none-any.whl", hash = "sha256:59b7b0ed56dcae9efae62b8373497274fc1a0484bdc5135cdacbe5a8f44e1d7b"},
+    {file = "types_psycopg2-2.9.21.20250809.tar.gz", hash = "sha256:b7c2cbdcf7c0bd16240f59ba694347329b0463e43398de69784ea4dee45f3c6d"},
 ]

 [[package]]
@@ -3255,4 +3256,4 @@ url-preview = ["lxml"]
 [metadata]
 lock-version = "2.1"
 python-versions = "^3.9.0"
-content-hash = "600a349d08dde732df251583094a121b5385eb43ae0c6ceff10dcf9749359446"
+content-hash = "2e8ea085e1a0c6f0ac051d4bc457a96827d01f621b1827086de01a5ffa98cf79"
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -101,7 +101,7 @@ module-name = "synapse.synapse_rust"

 [tool.poetry]
 name = "matrix-synapse"
-version = "1.137.0rc1"
+version = "1.138.0rc1"
 description = "Homeserver for the Matrix decentralised comms protocol"
 authors = ["Matrix.org Team and Contributors <packages@matrix.org>"]
 license = "AGPL-3.0-or-later"
@@ -324,7 +324,7 @@ all = [
 # failing on new releases. Keeping lower bounds loose here means that dependabot
 # can bump versions without having to update the content-hash in the lockfile.
 # This helps prevents merge conflicts when running a batch of dependabot updates.
-ruff = "0.12.7"
+ruff = "0.12.10"
 # Type checking only works with the pydantic.v1 compat module from pydantic v2
 pydantic = "^2"

--- a/schema/synapse-config.schema.yaml
+++ b/schema/synapse-config.schema.yaml
@@ -1,5 +1,5 @@
 $schema: https://element-hq.github.io/synapse/latest/schema/v1/meta.schema.json
-$id: https://element-hq.github.io/synapse/schema/synapse/v1.137/synapse-config.schema.json
+$id: https://element-hq.github.io/synapse/schema/synapse/v1.138/synapse-config.schema.json
 type: object
 properties:
  modules:
--- a/synapse/init.py
+++ b/synapse/init.py
@@ -86,12 +86,5 @@ import synapse.util  # noqa: E402

 __version__ = synapse.util.SYNAPSE_VERSION

-if bool(os.environ.get("SYNAPSE_TEST_PATCH_LOG_CONTEXTS", False)):
-    # We import here so that we don't have to install a bunch of deps when
-    # running the packaging tox test.
-    from synapse.util.patch_inline_callbacks import do_patch
-
-    do_patch()
-

 check_rust_lib_up_to_date()
--- a/synapse/_scripts/generate_workers_map.py
+++ b/synapse/_scripts/generate_workers_map.py
@@ -153,9 +153,13 @@ def get_registered_paths_for_default(
    """

    hs = MockHomeserver(base_config, worker_app)
+
    # TODO We only do this to avoid an error, but don't need the database etc
    hs.setup()
-    return get_registered_paths_for_hs(hs)
+    registered_paths = get_registered_paths_for_hs(hs)
+    hs.cleanup()
+
+    return registered_paths


 def elide_http_methods_if_unconflicting(
--- a/synapse/_scripts/synapse_port_db.py
+++ b/synapse/_scripts/synapse_port_db.py
@@ -99,6 +99,7 @@ from synapse.storage.engines import create_engine
 from synapse.storage.prepare_database import prepare_database
 from synapse.types import ISynapseReactor
 from synapse.util import SYNAPSE_VERSION, Clock
+from synapse.util.stringutils import random_string

 # Cast safety: Twisted does some naughty magic which replaces the
 # twisted.internet.reactor module with a Reactor instance at runtime.
@@ -323,6 +324,7 @@ class MockHomeserver:
        self.config = config
        self.hostname = config.server.server_name
        self.version_string = SYNAPSE_VERSION
+        self.instance_id = random_string(5)

    def get_clock(self) -> Clock:
        return self.clock
@@ -330,6 +332,9 @@ class MockHomeserver:
    def get_reactor(self) -> ISynapseReactor:
        return reactor

+    def get_instance_id(self) -> str:
+        return self.instance_id
+
    def get_instance_name(self) -> str:
        return "master"

@@ -685,7 +690,15 @@ class Porter:
            )
            prepare_database(db_conn, engine, config=self.hs_config)
            # Type safety: ignore that we're using Mock homeservers here.
-            store = Store(DatabasePool(hs, db_config, engine), db_conn, hs)  # type: ignore[arg-type]
+            store = Store(
+                DatabasePool(
+                    hs,  # type: ignore[arg-type]
+                    db_config,
+                    engine,
+                ),
+                db_conn,
+                hs,  # type: ignore[arg-type]
+            )
            db_conn.commit()

        return store
--- a/synapse/api/auth/mas.py
+++ b/synapse/api/auth/mas.py
@@ -13,7 +13,7 @@
 #
 #
 import logging
-from typing import TYPE_CHECKING, Optional
+from typing import TYPE_CHECKING, Optional, Set
 from urllib.parse import urlencode

 from synapse._pydantic_compat import (
@@ -57,8 +57,10 @@ logger = logging.getLogger(__name__)

 # Scope as defined by MSC2967
 # https://github.com/matrix-org/matrix-spec-proposals/pull/2967
-SCOPE_MATRIX_API = "urn:matrix:org.matrix.msc2967.client:api:*"
-SCOPE_MATRIX_DEVICE_PREFIX = "urn:matrix:org.matrix.msc2967.client:device:"
+UNSTABLE_SCOPE_MATRIX_API = "urn:matrix:org.matrix.msc2967.client:api:*"
+UNSTABLE_SCOPE_MATRIX_DEVICE_PREFIX = "urn:matrix:org.matrix.msc2967.client:device:"
+STABLE_SCOPE_MATRIX_API = "urn:matrix:client:api:*"
+STABLE_SCOPE_MATRIX_DEVICE_PREFIX = "urn:matrix:client:device:"


 class ServerMetadata(BaseModel):
@@ -334,7 +336,10 @@ class MasDelegatedAuth(BaseAuth):
        scope = introspection_result.get_scope_set()

        # Determine type of user based on presence of particular scopes
-        if SCOPE_MATRIX_API not in scope:
+        if (
+            UNSTABLE_SCOPE_MATRIX_API not in scope
+            and STABLE_SCOPE_MATRIX_API not in scope
+        ):
            raise InvalidClientTokenError(
                "Token doesn't grant access to the Matrix C-S API"
            )
@@ -366,11 +371,12 @@ class MasDelegatedAuth(BaseAuth):
            # We only allow a single device_id in the scope, so we find them all in the
            # scope list, and raise if there are more than one. The OIDC server should be
            # the one enforcing valid scopes, so we raise a 500 if we find an invalid scope.
-            device_ids = [
-                tok[len(SCOPE_MATRIX_DEVICE_PREFIX) :]
-                for tok in scope
-                if tok.startswith(SCOPE_MATRIX_DEVICE_PREFIX)
-            ]
+            device_ids: Set[str] = set()
+            for tok in scope:
+                if tok.startswith(UNSTABLE_SCOPE_MATRIX_DEVICE_PREFIX):
+                    device_ids.add(tok[len(UNSTABLE_SCOPE_MATRIX_DEVICE_PREFIX) :])
+                elif tok.startswith(STABLE_SCOPE_MATRIX_DEVICE_PREFIX):
+                    device_ids.add(tok[len(STABLE_SCOPE_MATRIX_DEVICE_PREFIX) :])

            if len(device_ids) > 1:
                raise AuthError(
@@ -378,7 +384,7 @@ class MasDelegatedAuth(BaseAuth):
                    "Multiple device IDs in scope",
                )

-            device_id = device_ids[0] if device_ids else None
+            device_id = next(iter(device_ids), None)

        if device_id is not None:
            # Sanity check the device_id
--- a/synapse/api/auth/msc3861_delegated.py
+++ b/synapse/api/auth/msc3861_delegated.py
@@ -20,7 +20,7 @@
 #
 import logging
 from dataclasses import dataclass
-from typing import TYPE_CHECKING, Any, Callable, Dict, List, Optional
+from typing import TYPE_CHECKING, Any, Callable, Dict, List, Optional, Set
 from urllib.parse import urlencode

 from authlib.oauth2 import ClientAuth
@@ -34,7 +34,6 @@ from synapse.api.errors import (
    AuthError,
    HttpResponseException,
    InvalidClientTokenError,
-    OAuthInsufficientScopeError,
    SynapseError,
    UnrecognizedRequestError,
 )
@@ -63,9 +62,10 @@ logger = logging.getLogger(__name__)

 # Scope as defined by MSC2967
 # https://github.com/matrix-org/matrix-spec-proposals/pull/2967
-SCOPE_MATRIX_API = "urn:matrix:org.matrix.msc2967.client:api:*"
-SCOPE_MATRIX_GUEST = "urn:matrix:org.matrix.msc2967.client:api:guest"
-SCOPE_MATRIX_DEVICE_PREFIX = "urn:matrix:org.matrix.msc2967.client:device:"
+UNSTABLE_SCOPE_MATRIX_API = "urn:matrix:org.matrix.msc2967.client:api:*"
+UNSTABLE_SCOPE_MATRIX_DEVICE_PREFIX = "urn:matrix:org.matrix.msc2967.client:device:"
+STABLE_SCOPE_MATRIX_API = "urn:matrix:client:api:*"
+STABLE_SCOPE_MATRIX_DEVICE_PREFIX = "urn:matrix:client:device:"

 # Scope which allows access to the Synapse admin API
 SCOPE_SYNAPSE_ADMIN = "urn:synapse:admin:*"
@@ -444,9 +444,6 @@ class MSC3861DelegatedAuth(BaseAuth):
        if not self._is_access_token_the_admin_token(access_token):
            await self._record_request(request, requester)

-        if not allow_guest and requester.is_guest:
-            raise OAuthInsufficientScopeError([SCOPE_MATRIX_API])
-
        request.requester = requester

        return requester
@@ -528,10 +525,11 @@ class MSC3861DelegatedAuth(BaseAuth):
        scope: List[str] = introspection_result.get_scope_list()

        # Determine type of user based on presence of particular scopes
-        has_user_scope = SCOPE_MATRIX_API in scope
-        has_guest_scope = SCOPE_MATRIX_GUEST in scope
+        has_user_scope = (
+            UNSTABLE_SCOPE_MATRIX_API in scope or STABLE_SCOPE_MATRIX_API in scope
+        )

-        if not has_user_scope and not has_guest_scope:
+        if not has_user_scope:
            raise InvalidClientTokenError("No scope in token granting user rights")

        # Match via the sub claim
@@ -579,11 +577,12 @@ class MSC3861DelegatedAuth(BaseAuth):
            # We only allow a single device_id in the scope, so we find them all in the
            # scope list, and raise if there are more than one. The OIDC server should be
            # the one enforcing valid scopes, so we raise a 500 if we find an invalid scope.
-            device_ids = [
-                tok[len(SCOPE_MATRIX_DEVICE_PREFIX) :]
-                for tok in scope
-                if tok.startswith(SCOPE_MATRIX_DEVICE_PREFIX)
-            ]
+            device_ids: Set[str] = set()
+            for tok in scope:
+                if tok.startswith(UNSTABLE_SCOPE_MATRIX_DEVICE_PREFIX):
+                    device_ids.add(tok[len(UNSTABLE_SCOPE_MATRIX_DEVICE_PREFIX) :])
+                elif tok.startswith(STABLE_SCOPE_MATRIX_DEVICE_PREFIX):
+                    device_ids.add(tok[len(STABLE_SCOPE_MATRIX_DEVICE_PREFIX) :])

            if len(device_ids) > 1:
                raise AuthError(
@@ -591,7 +590,7 @@ class MSC3861DelegatedAuth(BaseAuth):
                    "Multiple device IDs in scope",
                )

-            device_id = device_ids[0] if device_ids else None
+            device_id = next(iter(device_ids), None)

        if device_id is not None:
            # Sanity check the device_id
@@ -617,5 +616,4 @@ class MSC3861DelegatedAuth(BaseAuth):
            user_id=user_id,
            device_id=device_id,
            scope=scope,
-            is_guest=(has_guest_scope and not has_user_scope),
        )
--- a/synapse/app/_base.py
+++ b/synapse/app/_base.py
@@ -601,6 +601,12 @@ async def start(hs: "HomeServer") -> None:
    hs.get_datastores().main.db_pool.start_profiling()
    hs.get_pusherpool().start()

+    # Register background tasks required by this server. This must be done
+    # somewhat manually due to the background tasks not being registered
+    # unless handlers are instantiated.
+    if hs.config.worker.run_background_tasks:
+        hs.start_background_tasks()
+
    # Log when we start the shut down process.
    hs.get_reactor().addSystemEventTrigger(
        "before", "shutdown", logger.info, "Shutting down..."
--- a/synapse/events/snapshot.py
+++ b/synapse/events/snapshot.py
@@ -306,6 +306,12 @@ class EventContext(UnpersistedEventContextBase):
        )


+EventPersistencePair = Tuple[EventBase, EventContext]
+"""
+The combination of an event to be persisted and its context.
+"""
+
+
@attr.s(slots=True, auto_attribs=True)
 class UnpersistedEventContext(UnpersistedEventContextBase):
    """
@@ -363,7 +369,7 @@ class UnpersistedEventContext(UnpersistedEventContextBase):
        room_id: str,
        last_known_state_group: int,
        datastore: "StateGroupDataStore",
-    ) -> List[Tuple[EventBase, EventContext]]:
+    ) -> List[EventPersistencePair]:
        """
        Takes a list of events and their associated unpersisted contexts and persists
        the unpersisted contexts, returning a list of events and persisted contexts.
--- a/synapse/federation/federation_server.py
+++ b/synapse/federation/federation_server.py
@@ -59,7 +59,7 @@ from synapse.api.errors import (
 from synapse.api.room_versions import KNOWN_ROOM_VERSIONS, RoomVersion
 from synapse.crypto.event_signing import compute_event_signature
 from synapse.events import EventBase
-from synapse.events.snapshot import EventContext
+from synapse.events.snapshot import EventPersistencePair
 from synapse.federation.federation_base import (
    FederationBase,
    InvalidEventSignatureError,
@@ -914,7 +914,7 @@ class FederationServer(FederationBase):

    async def _on_send_membership_event(
        self, origin: str, content: JsonDict, membership_type: str, room_id: str
-    ) -> Tuple[EventBase, EventContext]:
+    ) -> EventPersistencePair:
        """Handle an on_send_{join,leave,knock} request

        Does some preliminary validation before passing the request on to the
--- a/synapse/federation/send_queue.py
+++ b/synapse/federation/send_queue.py
@@ -37,6 +37,7 @@ Events are replicated via a separate events stream.
 """

 import logging
+from enum import Enum
 from typing import (
    TYPE_CHECKING,
    Dict,
@@ -67,6 +68,25 @@ if TYPE_CHECKING:
 logger = logging.getLogger(__name__)


+class QueueNames(str, Enum):
+    PRESENCE_MAP = "presence_map"
+    KEYED_EDU = "keyed_edu"
+    KEYED_EDU_CHANGED = "keyed_edu_changed"
+    EDUS = "edus"
+    POS_TIME = "pos_time"
+    PRESENCE_DESTINATIONS = "presence_destinations"
+
+
+queue_name_to_gauge_map: Dict[QueueNames, LaterGauge] = {}
+
+for queue_name in QueueNames:
+    queue_name_to_gauge_map[queue_name] = LaterGauge(
+        name=f"synapse_federation_send_queue_{queue_name.value}_size",
+        desc="",
+        labelnames=[SERVER_NAME_LABEL],
+    )
+
+
 class FederationRemoteSendQueue(AbstractFederationSender):
    """A drop in replacement for FederationSender"""

@@ -111,23 +131,16 @@ class FederationRemoteSendQueue(AbstractFederationSender):
        # we make a new function, so we need to make a new function so the inner
        # lambda binds to the queue rather than to the name of the queue which
        # changes. ARGH.
-        def register(name: str, queue: Sized) -> None:
-            LaterGauge(
-                name="synapse_federation_send_queue_%s_size" % (queue_name,),
-                desc="",
-                labelnames=[SERVER_NAME_LABEL],
-                caller=lambda: {(self.server_name,): len(queue)},
+        def register(queue_name: QueueNames, queue: Sized) -> None:
+            queue_name_to_gauge_map[queue_name].register_hook(
+                homeserver_instance_id=hs.get_instance_id(),
+                hook=lambda: {(self.server_name,): len(queue)},
            )

-        for queue_name in [
-            "presence_map",
-            "keyed_edu",
-            "keyed_edu_changed",
-            "edus",
-            "pos_time",
-            "presence_destinations",
-        ]:
-            register(queue_name, getattr(self, queue_name))
+        for queue_name in QueueNames:
+            queue = getattr(self, queue_name.value)
+            assert isinstance(queue, Sized)
+            register(queue_name, queue=queue)

        self.clock.looping_call(self._clear_queue, 30 * 1000)

--- a/synapse/federation/sender/init.py
+++ b/synapse/federation/sender/init.py
@@ -199,6 +199,24 @@ sent_pdus_destination_dist_total = Counter(
    labelnames=[SERVER_NAME_LABEL],
 )

+transaction_queue_pending_destinations_gauge = LaterGauge(
+    name="synapse_federation_transaction_queue_pending_destinations",
+    desc="",
+    labelnames=[SERVER_NAME_LABEL],
+)
+
+transaction_queue_pending_pdus_gauge = LaterGauge(
+    name="synapse_federation_transaction_queue_pending_pdus",
+    desc="",
+    labelnames=[SERVER_NAME_LABEL],
+)
+
+transaction_queue_pending_edus_gauge = LaterGauge(
+    name="synapse_federation_transaction_queue_pending_edus",
+    desc="",
+    labelnames=[SERVER_NAME_LABEL],
+)
+
 # Time (in s) to wait before trying to wake up destinations that have
 # catch-up outstanding.
 # Please note that rate limiting still applies, so while the loop is
@@ -398,11 +416,9 @@ class FederationSender(AbstractFederationSender):
        # map from destination to PerDestinationQueue
        self._per_destination_queues: Dict[str, PerDestinationQueue] = {}

-        LaterGauge(
-            name="synapse_federation_transaction_queue_pending_destinations",
-            desc="",
-            labelnames=[SERVER_NAME_LABEL],
-            caller=lambda: {
+        transaction_queue_pending_destinations_gauge.register_hook(
+            homeserver_instance_id=hs.get_instance_id(),
+            hook=lambda: {
                (self.server_name,): sum(
                    1
                    for d in self._per_destination_queues.values()
@@ -410,22 +426,17 @@ class FederationSender(AbstractFederationSender):
                )
            },
        )
-
-        LaterGauge(
-            name="synapse_federation_transaction_queue_pending_pdus",
-            desc="",
-            labelnames=[SERVER_NAME_LABEL],
-            caller=lambda: {
+        transaction_queue_pending_pdus_gauge.register_hook(
+            homeserver_instance_id=hs.get_instance_id(),
+            hook=lambda: {
                (self.server_name,): sum(
                    d.pending_pdu_count() for d in self._per_destination_queues.values()
                )
            },
        )
-        LaterGauge(
-            name="synapse_federation_transaction_queue_pending_edus",
-            desc="",
-            labelnames=[SERVER_NAME_LABEL],
-            caller=lambda: {
+        transaction_queue_pending_edus_gauge.register_hook(
+            homeserver_instance_id=hs.get_instance_id(),
+            hook=lambda: {
                (self.server_name,): sum(
                    d.pending_edu_count() for d in self._per_destination_queues.values()
                )
--- a/synapse/handlers/delayed_events.py
+++ b/synapse/handlers/delayed_events.py
@@ -215,9 +215,9 @@ class DelayedEventsHandler:
                "Handling: %r %r, %s", delta.event_type, delta.state_key, delta.event_id
            )

-            event = await self._store.get_event(
-                delta.event_id, check_room_id=delta.room_id
-            )
+            event = await self._store.get_event(delta.event_id, allow_none=True)
+            if not event:
+                continue
            sender = UserID.from_string(event.sender)

            next_send_ts = await self._store.cancel_delayed_state_events(
--- a/synapse/handlers/federation_event.py
+++ b/synapse/handlers/federation_event.py
@@ -66,7 +66,11 @@ from synapse.event_auth import (
    validate_event_for_room_version,
 )
 from synapse.events import EventBase
-from synapse.events.snapshot import EventContext, UnpersistedEventContextBase
+from synapse.events.snapshot import (
+    EventContext,
+    EventPersistencePair,
+    UnpersistedEventContextBase,
+)
 from synapse.federation.federation_client import InvalidResponseError, PulledPduInfo
 from synapse.logging.context import nested_logging_context
 from synapse.logging.opentracing import (
@@ -341,7 +345,7 @@ class FederationEventHandler:

    async def on_send_membership_event(
        self, origin: str, event: EventBase
-    ) -> Tuple[EventBase, EventContext]:
+    ) -> EventPersistencePair:
        """
        We have received a join/leave/knock event for a room via send_join/leave/knock.

@@ -1712,7 +1716,7 @@ class FederationEventHandler:
            )
            auth_map.update(persisted_events)

-        events_and_contexts_to_persist: List[Tuple[EventBase, EventContext]] = []
+        events_and_contexts_to_persist: List[EventPersistencePair] = []

        async def prep(event: EventBase) -> None:
            with nested_logging_context(suffix=event.event_id):
@@ -2225,7 +2229,7 @@ class FederationEventHandler:
    async def persist_events_and_notify(
        self,
        room_id: str,
-        event_and_contexts: Sequence[Tuple[EventBase, EventContext]],
+        event_and_contexts: Sequence[EventPersistencePair],
        backfilled: bool = False,
    ) -> int:
        """Persists events and tells the notifier/pushers about them, if
--- a/synapse/handlers/message.py
+++ b/synapse/handlers/message.py
@@ -57,6 +57,7 @@ from synapse.events import EventBase, relation_from_event
 from synapse.events.builder import EventBuilder
 from synapse.events.snapshot import (
    EventContext,
+    EventPersistencePair,
    UnpersistedEventContext,
    UnpersistedEventContextBase,
 )
@@ -1439,7 +1440,7 @@ class EventCreationHandler:
    async def handle_new_client_event(
        self,
        requester: Requester,
-        events_and_context: List[Tuple[EventBase, EventContext]],
+        events_and_context: List[EventPersistencePair],
        ratelimit: bool = True,
        extra_users: Optional[List[UserID]] = None,
        ignore_shadow_ban: bool = False,
@@ -1651,7 +1652,7 @@ class EventCreationHandler:
    async def _persist_events(
        self,
        requester: Requester,
-        events_and_context: List[Tuple[EventBase, EventContext]],
+        events_and_context: List[EventPersistencePair],
        ratelimit: bool = True,
        extra_users: Optional[List[UserID]] = None,
    ) -> EventBase:
@@ -1737,7 +1738,7 @@ class EventCreationHandler:
            raise

    async def cache_joined_hosts_for_events(
-        self, events_and_context: List[Tuple[EventBase, EventContext]]
+        self, events_and_context: List[EventPersistencePair]
    ) -> None:
        """Precalculate the joined hosts at each of the given events, when using Redis, so that
        external federation senders don't have to recalculate it themselves.
@@ -1843,7 +1844,7 @@ class EventCreationHandler:
    async def persist_and_notify_client_events(
        self,
        requester: Requester,
-        events_and_context: List[Tuple[EventBase, EventContext]],
+        events_and_context: List[EventPersistencePair],
        ratelimit: bool = True,
        extra_users: Optional[List[UserID]] = None,
    ) -> EventBase:
--- a/synapse/handlers/presence.py
+++ b/synapse/handlers/presence.py
@@ -173,6 +173,18 @@ state_transition_counter = Counter(
    labelnames=["locality", "from", "to", SERVER_NAME_LABEL],
 )

+presence_user_to_current_state_size_gauge = LaterGauge(
+    name="synapse_handlers_presence_user_to_current_state_size",
+    desc="",
+    labelnames=[SERVER_NAME_LABEL],
+)
+
+presence_wheel_timer_size_gauge = LaterGauge(
+    name="synapse_handlers_presence_wheel_timer_size",
+    desc="",
+    labelnames=[SERVER_NAME_LABEL],
+)
+
 # If a user was last active in the last LAST_ACTIVE_GRANULARITY, consider them
 # "currently_active"
 LAST_ACTIVE_GRANULARITY = 60 * 1000
@@ -779,11 +791,9 @@ class PresenceHandler(BasePresenceHandler):
            EduTypes.PRESENCE, self.incoming_presence
        )

-        LaterGauge(
-            name="synapse_handlers_presence_user_to_current_state_size",
-            desc="",
-            labelnames=[SERVER_NAME_LABEL],
-            caller=lambda: {(self.server_name,): len(self.user_to_current_state)},
+        presence_user_to_current_state_size_gauge.register_hook(
+            homeserver_instance_id=hs.get_instance_id(),
+            hook=lambda: {(self.server_name,): len(self.user_to_current_state)},
        )

        # The per-device presence state, maps user to devices to per-device presence state.
@@ -882,11 +892,9 @@ class PresenceHandler(BasePresenceHandler):
                60 * 1000,
            )

-        LaterGauge(
-            name="synapse_handlers_presence_wheel_timer_size",
-            desc="",
-            labelnames=[SERVER_NAME_LABEL],
-            caller=lambda: {(self.server_name,): len(self.wheel_timer)},
+        presence_wheel_timer_size_gauge.register_hook(
+            homeserver_instance_id=hs.get_instance_id(),
+            hook=lambda: {(self.server_name,): len(self.wheel_timer)},
        )

        # Used to handle sending of presence to newly joined users/servers
--- a/synapse/http/request_metrics.py
+++ b/synapse/http/request_metrics.py
@@ -164,11 +164,13 @@ def _get_in_flight_counts() -> Mapping[Tuple[str, ...], int]:
    return counts


-LaterGauge(
+in_flight_requests = LaterGauge(
    name="synapse_http_server_in_flight_requests_count",
    desc="",
    labelnames=["method", "servlet", SERVER_NAME_LABEL],
-    caller=_get_in_flight_counts,
+)
+in_flight_requests.register_hook(
+    homeserver_instance_id=None, hook=_get_in_flight_counts
 )


--- a/synapse/http/server.py
+++ b/synapse/http/server.py
@@ -702,6 +702,10 @@ class _ByteProducer:
        self._request: Optional[Request] = request
        self._iterator = iterator
        self._paused = False
+        self.tracing_scope = start_active_span(
+            "write_bytes_to_request",
+        )
+        self.tracing_scope.__enter__()

        try:
            self._request.registerProducer(self, True)
@@ -712,8 +716,8 @@ class _ByteProducer:
            logger.info("Connection disconnected before response was written: %r", e)

            # We drop our references to data we'll not use.
-            self._request = None
            self._iterator = iter(())
+            self.tracing_scope.__exit__(type(e), None, e.__traceback__)
        else:
            # Start producing if `registerProducer` was successful
            self.resumeProducing()
@@ -727,6 +731,9 @@ class _ByteProducer:
        self._request.write(b"".join(data))

    def pauseProducing(self) -> None:
+        opentracing_span = active_span()
+        if opentracing_span is not None:
+            opentracing_span.log_kv({"event": "producer_paused"})
        self._paused = True

    def resumeProducing(self) -> None:
@@ -737,6 +744,10 @@ class _ByteProducer:

        self._paused = False

+        opentracing_span = active_span()
+        if opentracing_span is not None:
+            opentracing_span.log_kv({"event": "producer_resumed"})
+
        # Write until there's backpressure telling us to stop.
        while not self._paused:
            # Get the next chunk and write it to the request.
@@ -771,6 +782,7 @@ class _ByteProducer:
    def stopProducing(self) -> None:
        # Clear a circular reference.
        self._request = None
+        self.tracing_scope.__exit__(None, None, None)


 def _encode_json_bytes(json_object: object) -> bytes:
@@ -913,8 +925,9 @@ def _write_bytes_to_request(request: Request, bytes_to_write: bytes) -> None:
    # once (via `Request.write`) is that doing so starts the timeout for the
    # next request to be received: so if it takes longer than 60s to stream back
    # the response to the client, the client never gets it.
+    # c.f https://github.com/twisted/twisted/issues/12498
    #
-    # The correct solution is to use a Producer; then the timeout is only
+    # One workaround is to use a `Producer`; then the timeout is only
    # started once all of the content is sent over the TCP connection.

    # To make sure we don't write all of the bytes at once we split it up into
--- a/synapse/logging/context.py
+++ b/synapse/logging/context.py
@@ -34,6 +34,7 @@ import logging
 import threading
 import typing
 import warnings
+from contextvars import ContextVar
 from types import TracebackType
 from typing import (
    TYPE_CHECKING,
@@ -56,7 +57,6 @@ from twisted.internet import defer, threads
 from twisted.python.threadpool import ThreadPool

 if TYPE_CHECKING:
-    from synapse.logging.scopecontextmanager import _LogContextScope
    from synapse.types import ISynapseReactor

 logger = logging.getLogger(__name__)
@@ -230,14 +230,13 @@ LoggingContextOrSentinel = Union["LoggingContext", "_Sentinel"]
 class _Sentinel:
    """Sentinel to represent the root context"""

-    __slots__ = ["previous_context", "finished", "request", "scope", "tag"]
+    __slots__ = ["previous_context", "finished", "request", "tag"]

    def __init__(self) -> None:
        # Minimal set for compatibility with LoggingContext
        self.previous_context = None
        self.finished = False
        self.request = None
-        self.scope = None
        self.tag = None

    def __str__(self) -> str:
@@ -290,7 +289,6 @@ class LoggingContext:
        "finished",
        "request",
        "tag",
-        "scope",
    ]

    def __init__(
@@ -311,7 +309,6 @@ class LoggingContext:
        self.main_thread = get_thread_id()
        self.request = None
        self.tag = ""
-        self.scope: Optional["_LogContextScope"] = None

        # keep track of whether we have hit the __exit__ block for this context
        # (suggesting that the the thing that created the context thinks it should
@@ -324,9 +321,6 @@ class LoggingContext:
            # we track the current request_id
            self.request = self.parent_context.request

-            # we also track the current scope:
-            self.scope = self.parent_context.scope
-
        if request is not None:
            # the request param overrides the request from the parent context
            self.request = request
@@ -660,13 +654,12 @@ class PreserveLoggingContext:
                )


-_thread_local = threading.local()
-_thread_local.current_context = SENTINEL_CONTEXT
+_current_context: ContextVar[LoggingContextOrSentinel] = ContextVar("current_context")


 def current_context() -> LoggingContextOrSentinel:
    """Get the current logging context from thread local storage"""
-    return getattr(_thread_local, "current_context", SENTINEL_CONTEXT)
+    return _current_context.get(SENTINEL_CONTEXT)


 def set_current_context(context: LoggingContextOrSentinel) -> LoggingContextOrSentinel:
@@ -687,7 +680,7 @@ def set_current_context(context: LoggingContextOrSentinel) -> LoggingContextOrSe
    if current is not context:
        rusage = get_thread_resource_usage()
        current.stop(rusage)
-        _thread_local.current_context = context
+        _current_context.set(context)
        context.start(rusage)

    return current
@@ -803,7 +796,6 @@ def run_in_background(
    CRITICAL error about an unhandled error will be logged without much
    indication about where it came from.
    """
-    current = current_context()
    try:
        res = f(*args, **kwargs)
    except Exception:
@@ -832,23 +824,6 @@ def run_in_background(
        # optimise out the messing about
        return d

-    # The function may have reset the context before returning, so
-    # we need to restore it now.
-    ctx = set_current_context(current)
-
-    # The original context will be restored when the deferred
-    # completes, but there is nothing waiting for it, so it will
-    # get leaked into the reactor or some other function which
-    # wasn't expecting it. We therefore need to reset the context
-    # here.
-    #
-    # (If this feels asymmetric, consider it this way: we are
-    # effectively forking a new thread of execution. We are
-    # probably currently within a ``with LoggingContext()`` block,
-    # which is supposed to have a single entry and exit point. But
-    # by spawning off another deferred, we are effectively
-    # adding a new exit point.)
-    d.addBoth(_set_context_cb, ctx)
    return d


@@ -868,65 +843,20 @@ def run_coroutine_in_background(
    cannot change the log contexts.
    """

-    current = current_context()
-    d = defer.ensureDeferred(coroutine)
-
-    # The function may have reset the context before returning, so
-    # we need to restore it now.
-    ctx = set_current_context(current)
-
-    # The original context will be restored when the deferred
-    # completes, but there is nothing waiting for it, so it will
-    # get leaked into the reactor or some other function which
-    # wasn't expecting it. We therefore need to reset the context
-    # here.
-    #
-    # (If this feels asymmetric, consider it this way: we are
-    # effectively forking a new thread of execution. We are
-    # probably currently within a ``with LoggingContext()`` block,
-    # which is supposed to have a single entry and exit point. But
-    # by spawning off another deferred, we are effectively
-    # adding a new exit point.)
-    d.addBoth(_set_context_cb, ctx)
-    return d
+    return defer.ensureDeferred(coroutine)


 T = TypeVar("T")


+# TODO: This function is a no-op now and should be removed in a follow-up PR.
 def make_deferred_yieldable(deferred: "defer.Deferred[T]") -> "defer.Deferred[T]":
-    """Given a deferred, make it follow the Synapse logcontext rules:
-
-    If the deferred has completed, essentially does nothing (just returns another
-    completed deferred with the result/failure).
-
-    If the deferred has not yet completed, resets the logcontext before
-    returning a deferred. Then, when the deferred completes, restores the
-    current logcontext before running callbacks/errbacks.
-
-    (This is more-or-less the opposite operation to run_in_background.)
-    """
-    if deferred.called and not deferred.paused:
-        # it looks like this deferred is ready to run any callbacks we give it
-        # immediately. We may as well optimise out the logcontext faffery.
-        return deferred
-
-    # ok, we can't be sure that a yield won't block, so let's reset the
-    # logcontext, and add a callback to the deferred to restore it.
-    prev_context = set_current_context(SENTINEL_CONTEXT)
-    deferred.addBoth(_set_context_cb, prev_context)
    return deferred


 ResultT = TypeVar("ResultT")


-def _set_context_cb(result: ResultT, context: LoggingContextOrSentinel) -> ResultT:
-    """A callback function which just sets the logging context"""
-    set_current_context(context)
-    return result
-
-
 def defer_to_thread(
    reactor: "ISynapseReactor", f: Callable[P, R], *args: P.args, **kwargs: P.kwargs
 ) -> "defer.Deferred[R]":
@@ -938,9 +868,6 @@ def defer_to_thread(
    logcontext (so its CPU usage metrics will get attributed to the current
    logcontext). `f` should preserve the logcontext it is given.

-    The result deferred follows the Synapse logcontext rules: you should `yield`
-    on it.
-
    Args:
        reactor: The reactor in whose main thread the Deferred will be invoked,
            and whose threadpool we should use for the function.
@@ -978,9 +905,6 @@ def defer_to_threadpool(
    logcontext (so its CPU usage metrics will get attributed to the current
    logcontext). `f` should preserve the logcontext it is given.

-    The result deferred follows the Synapse logcontext rules: you should `yield`
-    on it.
-
    Args:
        reactor: The reactor in whose main thread the Deferred will be invoked.
            Normally this will be hs.get_reactor().
@@ -998,18 +922,6 @@ def defer_to_threadpool(
        A Deferred which fires a callback with the result of `f`, or an
            errback if `f` throws an exception.
    """
-    curr_context = current_context()
-    if not curr_context:
-        logger.warning(
-            "Calling defer_to_threadpool from sentinel context: metrics will be lost"
-        )
-        parent_context = None
-    else:
-        assert isinstance(curr_context, LoggingContext)
-        parent_context = curr_context
-
-    def g() -> R:
-        with LoggingContext(str(curr_context), parent_context=parent_context):
-            return f(*args, **kwargs)
-
-    return make_deferred_yieldable(threads.deferToThreadPool(reactor, threadpool, g))
+    return make_deferred_yieldable(
+        threads.deferToThreadPool(reactor, threadpool, f, *args, **kwargs)
+    )
--- a/synapse/logging/opentracing.py
+++ b/synapse/logging/opentracing.py
@@ -251,18 +251,17 @@ class _DummyTagNames:
 try:
    import opentracing
    import opentracing.tags
+    from opentracing.scope_managers.contextvars import ContextVarsScopeManager

    tags = opentracing.tags
 except ImportError:
    opentracing = None  # type: ignore[assignment]
    tags = _DummyTagNames  # type: ignore[assignment]
+    ContextVarsScopeManager = None  # type: ignore
 try:
    from jaeger_client import Config as JaegerConfig
-
-    from synapse.logging.scopecontextmanager import LogContextScopeManager
 except ImportError:
    JaegerConfig = None  # type: ignore
-    LogContextScopeManager = None  # type: ignore


 try:
@@ -484,7 +483,7 @@ def init_tracer(hs: "HomeServer") -> None:
    config = JaegerConfig(
        config=jaeger_config,
        service_name=f"{hs.config.server.server_name} {instance_name_by_type}",
-        scope_manager=LogContextScopeManager(),
+        scope_manager=ContextVarsScopeManager(),
        metrics_factory=PrometheusMetricsFactory(),
    )

--- a/synapse/logging/scopecontextmanager.py
+++ b/synapse/logging/scopecontextmanager.py
@@ -1,161 +0,0 @@
-#
-# This file is licensed under the Affero General Public License (AGPL) version 3.
-#
-# Copyright 2019 The Matrix.org Foundation C.I.C.
-# Copyright (C) 2023 New Vector, Ltd
-#
-# This program is free software: you can redistribute it and/or modify
-# it under the terms of the GNU Affero General Public License as
-# published by the Free Software Foundation, either version 3 of the
-# License, or (at your option) any later version.
-#
-# See the GNU Affero General Public License for more details:
-# <https://www.gnu.org/licenses/agpl-3.0.html>.
-#
-# Originally licensed under the Apache License, Version 2.0:
-# <http://www.apache.org/licenses/LICENSE-2.0>.
-#
-# [This file includes modifications made by New Vector Limited]
-#
-#
-
-import logging
-from typing import Optional
-
-from opentracing import Scope, ScopeManager, Span
-
-from synapse.logging.context import (
-    LoggingContext,
-    current_context,
-    nested_logging_context,
-)
-
-logger = logging.getLogger(__name__)
-
-
-class LogContextScopeManager(ScopeManager):
-    """
-    The LogContextScopeManager tracks the active scope in opentracing
-    by using the log contexts which are native to synapse. This is so
-    that the basic opentracing api can be used across twisted defereds.
-
-    It would be nice just to use opentracing's ContextVarsScopeManager,
-    but currently that doesn't work due to https://twistedmatrix.com/trac/ticket/10301.
-    """
-
-    def __init__(self) -> None:
-        pass
-
-    @property
-    def active(self) -> Optional[Scope]:
-        """
-        Returns the currently active Scope which can be used to access the
-        currently active Scope.span.
-        If there is a non-null Scope, its wrapped Span
-        becomes an implicit parent of any newly-created Span at
-        Tracer.start_active_span() time.
-
-        Return:
-            The Scope that is active, or None if not available.
-        """
-        ctx = current_context()
-        return ctx.scope
-
-    def activate(self, span: Span, finish_on_close: bool) -> Scope:
-        """
-        Makes a Span active.
-        Args
-            span: the span that should become active.
-            finish_on_close: whether Span should be automatically finished when
-                Scope.close() is called.
-
-        Returns:
-            Scope to control the end of the active period for
-            *span*. It is a programming error to neglect to call
-            Scope.close() on the returned instance.
-        """
-
-        ctx = current_context()
-
-        if not ctx:
-            logger.error("Tried to activate scope outside of loggingcontext")
-            return Scope(None, span)  # type: ignore[arg-type]
-
-        if ctx.scope is not None:
-            # start a new logging context as a child of the existing one.
-            # Doing so -- rather than updating the existing logcontext -- means that
-            # creating several concurrent spans under the same logcontext works
-            # correctly.
-            ctx = nested_logging_context("")
-            enter_logcontext = True
-        else:
-            # if there is no span currently associated with the current logcontext, we
-            # just store the scope in it.
-            #
-            # This feels a bit dubious, but it does hack around a problem where a
-            # span outlasts its parent logcontext (which would otherwise lead to
-            # "Re-starting finished log context" errors).
-            enter_logcontext = False
-
-        scope = _LogContextScope(self, span, ctx, enter_logcontext, finish_on_close)
-        ctx.scope = scope
-        if enter_logcontext:
-            ctx.__enter__()
-
-        return scope
-
-
-class _LogContextScope(Scope):
-    """
-    A custom opentracing scope, associated with a LogContext
-
-      * When the scope is closed, the logcontext's active scope is reset to None.
-        and - if enter_logcontext was set - the logcontext is finished too.
-    """
-
-    def __init__(
-        self,
-        manager: LogContextScopeManager,
-        span: Span,
-        logcontext: LoggingContext,
-        enter_logcontext: bool,
-        finish_on_close: bool,
-    ):
-        """
-        Args:
-            manager:
-                the manager that is responsible for this scope.
-            span:
-                the opentracing span which this scope represents the local
-                lifetime for.
-            logcontext:
-                the log context to which this scope is attached.
-            enter_logcontext:
-                if True the log context will be exited when the scope is finished
-            finish_on_close:
-                if True finish the span when the scope is closed
-        """
-        super().__init__(manager, span)
-        self.logcontext = logcontext
-        self._finish_on_close = finish_on_close
-        self._enter_logcontext = enter_logcontext
-
-    def __str__(self) -> str:
-        return f"Scope<{self.span}>"
-
-    def close(self) -> None:
-        active_scope = self.manager.active
-        if active_scope is not self:
-            logger.error(
-                "Closing scope %s which is not the currently-active one %s",
-                self,
-                active_scope,
-            )
-
-        if self._finish_on_close:
-            self.span.finish()
-
-        self.logcontext.scope = None
-
-        if self._enter_logcontext:
-            self.logcontext.__exit__(None, None, None)
--- a/synapse/metrics/init.py
+++ b/synapse/metrics/init.py
@@ -73,8 +73,6 @@ logger = logging.getLogger(__name__)

 METRICS_PREFIX = "/_synapse/metrics"

-all_gauges: Dict[str, Collector] = {}
-
 HAVE_PROC_SELF_STAT = os.path.exists("/proc/self/stat")

 SERVER_NAME_LABEL = "server_name"
@@ -163,42 +161,110 @@ class LaterGauge(Collector):
    name: str
    desc: str
    labelnames: Optional[StrSequence] = attr.ib(hash=False)
-    # callback: should either return a value (if there are no labels for this metric),
-    # or dict mapping from a label tuple to a value
-    caller: Callable[
-        [], Union[Mapping[Tuple[str, ...], Union[int, float]], Union[int, float]]
-    ]
+    _instance_id_to_hook_map: Dict[
+        Optional[str],  # instance_id
+        Callable[
+            [], Union[Mapping[Tuple[str, ...], Union[int, float]], Union[int, float]]
+        ],
+    ] = attr.ib(factory=dict, hash=False)
+    """
+    Map from homeserver instance_id to a callback. Each callback should either return a
+    value (if there are no labels for this metric), or dict mapping from a label tuple
+    to a value.
+
+    We use `instance_id` instead of `server_name` because it's possible to have multiple
+    workers running in the same process with the same `server_name`.
+    """

    def collect(self) -> Iterable[Metric]:
        # The decision to add `SERVER_NAME_LABEL` is from the `LaterGauge` usage itself
        # (we don't enforce it here, one level up).
        g = GaugeMetricFamily(self.name, self.desc, labels=self.labelnames)  # type: ignore[missing-server-name-label]

-        try:
-            calls = self.caller()
-        except Exception:
-            logger.exception("Exception running callback for LaterGauge(%s)", self.name)
-            yield g
-            return
+        for homeserver_instance_id, hook in self._instance_id_to_hook_map.items():
+            try:
+                hook_result = hook()
+            except Exception:
+                logger.exception(
+                    "Exception running callback for LaterGauge(%s) for homeserver_instance_id=%s",
+                    self.name,
+                    homeserver_instance_id,
+                )
+                # Continue to return the rest of the metrics that aren't broken
+                continue

-        if isinstance(calls, (int, float)):
-            g.add_metric([], calls)
-        else:
-            for k, v in calls.items():
-                g.add_metric(k, v)
+            if isinstance(hook_result, (int, float)):
+                g.add_metric([], hook_result)
+            else:
+                for k, v in hook_result.items():
+                    g.add_metric(k, v)

        yield g

+    def register_hook(
+        self,
+        *,
+        homeserver_instance_id: Optional[str],
+        hook: Callable[
+            [], Union[Mapping[Tuple[str, ...], Union[int, float]], Union[int, float]]
+        ],
+    ) -> None:
+        """
+        Register a callback/hook that will be called to generate a metric samples for
+        the gauge.
+
+        Args:
+            homeserver_instance_id: The unique ID for this Synapse process instance
+                (`hs.get_instance_id()`) that this hook is associated with. This can be used
+                later to lookup all hooks associated with a given server name in order to
+                unregister them. This should only be omitted for global hooks that work
+                across all homeservers.
+            hook: A callback that should either return a value (if there are no
+                labels for this metric), or dict mapping from a label tuple to a value
+        """
+        # We shouldn't have multiple hooks registered for the same homeserver `instance_id`.
+        existing_hook = self._instance_id_to_hook_map.get(homeserver_instance_id)
+        assert existing_hook is None, (
+            f"LaterGauge(name={self.name}) hook already registered for homeserver_instance_id={homeserver_instance_id}. "
+            "This is likely a Synapse bug and you forgot to unregister the previous hooks for "
+            "the server (especially in tests)."
+        )
+
+        self._instance_id_to_hook_map[homeserver_instance_id] = hook
+
+    def unregister_hooks_for_homeserver_instance_id(
+        self, homeserver_instance_id: str
+    ) -> None:
+        """
+        Unregister all hooks associated with the given homeserver `instance_id`. This should be
+        called when a homeserver is shutdown to avoid extra hooks sitting around.
+
+        Args:
+            homeserver_instance_id: The unique ID for this Synapse process instance to
+                unregister hooks for (`hs.get_instance_id()`).
+        """
+        self._instance_id_to_hook_map.pop(homeserver_instance_id, None)
+
    def __attrs_post_init__(self) -> None:
-        self._register()
-
-    def _register(self) -> None:
-        if self.name in all_gauges.keys():
-            logger.warning("%s already registered, reregistering", self.name)
-            REGISTRY.unregister(all_gauges.pop(self.name))
-
        REGISTRY.register(self)
-        all_gauges[self.name] = self
+
+        # We shouldn't have multiple metrics with the same name. Typically, metrics
+        # should be created globally so you shouldn't be running into this and this will
+        # catch any stupid mistakes. The `REGISTRY.register(self)` call above will also
+        # raise an error if the metric already exists but to make things explicit, we'll
+        # also check here.
+        existing_gauge = all_later_gauges_to_clean_up_on_shutdown.get(self.name)
+        assert existing_gauge is None, f"LaterGauge(name={self.name}) already exists. "
+
+        # Keep track of the gauge so we can clean it up later.
+        all_later_gauges_to_clean_up_on_shutdown[self.name] = self
+
+
+all_later_gauges_to_clean_up_on_shutdown: Dict[str, LaterGauge] = {}
+"""
+Track all `LaterGauge` instances so we can remove any associated hooks during homeserver
+shutdown.
+"""


 # `MetricsEntry` only makes sense when it is a `Protocol`,
@@ -250,7 +316,7 @@ class InFlightGauge(Generic[MetricsEntry], Collector):
        # Protects access to _registrations
        self._lock = threading.Lock()

-        self._register_with_collector()
+        REGISTRY.register(self)

    def register(
        self,
@@ -341,14 +407,6 @@ class InFlightGauge(Generic[MetricsEntry], Collector):
                gauge.add_metric(labels=key, value=getattr(metrics, name))
            yield gauge

-    def _register_with_collector(self) -> None:
-        if self.name in all_gauges.keys():
-            logger.warning("%s already registered, reregistering", self.name)
-            REGISTRY.unregister(all_gauges.pop(self.name))
-
-        REGISTRY.register(self)
-        all_gauges[self.name] = self
-

 class GaugeHistogramMetricFamilyWithLabels(GaugeHistogramMetricFamily):
    """
--- a/synapse/metrics/background_process_metrics.py
+++ b/synapse/metrics/background_process_metrics.py
@@ -223,10 +223,9 @@ def run_as_background_process(
    This should be used to wrap processes which are fired off to run in the
    background, instead of being associated with a particular request.

-    It returns a Deferred which completes when the function completes, but it doesn't
-    follow the synapse logcontext rules, which makes it appropriate for passing to
-    clock.looping_call and friends (or for firing-and-forgetting in the middle of a
-    normal synapse async function).
+    It returns a Deferred which completes when the function completes, which makes it
+    appropriate for passing to clock.looping_call and friends (or for
+    firing-and-forgetting in the middle of a normal synapse async function).

    Args:
        desc: a description for this background process type
@@ -241,8 +240,6 @@ def run_as_background_process(

    Returns:
        Deferred which returns the result of func, or `None` if func raises.
-        Note that the returned Deferred does not follow the synapse logcontext
-        rules.
    """

    async def run() -> Optional[R]:
--- a/synapse/module_api/init.py
+++ b/synapse/module_api/init.py
@@ -237,10 +237,9 @@ def run_as_background_process(
    This should be used to wrap processes which are fired off to run in the
    background, instead of being associated with a particular request.

-    It returns a Deferred which completes when the function completes, but it doesn't
-    follow the synapse logcontext rules, which makes it appropriate for passing to
-    clock.looping_call and friends (or for firing-and-forgetting in the middle of a
-    normal synapse async function).
+    It returns a Deferred which completes when the function completes, which makes it
+    appropriate for passing to clock.looping_call and friends (or for
+    firing-and-forgetting in the middle of a normal synapse async function).

    Args:
        desc: a description for this background process type
@@ -255,8 +254,6 @@ def run_as_background_process(

    Returns:
        Deferred which returns the result of func, or `None` if func raises.
-        Note that the returned Deferred does not follow the synapse logcontext
-        rules.
    """

    logger.warning(
@@ -1375,9 +1372,7 @@ class ModuleApi:

        Args:
            f: The function to call repeatedly. f can be either synchronous or
-                asynchronous, and must follow Synapse's logcontext rules.
-                More info about logcontexts is available at
-                https://element-hq.github.io/synapse/latest/log_contexts.html
+                asynchronous.
            msec: How long to wait between calls in milliseconds.
            *args: Positional arguments to pass to function.
            desc: The background task's description. Default to the function's name.
@@ -1431,9 +1426,7 @@ class ModuleApi:
        Args:
            msec: How long to wait before calling, in milliseconds.
            f: The function to call once. f can be either synchronous or
-                asynchronous, and must follow Synapse's logcontext rules.
-                More info about logcontexts is available at
-                https://element-hq.github.io/synapse/latest/log_contexts.html
+                asynchronous.
            *args: Positional arguments to pass to function.
            desc: The background task's description. Default to the function's name.
            **kwargs: Keyword arguments to pass to function.
@@ -1668,10 +1661,9 @@ class ModuleApi:
        This should be used to wrap processes which are fired off to run in the
        background, instead of being associated with a particular request.

-        It returns a Deferred which completes when the function completes, but it doesn't
-        follow the synapse logcontext rules, which makes it appropriate for passing to
-        clock.looping_call and friends (or for firing-and-forgetting in the middle of a
-        normal synapse async function).
+        It returns a Deferred which completes when the function completes, which makes
+        it appropriate for passing to clock.looping_call and friends (or for
+        firing-and-forgetting in the middle of a normal synapse async function).

        Args:
            desc: a description for this background process type
@@ -1686,8 +1678,6 @@ class ModuleApi:

        Returns:
            Deferred which returns the result of func, or `None` if func raises.
-            Note that the returned Deferred does not follow the synapse logcontext
-            rules.
        """
        return _run_as_background_process(
            desc, self.server_name, func, *args, bg_start_span=bg_start_span, **kwargs
--- a/synapse/notifier.py
+++ b/synapse/notifier.py
@@ -86,6 +86,24 @@ users_woken_by_stream_counter = Counter(
    labelnames=["stream", SERVER_NAME_LABEL],
 )

+
+notifier_listeners_gauge = LaterGauge(
+    name="synapse_notifier_listeners",
+    desc="",
+    labelnames=[SERVER_NAME_LABEL],
+)
+
+notifier_rooms_gauge = LaterGauge(
+    name="synapse_notifier_rooms",
+    desc="",
+    labelnames=[SERVER_NAME_LABEL],
+)
+notifier_users_gauge = LaterGauge(
+    name="synapse_notifier_users",
+    desc="",
+    labelnames=[SERVER_NAME_LABEL],
+)
+
 T = TypeVar("T")


@@ -281,28 +299,20 @@ class Notifier:
                )
            }

-        LaterGauge(
-            name="synapse_notifier_listeners",
-            desc="",
-            labelnames=[SERVER_NAME_LABEL],
-            caller=count_listeners,
+        notifier_listeners_gauge.register_hook(
+            homeserver_instance_id=hs.get_instance_id(), hook=count_listeners
        )
-
-        LaterGauge(
-            name="synapse_notifier_rooms",
-            desc="",
-            labelnames=[SERVER_NAME_LABEL],
-            caller=lambda: {
+        notifier_rooms_gauge.register_hook(
+            homeserver_instance_id=hs.get_instance_id(),
+            hook=lambda: {
                (self.server_name,): count(
                    bool, list(self.room_to_user_streams.values())
                )
            },
        )
-        LaterGauge(
-            name="synapse_notifier_users",
-            desc="",
-            labelnames=[SERVER_NAME_LABEL],
-            caller=lambda: {(self.server_name,): len(self.user_to_user_stream)},
+        notifier_users_gauge.register_hook(
+            homeserver_instance_id=hs.get_instance_id(),
+            hook=lambda: {(self.server_name,): len(self.user_to_user_stream)},
        )

    def add_replication_callback(self, cb: Callable[[], None]) -> None:
--- a/synapse/push/bulk_push_rule_evaluator.py
+++ b/synapse/push/bulk_push_rule_evaluator.py
@@ -49,7 +49,7 @@ from synapse.api.constants import (
 from synapse.api.room_versions import PushRuleRoomFlag
 from synapse.event_auth import auth_types_for_event, get_user_power_level
 from synapse.events import EventBase, relation_from_event
-from synapse.events.snapshot import EventContext
+from synapse.events.snapshot import EventContext, EventPersistencePair
 from synapse.logging.context import make_deferred_yieldable, run_in_background
 from synapse.metrics import SERVER_NAME_LABEL
 from synapse.state import CREATE_KEY, POWER_KEY
@@ -352,7 +352,7 @@ class BulkPushRuleEvaluator:
        return related_events

    async def action_for_events_by_user(
-        self, events_and_context: List[Tuple[EventBase, EventContext]]
+        self, events_and_context: List[EventPersistencePair]
    ) -> None:
        """Given a list of events and their associated contexts, evaluate the push rules
        for each event, check if the message should increment the unread count, and
--- a/synapse/replication/http/federation.py
+++ b/synapse/replication/http/federation.py
@@ -24,8 +24,8 @@ from typing import TYPE_CHECKING, List, Tuple
 from twisted.web.server import Request

 from synapse.api.room_versions import KNOWN_ROOM_VERSIONS, RoomVersion
-from synapse.events import EventBase, make_event_from_dict
-from synapse.events.snapshot import EventContext
+from synapse.events import make_event_from_dict
+from synapse.events.snapshot import EventContext, EventPersistencePair
 from synapse.http.server import HttpServer
 from synapse.replication.http._base import ReplicationEndpoint
 from synapse.types import JsonDict
@@ -86,7 +86,7 @@ class ReplicationFederationSendEventsRestServlet(ReplicationEndpoint):
    async def _serialize_payload(  # type: ignore[override]
        store: "DataStore",
        room_id: str,
-        event_and_contexts: List[Tuple[EventBase, EventContext]],
+        event_and_contexts: List[EventPersistencePair],
        backfilled: bool,
    ) -> JsonDict:
        """
--- a/synapse/replication/http/send_events.py
+++ b/synapse/replication/http/send_events.py
@@ -25,8 +25,8 @@ from typing import TYPE_CHECKING, List, Tuple
 from twisted.web.server import Request

 from synapse.api.room_versions import KNOWN_ROOM_VERSIONS
-from synapse.events import EventBase, make_event_from_dict
-from synapse.events.snapshot import EventContext
+from synapse.events import make_event_from_dict
+from synapse.events.snapshot import EventContext, EventPersistencePair
 from synapse.http.server import HttpServer
 from synapse.replication.http._base import ReplicationEndpoint
 from synapse.types import JsonDict, Requester, UserID
@@ -85,7 +85,7 @@ class ReplicationSendEventsRestServlet(ReplicationEndpoint):

    @staticmethod
    async def _serialize_payload(  # type: ignore[override]
-        events_and_context: List[Tuple[EventBase, EventContext]],
+        events_and_context: List[EventPersistencePair],
        store: "DataStore",
        requester: Requester,
        ratelimit: bool,
--- a/synapse/replication/tcp/handler.py
+++ b/synapse/replication/tcp/handler.py
@@ -106,6 +106,18 @@ user_ip_cache_counter = Counter(
    "synapse_replication_tcp_resource_user_ip_cache", "", labelnames=[SERVER_NAME_LABEL]
 )

+tcp_resource_total_connections_gauge = LaterGauge(
+    name="synapse_replication_tcp_resource_total_connections",
+    desc="",
+    labelnames=[SERVER_NAME_LABEL],
+)
+
+tcp_command_queue_gauge = LaterGauge(
+    name="synapse_replication_tcp_command_queue",
+    desc="Number of inbound RDATA/POSITION commands queued for processing",
+    labelnames=["stream_name", SERVER_NAME_LABEL],
+)
+

 # the type of the entries in _command_queues_by_stream
 _StreamCommandQueue = Deque[
@@ -243,11 +255,9 @@ class ReplicationCommandHandler:
        # outgoing replication commands to.)
        self._connections: List[IReplicationConnection] = []

-        LaterGauge(
-            name="synapse_replication_tcp_resource_total_connections",
-            desc="",
-            labelnames=[SERVER_NAME_LABEL],
-            caller=lambda: {(self.server_name,): len(self._connections)},
+        tcp_resource_total_connections_gauge.register_hook(
+            homeserver_instance_id=hs.get_instance_id(),
+            hook=lambda: {(self.server_name,): len(self._connections)},
        )

        # When POSITION or RDATA commands arrive, we stick them in a queue and process
@@ -266,11 +276,9 @@ class ReplicationCommandHandler:
        # from that connection.
        self._streams_by_connection: Dict[IReplicationConnection, Set[str]] = {}

-        LaterGauge(
-            name="synapse_replication_tcp_command_queue",
-            desc="Number of inbound RDATA/POSITION commands queued for processing",
-            labelnames=["stream_name", SERVER_NAME_LABEL],
-            caller=lambda: {
+        tcp_command_queue_gauge.register_hook(
+            homeserver_instance_id=hs.get_instance_id(),
+            hook=lambda: {
                (stream_name, self.server_name): len(queue)
                for stream_name, queue in self._command_queues_by_stream.items()
            },
--- a/synapse/replication/tcp/protocol.py
+++ b/synapse/replication/tcp/protocol.py
@@ -527,7 +527,10 @@ pending_commands = LaterGauge(
    name="synapse_replication_tcp_protocol_pending_commands",
    desc="",
    labelnames=["name", SERVER_NAME_LABEL],
-    caller=lambda: {
+)
+pending_commands.register_hook(
+    homeserver_instance_id=None,
+    hook=lambda: {
        (p.name, p.server_name): len(p.pending_commands) for p in connected_connections
    },
 )
@@ -544,7 +547,10 @@ transport_send_buffer = LaterGauge(
    name="synapse_replication_tcp_protocol_transport_send_buffer",
    desc="",
    labelnames=["name", SERVER_NAME_LABEL],
-    caller=lambda: {
+)
+transport_send_buffer.register_hook(
+    homeserver_instance_id=None,
+    hook=lambda: {
        (p.name, p.server_name): transport_buffer_size(p) for p in connected_connections
    },
 )
@@ -571,7 +577,10 @@ tcp_transport_kernel_send_buffer = LaterGauge(
    name="synapse_replication_tcp_protocol_transport_kernel_send_buffer",
    desc="",
    labelnames=["name", SERVER_NAME_LABEL],
-    caller=lambda: {
+)
+tcp_transport_kernel_send_buffer.register_hook(
+    homeserver_instance_id=None,
+    hook=lambda: {
        (p.name, p.server_name): transport_kernel_read_buffer_size(p, False)
        for p in connected_connections
    },
@@ -582,7 +591,10 @@ tcp_transport_kernel_read_buffer = LaterGauge(
    name="synapse_replication_tcp_protocol_transport_kernel_read_buffer",
    desc="",
    labelnames=["name", SERVER_NAME_LABEL],
-    caller=lambda: {
+)
+tcp_transport_kernel_read_buffer.register_hook(
+    homeserver_instance_id=None,
+    hook=lambda: {
        (p.name, p.server_name): transport_kernel_read_buffer_size(p, True)
        for p in connected_connections
    },
--- a/synapse/rest/client/auth_metadata.py
+++ b/synapse/rest/client/auth_metadata.py
@@ -76,11 +76,17 @@ class AuthMetadataServlet(RestServlet):
    Advertises the OAuth 2.0 server metadata for the homeserver.
    """

-    PATTERNS = client_patterns(
-        "/org.matrix.msc2965/auth_metadata$",
-        unstable=True,
-        releases=(),
-    )
+    PATTERNS = [
+        *client_patterns(
+            "/auth_metadata$",
+            releases=("v1",),
+        ),
+        *client_patterns(
+            "/org.matrix.msc2965/auth_metadata$",
+            unstable=True,
+            releases=(),
+        ),
+    ]

    def __init__(self, hs: "HomeServer"):
        super().__init__()
--- a/synapse/server.py
+++ b/synapse/server.py
@@ -129,7 +129,10 @@ from synapse.http.client import (
 )
 from synapse.http.matrixfederationclient import MatrixFederationHttpClient
 from synapse.media.media_repository import MediaRepository
-from synapse.metrics import register_threadpool
+from synapse.metrics import (
+    all_later_gauges_to_clean_up_on_shutdown,
+    register_threadpool,
+)
 from synapse.metrics.common_usage_metrics import CommonUsageMetricsManager
 from synapse.module_api import ModuleApi
 from synapse.module_api.callbacks import ModuleApiCallbacks
@@ -363,11 +366,36 @@ class HomeServer(metaclass=abc.ABCMeta):
        self.datastores = Databases(self.DATASTORE_CLASS, self)
        logger.info("Finished setting up.")

-        # Register background tasks required by this server. This must be done
-        # somewhat manually due to the background tasks not being registered
-        # unless handlers are instantiated.
-        if self.config.worker.run_background_tasks:
-            self.setup_background_tasks()
+    def __del__(self) -> None:
+        """
+        Called when an the homeserver is garbage collected.
+
+        Make sure we actually do some clean-up, rather than leak data.
+        """
+        self.cleanup()
+
+    def cleanup(self) -> None:
+        """
+        WIP: Clean-up any references to the homeserver and stop any running related
+        processes, timers, loops, replication stream, etc.
+
+        This should be called wherever you care about the HomeServer being completely
+        garbage collected like in tests. It's not necessary to call if you plan to just
+        shut down the whole Python process anyway.
+
+        Can be called multiple times.
+        """
+        logger.info("Received cleanup request for %s.", self.hostname)
+
+        # TODO: Stop background processes, timers, loops, replication stream, etc.
+
+        # Cleanup metrics associated with the homeserver
+        for later_gauge in all_later_gauges_to_clean_up_on_shutdown.values():
+            later_gauge.unregister_hooks_for_homeserver_instance_id(
+                self.get_instance_id()
+            )
+
+        logger.info("Cleanup complete for %s.", self.hostname)

    def start_listening(self) -> None:  # noqa: B027 (no-op by design)
        """Start the HTTP, manhole, metrics, etc listeners
@@ -376,7 +404,7 @@ class HomeServer(metaclass=abc.ABCMeta):
        appropriate listeners.
        """

-    def setup_background_tasks(self) -> None:
+    def start_background_tasks(self) -> None:
        """
        Some handlers have side effects on instantiation (like registering
        background updates). This function causes them to be fetched, and
--- a/synapse/storage/controllers/persist_events.py
+++ b/synapse/storage/controllers/persist_events.py
@@ -51,7 +51,7 @@ from twisted.internet import defer

 from synapse.api.constants import EventTypes, Membership
 from synapse.events import EventBase
-from synapse.events.snapshot import EventContext
+from synapse.events.snapshot import EventContext, EventPersistencePair
 from synapse.handlers.worker_lock import NEW_EVENT_DURING_PURGE_LOCK_NAME
 from synapse.logging.context import PreserveLoggingContext, make_deferred_yieldable
 from synapse.logging.opentracing import (
@@ -144,7 +144,7 @@ class _PersistEventsTask:

    name: ClassVar[str] = "persist_event_batch"  # used for opentracing

-    events_and_contexts: List[Tuple[EventBase, EventContext]]
+    events_and_contexts: List[EventPersistencePair]
    backfilled: bool

    def try_merge(self, task: "_EventPersistQueueTask") -> bool:
@@ -391,7 +391,7 @@ class EventsPersistenceStorageController:
    @trace
    async def persist_events(
        self,
-        events_and_contexts: Iterable[Tuple[EventBase, EventContext]],
+        events_and_contexts: Iterable[EventPersistencePair],
        backfilled: bool = False,
    ) -> Tuple[List[EventBase], RoomStreamToken]:
        """
@@ -414,7 +414,7 @@ class EventsPersistenceStorageController:
                a room that has been un-partial stated.
        """
        event_ids: List[str] = []
-        partitioned: Dict[str, List[Tuple[EventBase, EventContext]]] = {}
+        partitioned: Dict[str, List[EventPersistencePair]] = {}
        for event, ctx in events_and_contexts:
            partitioned.setdefault(event.room_id, []).append((event, ctx))
            event_ids.append(event.event_id)
@@ -430,7 +430,7 @@ class EventsPersistenceStorageController:
        set_tag(SynapseTags.FUNC_ARG_PREFIX + "backfilled", str(backfilled))

        async def enqueue(
-            item: Tuple[str, List[Tuple[EventBase, EventContext]]],
+            item: Tuple[str, List[EventPersistencePair]],
        ) -> Dict[str, str]:
            room_id, evs_ctxs = item
            return await self._event_persist_queue.add_to_queue(
@@ -677,7 +677,7 @@ class EventsPersistenceStorageController:
        return replaced_events

    async def _calculate_new_forward_extremities_and_state_delta(
-        self, room_id: str, ev_ctx_rm: List[Tuple[EventBase, EventContext]]
+        self, room_id: str, ev_ctx_rm: List[EventPersistencePair]
    ) -> Tuple[Optional[Set[str]], Optional[DeltaState]]:
        """Calculates the new forward extremities and state delta for a room
        given events to persist.
@@ -802,7 +802,7 @@ class EventsPersistenceStorageController:
    async def _calculate_new_extremities(
        self,
        room_id: str,
-        event_contexts: List[Tuple[EventBase, EventContext]],
+        event_contexts: List[EventPersistencePair],
        latest_event_ids: AbstractSet[str],
    ) -> Set[str]:
        """Calculates the new forward extremities for a room given events to
@@ -862,7 +862,7 @@ class EventsPersistenceStorageController:
    async def _get_new_state_after_events(
        self,
        room_id: str,
-        events_context: List[Tuple[EventBase, EventContext]],
+        events_context: List[EventPersistencePair],
        old_latest_event_ids: AbstractSet[str],
        new_latest_event_ids: Set[str],
    ) -> Tuple[Optional[StateMap[str]], Optional[StateMap[str]], Set[str]]:
@@ -1039,7 +1039,7 @@ class EventsPersistenceStorageController:
        new_latest_event_ids: Set[str],
        resolved_state_group: int,
        event_id_to_state_group: Dict[str, int],
-        events_context: List[Tuple[EventBase, EventContext]],
+        events_context: List[EventPersistencePair],
    ) -> Set[str]:
        """See if we can prune any of the extremities after calculating the
        resolved state.
@@ -1176,7 +1176,7 @@ class EventsPersistenceStorageController:
    async def _is_server_still_joined(
        self,
        room_id: str,
-        ev_ctx_rm: List[Tuple[EventBase, EventContext]],
+        ev_ctx_rm: List[EventPersistencePair],
        delta: DeltaState,
    ) -> bool:
        """Check if the server will still be joined after the given events have
--- a/synapse/storage/database.py
+++ b/synapse/storage/database.py
@@ -61,7 +61,7 @@ from synapse.logging.context import (
    current_context,
    make_deferred_yieldable,
 )
-from synapse.metrics import SERVER_NAME_LABEL, LaterGauge, register_threadpool
+from synapse.metrics import SERVER_NAME_LABEL, register_threadpool
 from synapse.metrics.background_process_metrics import run_as_background_process
 from synapse.storage.background_updates import BackgroundUpdater
 from synapse.storage.engines import BaseDatabaseEngine, PostgresEngine, Sqlite3Engine
@@ -611,12 +611,6 @@ class DatabasePool:
        )

        self.updates = BackgroundUpdater(hs, self)
-        LaterGauge(
-            name="synapse_background_update_status",
-            desc="Background update status",
-            labelnames=[SERVER_NAME_LABEL],
-            caller=lambda: {(self.server_name,): self.updates.get_status()},
-        )

        self._previous_txn_total_time = 0.0
        self._current_txn_total_time = 0.0
--- a/synapse/storage/databases/init.py
+++ b/synapse/storage/databases/init.py
@@ -22,6 +22,7 @@
 import logging
 from typing import TYPE_CHECKING, Generic, List, Optional, Type, TypeVar

+from synapse.metrics import SERVER_NAME_LABEL, LaterGauge
 from synapse.storage._base import SQLBaseStore
 from synapse.storage.database import DatabasePool, make_conn
 from synapse.storage.databases.main.events import PersistEventsStore
@@ -40,6 +41,13 @@ logger = logging.getLogger(__name__)
 DataStoreT = TypeVar("DataStoreT", bound=SQLBaseStore, covariant=True)


+background_update_status = LaterGauge(
+    name="synapse_background_update_status",
+    desc="Background update status",
+    labelnames=["database_name", SERVER_NAME_LABEL],
+)
+
+
 class Databases(Generic[DataStoreT]):
    """The various databases.

@@ -143,6 +151,15 @@ class Databases(Generic[DataStoreT]):

            db_conn.close()

+        # Track the background update status for each database
+        background_update_status.register_hook(
+            homeserver_instance_id=hs.get_instance_id(),
+            hook=lambda: {
+                (database.name(), server_name): database.updates.get_status()
+                for database in self.databases
+            },
+        )
+
        # Sanity check that we have actually configured all the required stores.
        if not main:
            raise Exception("No 'main' database configured")
--- a/synapse/storage/databases/main/events.py
+++ b/synapse/storage/databases/main/events.py
@@ -57,7 +57,7 @@ from synapse.events import (
    is_creator,
    relation_from_event,
 )
-from synapse.events.snapshot import EventContext
+from synapse.events.snapshot import EventPersistencePair
 from synapse.events.utils import parse_stripped_state_event
 from synapse.logging.opentracing import trace
 from synapse.metrics import SERVER_NAME_LABEL
@@ -274,7 +274,7 @@ class PersistEventsStore:
    async def _persist_events_and_state_updates(
        self,
        room_id: str,
-        events_and_contexts: List[Tuple[EventBase, EventContext]],
+        events_and_contexts: List[EventPersistencePair],
        *,
        state_delta_for_room: Optional[DeltaState],
        new_forward_extremities: Optional[Set[str]],
@@ -532,7 +532,7 @@ class PersistEventsStore:
    async def _calculate_sliding_sync_table_changes(
        self,
        room_id: str,
-        events_and_contexts: Sequence[Tuple[EventBase, EventContext]],
+        events_and_contexts: Sequence[EventPersistencePair],
        delta_state: DeltaState,
    ) -> SlidingSyncTableChanges:
        """
@@ -1016,7 +1016,7 @@ class PersistEventsStore:
        txn: LoggingTransaction,
        *,
        room_id: str,
-        events_and_contexts: List[Tuple[EventBase, EventContext]],
+        events_and_contexts: List[EventPersistencePair],
        inhibit_local_membership_updates: bool,
        state_delta_for_room: Optional[DeltaState],
        new_forward_extremities: Optional[Set[str]],
@@ -1666,7 +1666,7 @@ class PersistEventsStore:
    def _persist_transaction_ids_txn(
        self,
        txn: LoggingTransaction,
-        events_and_contexts: List[Tuple[EventBase, EventContext]],
+        events_and_contexts: List[EventPersistencePair],
    ) -> None:
        """Persist the mapping from transaction IDs to event IDs (if defined)."""

@@ -2316,7 +2316,7 @@ class PersistEventsStore:
        self,
        txn: LoggingTransaction,
        room_id: str,
-        events_and_contexts: List[Tuple[EventBase, EventContext]],
+        events_and_contexts: List[EventPersistencePair],
    ) -> None:
        """
        Update the latest `event_stream_ordering`/`bump_stamp` columns in the
@@ -2456,8 +2456,8 @@ class PersistEventsStore:

    @classmethod
    def _filter_events_and_contexts_for_duplicates(
-        cls, events_and_contexts: List[Tuple[EventBase, EventContext]]
-    ) -> List[Tuple[EventBase, EventContext]]:
+        cls, events_and_contexts: List[EventPersistencePair]
+    ) -> List[EventPersistencePair]:
        """Ensure that we don't have the same event twice.

        Pick the earliest non-outlier if there is one, else the earliest one.
@@ -2468,9 +2468,7 @@ class PersistEventsStore:
        Returns:
            filtered list
        """
-        new_events_and_contexts: OrderedDict[str, Tuple[EventBase, EventContext]] = (
-            OrderedDict()
-        )
+        new_events_and_contexts: OrderedDict[str, EventPersistencePair] = OrderedDict()
        for event, context in events_and_contexts:
            prev_event_context = new_events_and_contexts.get(event.event_id)
            if prev_event_context:
@@ -2488,7 +2486,7 @@ class PersistEventsStore:
        self,
        txn: LoggingTransaction,
        room_id: str,
-        events_and_contexts: List[Tuple[EventBase, EventContext]],
+        events_and_contexts: List[EventPersistencePair],
    ) -> None:
        """Update min_depth for each room

@@ -2530,8 +2528,8 @@ class PersistEventsStore:
    def _update_outliers_txn(
        self,
        txn: LoggingTransaction,
-        events_and_contexts: List[Tuple[EventBase, EventContext]],
-    ) -> List[Tuple[EventBase, EventContext]]:
+        events_and_contexts: List[EventPersistencePair],
+    ) -> List[EventPersistencePair]:
        """Update any outliers with new event info.

        This turns outliers into ex-outliers (unless the new event was rejected), and
@@ -2638,7 +2636,7 @@ class PersistEventsStore:
    def _store_event_txn(
        self,
        txn: LoggingTransaction,
-        events_and_contexts: Collection[Tuple[EventBase, EventContext]],
+        events_and_contexts: Collection[EventPersistencePair],
    ) -> None:
        """Insert new events into the event, event_json, redaction and
        state_events tables.
@@ -2742,8 +2740,8 @@ class PersistEventsStore:
    def _store_rejected_events_txn(
        self,
        txn: LoggingTransaction,
-        events_and_contexts: List[Tuple[EventBase, EventContext]],
-    ) -> List[Tuple[EventBase, EventContext]]:
+        events_and_contexts: List[EventPersistencePair],
+    ) -> List[EventPersistencePair]:
        """Add rows to the 'rejections' table for received events which were
        rejected

@@ -2770,8 +2768,8 @@ class PersistEventsStore:
        self,
        txn: LoggingTransaction,
        *,
-        events_and_contexts: List[Tuple[EventBase, EventContext]],
-        all_events_and_contexts: List[Tuple[EventBase, EventContext]],
+        events_and_contexts: List[EventPersistencePair],
+        all_events_and_contexts: List[EventPersistencePair],
        inhibit_local_membership_updates: bool = False,
    ) -> None:
        """Update all the miscellaneous tables for new events
@@ -2865,7 +2863,7 @@ class PersistEventsStore:
    def _add_to_cache(
        self,
        txn: LoggingTransaction,
-        events_and_contexts: List[Tuple[EventBase, EventContext]],
+        events_and_contexts: List[EventPersistencePair],
    ) -> None:
        to_prefill: List[EventCacheEntry] = []

@@ -3338,8 +3336,8 @@ class PersistEventsStore:
    def _set_push_actions_for_event_and_users_txn(
        self,
        txn: LoggingTransaction,
-        events_and_contexts: List[Tuple[EventBase, EventContext]],
-        all_events_and_contexts: List[Tuple[EventBase, EventContext]],
+        events_and_contexts: List[EventPersistencePair],
+        all_events_and_contexts: List[EventPersistencePair],
    ) -> None:
        """Handles moving push actions from staging table to main
        event_push_actions table for all events in `events_and_contexts`.
@@ -3422,7 +3420,7 @@ class PersistEventsStore:
    def _store_event_state_mappings_txn(
        self,
        txn: LoggingTransaction,
-        events_and_contexts: Collection[Tuple[EventBase, EventContext]],
+        events_and_contexts: Collection[EventPersistencePair],
    ) -> None:
        """
        Raises:
--- a/synapse/storage/databases/main/events_worker.py
+++ b/synapse/storage/databases/main/events_worker.py
@@ -81,6 +81,7 @@ from synapse.storage.database import (
    DatabasePool,
    LoggingDatabaseConnection,
    LoggingTransaction,
+    make_tuple_in_list_sql_clause,
 )
 from synapse.storage.types import Cursor
 from synapse.storage.util.id_generators import (
@@ -1337,6 +1338,7 @@ class EventsWorkerStore(SQLBaseStore):
        fetched_event_ids: Set[str] = set()
        fetched_events: Dict[str, _EventRow] = {}

+        @trace
        async def _fetch_event_ids_and_get_outstanding_redactions(
            event_ids_to_fetch: Collection[str],
        ) -> Collection[str]:
@@ -1344,6 +1346,10 @@ class EventsWorkerStore(SQLBaseStore):
            Fetch all of the given event_ids and return any associated redaction event_ids
            that we still need to fetch in the next iteration.
            """
+            set_tag(
+                SynapseTags.FUNC_ARG_PREFIX + "event_ids_to_fetch.length",
+                str(len(event_ids_to_fetch)),
+            )
            row_map = await self._enqueue_events(event_ids_to_fetch)

            # we need to recursively fetch any redactions of those events
@@ -1617,21 +1623,28 @@ class EventsWorkerStore(SQLBaseStore):
            # likely that some of these events may be for the same room/user combo, in
            # which case we don't need to do redundant queries
            to_check_set = set(to_check)
-            for room_and_user in to_check_set:
-                room_redactions_sql = "SELECT redacting_event_id, redact_end_ordering FROM room_ban_redactions WHERE room_id = ? and user_id = ?"
-                txn.execute(room_redactions_sql, room_and_user)
-
-                res = txn.fetchone()
-                # we have a redaction for a room, user_id combo - apply it to matching events
-                if not res:
-                    continue
+            room_redaction_sql = "SELECT room_id, user_id, redacting_event_id, redact_end_ordering FROM room_ban_redactions WHERE "
+            (
+                in_list_clause,
+                room_redaction_args,
+            ) = make_tuple_in_list_sql_clause(
+                self.database_engine, ("room_id", "user_id"), to_check_set
+            )
+            txn.execute(room_redaction_sql + in_list_clause, room_redaction_args)
+            for (
+                returned_room_id,
+                returned_user_id,
+                redacting_event_id,
+                redact_end_ordering,
+            ) in txn:
                for e_row in events:
                    e_json = json.loads(e_row.json)
                    room_id = e_json.get("room_id")
                    user_id = e_json.get("sender")
+                    room_and_user = (returned_room_id, returned_user_id)
+                    # check if we have a redaction match for this room, user combination
                    if room_and_user != (room_id, user_id):
                        continue
-                    redacting_event_id, redact_end_ordering = res
                    if redact_end_ordering:
                        # Avoid redacting any events arriving *after* the membership event which
                        # ends an active redaction - note that this will always redact
--- a/synapse/storage/databases/main/roommember.py
+++ b/synapse/storage/databases/main/roommember.py
@@ -84,6 +84,13 @@ _CURRENT_STATE_MEMBERSHIP_UPDATE_NAME = "current_state_events_membership"
 _POPULATE_PARTICIPANT_BG_UPDATE_BATCH_SIZE = 1000


+federation_known_servers_gauge = LaterGauge(
+    name="synapse_federation_known_servers",
+    desc="",
+    labelnames=[SERVER_NAME_LABEL],
+)
+
+
@attr.s(frozen=True, slots=True, auto_attribs=True)
 class EventIdMembership:
    """Returned by `get_membership_from_event_ids`"""
@@ -116,11 +123,9 @@ class RoomMemberWorkerStore(EventsWorkerStore, CacheInvalidationWorkerStore):
                1,
                self._count_known_servers,
            )
-            LaterGauge(
-                name="synapse_federation_known_servers",
-                desc="",
-                labelnames=[SERVER_NAME_LABEL],
-                caller=lambda: {(self.server_name,): self._known_servers_count},
+            federation_known_servers_gauge.register_hook(
+                homeserver_instance_id=hs.get_instance_id(),
+                hook=lambda: {(self.server_name,): self._known_servers_count},
            )

    @wrap_as_background_process("_count_known_servers")
--- a/synapse/storage/databases/state/deletion.py
+++ b/synapse/storage/databases/state/deletion.py
@@ -25,8 +25,7 @@ from typing import (
    Tuple,
 )

-from synapse.events import EventBase
-from synapse.events.snapshot import EventContext
+from synapse.events.snapshot import EventPersistencePair
 from synapse.storage.database import (
    DatabasePool,
    LoggingDatabaseConnection,
@@ -228,7 +227,7 @@ class StateDeletionDataStore:

    @contextlib.asynccontextmanager
    async def persisting_state_group_references(
-        self, event_and_contexts: Collection[Tuple[EventBase, EventContext]]
+        self, event_and_contexts: Collection[EventPersistencePair]
    ) -> AsyncIterator[None]:
        """Wraps the persistence of the given events and contexts, ensuring that
        any state groups referenced still exist and that they don't get deleted
--- a/synapse/util/async_helpers.py
+++ b/synapse/util/async_helpers.py
@@ -84,9 +84,6 @@ class AbstractObservableDeferred(Generic[_T], metaclass=abc.ABCMeta):
        This returns a brand new deferred that is resolved when the underlying
        deferred is resolved. Interacting with the returned deferred does not
        effect the underlying deferred.
-
-        Note that the returned Deferred doesn't follow the Synapse logcontext rules -
-        you will probably want to `make_deferred_yieldable` it.
        """
        ...

@@ -100,11 +97,6 @@ class ObservableDeferred(Generic[_T], AbstractObservableDeferred[_T]):

    Cancelling or otherwise resolving an observer will not affect the original
    ObservableDeferred.
-
-    NB that it does not attempt to do anything with logcontexts; in general
-    you should probably make_deferred_yieldable the deferreds
-    returned by `observe`, and ensure that the original deferred runs its
-    callbacks in the sentinel logcontext.
    """

    __slots__ = ["_deferred", "_observers", "_result"]
@@ -861,16 +853,12 @@ def stop_cancellation(deferred: "defer.Deferred[T]") -> "defer.Deferred[T]":
    """Prevent a `Deferred` from being cancelled by wrapping it in another `Deferred`.

    Args:
-        deferred: The `Deferred` to protect against cancellation. Must not follow the
-            Synapse logcontext rules.
+        deferred: The `Deferred` to protect against cancellation.

    Returns:
        A new `Deferred`, which will contain the result of the original `Deferred`.
        The new `Deferred` will not propagate cancellation through to the original.
        When cancelled, the new `Deferred` will fail with a `CancelledError`.
-
-        The new `Deferred` will not follow the Synapse logcontext rules and should be
-        wrapped with `make_deferred_yieldable`.
    """
    new_deferred: "defer.Deferred[T]" = defer.Deferred()
    deferred.chainDeferred(new_deferred)
@@ -896,8 +884,7 @@ def delay_cancellation(awaitable: Awaitable[T]) -> Awaitable[T]:
    resolve with a `CancelledError` until the original awaitable resolves.

    Args:
-        deferred: The coroutine or `Deferred` to protect against cancellation. May
-            optionally follow the Synapse logcontext rules.
+        deferred: The coroutine or `Deferred` to protect against cancellation.

    Returns:
        A new `Deferred`, which will contain the result of the original coroutine or
@@ -906,10 +893,6 @@ def delay_cancellation(awaitable: Awaitable[T]) -> Awaitable[T]:

        When cancelled, the new `Deferred` will wait until the original coroutine or
        `Deferred` resolves before failing with a `CancelledError`.
-
-        The new `Deferred` will follow the Synapse logcontext rules if `awaitable`
-        follows the Synapse logcontext rules. Otherwise the new `Deferred` should be
-        wrapped with `make_deferred_yieldable`.
    """

    # First, convert the awaitable into a `Deferred`.
--- a/synapse/util/caches/deferred_cache.py
+++ b/synapse/util/caches/deferred_cache.py
@@ -295,9 +295,6 @@ class DeferredCache(Generic[KT, VT]):
        *original* `value`, (c) any future calls to `get()` will complete with the
        result from the *new* `value`.

-        It is expected that `value` does *not* follow the synapse logcontext rules - ie,
-        if it is incomplete, it runs its callbacks in the sentinel context.
-
        Args:
            key: Key to be set
            value: a deferred which will complete with a result to add to the cache
--- a/synapse/util/caches/response_cache.py
+++ b/synapse/util/caches/response_cache.py
@@ -234,11 +234,9 @@ class ResponseCache(Generic[KV]):
    ) -> RV:
        """Wrap together a *get* and *set* call, taking care of logcontexts

-        First looks up the key in the cache, and if it is present makes it
-        follow the synapse logcontext rules and returns it.
+        First looks up the key in the cache, and if present, returns it.

-        Otherwise, makes a call to *callback(*args, **kwargs)*, which should
-        follow the synapse logcontext rules, and adds the result to the cache.
+        Otherwise, makes a call to *callback(*args, **kwargs)* and adds the result to the cache.

        Example usage:

--- a/synapse/util/patch_inline_callbacks.py
+++ b/synapse/util/patch_inline_callbacks.py
@@ -1,250 +0,0 @@
-#
-# This file is licensed under the Affero General Public License (AGPL) version 3.
-#
-# Copyright (C) 2023 New Vector, Ltd
-#
-# This program is free software: you can redistribute it and/or modify
-# it under the terms of the GNU Affero General Public License as
-# published by the Free Software Foundation, either version 3 of the
-# License, or (at your option) any later version.
-#
-# See the GNU Affero General Public License for more details:
-# <https://www.gnu.org/licenses/agpl-3.0.html>.
-#
-# Originally licensed under the Apache License, Version 2.0:
-# <http://www.apache.org/licenses/LICENSE-2.0>.
-#
-# [This file includes modifications made by New Vector Limited]
-#
-#
-
-import functools
-import sys
-from types import GeneratorType
-from typing import Any, Callable, Generator, List, TypeVar, cast
-
-from typing_extensions import ParamSpec
-
-from twisted.internet import defer
-from twisted.internet.defer import Deferred
-from twisted.python.failure import Failure
-
-# Tracks if we've already patched inlineCallbacks
-_already_patched = False
-
-
-T = TypeVar("T")
-P = ParamSpec("P")
-
-
-def do_patch() -> None:
-    """
-    Patch defer.inlineCallbacks so that it checks the state of the logcontext on exit
-    """
-
-    from synapse.logging.context import current_context
-
-    global _already_patched
-
-    orig_inline_callbacks = defer.inlineCallbacks
-    if _already_patched:
-        return
-
-    def new_inline_callbacks(
-        f: Callable[P, Generator["Deferred[object]", object, T]],
-    ) -> Callable[P, "Deferred[T]"]:
-        @functools.wraps(f)
-        def wrapped(*args: P.args, **kwargs: P.kwargs) -> "Deferred[T]":
-            start_context = current_context()
-            changes: List[str] = []
-            orig: Callable[P, "Deferred[T]"] = orig_inline_callbacks(
-                _check_yield_points(f, changes)
-            )
-
-            try:
-                res: "Deferred[T]" = orig(*args, **kwargs)
-            except Exception:
-                if current_context() != start_context:
-                    for err in changes:
-                        print(err, file=sys.stderr)
-
-                    err = "%s changed context from %s to %s on exception" % (
-                        f,
-                        start_context,
-                        current_context(),
-                    )
-                    print(err, file=sys.stderr)
-                    raise Exception(err)
-                raise
-
-            if not isinstance(res, Deferred) or res.called:
-                if current_context() != start_context:
-                    for err in changes:
-                        print(err, file=sys.stderr)
-
-                    err = "Completed %s changed context from %s to %s" % (
-                        f,
-                        start_context,
-                        current_context(),
-                    )
-                    # print the error to stderr because otherwise all we
-                    # see in travis-ci is the 500 error
-                    print(err, file=sys.stderr)
-                    raise Exception(err)
-                return res
-
-            if current_context():
-                err = (
-                    "%s returned incomplete deferred in non-sentinel context "
-                    "%s (start was %s)"
-                ) % (f, current_context(), start_context)
-                print(err, file=sys.stderr)
-                raise Exception(err)
-
-            def check_ctx(r: T) -> T:
-                if current_context() != start_context:
-                    for err in changes:
-                        print(err, file=sys.stderr)
-                    err = "%s completion of %s changed context from %s to %s" % (
-                        "Failure" if isinstance(r, Failure) else "Success",
-                        f,
-                        start_context,
-                        current_context(),
-                    )
-                    print(err, file=sys.stderr)
-                    raise Exception(err)
-                return r
-
-            res.addBoth(check_ctx)
-            return res
-
-        return wrapped
-
-    defer.inlineCallbacks = new_inline_callbacks
-    _already_patched = True
-
-
-def _check_yield_points(
-    f: Callable[P, Generator["Deferred[object]", object, T]],
-    changes: List[str],
-) -> Callable:
-    """Wraps a generator that is about to be passed to defer.inlineCallbacks
-    checking that after every yield the log contexts are correct.
-
-    It's perfectly valid for log contexts to change within a function, e.g. due
-    to new Measure blocks, so such changes are added to the given `changes`
-    list instead of triggering an exception.
-
-    Args:
-        f: generator function to wrap
-        changes: A list of strings detailing how the contexts
-            changed within a function.
-
-    Returns:
-        function
-    """
-
-    from synapse.logging.context import current_context
-
-    @functools.wraps(f)
-    def check_yield_points_inner(
-        *args: P.args, **kwargs: P.kwargs
-    ) -> Generator["Deferred[object]", object, T]:
-        gen = f(*args, **kwargs)
-
-        # We only patch if we have a native generator function, as we rely on
-        # `gen.gi_frame`.
-        if not isinstance(gen, GeneratorType):
-            ret = yield from gen
-            return ret
-
-        last_yield_line_no = gen.gi_frame.f_lineno
-        result: Any = None
-        while True:
-            expected_context = current_context()
-
-            try:
-                isFailure = isinstance(result, Failure)
-                if isFailure:
-                    d = result.throwExceptionIntoGenerator(gen)
-                else:
-                    d = gen.send(result)
-            except StopIteration as e:
-                if current_context() != expected_context:
-                    # This happens when the context is lost sometime *after* the
-                    # final yield and returning. E.g. we forgot to yield on a
-                    # function that returns a deferred.
-                    #
-                    # We don't raise here as it's perfectly valid for contexts to
-                    # change in a function, as long as it sets the correct context
-                    # on resolving (which is checked separately).
-                    err = (
-                        "Function %r returned and changed context from %s to %s,"
-                        " in %s between %d and end of func"
-                        % (
-                            f.__qualname__,
-                            expected_context,
-                            current_context(),
-                            f.__code__.co_filename,
-                            last_yield_line_no,
-                        )
-                    )
-                    changes.append(err)
-                # The `StopIteration` contains the return value from the
-                # generator.
-                return cast(T, e.value)
-
-            frame = gen.gi_frame
-
-            if isinstance(d, defer.Deferred) and not d.called:
-                # This happens if we yield on a deferred that doesn't follow
-                # the log context rules without wrapping in a `make_deferred_yieldable`.
-                # We raise here as this should never happen.
-                if current_context():
-                    err = (
-                        "%s yielded with context %s rather than sentinel,"
-                        " yielded on line %d in %s"
-                        % (
-                            frame.f_code.co_name,
-                            current_context(),
-                            frame.f_lineno,
-                            frame.f_code.co_filename,
-                        )
-                    )
-                    raise Exception(err)
-
-            # the wrapped function yielded a Deferred: yield it back up to the parent
-            # inlineCallbacks().
-            try:
-                result = yield d
-            except Exception:
-                # this will fish an earlier Failure out of the stack where possible, and
-                # thus is preferable to passing in an exception to the Failure
-                # constructor, since it results in less stack-mangling.
-                result = Failure()
-
-            if current_context() != expected_context:
-                # This happens because the context is lost sometime *after* the
-                # previous yield and *after* the current yield. E.g. the
-                # deferred we waited on didn't follow the rules, or we forgot to
-                # yield on a function between the two yield points.
-                #
-                # We don't raise here as its perfectly valid for contexts to
-                # change in a function, as long as it sets the correct context
-                # on resolving (which is checked separately).
-                err = (
-                    "%s changed context from %s to %s, happened between lines %d and %d in %s"
-                    % (
-                        frame.f_code.co_name,
-                        expected_context,
-                        current_context(),
-                        last_yield_line_no,
-                        frame.f_lineno,
-                        frame.f_code.co_filename,
-                    )
-                )
-                changes.append(err)
-
-            last_yield_line_no = frame.f_lineno
-
-    return check_yield_points_inner
--- a/synapse/util/ratelimitutils.py
+++ b/synapse/util/ratelimitutils.py
@@ -131,22 +131,28 @@ def _get_counts_from_rate_limiter_instance(
 # We track the number of affected hosts per time-period so we can
 # differentiate one really noisy homeserver from a general
 # ratelimit tuning problem across the federation.
-LaterGauge(
+sleep_affected_hosts_gauge = LaterGauge(
    name="synapse_rate_limit_sleep_affected_hosts",
    desc="Number of hosts that had requests put to sleep",
    labelnames=["rate_limiter_name", SERVER_NAME_LABEL],
-    caller=lambda: _get_counts_from_rate_limiter_instance(
+)
+sleep_affected_hosts_gauge.register_hook(
+    homeserver_instance_id=None,
+    hook=lambda: _get_counts_from_rate_limiter_instance(
        lambda rate_limiter_instance: sum(
            ratelimiter.should_sleep()
            for ratelimiter in rate_limiter_instance.ratelimiters.values()
        )
    ),
 )
-LaterGauge(
+reject_affected_hosts_gauge = LaterGauge(
    name="synapse_rate_limit_reject_affected_hosts",
    desc="Number of hosts that had requests rejected",
    labelnames=["rate_limiter_name", SERVER_NAME_LABEL],
-    caller=lambda: _get_counts_from_rate_limiter_instance(
+)
+reject_affected_hosts_gauge.register_hook(
+    homeserver_instance_id=None,
+    hook=lambda: _get_counts_from_rate_limiter_instance(
        lambda rate_limiter_instance: sum(
            ratelimiter.should_reject()
            for ratelimiter in rate_limiter_instance.ratelimiters.values()
--- a/synapse/util/task_scheduler.py
+++ b/synapse/util/task_scheduler.py
@@ -44,6 +44,13 @@ if TYPE_CHECKING:
 logger = logging.getLogger(__name__)


+running_tasks_gauge = LaterGauge(
+    name="synapse_scheduler_running_tasks",
+    desc="The number of concurrent running tasks handled by the TaskScheduler",
+    labelnames=[SERVER_NAME_LABEL],
+)
+
+
 class TaskScheduler:
    """
    This is a simple task scheduler designed for resumable tasks. Normally,
@@ -130,11 +137,9 @@ class TaskScheduler:
                TaskScheduler.SCHEDULE_INTERVAL_MS,
            )

-        LaterGauge(
-            name="synapse_scheduler_running_tasks",
-            desc="The number of concurrent running tasks handled by the TaskScheduler",
-            labelnames=[SERVER_NAME_LABEL],
-            caller=lambda: {(self.server_name,): len(self._running_tasks)},
+        running_tasks_gauge.register_hook(
+            homeserver_instance_id=hs.get_instance_id(),
+            hook=lambda: {(self.server_name,): len(self._running_tasks)},
        )

    def register_action(
--- a/tests/init.py
+++ b/tests/init.py
@@ -21,9 +21,4 @@

 from twisted.trial import util

-from synapse.util.patch_inline_callbacks import do_patch
-
-# attempt to do the patch before we load any synapse code
-do_patch()
-
 util.DEFAULT_TIMEOUT_DURATION = 20
--- a/tests/handlers/test_oauth_delegation.py
+++ b/tests/handlers/test_oauth_delegation.py
@@ -25,11 +25,11 @@ import time
 from http import HTTPStatus
 from http.server import BaseHTTPRequestHandler, HTTPServer
 from io import BytesIO
-from typing import Any, Coroutine, Dict, Generator, Optional, TypeVar, Union
+from typing import Any, ClassVar, Coroutine, Dict, Generator, Optional, TypeVar, Union
 from unittest.mock import ANY, AsyncMock, Mock
 from urllib.parse import parse_qs

-from parameterized import parameterized_class
+from parameterized.parameterized import parameterized_class
 from signedjson.key import (
    encode_verify_key_base64,
    generate_signing_key,
@@ -46,7 +46,6 @@ from synapse.api.errors import (
    Codes,
    HttpResponseException,
    InvalidClientTokenError,
-    OAuthInsufficientScopeError,
    SynapseError,
 )
 from synapse.appservice import ApplicationService
@@ -78,11 +77,7 @@ JWKS_URI = ISSUER + ".well-known/jwks.json"
 INTROSPECTION_ENDPOINT = ISSUER + "introspect"

 SYNAPSE_ADMIN_SCOPE = "urn:synapse:admin:*"
-MATRIX_USER_SCOPE = "urn:matrix:org.matrix.msc2967.client:api:*"
-MATRIX_GUEST_SCOPE = "urn:matrix:org.matrix.msc2967.client:api:guest"
-MATRIX_DEVICE_SCOPE_PREFIX = "urn:matrix:org.matrix.msc2967.client:device:"
 DEVICE = "AABBCCDD"
-MATRIX_DEVICE_SCOPE = MATRIX_DEVICE_SCOPE_PREFIX + DEVICE
 SUBJECT = "abc-def-ghi"
 USERNAME = "test-user"
 USER_ID = "@" + USERNAME + ":" + SERVER_NAME
@@ -112,7 +107,24 @@ async def get_json(url: str) -> JsonDict:


@skip_unless(HAS_AUTHLIB, "requires authlib")
+@parameterized_class(
+    ("device_scope_prefix", "api_scope"),
+    [
+        ("urn:matrix:client:device:", "urn:matrix:client:api:*"),
+        (
+            "urn:matrix:org.matrix.msc2967.client:device:",
+            "urn:matrix:org.matrix.msc2967.client:api:*",
+        ),
+    ],
+)
 class MSC3861OAuthDelegation(HomeserverTestCase):
+    device_scope_prefix: ClassVar[str]
+    api_scope: ClassVar[str]
+
+    @property
+    def device_scope(self) -> str:
+        return self.device_scope_prefix + DEVICE
+
    servlets = [
        account.register_servlets,
        keys.register_servlets,
@@ -212,7 +224,7 @@ class MSC3861OAuthDelegation(HomeserverTestCase):
        """The handler should return a 500 when no subject is present."""

        self._set_introspection_returnvalue(
-            {"active": True, "scope": " ".join([MATRIX_USER_SCOPE])}
+            {"active": True, "scope": " ".join([self.api_scope])},
        )

        request = Mock(args={})
@@ -235,7 +247,7 @@ class MSC3861OAuthDelegation(HomeserverTestCase):
            {
                "active": True,
                "sub": SUBJECT,
-                "scope": " ".join([MATRIX_DEVICE_SCOPE]),
+                "scope": " ".join([self.device_scope]),
            }
        )
        request = Mock(args={})
@@ -282,7 +294,7 @@ class MSC3861OAuthDelegation(HomeserverTestCase):
            {
                "active": True,
                "sub": SUBJECT,
-                "scope": " ".join([SYNAPSE_ADMIN_SCOPE, MATRIX_USER_SCOPE]),
+                "scope": " ".join([SYNAPSE_ADMIN_SCOPE, self.api_scope]),
                "username": USERNAME,
            }
        )
@@ -312,9 +324,7 @@ class MSC3861OAuthDelegation(HomeserverTestCase):
            {
                "active": True,
                "sub": SUBJECT,
-                "scope": " ".join(
-                    [SYNAPSE_ADMIN_SCOPE, MATRIX_USER_SCOPE, MATRIX_GUEST_SCOPE]
-                ),
+                "scope": " ".join([SYNAPSE_ADMIN_SCOPE, self.api_scope]),
                "username": USERNAME,
            }
        )
@@ -344,7 +354,7 @@ class MSC3861OAuthDelegation(HomeserverTestCase):
            {
                "active": True,
                "sub": SUBJECT,
-                "scope": " ".join([MATRIX_USER_SCOPE]),
+                "scope": " ".join([self.api_scope]),
                "username": USERNAME,
            }
        )
@@ -374,7 +384,7 @@ class MSC3861OAuthDelegation(HomeserverTestCase):
            {
                "active": True,
                "sub": SUBJECT,
-                "scope": " ".join([MATRIX_USER_SCOPE, MATRIX_DEVICE_SCOPE]),
+                "scope": " ".join([self.api_scope, self.device_scope]),
                "username": USERNAME,
            }
        )
@@ -404,7 +414,7 @@ class MSC3861OAuthDelegation(HomeserverTestCase):
            {
                "active": True,
                "sub": SUBJECT,
-                "scope": " ".join([MATRIX_USER_SCOPE]),
+                "scope": " ".join([self.api_scope]),
                "device_id": DEVICE,
                "username": USERNAME,
            }
@@ -444,9 +454,9 @@ class MSC3861OAuthDelegation(HomeserverTestCase):
                "sub": SUBJECT,
                "scope": " ".join(
                    [
-                        MATRIX_USER_SCOPE,
-                        f"{MATRIX_DEVICE_SCOPE_PREFIX}AABBCC",
-                        f"{MATRIX_DEVICE_SCOPE_PREFIX}DDEEFF",
+                        self.api_scope,
+                        f"{self.device_scope_prefix}AABBCC",
+                        f"{self.device_scope_prefix}DDEEFF",
                    ]
                ),
                "username": USERNAME,
@@ -457,68 +467,6 @@ class MSC3861OAuthDelegation(HomeserverTestCase):
        request.requestHeaders.getRawHeaders = mock_getRawHeaders()
        self.get_failure(self.auth.get_user_by_req(request), AuthError)

-    def test_active_guest_not_allowed(self) -> None:
-        """The handler should return an insufficient scope error."""
-
-        self._set_introspection_returnvalue(
-            {
-                "active": True,
-                "sub": SUBJECT,
-                "scope": " ".join([MATRIX_GUEST_SCOPE, MATRIX_DEVICE_SCOPE]),
-                "username": USERNAME,
-            }
-        )
-        request = Mock(args={})
-        request.args[b"access_token"] = [b"mockAccessToken"]
-        request.requestHeaders.getRawHeaders = mock_getRawHeaders()
-        error = self.get_failure(
-            self.auth.get_user_by_req(request), OAuthInsufficientScopeError
-        )
-        self.http_client.get_json.assert_called_once_with(WELL_KNOWN)
-        self._rust_client.post.assert_called_once_with(
-            url=INTROSPECTION_ENDPOINT,
-            response_limit=ANY,
-            request_body=ANY,
-            headers=ANY,
-        )
-        self._assertParams()
-        self.assertEqual(
-            getattr(error.value, "headers", {})["WWW-Authenticate"],
-            'Bearer error="insufficient_scope", scope="urn:matrix:org.matrix.msc2967.client:api:*"',
-        )
-
-    def test_active_guest_allowed(self) -> None:
-        """The handler should return a requester with guest user rights and a device ID."""
-
-        self._set_introspection_returnvalue(
-            {
-                "active": True,
-                "sub": SUBJECT,
-                "scope": " ".join([MATRIX_GUEST_SCOPE, MATRIX_DEVICE_SCOPE]),
-                "username": USERNAME,
-            }
-        )
-        request = Mock(args={})
-        request.args[b"access_token"] = [b"mockAccessToken"]
-        request.requestHeaders.getRawHeaders = mock_getRawHeaders()
-        requester = self.get_success(
-            self.auth.get_user_by_req(request, allow_guest=True)
-        )
-        self.http_client.get_json.assert_called_once_with(WELL_KNOWN)
-        self._rust_client.post.assert_called_once_with(
-            url=INTROSPECTION_ENDPOINT,
-            response_limit=ANY,
-            request_body=ANY,
-            headers=ANY,
-        )
-        self._assertParams()
-        self.assertEqual(requester.user.to_string(), "@%s:%s" % (USERNAME, SERVER_NAME))
-        self.assertEqual(requester.is_guest, True)
-        self.assertEqual(
-            get_awaitable_result(self.auth.is_server_admin(requester)), False
-        )
-        self.assertEqual(requester.device_id, DEVICE)
-
    def test_unavailable_introspection_endpoint(self) -> None:
        """The handler should return an internal server error."""
        request = Mock(args={})
@@ -562,8 +510,8 @@ class MSC3861OAuthDelegation(HomeserverTestCase):
                "sub": SUBJECT,
                "scope": " ".join(
                    [
-                        MATRIX_USER_SCOPE,
-                        f"{MATRIX_DEVICE_SCOPE_PREFIX}AABBCC",
+                        self.api_scope,
+                        f"{self.device_scope_prefix}AABBCC",
                    ]
                ),
                "username": USERNAME,
@@ -611,7 +559,7 @@ class MSC3861OAuthDelegation(HomeserverTestCase):
            {
                "active": True,
                "sub": SUBJECT,
-                "scope": " ".join([MATRIX_USER_SCOPE, MATRIX_DEVICE_SCOPE]),
+                "scope": " ".join([self.api_scope, self.device_scope]),
                "username": USERNAME,
            }
        )
@@ -676,7 +624,7 @@ class MSC3861OAuthDelegation(HomeserverTestCase):
                return json.dumps(
                    {
                        "active": True,
-                        "scope": MATRIX_USER_SCOPE,
+                        "scope": self.api_scope,
                        "sub": SUBJECT,
                        "username": USERNAME,
                    },
@@ -842,8 +790,24 @@ class FakeMasServer(HTTPServer):
 T = TypeVar("T")


+@parameterized_class(
+    ("device_scope_prefix", "api_scope"),
+    [
+        ("urn:matrix:client:device:", "urn:matrix:client:api:*"),
+        (
+            "urn:matrix:org.matrix.msc2967.client:device:",
+            "urn:matrix:org.matrix.msc2967.client:api:*",
+        ),
+    ],
+)
 class MasAuthDelegation(HomeserverTestCase):
    server: FakeMasServer
+    device_scope_prefix: ClassVar[str]
+    api_scope: ClassVar[str]
+
+    @property
+    def device_scope(self) -> str:
+        return self.device_scope_prefix + DEVICE

    def till_deferred_has_result(
        self,
@@ -914,12 +878,7 @@ class MasAuthDelegation(HomeserverTestCase):
        self.server.introspection_response = {
            "active": True,
            "sub": SUBJECT,
-            "scope": " ".join(
-                [
-                    MATRIX_USER_SCOPE,
-                    f"{MATRIX_DEVICE_SCOPE_PREFIX}{DEVICE}",
-                ]
-            ),
+            "scope": " ".join([self.api_scope, self.device_scope]),
            "username": USERNAME,
            "expires_in": 60,
        }
@@ -943,12 +902,7 @@ class MasAuthDelegation(HomeserverTestCase):
        self.server.introspection_response = {
            "active": True,
            "sub": SUBJECT,
-            "scope": " ".join(
-                [
-                    MATRIX_USER_SCOPE,
-                    f"{MATRIX_DEVICE_SCOPE_PREFIX}{DEVICE}",
-                ]
-            ),
+            "scope": " ".join([self.api_scope, self.device_scope]),
            "username": USERNAME,
        }

@@ -971,12 +925,7 @@ class MasAuthDelegation(HomeserverTestCase):
        self.server.introspection_response = {
            "active": True,
            "sub": SUBJECT,
-            "scope": " ".join(
-                [
-                    MATRIX_USER_SCOPE,
-                    f"{MATRIX_DEVICE_SCOPE_PREFIX}ABCDEF",
-                ]
-            ),
+            "scope": " ".join([self.api_scope, f"{self.device_scope_prefix}ABCDEF"]),
            "username": USERNAME,
            "expires_in": 60,
        }
@@ -993,7 +942,7 @@ class MasAuthDelegation(HomeserverTestCase):
        self.server.introspection_response = {
            "active": True,
            "sub": SUBJECT,
-            "scope": " ".join([MATRIX_USER_SCOPE]),
+            "scope": " ".join([self.api_scope]),
            "username": "inexistent_user",
            "expires_in": 60,
        }
@@ -1039,7 +988,7 @@ class MasAuthDelegation(HomeserverTestCase):
        self.server.introspection_response = {
            "active": True,
            "sub": SUBJECT,
-            "scope": MATRIX_USER_SCOPE,
+            "scope": self.api_scope,
            "username": USERNAME,
            "expires_in": 60,
            "device_id": DEVICE,
@@ -1057,7 +1006,7 @@ class MasAuthDelegation(HomeserverTestCase):
        self.server.introspection_response = {
            "active": True,
            "sub": SUBJECT,
-            "scope": " ".join([SYNAPSE_ADMIN_SCOPE, MATRIX_USER_SCOPE]),
+            "scope": " ".join([SYNAPSE_ADMIN_SCOPE, self.api_scope]),
            "username": USERNAME,
            "expires_in": 60,
        }
@@ -1079,12 +1028,7 @@ class MasAuthDelegation(HomeserverTestCase):
        self.server.introspection_response = {
            "active": True,
            "sub": SUBJECT,
-            "scope": " ".join(
-                [
-                    MATRIX_USER_SCOPE,
-                    f"{MATRIX_DEVICE_SCOPE_PREFIX}{DEVICE}",
-                ]
-            ),
+            "scope": " ".join([self.api_scope, self.device_scope]),
            "username": USERNAME,
            "expires_in": 60,
        }
--- a/tests/logging/test_opentracing.py
+++ b/tests/logging/test_opentracing.py
@@ -19,7 +19,7 @@
 #
 #

-from typing import Awaitable, cast
+from typing import Awaitable, Dict, cast

 from twisted.internet import defer
 from twisted.internet.testing import MemoryReactorClock
@@ -38,9 +38,11 @@ from synapse.logging.opentracing import (
 from synapse.util import Clock

 try:
-    from synapse.logging.scopecontextmanager import LogContextScopeManager
+    import opentracing
+    from opentracing.scope_managers.contextvars import ContextVarsScopeManager
 except ImportError:
-    LogContextScopeManager = None  # type: ignore
+    opentracing = None  # type: ignore
+    ContextVarsScopeManager = None  # type: ignore

 try:
    import jaeger_client
@@ -54,9 +56,10 @@ from tests.unittest import TestCase
 logger = logging.getLogger(__name__)


-class LogContextScopeManagerTestCase(TestCase):
+class TracingScopeTestCase(TestCase):
    """
-    Test logging contexts and active opentracing spans.
+    Test that our tracing machinery works well in a variety of situations (especially
+    with Twisted's runtime and deferreds).

    There's casts throughout this from generic opentracing objects (e.g.
    opentracing.Span) to the ones specific to Jaeger since they have additional
@@ -64,7 +67,7 @@ class LogContextScopeManagerTestCase(TestCase):
    opentracing backend is Jaeger.
    """

-    if LogContextScopeManager is None:
+    if opentracing is None:
        skip = "Requires opentracing"  # type: ignore[unreachable]
    if jaeger_client is None:
        skip = "Requires jaeger_client"  # type: ignore[unreachable]
@@ -74,7 +77,7 @@ class LogContextScopeManagerTestCase(TestCase):
        # global variables that power opentracing. We create our own tracer instance
        # and test with it.

-        scope_manager = LogContextScopeManager()
+        scope_manager = ContextVarsScopeManager()
        config = jaeger_client.config.Config(
            config={}, service_name="test", scope_manager=scope_manager
        )
@@ -208,6 +211,135 @@ class LogContextScopeManagerTestCase(TestCase):
            [scopes[1].span, scopes[2].span, scopes[0].span],
        )

+    def test_run_in_background_active_scope_still_available(self) -> None:
+        """
+        Test that tasks running via `run_in_background` still have access to the
+        active tracing scope.
+
+        This is a regression test for a previous Synapse issue where the tracing scope
+        would `__exit__` and close before the `run_in_background` task completed and our
+        own previous custom `_LogContextScope.close(...)` would clear
+        `LoggingContext.scope` preventing further tracing spans from having the correct
+        parent.
+        """
+        reactor = MemoryReactorClock()
+        clock = Clock(reactor)
+
+        scope_map: Dict[str, opentracing.Scope] = {}
+
+        async def async_task() -> None:
+            root_scope = scope_map["root"]
+            root_context = cast(jaeger_client.SpanContext, root_scope.span.context)
+
+            self.assertEqual(
+                self._tracer.active_span,
+                root_scope.span,
+                "expected to inherit the root tracing scope from where this was run",
+            )
+
+            # Return control back to the reactor thread and wait an arbitrary amount
+            await clock.sleep(4)
+
+            # This is a key part of what we're testing! In a previous version of
+            # Synapse, we would lose the active span at this point.
+            self.assertEqual(
+                self._tracer.active_span,
+                root_scope.span,
+                "expected to still have a root tracing scope/span active",
+            )
+
+            # For complete-ness sake, let's also trace more sub-tasks here and assert
+            # they have the correct span parents as well (root)
+
+            # Start tracing some other sub-task.
+            #
+            # This is a key part of what we're testing! In a previous version of
+            # Synapse, it would have the incorrect span parents.
+            scope = start_active_span(
+                "task1",
+                tracer=self._tracer,
+            )
+            scope_map["task1"] = scope
+
+            # Ensure the span parent is pointing to the root scope
+            context = cast(jaeger_client.SpanContext, scope.span.context)
+            self.assertEqual(
+                context.parent_id,
+                root_context.span_id,
+                "expected task1 parent to be the root span",
+            )
+
+            # Ensure that the active span is our new sub-task now
+            self.assertEqual(self._tracer.active_span, scope.span)
+            # Return control back to the reactor thread and wait an arbitrary amount
+            await clock.sleep(4)
+            # We should still see the active span as the scope wasn't closed yet
+            self.assertEqual(self._tracer.active_span, scope.span)
+            scope.close()
+
+        async def root() -> None:
+            with start_active_span(
+                "root span",
+                tracer=self._tracer,
+                # We will close this off later. We're basically just mimicking the same
+                # pattern for how we handle requests. We pass the span off to the
+                # request for it to finish.
+                finish_on_close=False,
+            ) as root_scope:
+                scope_map["root"] = root_scope
+                self.assertEqual(self._tracer.active_span, root_scope.span)
+
+                # Fire-and-forget a task
+                #
+                # XXX: The root scope context manager will `__exit__` before this task
+                # completes.
+                run_in_background(async_task)
+
+                # Because we used `run_in_background`, the active span should still be
+                # the root.
+                self.assertEqual(self._tracer.active_span, root_scope.span)
+
+            # We shouldn't see any active spans outside of the scope
+            self.assertIsNone(self._tracer.active_span)
+
+        with LoggingContext("root context"):
+            # Start the test off
+            d_root = defer.ensureDeferred(root())
+
+            # Let the tasks complete
+            reactor.pump((2,) * 8)
+            self.successResultOf(d_root)
+
+            # After we see all of the tasks are done (like a request when it
+            # `_finished_processing`), let's finish our root span
+            scope_map["root"].span.finish()
+
+            # Sanity check again: We shouldn't see any active spans leftover in this
+            # this context.
+            self.assertIsNone(self._tracer.active_span)
+
+        # The spans should be reported in order of their finishing: task 1, task 2,
+        # root.
+        #
+        # We use `assertIncludes` just as an easier way to see if items are missing or
+        # added. We assert the order just below
+        self.assertIncludes(
+            set(self._reporter.get_spans()),
+            {
+                scope_map["task1"].span,
+                scope_map["root"].span,
+            },
+            exact=True,
+        )
+        # This is where we actually assert the correct order
+        self.assertEqual(
+            self._reporter.get_spans(),
+            [
+                scope_map["task1"].span,
+                scope_map["root"].span,
+            ],
+        )
+
    def test_trace_decorator_sync(self) -> None:
        """
        Test whether we can use `@trace_with_opname` (`@trace`) and `@tag_args`
--- a/tests/metrics/test_metrics.py
+++ b/tests/metrics/test_metrics.py
@@ -18,11 +18,18 @@
 # [This file includes modifications made by New Vector Limited]
 #
 #
-from typing import Dict, Protocol, Tuple
+from typing import Dict, NoReturn, Protocol, Tuple

 from prometheus_client.core import Sample

-from synapse.metrics import REGISTRY, InFlightGauge, generate_latest
+from synapse.metrics import (
+    REGISTRY,
+    SERVER_NAME_LABEL,
+    InFlightGauge,
+    LaterGauge,
+    all_later_gauges_to_clean_up_on_shutdown,
+    generate_latest,
+)
 from synapse.util.caches.deferred_cache import DeferredCache

 from tests import unittest
@@ -285,6 +292,95 @@ class CacheMetricsTests(unittest.HomeserverTestCase):
        self.assertEqual(hs2_cache_max_size_metric_value, "777.0")


+class LaterGaugeTests(unittest.HomeserverTestCase):
+    def setUp(self) -> None:
+        super().setUp()
+        self.later_gauge = LaterGauge(
+            name="foo",
+            desc="",
+            labelnames=[SERVER_NAME_LABEL],
+        )
+
+    def tearDown(self) -> None:
+        super().tearDown()
+
+        REGISTRY.unregister(self.later_gauge)
+        all_later_gauges_to_clean_up_on_shutdown.pop(self.later_gauge.name, None)
+
+    def test_later_gauge_multiple_servers(self) -> None:
+        """
+        Test that LaterGauge metrics are reported correctly across multiple servers. We
+        will have an metrics entry for each homeserver that is labeled with the
+        `server_name` label.
+        """
+        self.later_gauge.register_hook(
+            homeserver_instance_id="123", hook=lambda: {("hs1",): 1}
+        )
+        self.later_gauge.register_hook(
+            homeserver_instance_id="456", hook=lambda: {("hs2",): 2}
+        )
+
+        metrics_map = get_latest_metrics()
+
+        # Find the metrics from both homeservers
+        hs1_metric = 'foo{server_name="hs1"}'
+        hs1_metric_value = metrics_map.get(hs1_metric)
+        self.assertIsNotNone(
+            hs1_metric_value,
+            f"Missing metric {hs1_metric} in metrics {metrics_map}",
+        )
+        self.assertEqual(hs1_metric_value, "1.0")
+
+        hs2_metric = 'foo{server_name="hs2"}'
+        hs2_metric_value = metrics_map.get(hs2_metric)
+        self.assertIsNotNone(
+            hs2_metric_value,
+            f"Missing metric {hs2_metric} in metrics {metrics_map}",
+        )
+        self.assertEqual(hs2_metric_value, "2.0")
+
+    def test_later_gauge_hook_exception(self) -> None:
+        """
+        Test that LaterGauge metrics are collected across multiple servers even if one
+        hooks is throwing an exception.
+        """
+
+        def raise_exception() -> NoReturn:
+            raise Exception("fake error generating data")
+
+        # Make the hook for hs1 throw an exception
+        self.later_gauge.register_hook(
+            homeserver_instance_id="123", hook=raise_exception
+        )
+        # Metrics from hs2 still work fine
+        self.later_gauge.register_hook(
+            homeserver_instance_id="456", hook=lambda: {("hs2",): 2}
+        )
+
+        metrics_map = get_latest_metrics()
+
+        # Since we encountered an exception while trying to collect metrics from hs1, we
+        # don't expect to see it here.
+        hs1_metric = 'foo{server_name="hs1"}'
+        hs1_metric_value = metrics_map.get(hs1_metric)
+        self.assertIsNone(
+            hs1_metric_value,
+            (
+                "Since we encountered an exception while trying to collect metrics from hs1"
+                f"we don't expect to see it the metrics_map {metrics_map}"
+            ),
+        )
+
+        # We should still see metrics from hs2 though
+        hs2_metric = 'foo{server_name="hs2"}'
+        hs2_metric_value = metrics_map.get(hs2_metric)
+        self.assertIsNotNone(
+            hs2_metric_value,
+            f"Missing metric {hs2_metric} in cache metrics {metrics_map}",
+        )
+        self.assertEqual(hs2_metric_value, "2.0")
+
+
 def get_latest_metrics() -> Dict[str, str]:
    """
    Collect the latest metrics from the registry and parse them into an easy to use map.
--- a/tests/replication/_base.py
+++ b/tests/replication/_base.py
@@ -32,7 +32,6 @@ from synapse.config.workers import InstanceTcpLocationConfig, InstanceUnixLocati
 from synapse.http.site import SynapseRequest, SynapseSite
 from synapse.replication.http import ReplicationRestResource
 from synapse.replication.tcp.client import ReplicationDataHandler
-from synapse.replication.tcp.handler import ReplicationCommandHandler
 from synapse.replication.tcp.protocol import (
    ClientReplicationStreamProtocol,
    ServerReplicationStreamProtocol,
@@ -97,7 +96,7 @@ class BaseStreamTestCase(unittest.HomeserverTestCase):
        self.test_handler = self._build_replication_data_handler()
        self.worker_hs._replication_data_handler = self.test_handler  # type: ignore[attr-defined]

-        repl_handler = ReplicationCommandHandler(self.worker_hs)
+        repl_handler = self.worker_hs.get_replication_command_handler()
        self.client = ClientReplicationStreamProtocol(
            self.worker_hs,
            "client",
--- a/tests/rest/client/test_auth_metadata.py
+++ b/tests/rest/client/test_auth_metadata.py
@@ -18,8 +18,11 @@
 # [This file includes modifications made by New Vector Limited]
 #
 from http import HTTPStatus
+from typing import ClassVar
 from unittest.mock import AsyncMock

+from parameterized import parameterized_class
+
 from synapse.rest.client import auth_metadata

 from tests.unittest import HomeserverTestCase, override_config, skip_unless
@@ -85,17 +88,22 @@ class AuthIssuerTestCase(HomeserverTestCase):
        req_mock.assert_not_called()


+@parameterized_class(
+    ("endpoint",),
+    [
+        ("/_matrix/client/unstable/org.matrix.msc2965/auth_metadata",),
+        ("/_matrix/client/v1/auth_metadata",),
+    ],
+)
 class AuthMetadataTestCase(HomeserverTestCase):
+    endpoint: ClassVar[str]
    servlets = [
        auth_metadata.register_servlets,
    ]

    def test_returns_404_when_msc3861_disabled(self) -> None:
        # Make an unauthenticated request for the discovery info.
-        channel = self.make_request(
-            "GET",
-            "/_matrix/client/unstable/org.matrix.msc2965/auth_metadata",
-        )
+        channel = self.make_request("GET", self.endpoint)
        self.assertEqual(channel.code, HTTPStatus.NOT_FOUND)

    @skip_unless(HAS_AUTHLIB, "requires authlib")
@@ -124,10 +132,7 @@ class AuthMetadataTestCase(HomeserverTestCase):
        )
        self.hs.get_proxied_http_client().get_json = req_mock  # type: ignore[method-assign]

-        channel = self.make_request(
-            "GET",
-            "/_matrix/client/unstable/org.matrix.msc2965/auth_metadata",
-        )
+        channel = self.make_request("GET", self.endpoint)

        self.assertEqual(channel.code, HTTPStatus.OK)
        self.assertEqual(
--- a/tests/server.py
+++ b/tests/server.py
@@ -1145,6 +1145,9 @@ def setup_test_homeserver(
        reactor=reactor,
    )

+    # Register the cleanup hook
+    cleanup_func(hs.cleanup)
+
    # Install @cache_in_self attributes
    for key, val in kwargs.items():
        setattr(hs, "_" + key, val)
--- a/tests/util/caches/test_descriptors.py
+++ b/tests/util/caches/test_descriptors.py
@@ -355,7 +355,7 @@ class DescriptorTestCase(unittest.TestCase):
                    d = obj.fn(1)
                    self.assertEqual(
                        current_context(),
-                        SENTINEL_CONTEXT,
+                        c1,
                    )
                    yield d
                    self.fail("No exception thrown")
@@ -849,7 +849,7 @@ class CachedListDescriptorTestCase(unittest.TestCase):

            # start the lookup off
            d1 = obj.list_fn([10, 20], 2)
-            self.assertEqual(current_context(), SENTINEL_CONTEXT)
+            self.assertEqual(current_context(), c1)
            r = yield d1
            self.assertEqual(current_context(), c1)
            obj.mock.assert_called_once_with({10, 20}, 2)
Author	SHA1	Message	Date
Eric Eastwood	6ce2f3e59d	Fix CPU time going backwards when `daemonize` When we `daemonize`, we fork the process and cputime metrics get confused about the per-thread resource usage appearing to go backwards because we're comparing the resource usage (`rusage`) from the original process to the forked process. We now kick off the background tasks (`run_as_background_process`) after we have forked the process so the `rusage` we record when we `start` is in the same thread when we `stop`. Bad log examples from before: ``` synapse.logging.context - ERROR - _schedule_next_expiry-0 - utime went backwards! 0.050467 < 0.886526 synapse.logging.context - ERROR - _schedule_db_events-0 - stime went backwards! 0.009941 < 0.155106 synapse.logging.context - ERROR - wake_destinations_needing_catchup-0 - stime went backwards! 0.010175 < 0.130923 synapse.logging.context - ERROR - resume_sync_partial_state_room-0 - utime went backwards! 0.052898 < 0.886526 ``` Testing strategy: 1. Run with `daemonize: true` in your `homeserver.yaml` 1. `poetry run synapse_homeserver --config-path homeserver.yaml` 1. Shutdown the server 1. Look for any bad log entries in your homeserver logs: - `Expected logging context sentinel but found main` - `Expected logging context main was lost` - `utime went backwards!`/`stime went backwards!`	2025-09-02 15:11:04 -05:00
Eric Eastwood	93044f4c5b	Fix lints	2025-09-02 14:28:41 -05:00
Eric Eastwood	c7a80b63ec	No need to `patch_inline_callbacks` when we don't have log context rules anymore	2025-09-02 14:07:26 -05:00
Eric Eastwood	1f384b0e21	Merge branch 'develop' into madlittlemods/log-context-using-contextvars3	2025-09-02 13:39:44 -05:00
Eric Eastwood	b2997a8f20	Suppress "Applying schema" log noise bulk when running Complement tests (#18878 ) If Synapse is under test (`SYNAPSE_LOG_TESTING` is set), we don't care about seeing the "Applying schema" log lines at the INFO level every time we run the tests (it's 100 lines of bulk for each homeserver). ``` synapse_main \| 2025-08-29 22:34:03,453 - synapse.storage.prepare_database - 433 - INFO - main - Applying schema deltas for v73 synapse_main \| 2025-08-29 22:34:03,454 - synapse.storage.prepare_database - 541 - INFO - main - Applying schema 73/01event_failed_pull_attempts.sql synapse_main \| 2025-08-29 22:34:03,463 - synapse.storage.prepare_database - 541 - INFO - main - Applying schema 73/02add_pusher_enabled.sql synapse_main \| 2025-08-29 22:34:03,473 - synapse.storage.prepare_database - 541 - INFO - main - Applying schema 73/02room_id_indexes_for_purging.sql synapse_main \| 2025-08-29 22:34:03,482 - synapse.storage.prepare_database - 541 - INFO - main - Applying schema 73/03pusher_device_id.sql synapse_main \| 2025-08-29 22:34:03,492 - synapse.storage.prepare_database - 541 - INFO - main - Applying schema 73/03users_approved_column.sql synapse_main \| 2025-08-29 22:34:03,502 - synapse.storage.prepare_database - 541 - INFO - main - Applying schema 73/04partial_join_details.sql synapse_main \| 2025-08-29 22:34:03,513 - synapse.storage.prepare_database - 541 - INFO - main - Applying schema 73/04pending_device_list_updates.sql ... ``` The Synapse logs are visible when a Complement test fails or you use `COMPLEMENT_ALWAYS_PRINT_SERVER_LOGS=1`. This is spawning from a Complement test with three homeservers and wanting less log noise to scroll through.	2025-09-02 13:34:47 -05:00
Eric Eastwood	bff4a11b3f	Re-introduce: Fix `LaterGauge` metrics to collect from all servers (#18791 ) Re-introduce: https://github.com/element-hq/synapse/pull/18751 that was reverted in https://github.com/element-hq/synapse/pull/18789 (explains why the PR was reverted in the first place). - Adds a `cleanup` pattern that cleans up metrics from each homeserver in the tests. Previously, the list of hooks built up until our CI machines couldn't operate properly, see https://github.com/element-hq/synapse/pull/18789 - Fix long-standing issue with `synapse_background_update_status` metrics only tracking the last database listed in the config (see https://github.com/element-hq/synapse/pull/18791#discussion_r2261706749)	2025-09-02 12:14:27 -05:00
Quentin Gliech	09a489e198	1.138.0rc1	2025-09-02 14:16:55 +02:00
Quentin Gliech	537e14169e	Support stable endpoint and scopes from the MSC3861 family (#18549 ) This adds stable APIs for both MSC2965 and MSC2967	2025-09-02 13:55:12 +02:00
Eric Eastwood	3e66e0a1b8	Rewrite docs and remove log context rules (no longer relevant)	2025-08-27 22:46:58 -05:00
Eric Eastwood	0c8759bbb6	Remove mentions of Synapse logcontext rules (the distinction doesn't matter anymore)	2025-08-27 20:27:45 -05:00
Eric Eastwood	4303879cfe	Add changelog	2025-08-27 20:13:34 -05:00
Eric Eastwood	3742b3b3fb	Mark `make_deferred_yieldable` for future deletion	2025-08-27 19:57:32 -05:00
Quentin Gliech	224cb3f827	WIP: use a contextvar to store the logcontext	2025-08-27 19:57:28 -05:00
Eric Eastwood	68068de3a4	Trace how much work is being done while "recursively fetching redactions" (#18854 ) Spawning from observing this trace for a `/messages` request (`RoomMessageListRestServlet`). We don't know if it took a while for the database to fetch a single redaction or a whole chain of redactions.	2025-08-27 12:27:33 -05:00
Eric Eastwood	356cc4a0a1	Instrument `_ByteProducer` with tracing to measure potential dead time while writing bytes to the request (#18804 ) This will allow to easily see how much time is taken up by being able to filter by the `write_bytes_to_request` operation in Jaeger. Spawning from https://github.com/element-hq/synapse/issues/17722 The `write_bytes_to_request` span won't show up in the trace until https://github.com/element-hq/synapse/pull/18849 is merged. Note: It's totally fine for a span child to finish after the parent. See https://opentracing.io/specification/#references-between-spans which shows "Child Span D" outliving the "Parent Span"	2025-08-27 12:26:42 -05:00
Eric Eastwood	27fc3389f3	Switch to OpenTracing's `ContextVarsScopeManager` (#18849 ) Switch to OpenTracing's `ContextVarsScopeManager` instead of our own custom `LogContextScopeManager`. This is now possible because the linked Twisted issue from the comment in our custom `LogContextScopeManager` is resolved: https://twistedmatrix.com/trac/ticket/10301 This PR is spawning from exploring different possibilities to solve the `scope` loss problem I was encountering in https://github.com/element-hq/synapse/pull/18804#discussion_r2268254424. This appears to solve the problem and I've added the additional test from there to this PR ✅	2025-08-27 11:41:00 -05:00
Eric Eastwood	df2cfb3932	Link upstream Twisted bug: Idle connection timeout incorrectly enforced while sending large response with `Request.write(...)` (#18855 ) Link upstream Twisted bug -> https://github.com/twisted/twisted/issues/12498 Spawning from https://github.com/element-hq/synapse/pull/18852	2025-08-27 11:25:57 -05:00
Andrew Ferrazzutti	c339021ce8	Reduce strictness of delayed event delta fetching (#18858 )	2025-08-27 13:26:10 +01:00
dependabot[bot]	499f947c67	Bump actions/checkout from 4.3.0 to 5.0.0 (#18834 ) Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-27 07:47:19 +01:00
dependabot[bot]	e76a9af4d7	Bump types-jsonschema from 4.25.0.20250720 to 4.25.1.20250822 (#18867 ) Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-27 07:28:01 +01:00
dependabot[bot]	eec1ca6e93	Bump serde_json from 1.0.142 to 1.0.143 (#18866 ) Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-27 07:04:50 +01:00
dependabot[bot]	56b5759c0f	Bump ruff from 0.12.7 to 0.12.10 (#18865 ) Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-27 07:03:41 +01:00
dependabot[bot]	767177ca5a	Bump regex from 1.11.1 to 1.11.2 (#18864 ) Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-27 07:01:59 +01:00
dependabot[bot]	5b8e6e7911	Bump actions/add-to-project from c0c5949b017d0d4a39f7ba888255881bdac2a823 to 4515659e2b458b27365e167605ac44f219494b66 (#18863 ) Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-27 07:01:12 +01:00
dependabot[bot]	6a6be6fbe2	Bump dtolnay/rust-toolchain from b3b07ba8b418998c39fb20f53e8b695cdcc8de1b to e97e2d8cc328f1b50210efc529dca0028893a2d9 (#18862 ) Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-27 07:00:07 +01:00
dependabot[bot]	21c7841228	Bump reqwest from 0.12.22 to 0.12.23 (#18842 ) Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-27 06:57:17 +01:00
dependabot[bot]	5b55e3f15d	Bump anyhow from 1.0.98 to 1.0.99 (#18841 ) Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-27 06:55:31 +01:00
dependabot[bot]	0e2b92bcbc	Bump types-bleach from 6.2.0.20250514 to 6.2.0.20250809 (#18838 ) Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-27 06:54:32 +01:00
dependabot[bot]	481987eb83	Bump phonenumbers from 9.0.11 to 9.0.12 (#18837 ) Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-27 06:53:52 +01:00
dependabot[bot]	5fd30c7ea7	Bump types-psycopg2 from 2.9.21.20250718 to 2.9.21.20250809 (#18836 ) Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-27 06:53:28 +01:00
dependabot[bot]	d527c794fb	Bump docker/login-action from 3.4.0 to 3.5.0 (#18835 ) Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-27 06:52:28 +01:00
Erik Johnston	19fe3f001e	Merge branch 'master' into develop	2025-08-26 10:54:46 +01:00
Erik Johnston	f8a44638eb	1.137.0	2025-08-26 10:23:44 +01:00
Richard van der Hoff	7ec5e60671	Introduce `EventPersistencePair` type (#18857 ) `Tuple[EventBase, EventContext]` is everywhere and I keep misspelling it. Let's just define a type for it.	2025-08-26 10:15:03 +01:00
Ben Banfield-Zanin	48184eefa3	Fix worker documentation around room Admin APIs (#18853 ) Discovered via https://github.com/element-hq/ess-helm/issues/677. Looking at https://github.com/element-hq/synapse/blob/v1.136.0/synapse/rest/admin/__init__.py#L266 only `RoomRestServlet` is generally worker capable. This is just the Room Details API and the v1 Room Delete API and not all the APIs documented on https://element-hq.github.io/synapse/latest/admin_api/rooms.html	2025-08-26 10:04:47 +02:00
Shay	205d9e4fc4	Improve `redact_on_ban` performance (#18851 ) Co-authored-by: Erik Johnston <erikj@jki.re>	2025-08-23 11:43:50 +01:00
				`@@ -0,0 +1 @@`
				Fix `LaterGauge` metrics to collect from all servers.
				`@@ -0,0 +1 @@`
				Store the `LoggingContext` in a `ContextVar` instead of a thread-local variable.