Compare commits

...

47 Commits

Author SHA1 Message Date
Will Hunt
1eb989fa7a await 2021-08-10 16:54:32 +01:00
Will Hunt
fb719663c5 Merge remote-tracking branch 'origin/develop' into hs/hacked-together-event-cache 2021-08-10 16:20:46 +01:00
Will Hunt
61293c86c1 Merge remote-tracking branch 'origin/develop' into hs/hacked-together-event-cache 2021-08-03 13:40:06 +01:00
Will Hunt
fad91897ec Merge remote-tracking branch 'origin/develop' into hs/hacked-together-event-cache 2021-08-01 12:38:40 +01:00
Will Hunt
64970400ae Merge remote-tracking branch 'origin/develop' into hs/hacked-together-event-cache 2021-07-27 14:20:08 +01:00
Will Hunt
f6be8041d8 Fix merge fail 2021-07-20 11:32:00 +01:00
Will Hunt
f19795355b Merge remote-tracking branch 'origin/develop' into hs/hacked-together-event-cache 2021-07-20 09:35:03 +01:00
Will Hunt
81dd28c216 Merge remote-tracking branch 'origin/develop' into hs/hacked-together-event-cache 2021-06-29 21:49:25 +01:00
Will Hunt
8bfb649cda Merge tag 'v1.37.0rc1' into hs/hacked-together-event-cache
Synapse 1.37.0rc1 (2021-06-24)
==============================

This release deprecates the current spam checker interface. See the [upgrade notes](https://matrix-org.github.io/synapse/develop/upgrade#deprecation-of-the-current-spam-checker-interface) for more information on how to update to the new generic module interface.

This release also removes support for fetching and renewing TLS certificates using the ACME v1 protocol, which has been fully decommissioned by Let's Encrypt on June 1st 2021. Admins previously using this feature should use a [reverse proxy](https://matrix-org.github.io/synapse/develop/reverse_proxy.html) to handle TLS termination, or use an external ACME client (such as [certbot](https://certbot.eff.org/)) to retrieve a certificate and key and provide them to Synapse using the `tls_certificate_path` and `tls_private_key_path` configuration settings.

Features
--------

- Implement "room knocking" as per [MSC2403](https://github.com/matrix-org/matrix-doc/pull/2403). Contributed by @Sorunome and anoa. ([\#6739](https://github.com/matrix-org/synapse/issues/6739), [\#9359](https://github.com/matrix-org/synapse/issues/9359), [\#10167](https://github.com/matrix-org/synapse/issues/10167), [\#10212](https://github.com/matrix-org/synapse/issues/10212), [\#10227](https://github.com/matrix-org/synapse/issues/10227))
- Add experimental support for backfilling history into rooms ([MSC2716](https://github.com/matrix-org/matrix-doc/pull/2716)). ([\#9247](https://github.com/matrix-org/synapse/issues/9247))
- Implement a generic interface for third-party plugin modules. ([\#10062](https://github.com/matrix-org/synapse/issues/10062), [\#10206](https://github.com/matrix-org/synapse/issues/10206))
- Implement config option `sso.update_profile_information` to sync SSO users' profile information with the identity provider each time they login. Currently only displayname is supported. ([\#10108](https://github.com/matrix-org/synapse/issues/10108))
- Ensure that errors during startup are written to the logs and the console. ([\#10191](https://github.com/matrix-org/synapse/issues/10191))

Bugfixes
--------

- Fix a bug introduced in Synapse v1.25.0 that prevented the `ip_range_whitelist` configuration option from working for federation and identity servers. Contributed by @mikure. ([\#10115](https://github.com/matrix-org/synapse/issues/10115))
- Remove a broken import line in Synapse's `admin_cmd` worker. Broke in Synapse v1.33.0. ([\#10154](https://github.com/matrix-org/synapse/issues/10154))
- Fix a bug introduced in Synapse v1.21.0 which could cause `/sync` to return immediately with an empty response. ([\#10157](https://github.com/matrix-org/synapse/issues/10157), [\#10158](https://github.com/matrix-org/synapse/issues/10158))
- Fix a minor bug in the response to `/_matrix/client/r0/user/{user}/openid/request_token` causing `expires_in` to be a float instead of an integer. Contributed by @lukaslihotzki. ([\#10175](https://github.com/matrix-org/synapse/issues/10175))
- Always require users to re-authenticate for dangerous operations: deactivating an account, modifying an account password, and adding 3PIDs. ([\#10184](https://github.com/matrix-org/synapse/issues/10184))
- Fix a bug introduced in Synpase v1.7.2 where remote server count metrics collection would be incorrectly delayed on startup. Found by @heftig. ([\#10195](https://github.com/matrix-org/synapse/issues/10195))
- Fix a bug introduced in Synapse v1.35.1 where an `allow` key of a `m.room.join_rules` event could be applied for incorrect room versions and configurations. ([\#10208](https://github.com/matrix-org/synapse/issues/10208))
- Fix performance regression in responding to user key requests over federation. Introduced in Synapse v1.34.0rc1. ([\#10221](https://github.com/matrix-org/synapse/issues/10221))

Improved Documentation
----------------------

- Add a new guide to decoding request logs. ([\#8436](https://github.com/matrix-org/synapse/issues/8436))
- Mention in the sample homeserver config that you may need to configure max upload size in your reverse proxy. Contributed by @aaronraimist. ([\#10122](https://github.com/matrix-org/synapse/issues/10122))
- Fix broken links in documentation. ([\#10180](https://github.com/matrix-org/synapse/issues/10180))
- Deploy a snapshot of the documentation website upon each new Synapse release. ([\#10198](https://github.com/matrix-org/synapse/issues/10198))

Deprecations and Removals
-------------------------

- The current spam checker interface is deprecated in favour of a new generic modules system. See the [upgrade notes](https://matrix-org.github.io/synapse/develop/upgrade#deprecation-of-the-current-spam-checker-interface) for more information on how to update to the new system. ([\#10062](https://github.com/matrix-org/synapse/issues/10062), [\#10210](https://github.com/matrix-org/synapse/issues/10210), [\#10238](https://github.com/matrix-org/synapse/issues/10238))
- Stop supporting the unstable spaces prefixes from MSC1772. ([\#10161](https://github.com/matrix-org/synapse/issues/10161))
- Remove Synapse's support for automatically fetching and renewing certificates using the ACME v1 protocol. This protocol has been fully turned off by Let's Encrypt for existing installations on June 1st 2021. Admins previously using this feature should use a [reverse proxy](https://matrix-org.github.io/synapse/develop/reverse_proxy.html) to handle TLS termination, or use an external ACME client (such as [certbot](https://certbot.eff.org/)) to retrieve a certificate and key and provide them to Synapse using the `tls_certificate_path` and `tls_private_key_path` configuration settings. ([\#10194](https://github.com/matrix-org/synapse/issues/10194))

Internal Changes
----------------

- Update the database schema versioning to support gradual migration away from legacy tables. ([\#9933](https://github.com/matrix-org/synapse/issues/9933))
- Add type hints to the federation servlets. ([\#10080](https://github.com/matrix-org/synapse/issues/10080))
- Improve OpenTracing for event persistence. ([\#10134](https://github.com/matrix-org/synapse/issues/10134), [\#10193](https://github.com/matrix-org/synapse/issues/10193))
- Clean up the interface for injecting OpenTracing over HTTP. ([\#10143](https://github.com/matrix-org/synapse/issues/10143))
- Limit the number of in-flight `/keys/query` requests from a single device. ([\#10144](https://github.com/matrix-org/synapse/issues/10144))
- Refactor EventPersistenceQueue. ([\#10145](https://github.com/matrix-org/synapse/issues/10145))
- Document `SYNAPSE_TEST_LOG_LEVEL` to see the logger output when running tests. ([\#10148](https://github.com/matrix-org/synapse/issues/10148))
- Update the Complement build tags in GitHub Actions to test currently experimental features. ([\#10155](https://github.com/matrix-org/synapse/issues/10155))
- Add a `synapse_federation_soft_failed_events_total` metric to track how often events are soft failed. ([\#10156](https://github.com/matrix-org/synapse/issues/10156))
- Fetch the corresponding complement branch when performing CI. ([\#10160](https://github.com/matrix-org/synapse/issues/10160))
- Add some developer documentation about boolean columns in database schemas. ([\#10164](https://github.com/matrix-org/synapse/issues/10164))
- Add extra logging fields to better debug where events are being soft failed. ([\#10168](https://github.com/matrix-org/synapse/issues/10168))
- Add debug logging for when we enter and exit `Measure` blocks. ([\#10183](https://github.com/matrix-org/synapse/issues/10183))
- Improve comments in structured logging code. ([\#10188](https://github.com/matrix-org/synapse/issues/10188))
- Update [MSC3083](https://github.com/matrix-org/matrix-doc/pull/3083) support with modifications from the MSC. ([\#10189](https://github.com/matrix-org/synapse/issues/10189))
- Remove redundant DNS lookup limiter. ([\#10190](https://github.com/matrix-org/synapse/issues/10190))
- Upgrade `black` linting tool to 21.6b0. ([\#10197](https://github.com/matrix-org/synapse/issues/10197))
- Expose OpenTracing trace id in response headers. ([\#10199](https://github.com/matrix-org/synapse/issues/10199))
2021-06-28 09:47:21 +01:00
Will Hunt
7d33ba70df Merge remote-tracking branch 'origin/release-v1.36' into hs/hacked-together-event-cache 2021-06-16 11:32:27 +01:00
Will Hunt
dacc395dca Merge remote-tracking branch 'origin/develop' into hs/hacked-together-event-cache 2021-05-11 22:28:35 +01:00
Will Hunt
6a20b4f32e fix version 2021-05-06 12:29:50 +01:00
Will Hunt
a5e400286d Attempt to pin to fix 2021-05-06 12:20:15 +01:00
Will Hunt
1f3e399ea0 Merge remote-tracking branch 'origin/develop' into hs/hacked-together-event-cache 2021-05-05 14:37:27 +01:00
Will Hunt
76288b9fbd Fix merge header 2021-05-05 14:33:23 +01:00
Will Hunt
89ad2f60ca Merge remote-tracking branch 'origin/develop' into hs/hacked-together-event-cache 2021-05-05 14:27:02 +01:00
Will Hunt
1315b6a0f2 Merge remote-tracking branch 'origin/develop' into hs/hacked-together-event-cache 2021-04-22 12:02:35 +01:00
Will Hunt
5e8c7b4b05 Merge remote-tracking branch 'origin/develop' into hs/hacked-together-event-cache 2021-04-21 15:05:40 +01:00
Will Hunt
d77072ff6e Merge remote-tracking branch 'origin/develop' into hs/hacked-together-event-cache 2021-04-21 14:54:42 +01:00
Will Hunt
f98d0985d3 Merge remote-tracking branch 'origin/develop' into hs/hacked-together-event-cache 2021-04-20 16:10:16 +01:00
Will Hunt
82fcd3cce8 Merge remote-tracking branch 'origin/develop' into hs/hacked-together-event-cache 2021-04-20 00:34:28 +01:00
Will Hunt
4be9d0918e Merge remote-tracking branch 'origin/develop' into hs/hacked-together-event-cache 2021-04-13 15:03:34 +01:00
Will Hunt
5f631512b5 Merge remote-tracking branch 'origin/develop' into hs/hacked-together-event-cache 2021-04-09 14:39:26 +01:00
Will Hunt
9009d29b81 Merge remote-tracking branch 'origin/develop' into hs/hacked-together-event-cache 2021-03-22 13:40:27 +00:00
Will Hunt
16f23761bb Merge tag 'v1.30.0' into hs/hacked-together-event-cache
Synapse 1.30.0 (2021-03-22)
===========================

Note that this release deprecates the ability for appservices to
call `POST /_matrix/client/r0/register`  without the body parameter `type`. Appservice
developers should use a `type` value of `m.login.application_service` as
per [the spec](https://matrix.org/docs/spec/application_service/r0.1.2#server-admin-style-permissions).
In future releases, calling this endpoint with an access token - but without a `m.login.application_service`
type - will fail.

No significant changes.

Synapse 1.30.0rc1 (2021-03-16)
==============================

Features
--------

- Add prometheus metrics for number of users successfully registering and logging in. ([\#9510](https://github.com/matrix-org/synapse/issues/9510), [\#9511](https://github.com/matrix-org/synapse/issues/9511), [\#9573](https://github.com/matrix-org/synapse/issues/9573))
- Add `synapse_federation_last_sent_pdu_time` and `synapse_federation_last_received_pdu_time` prometheus metrics, which monitor federation delays by reporting the timestamps of messages sent and received to a set of remote servers. ([\#9540](https://github.com/matrix-org/synapse/issues/9540))
- Add support for generating JSON Web Tokens dynamically for use as OIDC client secrets. ([\#9549](https://github.com/matrix-org/synapse/issues/9549))
- Optimise handling of incomplete room history for incoming federation. ([\#9601](https://github.com/matrix-org/synapse/issues/9601))
- Finalise support for allowing clients to pick an SSO Identity Provider ([MSC2858](https://github.com/matrix-org/matrix-doc/pull/2858)). ([\#9617](https://github.com/matrix-org/synapse/issues/9617))
- Tell spam checker modules about the SSO IdP a user registered through if one was used. ([\#9626](https://github.com/matrix-org/synapse/issues/9626))

Bugfixes
--------

- Fix long-standing bug when generating thumbnails for some images with transparency: `TypeError: cannot unpack non-iterable int object`. ([\#9473](https://github.com/matrix-org/synapse/issues/9473))
- Purge chain cover indexes for events that were purged prior to Synapse v1.29.0. ([\#9542](https://github.com/matrix-org/synapse/issues/9542), [\#9583](https://github.com/matrix-org/synapse/issues/9583))
- Fix bug where federation requests were not correctly retried on 5xx responses. ([\#9567](https://github.com/matrix-org/synapse/issues/9567))
- Fix re-activating an account via the admin API when local passwords are disabled. ([\#9587](https://github.com/matrix-org/synapse/issues/9587))
- Fix a bug introduced in Synapse 1.20 which caused incoming federation transactions to stack up, causing slow recovery from outages. ([\#9597](https://github.com/matrix-org/synapse/issues/9597))
- Fix a bug introduced in v1.28.0 where the OpenID Connect callback endpoint could error with a `MacaroonInitException`. ([\#9620](https://github.com/matrix-org/synapse/issues/9620))
- Fix Internal Server Error on `GET /_synapse/client/saml2/authn_response` request. ([\#9623](https://github.com/matrix-org/synapse/issues/9623))

Updates to the Docker image
---------------------------

- Make use of an improved malloc implementation (`jemalloc`) in the docker image. ([\#8553](https://github.com/matrix-org/synapse/issues/8553))

Improved Documentation
----------------------

- Add relayd entry to reverse proxy example configurations. ([\#9508](https://github.com/matrix-org/synapse/issues/9508))
- Improve the SAML2 upgrade notes for 1.27.0. ([\#9550](https://github.com/matrix-org/synapse/issues/9550))
- Link to the "List user's media" admin API from the media admin API docs. ([\#9571](https://github.com/matrix-org/synapse/issues/9571))
- Clarify the spam checker modules documentation example to mention that `parse_config` is a required method. ([\#9580](https://github.com/matrix-org/synapse/issues/9580))
- Clarify the sample configuration for `stats` settings. ([\#9604](https://github.com/matrix-org/synapse/issues/9604))

Deprecations and Removals
-------------------------

- The `synapse_federation_last_sent_pdu_age` and `synapse_federation_last_received_pdu_age` prometheus metrics have been removed. They are replaced by `synapse_federation_last_sent_pdu_time` and `synapse_federation_last_received_pdu_time`. ([\#9540](https://github.com/matrix-org/synapse/issues/9540))
- Registering an Application Service user without using the `m.login.application_service` login type will be unsupported in an upcoming Synapse release. ([\#9559](https://github.com/matrix-org/synapse/issues/9559))

Internal Changes
----------------

- Add tests to ResponseCache. ([\#9458](https://github.com/matrix-org/synapse/issues/9458))
- Add type hints to purge room and server notice admin API. ([\#9520](https://github.com/matrix-org/synapse/issues/9520))
- Add extra logging to ObservableDeferred when callbacks throw exceptions. ([\#9523](https://github.com/matrix-org/synapse/issues/9523))
- Fix incorrect type hints. ([\#9528](https://github.com/matrix-org/synapse/issues/9528), [\#9543](https://github.com/matrix-org/synapse/issues/9543), [\#9591](https://github.com/matrix-org/synapse/issues/9591), [\#9608](https://github.com/matrix-org/synapse/issues/9608), [\#9618](https://github.com/matrix-org/synapse/issues/9618))
- Add an additional test for purging a room. ([\#9541](https://github.com/matrix-org/synapse/issues/9541))
- Add a `.git-blame-ignore-revs` file with the hashes of auto-formatting. ([\#9560](https://github.com/matrix-org/synapse/issues/9560))
- Increase the threshold before which outbound federation to a server goes into "catch up" mode, which is expensive for the remote server to handle. ([\#9561](https://github.com/matrix-org/synapse/issues/9561))
- Fix spurious errors reported by the `config-lint.sh` script. ([\#9562](https://github.com/matrix-org/synapse/issues/9562))
- Fix type hints and tests for BlacklistingAgentWrapper and BlacklistingReactorWrapper. ([\#9563](https://github.com/matrix-org/synapse/issues/9563))
- Do not have mypy ignore type hints from unpaddedbase64. ([\#9568](https://github.com/matrix-org/synapse/issues/9568))
- Improve efficiency of calculating the auth chain in large rooms. ([\#9576](https://github.com/matrix-org/synapse/issues/9576))
- Convert `synapse.types.Requester` to an `attrs` class. ([\#9586](https://github.com/matrix-org/synapse/issues/9586))
- Add logging for redis connection setup. ([\#9590](https://github.com/matrix-org/synapse/issues/9590))
- Improve logging when processing incoming transactions. ([\#9596](https://github.com/matrix-org/synapse/issues/9596))
- Remove unused `stats.retention` setting, and emit a warning if stats are disabled. ([\#9604](https://github.com/matrix-org/synapse/issues/9604))
- Prevent attempting to bundle aggregations for state events in /context APIs. ([\#9619](https://github.com/matrix-org/synapse/issues/9619))
2021-03-22 13:40:09 +00:00
Will Hunt
45b1c58898 Merge tag 'v1.30.0rc1' into hs/hacked-together-event-cache
Synapse 1.30.0rc1 (2021-03-16)
==============================

Note that this release deprecates the ability for appservices to
call `POST /_matrix/client/r0/register`  without the body parameter `type`. Appservice
developers should use a `type` value of `m.login.application_service` as
per [the spec](https://matrix.org/docs/spec/application_service/r0.1.2#server-admin-style-permissions).
In future releases, calling this endpoint with an access token - but without a `m.login.application_service`
type - will fail.

Features
--------

- Add prometheus metrics for number of users successfully registering and logging in. ([\#9510](https://github.com/matrix-org/synapse/issues/9510), [\#9511](https://github.com/matrix-org/synapse/issues/9511), [\#9573](https://github.com/matrix-org/synapse/issues/9573))
- Add `synapse_federation_last_sent_pdu_time` and `synapse_federation_last_received_pdu_time` prometheus metrics, which monitor federation delays by reporting the timestamps of messages sent and received to a set of remote servers. ([\#9540](https://github.com/matrix-org/synapse/issues/9540))
- Add support for generating JSON Web Tokens dynamically for use as OIDC client secrets. ([\#9549](https://github.com/matrix-org/synapse/issues/9549))
- Optimise handling of incomplete room history for incoming federation. ([\#9601](https://github.com/matrix-org/synapse/issues/9601))
- Finalise support for allowing clients to pick an SSO Identity Provider ([MSC2858](https://github.com/matrix-org/matrix-doc/pull/2858)). ([\#9617](https://github.com/matrix-org/synapse/issues/9617))
- Tell spam checker modules about the SSO IdP a user registered through if one was used. ([\#9626](https://github.com/matrix-org/synapse/issues/9626))

Bugfixes
--------

- Fix long-standing bug when generating thumbnails for some images with transparency: `TypeError: cannot unpack non-iterable int object`. ([\#9473](https://github.com/matrix-org/synapse/issues/9473))
- Purge chain cover indexes for events that were purged prior to Synapse v1.29.0. ([\#9542](https://github.com/matrix-org/synapse/issues/9542), [\#9583](https://github.com/matrix-org/synapse/issues/9583))
- Fix bug where federation requests were not correctly retried on 5xx responses. ([\#9567](https://github.com/matrix-org/synapse/issues/9567))
- Fix re-activating an account via the admin API when local passwords are disabled. ([\#9587](https://github.com/matrix-org/synapse/issues/9587))
- Fix a bug introduced in Synapse 1.20 which caused incoming federation transactions to stack up, causing slow recovery from outages. ([\#9597](https://github.com/matrix-org/synapse/issues/9597))
- Fix a bug introduced in v1.28.0 where the OpenID Connect callback endpoint could error with a `MacaroonInitException`. ([\#9620](https://github.com/matrix-org/synapse/issues/9620))
- Fix Internal Server Error on `GET /_synapse/client/saml2/authn_response` request. ([\#9623](https://github.com/matrix-org/synapse/issues/9623))

Updates to the Docker image
---------------------------

- Use jemalloc if available in docker. ([\#8553](https://github.com/matrix-org/synapse/issues/8553))

Improved Documentation
----------------------

- Add relayd entry to reverse proxy example configurations. ([\#9508](https://github.com/matrix-org/synapse/issues/9508))
- Improve the SAML2 upgrade notes for 1.27.0. ([\#9550](https://github.com/matrix-org/synapse/issues/9550))
- Link to the "List user's media" admin API from the media admin API docs. ([\#9571](https://github.com/matrix-org/synapse/issues/9571))
- Clarify the spam checker modules documentation example to mention that `parse_config` is a required method. ([\#9580](https://github.com/matrix-org/synapse/issues/9580))
- Clarify the sample configuration for `stats` settings. ([\#9604](https://github.com/matrix-org/synapse/issues/9604))

Deprecations and Removals
-------------------------

- The `synapse_federation_last_sent_pdu_age` and `synapse_federation_last_received_pdu_age` prometheus metrics have been removed. They are replaced by `synapse_federation_last_sent_pdu_time` and `synapse_federation_last_received_pdu_time`. ([\#9540](https://github.com/matrix-org/synapse/issues/9540))
- Registering an Application Service user without using the `m.login.application_service` login type will be unsupported in an upcoming Synapse release. ([\#9559](https://github.com/matrix-org/synapse/issues/9559))

Internal Changes
----------------

- Add tests to ResponseCache. ([\#9458](https://github.com/matrix-org/synapse/issues/9458))
- Add type hints to purge room and server notice admin API. ([\#9520](https://github.com/matrix-org/synapse/issues/9520))
- Add extra logging to ObservableDeferred when callbacks throw exceptions. ([\#9523](https://github.com/matrix-org/synapse/issues/9523))
- Fix incorrect type hints. ([\#9528](https://github.com/matrix-org/synapse/issues/9528), [\#9543](https://github.com/matrix-org/synapse/issues/9543), [\#9591](https://github.com/matrix-org/synapse/issues/9591), [\#9608](https://github.com/matrix-org/synapse/issues/9608), [\#9618](https://github.com/matrix-org/synapse/issues/9618))
- Add an additional test for purging a room. ([\#9541](https://github.com/matrix-org/synapse/issues/9541))
- Add a `.git-blame-ignore-revs` file with the hashes of auto-formatting. ([\#9560](https://github.com/matrix-org/synapse/issues/9560))
- Increase the threshold before which outbound federation to a server goes into "catch up" mode, which is expensive for the remote server to handle. ([\#9561](https://github.com/matrix-org/synapse/issues/9561))
- Fix spurious errors reported by the `config-lint.sh` script. ([\#9562](https://github.com/matrix-org/synapse/issues/9562))
- Fix type hints and tests for BlacklistingAgentWrapper and BlacklistingReactorWrapper. ([\#9563](https://github.com/matrix-org/synapse/issues/9563))
- Do not have mypy ignore type hints from unpaddedbase64. ([\#9568](https://github.com/matrix-org/synapse/issues/9568))
- Improve efficiency of calculating the auth chain in large rooms. ([\#9576](https://github.com/matrix-org/synapse/issues/9576))
- Convert `synapse.types.Requester` to an `attrs` class. ([\#9586](https://github.com/matrix-org/synapse/issues/9586))
- Add logging for redis connection setup. ([\#9590](https://github.com/matrix-org/synapse/issues/9590))
- Improve logging when processing incoming transactions. ([\#9596](https://github.com/matrix-org/synapse/issues/9596))
- Remove unused `stats.retention` setting, and emit a warning if stats are disabled. ([\#9604](https://github.com/matrix-org/synapse/issues/9604))
- Prevent attempting to bundle aggregations for state events in /context APIs. ([\#9619](https://github.com/matrix-org/synapse/issues/9619))
2021-03-16 15:40:21 +00:00
Will Hunt
316db51bed Merge remote-tracking branch 'origin/develop' into hs/hacked-together-event-cache 2021-03-04 10:26:06 +00:00
Will Hunt
4a3260092d Merge remote-tracking branch 'origin/develop' into hs/hacked-together-event-cache 2021-02-25 19:32:18 +00:00
Will Hunt
0831f16757 this is a property 2021-02-23 01:19:51 +00:00
Will Hunt
52d3e5c796 Add Del/Expire commands to test 2021-02-22 23:58:39 +00:00
Will Hunt
2585f57e60 fix config getter 2021-02-22 22:14:01 +00:00
Will Hunt
1e11898863 linting 2021-02-22 22:04:47 +00:00
Will Hunt
48b35d9404 Make external_event_cache_expiry_ms configurable 2021-02-22 18:27:33 +00:00
Will Hunt
2be3a0284f Use ms for expiry time on get() 2021-02-22 18:27:24 +00:00
Will Hunt
f55d926682 Squash changes to events_worker to fix commit issues
This change:
- Adds dedicated functions to dehydrate and hydrate frozen events in the cache
- Resets the expiry time on events
2021-02-22 18:14:57 +00:00
Will Hunt
8b36deef2f Add delete and expire methods to external_cache 2021-02-22 18:14:40 +00:00
Will Hunt
c2f5415afe Merge remote-tracking branch 'origin/develop' into hs/hacked-together-event-cache 2021-02-22 18:11:06 +00:00
Will Hunt
2de6060266 Do not auto expire events 2021-02-16 15:46:37 +00:00
Will Hunt
873da386d9 Update synapse/storage/databases/main/events_worker.py
Co-authored-by: Christian Paul <christianp@matrix.org>
2021-02-12 10:31:00 +00:00
Will Hunt
e9fbbf1342 fix port db 2021-02-12 10:31:00 +00:00
Will Hunt
9ac17af4b4 linting 2021-02-12 10:31:00 +00:00
Will Hunt
c673e3ec1c Call external_cache.delete syncronously 2021-02-12 10:31:00 +00:00
Will Hunt
5358283ac6 Use both internal and external caches 2021-02-12 10:31:00 +00:00
Will Hunt
fc38c182bd await 2021-02-12 10:31:00 +00:00
Will Hunt
963e1c6540 changelog+linting 2021-02-12 10:31:00 +00:00
Will Hunt
0b117731ef Use external cache for events 2021-02-12 10:31:00 +00:00
Will Hunt
264e9a6ee3 Add delete key to ExternalCache 2021-02-12 10:31:00 +00:00
9 changed files with 181 additions and 9 deletions

1
changelog.d/9379.feature Normal file
View File

@@ -0,0 +1 @@
Store cached events in the external redis cache, when redis is enabled.

View File

@@ -676,6 +676,13 @@ retention:
#
#event_cache_size: 10K
# The expiry time of an event stored in the external cache (Redis). This
# time will be reset each time the event is accessed.
# This is only used when Redis is configured.
# Defaults to 30 minutes
#
#external_event_cache_expiry_ms: 1800000
caches:
# Controls the global cache factor, which is the default cache factor
# for all caches if a specific factor for that cache is not otherwise

View File

@@ -35,6 +35,7 @@ from synapse.logging.context import (
make_deferred_yieldable,
run_in_background,
)
from synapse.replication.tcp.external_cache import ExternalCache
from synapse.storage.database import DatabasePool, make_conn
from synapse.storage.databases.main.client_ips import ClientIpBackgroundUpdateStore
from synapse.storage.databases.main.deviceinbox import DeviceInboxBackgroundUpdateStore
@@ -208,13 +209,19 @@ class Store(
"Attempt to set room_is_public during port_db: database not empty?"
)
class MockHomeserver:
def __init__(self, config):
self.clock = Clock(reactor)
self.config = config
self.hostname = config.server_name
self.version_string = "Synapse/" + get_version_string(synapse)
self.external_cache = ExternalCache(self)
def get_outbound_redis_connection(self):
return None
def get_external_cache(self):
return self.external_cache
def get_clock(self):
return self.clock

View File

@@ -31,6 +31,8 @@ class RedisProtocol(protocol.Protocol):
only_if_exists: bool = False,
) -> None: ...
async def get(self, key: str) -> Any: ...
async def delete(self, key: str) -> None: ...
async def expire(self, key: str, expire: int) -> None: ...
class SubscriberProtocol(RedisProtocol):
def __init__(self, *args, **kwargs): ...

View File

@@ -32,6 +32,7 @@ _CACHES_LOCK = threading.Lock()
_DEFAULT_FACTOR_SIZE = 0.5
_DEFAULT_EVENT_CACHE_SIZE = "10K"
_DEFAULT_EXTERNAL_CACHE_EXPIRY_MS = 30 * 60 * 1000 # 30 minutes
class CacheProperties:
@@ -115,6 +116,13 @@ class CacheConfig(Config):
#
#event_cache_size: 10K
# The expiry time of an event stored in the external cache (Redis). This
# time will be reset each time the event is accessed.
# This is only used when Redis is configured.
# Defaults to 30 minutes
#
#external_event_cache_expiry_ms: 1800000
caches:
# Controls the global cache factor, which is the default cache factor
# for all caches if a specific factor for that cache is not otherwise
@@ -166,6 +174,13 @@ class CacheConfig(Config):
self.event_cache_size = self.parse_size(
config.get("event_cache_size", _DEFAULT_EVENT_CACHE_SIZE)
)
self.external_event_cache_expiry_ms = config.get(
"external_event_cache_expiry_ms", _DEFAULT_EXTERNAL_CACHE_EXPIRY_MS
)
if not isinstance(self.external_event_cache_expiry_ms, (int, float)):
raise ConfigError("external_event_cache_expiry_ms must be a number.")
self.cache_factors: Dict[str, float] = {}
cache_config = config.get("caches") or {}

View File

@@ -35,6 +35,12 @@ get_counter = Counter(
labelnames=["cache_name", "hit"],
)
delete_counter = Counter(
"synapse_external_cache_delete",
"Number of times we deleted keys from a cache",
labelnames=["cache_name"],
)
response_timer = Histogram(
"synapse_external_cache_response_time_seconds",
"Time taken to get a response from Redis for a cache get/set request",
@@ -72,7 +78,24 @@ class ExternalCache:
"""
return self._redis_connection is not None
async def set(self, cache_name: str, key: str, value: Any, expiry_ms: int) -> None:
async def delete(self, cache_name: str, key: str) -> None:
"""Delete a key from the named cache."""
if self._redis_connection is None:
return
delete_counter.labels(cache_name).inc()
logger.debug("Deleting %s %s", cache_name, key)
return await make_deferred_yieldable(
self._redis_connection.delete(
self._get_redis_key(cache_name, key),
)
)
async def set(
self, cache_name: str, key: str, value: Any, expiry_ms: Optional[int] = None
) -> None:
"""Add the key/value to the named cache, with the expiry time given."""
if self._redis_connection is None:
@@ -95,15 +118,18 @@ class ExternalCache:
)
)
async def get(self, cache_name: str, key: str) -> Optional[Any]:
async def get(
self, cache_name: str, key: str, expiry_ms: Optional[int] = None
) -> Optional[Any]:
"""Look up a key/value in the named cache."""
if self._redis_connection is None:
return None
cache_key = self._get_redis_key(cache_name, key)
with response_timer.labels("get").time():
result = await make_deferred_yieldable(
self._redis_connection.get(self._get_redis_key(cache_name, key))
self._redis_connection.get(cache_key)
)
logger.debug("Got cache result %s %s: %r", cache_name, key, result)
@@ -113,6 +139,13 @@ class ExternalCache:
if not result:
return None
if expiry_ms:
# If we are using this key, bump the expiry time
# NOTE: txredisapi does not support pexire, so we must use (expire) seconds
await make_deferred_yieldable(
self._redis_connection.expire(cache_key, expiry_ms // 1000)
)
# For some reason the integers get magically converted back to integers
if isinstance(result, int):
return result

View File

@@ -78,7 +78,7 @@ logger = logging.getLogger(__name__)
EVENT_QUEUE_THREADS = 3 # Max number of threads that will fetch events
EVENT_QUEUE_ITERATIONS = 3 # No. times we block waiting for requests for events
EVENT_QUEUE_TIMEOUT_S = 0.1 # Timeout when waiting for requests for events
GET_EVENT_CACHE_NAME = "getEvent"
@attr.s(slots=True, auto_attribs=True)
class _EventCacheEntry:
@@ -165,10 +165,14 @@ class EventsWorkerStore(SQLBaseStore):
5 * 60 * 1000,
)
self._external_cache = hs.get_external_cache()
self._get_event_cache = LruCache(
cache_name="*getEvent*",
max_size=hs.config.caches.event_cache_size,
)
self._external_cache_event_expiry_ms = (
hs.config.caches.external_event_cache_expiry_ms
)
# Map from event ID to a deferred that will result in a map from event
# ID to cache entry. Note that the returned dict may not have the
@@ -511,7 +515,7 @@ class EventsWorkerStore(SQLBaseStore):
Returns:
map from event id to result
"""
event_entry_map = self._get_events_from_cache(
event_entry_map = await self._get_events_from_cache(
event_ids,
)
@@ -593,8 +597,77 @@ class EventsWorkerStore(SQLBaseStore):
def _invalidate_get_event_cache(self, event_id):
self._get_event_cache.invalidate((event_id,))
if self._external_cache.is_enabled():
# XXX: Is there danger in doing this?
# We could hold a set of recently evicted keys in memory if
# we need this to be synchronous?
run_as_background_process(
"getEvent_external_cache_delete",
self._external_cache.delete,
GET_EVENT_CACHE_NAME,
event_id,
)
def _get_events_from_cache(
def create_external_cache_event_from_event(self, event, redacted_event=None):
if redacted_event:
redacted_event = self.create_external_cache_event_from_event(
redacted_event
)[0]
event_dict = event.get_dict()
for key, value in event.unsigned.items():
if isinstance(value, EventBase):
event_dict["unsigned"][key] = {"_cache_event_id": value.event_id}
return _EventCacheEntry(
event={
"event_dict": event_dict,
"room_version": event.room_version.identifier,
"internal_metadata_dict": event.get_internal_metadata_dict(),
"rejected_reason": event.rejected_reason,
"stream_ordering": event.internal_metadata.stream_ordering,
},
redacted_event=redacted_event,
)
async def _create_event_cache_entry_from_external_cache_entry(
self, external_entry: Tuple[JsonDict, Optional[JsonDict]]
) -> Optional[_EventCacheEntry]:
"""Create a _EventCacheEntry from a tuple of dicts
Args:
external_entry: A tuple of event, redacted_event
Returns:
A _EventCacheEntry containing the frozen event(s)
"""
event_dict = external_entry[0].get("event_dict")
for key, value in event_dict.get("unsigned", {}).items():
# If unsigned contained any events, get them now
if isinstance(value, dict) and value.get("_cache_event_id"):
event_dict["unsigned"][key] = await self.get_event(
value["_cache_event_id"]
)
original_ev = make_event_from_dict(
event_dict=event_dict,
room_version=KNOWN_ROOM_VERSIONS[external_entry[0].get("room_version")],
internal_metadata_dict=external_entry[0].get("internal_metadata_dict"),
rejected_reason=external_entry[0].get("rejected_reason"),
)
original_ev.internal_metadata.stream_ordering = external_entry[0].get(
"stream_ordering"
)
redacted_ev = None
if external_entry[1]:
redacted_ev = make_event_from_dict(
event_dict=external_entry[1].get("event_dict"),
room_version=KNOWN_ROOM_VERSIONS[external_entry[1].get("room_version")],
internal_metadata_dict=external_entry[1].get("internal_metadata_dict"),
rejected_reason=external_entry[1].get("rejected_reason"),
)
return _EventCacheEntry(event=original_ev, redacted_event=redacted_ev)
async def _get_events_from_cache(
self, events: Iterable[str], update_metrics: bool = True
) -> Dict[str, _EventCacheEntry]:
"""Fetch events from the caches.
@@ -608,9 +681,27 @@ class EventsWorkerStore(SQLBaseStore):
event_map = {}
for event_id in events:
# L1 cache - internal
ret = self._get_event_cache.get(
(event_id,), None, update_metrics=update_metrics
)
if not ret and self._external_cache.is_enabled():
# L2 cache - external
cache_result = await self._external_cache.get(
GET_EVENT_CACHE_NAME,
event_id,
self._external_cache_event_expiry_ms,
)
if cache_result:
ret = (
await self._create_event_cache_entry_from_external_cache_entry(
cache_result
)
)
# We got a hit here, store it in the L1 cache
self._get_event_cache.set((event_id,), ret)
if not ret:
continue
@@ -889,10 +980,22 @@ class EventsWorkerStore(SQLBaseStore):
cache_entry = _EventCacheEntry(
event=original_ev, redacted_event=redacted_event
)
self._get_event_cache.set((event_id,), cache_entry)
result_map[event_id] = cache_entry
if self._external_cache.is_enabled():
# Store in the L2 cache
# Redis cannot store a FrozenEvent, so we transform these
# into two dicts
redis_cache_entry = self.create_external_cache_event_from_event(
original_ev, redacted_event
)
await self._external_cache.set(
GET_EVENT_CACHE_NAME,
event_id,
redis_cache_entry,
)
return result_map
async def _enqueue_events(self, events):

View File

@@ -629,7 +629,7 @@ class RoomMemberWorkerStore(EventsWorkerStore):
# We don't update the event cache hit ratio as it completely throws off
# the hit ratio counts. After all, we don't populate the cache if we
# miss it here
event_map = self._get_events_from_cache(member_event_ids, update_metrics=False)
event_map = await self._get_events_from_cache(member_event_ids, update_metrics=False)
missing_member_event_ids = []
for event_id in member_event_ids:

View File

@@ -525,6 +525,10 @@ class FakeRedisPubSubProtocol(Protocol):
self.send("OK")
elif command == b"GET":
self.send(None)
elif command == b"DEL":
self.send("OK")
elif command == b"EXPIRE":
self.send("OK")
else:
raise Exception("Unknown command")