This was broken in PR #4405, commit 886e5ac, where we changed remote
rejections to be outliers.
The fix is to explicitly add the leave event in when we know its an out
of band invite. We can't always add the event as if the server is/was in
the room there might be more events to send down the sync than just the
leave.
* Fix replication for room v3
We were not correctly quoting the path fragments over http replication,
which meant that it exploded when the event IDs had a slash in them
* Newsfile
* Handle listening for ACME requests on IPv6 addresses
the weird url-but-not-actually-a-url-string doesn't handle IPv6 addresses
without extra quoting. Building a string which you are about to parse again
seems like a weird choice. Let's just use listenTCP, which is consistent with
what we do elsewhere.
* Clean up the default ACME config
make it look a bit more consistent with everything else, and tweak the defaults
to listen on port 80.
* newsfile
We only process events sent to us from a server if the event ID matches
the server, to help guard against federation storms. We replace this
with a check against the event origin.
The transaction queue only sends out events that we generate. This was
done by checking domain of event ID, but that can no longer be used.
Instead, we may as well use the sender field.
We add the constant, but don't add it to the known room versions. This
lets us start adding V3 logic, but the servers will never join or create
V3 rooms
We currently pass FrozenEvent instead of `dict` to
`compute_event_signature`, which works by accident due to `dict(event)`
producing the correct result.
This fixes PR #4493 commit 855a151
two reasons for this. One, it saves a bunch of boilerplate. Two, it squashes
unicode to IDNA-in-a-`str` (even on python 3) in a way that it turns out we
rely on to give consistent behaviour between python 2 and 3.
The validator was being run on the EventBuilder objects, and so the
validator only checked a subset of fields. With the upcoming
EventBuilder refactor even fewer fields will be there to validate.
To get around this we split the validation into those that can be run
against an EventBuilder and those run against a fully fledged event.
This is in preparation for making EventBuilder format agnostic, which
means event signing should be done against the event dict rather than
the EventBuilder object.
Turns out that the library does a better job of parsing URIs than our
reinvented wheel. Who knew.
There are two things going on here. The first is that, unlike
parse_server_name, URI.fromBytes will strip off square brackets from IPv6
literals, which means that it is valid input to ClientTLSOptionsFactory and
HostnameEndpoint.
The second is that we stay in `bytes` throughout (except for the argument to
ClientTLSOptionsFactory), which avoids the weirdness of (sometimes) ending up
with idna-encoded values being held in `unicode` variables. TBH it probably
would have been ok but it made the tests fragile.
If you use double-quotes here, you have to escape your backslashes. It's much
easier with single-quotes.
(Note that the existing double-backslashes are already interpreted by python's
""" parsing.)
This is leading to problems with people upgrading to clients that
support MSC1730 because people have this misconfigured, so try
to make the docs completely unambiguous.
The problem here is that we have cut-and-pasted an impl from Twisted, and then
failed to maintain it. It was fixed in Twisted in
https://github.com/twisted/twisted/pull/1047/files; let's do the same here.
Currently they're stored as non-outliers even though the server isn't in
the room, which can be problematic in places where the code assumes it
has the state for all non outlier events.
In particular, there is an edge case where persisting the leave event
triggers a state resolution, which requires looking up the room version
from state. Since the server doesn't have the state, this causes an
exception to be thrown.
In the debian package, make the virtualenv symlink python to /usr/bin/python3.X
rather than /usr/bin/python3. Also make sure we depend on the right python3.x
package.
This might help a bit with subtle failures when people install a package from
the wrong distro (https://github.com/matrix-org/synapse/issues/4431).
Currently we only have the one event format version defined, but this
adds the necessary infrastructure to persist and fetch the format
versions alongside the events.
We specify the format version rather than the room version as:
1. We don't necessarily know the room version, existing events may be
either v1 or v2.
2. We'd need to be careful to prevent/handle correctly if different
events in the same room reported to be of different versions, which
sounds annoying.
This was caused by accidentally overwritting a `last_seen` variable
in a for loop, causing the wrong value to be written to the progress
table. The result of which was that we didn't scan sections of the table
when searching for duplicates, and so some duplicates did not get
deleted.
* Fix race when persisting create event
When persisting a chunk of DAG it is sometimes requried to do a state
resolution, which requires knowledge of the room version. If this
happens while we're persisting the create event then we need to use that
event rather than attempting to look it up in the database.
* Remove redundant WrappedConnection
The matrix federation client uses an HTTP connection pool, which times out its
idle HTTP connections, so there is no need for any of this business.
Since 0.13.0, pymacaroons works correctly with pynacl, so there
isn’t any more reason to depend on an outdated pynacl fork.
Signed-off-by: Andrej Shadura <andrew.shadura@collabora.co.uk>
turns out that 0.34.1.1+1 comes before 0.34.1.1+bionic (etc).
The version may only contain "~ 0-9 A-Z a-z + - ." (sorting in that order).
Option 1: replace "+" with something that sorts after +. Options are "-" (but
dpkg-source complains about that) or "." (but that would mean we couldn't
distinguish packaging-only changes from real changes).
Option 2: stick with + and just find something that sorts after 'xenial'. The
only options there are "-", "." (same problems as before), "z", and "+".
Hence, ++1. Sorry.
* Correctly retry and back off if we get a HTTPerror response
* Refactor request sending to have better excpetions
MatrixFederationHttpClient blindly reraised exceptions to the caller
without differentiating "expected" failures (e.g. connection timeouts
etc) versus more severe problems (e.g. programming errors).
This commit adds a RequestSendFailed exception that is raised when
"expected" failures happen, allowing the TransactionQueue to log them as
warnings while allowing us to log other exceptions as actual exceptions.
* Fix command hint to generate a config file
When trying to start Synapse without a config file, it will complain
and give a hint towards what command to run. This hinted command
is missing the "report_stats" parameter, which is required with either
yes or no value. Add this to the command.
Not an ideal situation but makes the given command work without the
user getting another error, even though it might be unclear what
"report_stats" represents.
Signed-off-by: Jason Robinson <jasonr@matrix.org>
Hi, the original docker-compose file did not work by default.
You get federation port working but no client port.
My proposal is to let federation port work as it is by default (8448) and let traefik handle client http/https traffic.
since #4298, the optional dependencies are no longer installed with a simple
`pip install .`, which meant that they were not being included in the debian
package.
The easy fix to that is dh_virtualenv --extras, but that needs dh_virtualenv
1.1...
* Remove mention of lt-cred-mech in the sample coturn config.
See https://github.com/coturn/coturn/pull/262 for more context.
Also clean up some minor formatting issues while I'm here.
* Add changelog.
Signed-off-by: Krithin Sitaram <krithin@gmail.com>
prometheus_client 0.5 has a named-tuple Sample type with more member
than the old plain tuple had. This commit makes sure the unit test
detects this and changes the way it reads the sample.
Signed-off-by: Maarten de Vries <maarten@de-vri.es>
Allow for the creation of a support user.
A support user can access the server, join rooms, interact with other users, but does not appear in the user directory nor does it contribute to monthly active user limits.
This is largely a precursor for the removal of the bundled webclient. The idea
is to present a page at / which reassures people that something is working, and
to give them some links for next steps.
The welcome page lives at `/_matrix/static/`, so is enabled alongside the other
`static` resources (which, in practice, means the client API is enabled). We'll
redirect to it from `/` if we have nothing better to display there.
It would be nice to have a way to disable it (in the same way that you might
disable the nginx welcome page), but I can't really think of a good way to do
that without a load of ickiness.
It's based on the work done by @krombel for #2601.
https://www.kernel.org/doc/Documentation/SubmittingPatches returns a simple page indicating that it's moved to: "process/submitting-patches.rst".
I believe the new link contains the same information as what was previously linked to.
If you're installing as a system package, the system package should have set up
the systemd config, so it's more useful to give an example of running in a
virtualenv here.
This implements both a SAML2 metadata endpoint (at
`/_matrix/saml2/metadata.xml`), and a SAML2 response receiver (at
`/_matrix/saml2/authn_response`). If the SAML2 response matches what's been
configured, we complete the SSO login flow by redirecting to the client url
(aka `RelayState` in SAML2 jargon) with a login token.
What we don't yet have is anything to build a SAML2 request and redirect the
user to the identity provider. That is left as an exercise for the reader.
This is mostly factoring out the post-CAS-login code to somewhere we can reuse
it for other SSO flows, but it also fixes the userid mapping while we're at it.
* Rip out half-implemented m.login.saml2 support
This was implemented in an odd way that left most of the work to the client, in
a way that I really didn't understand. It's going to be a pain to maintain, so
let's start by ripping it out.
* drop undocumented dependency on dateutil
It turns out we were relying on dateutil being pulled in transitively by
pysaml2. There's no need for that bloat.
* Add note to UPGRADE.rst about removing riot.im from list of trusted identity servers
Signed-off-by: Aaron Raimist <aaron@raim.ist>
* Add changelog
Signed-off-by: Aaron Raimist <aaron@raim.ist>
* Clean up the CSS for the fallback login form
I was finding this hard to work with, so simplify a bunch of things. Each
flow is now a form inside a div of class login_flow.
The login_flow class now has a fixed width, as that looks much better than each
flow having a differnt width.
* Support m.login.sso
MSC1721 renames m.login.cas to m.login.sso. This implements the change
(retaining support for m.login.cas for older clients).
* changelog
* Add better diagnostics to flakey keyring test
* fix interpolation fail
* Check logcontexts before and after each test
* update changelog
* update changelog
* Some words about garbage collections and logcontexts
* Do a GC after each test to fix logcontext leaks
This feels like an awful hack, but...
* changelog
Clients often reupload their device keys (for some reason) so its
important for the server to check for no-ops before sending out device
list update notifications.
The check is broken in python 3 due to the fact comparing bytes and
unicode always fails, and that we write bytes to the DB but get unicode
when we read.
When we receive events over federation we will need to know the room
version to be able to correctly handle them, e.g. once we start changing
event formats. Currently, we attempt to handle events in unknown rooms.
It seems that, at some point, the ability to run tox on old servers (with old
setuptools) got broken - and it was only working on our Jenkins instance by
dint of reusing the tox environments.
Let's try to get tox to do the right thing, and remove the guff from
jenkins/prepare_synapse.sh.
(There is a separate question about whether the jenkins builds should be using
tox to prepare the virtualenv at all here, but that is somewhat orthogonal).
We currently send several kHz of device list updates over replication
occisonally, which often causes the replications streams to lag and then
get dropped.
A lot of those updates will actually be duplicates, since we don't send
e.g. device_ids across replication, so let's deduplicate it when we pull
them out of the database.
Fixes handling of rooms where we have permission to send the tombstone, but not
other state. We need to (a) fail more gracefully when we can't send the PLs in
the old room, and (b) not set the PLs in the new room until we are done with
the other stuff.
Currently when fetching state groups from the data store we make two
hits two the database: once for members and once for non-members (unless
request is filtered to one or the other). This adds needless load to the
datbase, so this PR refactors the lookup to make only a single database
hit.
Debug tests
Try printing the channel
fix
Import and use six
Remove debugging
Disable captcha
Add some mocks
Define the URL
Fix the clock?
Less rendering?
use the other render
Complete the dummy auth stage
Fix last stage of the test
Remove mocks we don't need
Fixes a bug introduced in https://github.com/matrix-org/synapse/pull/1783 which
meant that single backslashes were not allowed in event field filters.
The intention here is to allow single-backslashes, but disallow
double-backslashes.
Broadly three things here:
* disable W504 which seems a bit whacko
* remove a bunch of `as e` expressions from exception handlers that don't use
them
* use `r""` for strings which include backslashes
Also, we don't use pep8 any more, so we can get rid of the duplicate config
there.
Wrap calls to deferToThread() in a thing which uses a child logcontext to
attribute CPU usage to the right request.
While we're in the area, remove the logcontext_tracer stuff, which is never
used, and afaik doesn't work.
Fixes#4064
This brings it into line with on_new_notifications and on_new_receipts. It
requires a little bit of hoop-jumping in EmailPusher to load the throttle
params before the first loop.
`on_new_notifications` and `on_new_receipts` in `HttpPusher` and `EmailPusher`
now always return synchronously, so we can remove the `defer.gatherResults` on
their results, and the `run_as_background_process` wrappers can be removed too
because the PusherPool methods will now complete quickly enough.
Each pusher has its own loop which runs for as long as it has work to do. This
should run in its own background thread with its own logcontext, as other
similar loops elsewhere in the system do - which means that CPU usage is
consistently attributed to that loop, rather than to whatever request happened
to start the loop.
As of #4027, we require psutil to be installed, so it should be in our
dependency list. We can also remove some of the conditional import code
introduced by #992.
Fixes#4062.
It's quite important that get_missing_events returns the *latest* events in the
room; however we were pulling event ids out of the database until we got *at
least* 10, and then taking the *earliest* of the results.
We also shouldn't really be relying on depth, and should be checking the
room_id.
move the example email templates into the synapse package so that they can be
used as package data, which should mean that all of the packaging mechanisms
(pip, docker, debian, arch, etc) should now come with the example templates.
In order to grandfather in people who relied on the templates being in the old
place, check for that situation and fall back to using the defaults if the
templates directory does not exist.
It's quite important that get_missing_events returns the *latest* events in the
room; however we were pulling event ids out of the database until we got *at
least* 10, and then taking the *earliest* of the results.
We also shouldn't really be relying on depth, and should be checking the
room_id.
- Improve logging: log things in the right order, include destination and txids
in all log lines, don't log successful responses twice
- Fix the docstring on TransportLayerClient.send_transaction
- Don't use treq.request, which is overcomplicated for our purposes: just use a
twisted.web.client.Agent.
- simplify the logic for setting up the bodyProducer
- fix bytes/str confusions
We're better off hashing just the event_id than the whole ((type, state_key),
event_id) tuple - so use a dict instead of a set.
Also, iteritems > items.
Since we don't actually delete the keys, just mark the versions
as deleted in the db rather than actually deleting them, then we
won't reuse versions.
Fixes https://github.com/vector-im/riot-web/issues/7448
If a looping call function errors, then it kills the loop entirely.
Currently it throws away the exception logs, so we should make it
actually log them.
Fixes#3929
If a connection is lost before a request is read from Request, Twisted
sets `method` (and `uri`) attributes to dummy values. These dummy values
have incorrect types (i.e. they're not bytes), and so things like
`__repr__` would raise an exception.
To fix this we had a helper method to return the method with a
consistent type.
In particular, we assume that the name and canonical alias events in
the state have not been rejected. In practice this may not be the case
(though we should probably think about fixing that) so lets ensure that
we gracefully handle that case, rather than 404'ing the sync request
like we do now.
This test didn't do what it claimed to do, and what it claimed to do was the
same as test_cant_hide_direct_ancestors anyway.
This stuff is tested by sytest anyway.
Latest is horrible and makes debugging what has happened anywhere a
nightmare. We push a latest because of demand for it, but we'll also
push a SHA1 commit id so those wanting to know what they're running
(and be able to roll back if required) can use those instead.
Note that latest here is defined as "most recent master commit" not
"most recent released version", as the actual semantics of making latest
correct while still being able to build bugfixed releases of previous
versions is just ARGH. So we define it as "master" not "latest release".
If we have a forward extremity for a room as `E`, and you receive `A`, `B`,
s.t. `A -> B -> E`, and `B` also points to an unknown event `X`, then we need
to do state res between `X` and `E`.
When that happens, we need to make sure we include `X` in the state that goes
into the state res alg.
Fixes#3934.
If we've fetched state events from remote servers in order to resolve the state
for a new event, we need to actually pass those events into
resolve_events_with_factory (so that it can do the state res) and then persist
the ones we need - otherwise other bits of the codebase get confused about why
we have state groups pointing to non-existent events.
get_state_groups returns a map from state_group_id to a list of FrozenEvents,
so was very much the wrong thing to be putting as one of the entries in the
list passed to resolve_events_with_factory (which expects maps from
(event_type, state_key) to event id).
We actually want get_state_groups_ids().values() rather than
get_state_groups().
This fixes the main problem in #3923, but there are other problems with this
bit of code which get discovered once you do so.
* add some comments on things that look a bit bogus
* rename this `state` variable to avoid confusion with the `state` used
elsewhere in this function. (There was no actual conflict, but it was
a confusing bit of spaghetti.)
when processing incoming transactions, it can be hard to see what's going on,
because we process a bunch of stuff in parallel, and because we may end up
recursively working our way through a chain of three or four events.
This commit creates a way to use logcontexts to add the relevant event ids to
the log lines.
This ensures that its resource usage metrics get recorded somewhere rather than
getting lost.
(It also fixes an error when called from a nested logging context which
completes before the bg process)
There's really no point in checking for destinations called "localhost" because
there is nothing stopping people creating other DNS entries which point to
127.0.0.1. The right fix for this is
https://github.com/matrix-org/synapse/issues/3953.
Blocking localhost, on the other hand, means that you get a surprise when
trying to connect a test server on localhost to an existing server (with a
'normal' server_name).
When we were authorizing an event, if there was no `m.room.create` in its
auth_events, we would raise a SynapseError with a cryptic message, which then
meant that we would bail out of processing any incoming events, rather than
storing a rejection for the faulty event and moving on.
We should treat the absent event the same as any other auth failure, by
raising an AuthError, so that the event is marked as rejected.
It used to try and produce an estimate, which was sometimes negative.
This caused metrics to be sad, so lets always just calculate it from
scratch.
(This appears to have been a longstanding bug, but one which has been made more
of a problem by #3932 and #3933).
(This was originally done by Erik as part of #3933. I'm cherry-picking it
because really it's a fix in its own right)
Synapse doesn’t allow for media resources to be played directly from
Chrome. It is a problem for users on other networks (e.g. IRC)
communicating with Matrix users through a gateway. The gateway sends
them the raw URL for the resource when a Matrix user uploads a video
and the video cannot be played directly in Chrome using that URL.
Chrome argues it is not authorized to play the video because of the
Content Security Policy. Chrome checks for the "media-src" policy which
is missing, and defauts to the "default-src" policy which is "none".
As Synapse already sends "object-src: 'self'" I thought it wouldn’t be
a problem to add "media-src: 'self'" to the CSP to fix this problem.
symlinks apparently break setuptools on python3 and alpine
(https://bugs.python.org/issue31940), so let's stop using a symlink and just
use the file directly.
Given we have disabled lazy loading for incr syncs in #3840, we can make self-LL more efficient by only doing it on initial sync. Also adds a bounds check for if/when we change our mind, so that we don't try to include LL members on sync responses with no timeline.
Older Twisted (18.4.0) returns TimeoutError instead of
ConnectingCancelledError when connection times out.
This change allows tests to be compatible with this behaviour.
Signed-off-by: Oleg Girko <ol@infoserver.lv>
ExpiringCache required that `start()` be called before it would actually
start expiring entries. A number of places didn't do that.
This PR removes `start` from ExpiringCache, and automatically starts
backround reaping process on creation instead.
If a HTTP handler throws an exception while processing a request we
automatically write a JSON error response. If the handler had already
started writing a response twisted throws an exception.
We should check for this case and simple abort the connection if there
was an error after the response had started being written.
Add some informative comments about what's going on here.
Also, `sent_to_us_directly` and `get_missing` were doing the same thing (apart
from in `_handle_queued_pdus`, which looks like a bug), so let's get rid of
`get_missing` and use `sent_to_us_directly` consistently.
Let's try to rationalise the logging that happens when we are processing an
incoming transaction, to make it easier to figure out what is going wrong when
they take ages. In particular:
- make everything start with a [room_id event_id] prefix
- make sure we log a warning when catching exceptions rather than just turning
them into other, more cryptic, exceptions.
Currently we rely on the master to invalidate this cache promptly.
However, after having moved most federation endpoints off of master this
no longer happens, causing outbound fedeariont to get blackholed.
Fixes#3798
The existing deferred timeout helper function (and the one into twisted)
suffer from a bug when a deferred's canceller throws an exception, #3842.
The new helper function doesn't suffer from this problem.
We want to wait until we have read the response body before we log the request
as complete, otherwise a confusing thing happens where the request appears to
have completed, but we later fail it.
To do this, we factor the salient details of a request out to a separate
object, which can then keep track of the txn_id, so that it can be logged.
The problem with this script is that it is largely untested, entirely
unmaintained, and running it is likely to make your synapse blow up in
exciting ways.
For example, it leaves a bunch of tables with dead values in it, like
event_to_state_groups.
Having it here sends a message that it is a supported part of
synapse, which is absolutely not the case.
don't filter membership events based on history visibility
as we will already have filtered the messages in the timeline, and state events
are always visible.
and because @erikjohnston said so.
Continues from uhoreg's branch
This just fixed the errcode on /room_keys/version if no backup and
updates the schema delta to be on the latest so it gets run
If we receive an event that doesn't pass their content hash check (e.g.
due to already being redacted) then we hit a bug which causes an
exception to be raised, which then promplty stops the event (and
request) from being processed.
This effects all sorts of federation APIs, including joining rooms with
a redacted state event.
* speed up room summaries by pulling their data from room_memberships rather than room state
* disable LL for incr syncs, and log incr sync stats (#3840)
When a user joined a room any existing tags were not sent down the sync
stream. Ordinarily this isn't a problem because the user needs to be in
the room to have set tags in it, however synapse will sometimes add tags
for a user to a room, e.g. for server notices, which need to come down
sync.
We should check that both the sender's server, and the server which created the
event_id (which may be different from whatever the remote server has told us
the origin is), have signed the event.
Fetching the list of all new typing notifications involved iterating
over all rooms and comparing their serial. Lets move to using a stream
change cache, like we do for other streams.
We should explicitly close any db connections we open, because failing to do so
can block other transactions as per
https://github.com/matrix-org/synapse/issues/3682.
Let's also try to factor out some of the boilerplate by having server classes
define their datastore class rather than duplicating the whole of `setup`.
This was originally done in commit c75b71a397,
but got reverted on this branch due to the PR (#3677) being based on the wrong
branch.
We're ready to merge this to master now, so let's make it match
release-v0.33.3.
Splits the state_group_cache in two.
One half contains normal state events; the other contains member events.
The idea is that the lazyloading common case of: "I want a subset of member events plus all of the other state" can be accomplished efficiently by splitting the cache into two, and asking for "all events" from the non-members cache, and "just these keys" from the members cache. This means we can avoid having to make DictionaryCache aware of these sort of complicated queries, whilst letting LL requests benefit from the caching.
Previously we were unable to sensibly use the caching and had to pull all state from the DB irrespective of the filtering, which made things slow. Hopefully fixes https://github.com/matrix-org/synapse/issues/3720.
Turns out that the user directory handling is fairly racey as a bunch
of stuff assumes that the processing happens on master, which it doesn't
when there is a synapse.app.user_dir worker. So lets just call the
function directly until we actually get round to fixing it, since it
doesn't make the situation any worse.
Run the handlers for replication commands as background processes. This should
improve the visibility in our metrics, and reduce the number of "running db
transaction from sentinel context" warnings.
Ideally it means converting the things that fire off deferreds into the night
into things that actually return a Deferred when they are done. I've made a bit
of a stab at this, but it will probably be leaky.
First of all, avoid resetting the logcontext before running the pushers, to fix
the "Starting db txn 'get_all_updated_receipts' from sentinel context" warning.
Instead, give them their own "background process" logcontexts.
The problem with dumping all of the json response into the Request object at
once is that doing so starts the timeout for the next request to be received:
so if it takes longer than 60s to stream back the response to the client, the
client never gets it.
The correct solution is to use a Producer; then the timeout is only started
once all of the content is sent over the TCP connection.
This commit moves a bunch of the logic for deciding when to log the receipt and
completion of HTTP requests into SynapseRequest, rather than in the request
handling wrappers.
Advantages of this are:
* we get logs for *all* requests (including OPTIONS and HEADs), rather than
just those that end up hitting handlers we've remembered to decorate
correctly.
* when a request handler wires up a Producer (as the media stuff does
currently, and as other things will do soon), we log at the point that all
of the traffic has been sent to the client.
The script recaptcha_ajax.js contains minified non-free code which
we probably cannot redistribute. Since it talks to Google servers
anyway, it is better to just download it from Google directly instead
of shipping it.
This fixes#1932.
Turns out that cancellation of inlineDeferreds didn't really work properly
until Twisted 18.7. This commit refactors Linearizer.queue to avoid
inlineCallbacks.
Older identity servers may not support the unbind 3pid request, so we
shouldn't fail the requests if we received one of 400/404/501. The
request still fails if we receive e.g. 500 responses, allowing clients
to retry requests on transient identity server errors that otherwise do
support the API.
Fixes#3661
This is the first tranche of support for room versioning. It includes:
* setting the default room version in the config file
* new room_version param on the createRoom API
* storing the version of newly-created rooms in the m.room.create event
* fishing the version of existing rooms out of the m.room.create event
Synapse 0.33.1 (2018-08-02)
===========================
SECURITY FIXES
--------------
- Fix a potential issue where servers could request events for rooms they have not joined. (`#3641 <https://github.com/matrix-org/synapse/issues/3641>`_)
- Fix a potential issue where users could see events in private rooms before they joined. (`#3642 <https://github.com/matrix-org/synapse/issues/3642>`_)
Make sure that the user has permission to view the requeseted event for
/event/{eventId} and /room/{roomId}/event/{eventId} requests.
Also check that the event is in the given room for
/room/{roomId}/event/{eventId}, for sanity.
When we get a federation request which refers to an event id, make sure that
said event is in the room the caller claims it is in.
(patch supplied by @turt2live)
This code brings the SimpleHttpClient into line with the
MatrixFederationHttpClient by having it raise HttpResponseExceptions when a
request fails (rather than trying to parse for matrix errors and maybe raising
MatrixCodeMessageException).
Then, whenever we were checking for MatrixCodeMessageException and turning them
into SynapseErrors, we now need to check for HttpResponseExceptions and call
to_synapse_error.
This commit replaces SynapseError.from_http_response_exception with
HttpResponseException.to_synapse_error.
The new method actually returns a ProxiedRequestError, which allows us to pass
through additional metadata from the API call.
We really shouldn't be sending all CodeMessageExceptions back over the C-S API;
it will include things like 401s which we shouldn't proxy.
That means that we need to explicitly turn a few HttpResponseExceptions into
SynapseErrors in the federation layer.
The effect of the latter is that the matrix errcode will get passed through
correctly to calling clients, which might help with some of the random
M_UNKNOWN errors when trying to join rooms.
secrets got introduced in python 3.6 so this class is not available
in 3.5 and before.
This now checks for the current running version and only tries using
secrets if the version is 3.6 or above
Signed-Off-By: Matthias Kesler <krombel@krombel.de>
* attempt at deduplicating lazy-loaded members
as per the proposal; we can deduplicate redundant lazy-loaded members
which are sent in the same sync sequence. we do this heuristically
rather than requiring the client to somehow tell us which members it
has chosen to cache, by instead caching the last N members sent to
a client, and not sending them again. For now we hardcode N to 100.
Each cache for a given (user,device) tuple is in turn cached for up to
X minutes (to avoid the caches building up). For now we hardcode X to 30.
* add include_redundant_members filter option & make it work
* remove stale todo
* add tests for _get_some_state_from_cache
* incorporate review
This field is no longer read from, so we should stop populating it. Once we're
happy that this doesn't break everything, and a rollback is unlikely, we can
think about dropping the column.
We've long passed the point where it's possible to have the same event_id in
different tables, so these join conditions are redundant: we can just join on
event_id.
event_edges is of non-trivial size, and the room_id column is wasteful, so
let's stop reading from it. In future, we can stop writing to it, and then drop
it.
(since it uses methods therein)
Turns out that we had a bunch of things which were incorrectly importing
EventWorkerStore from events.py rather than events_worker.py, which broke once
I removed the import into events.py.
It turns out that looping_call does check the deferred returned by its
callback, and (at least in the case of client_ips), we were relying on this,
and I broke it in #3604.
Update run_as_background_process to return the deferred, and make sure we
return it to clock.looping_call.
This fixes a bug in _delete_existing_rows_txn which was introduced in #3435
(though it's been on matrix-org-hotfixes for *years*). This code is only called
when there is some sort of conflict the first time we try to persist an event,
so it only happens rarely. Still, the exceptions are annoying.
We need to run the errback in the sentinel context to avoid losing our own
context.
Also: add logging to runInteraction to help identify where "Starting db
connection from sentinel context" warnings are coming from
_update_remote_profile_cache was missing its `defer.inlineCallbacks`, so when
it was called, would just return a generator object, without actually running
any of the method body.
on_notifier_poke no longer runs synchonously, so we have to do a different hack
to make sure that the replication data has been sent. Let's actually listen for
its arrival.
We incorrectly asserted that all contexts must have a non None state
group without consider outliers. This would usually be fine as the
assertion would never be hit, as there is a shortcut during persistence
if the forward extremities don't change.
However, if the outlier is being persisted with non-outlier events, the
function would be called and the assertion would be hit.
Fixes#3601
This allows us to handle /context/ requests on the client_reader worker
without having to pull in all the various stream handlers (e.g.
precence, typing, pushers etc). The only thing the token gets used for
is pagination, and that ignores everything but the room portion of the
token.
First of all, fix the logic which looks for identical input state groups so
that we actually use them. This turned out to be most easily done by factoring
the relevant code out to a separate function so that we could do an early
return.
Secondly, avoid building the whole `conflicted_state` dict (which was only ever
used as a boolean flag).
Thirdly, replace the construction of the `state` dict (which mapped from keys
to events that set them), with an optimistic construction of the resolution
result assuming there will be no conflicts. This should be no slower than
building the old `state` dict, and:
- in the conflicted case, we'll short-cut it, saving part of the work
- in the unconflicted case, it saves rebuilding the resolution from the
`state` dict.
Finally, do a couple of s/values/itervalues/.
We don't want to bother pulling out the current state from the DB since
until we know we have to. Checking the context for state is just an
optimisation.
Since, for better or worse, we seem to have configured isort to generate
89-character lines, pycharm is now complaining at me that our lines are too
long.
So, let's configure pep8 to behave consistently with flake8.
This is more involved than it might otherwise be, because the current
implementation just drops its logcontexts and runs everything in the sentinel
context.
It turns out that we aren't actually using a bunch of the functionality here
(notably suppress_failures and the fact that Distributor.fire returns a
deferred), so the easiest way to fix this is actually by simplifying a bunch of
code.
This fixes#3518, and ensures that we get useful logs and metrics for lots of
things that happen in the background.
(There are certainly more things that happen in the background; these are just
the common ones I've found running a single-process synapse locally).
This introduces a mechanism for tracking resource usage by background
processes, along with an example of how it will be used.
This will help address #3518, but more importantly will give us better insights
into things which are happening but not being shown up by the request metrics.
We *could* do this with Measure blocks, but:
- I think having them pulled out as a completely separate metric class will
make it easier to distinguish top-level processes from those which are
nested.
- I want to be able to report on in-flight background processes, and I don't
think we want to do this for *all* Measure blocks.
The get_entities_changed function was changed to return all changed
entities since the given stream position, rather than only those changed
from a given list of entities. This resulted in the function incorrectly
returning large numbers of entities that, for example, caused large
increases in database usage.
It's much easier to build the image via docker-compose instead of an
error-prone low-level docker call.
Signed-off-by: Benedikt Heine <bebe@bebehei.de>
parse_integer and parse_string can take a request and raise errors
in case we have wrong or missing params.
This PR tries to use them more to deduplicate some code and make it
better readable
The stream cache keeps track of all entities that have changed since
a particular stream position, so get_entities_changed does not need to
return unknown entites when given a larger stream position.
This makes it consistent with the behaviour of has_entity_changed.
This line shows up as about 5% of cpu time on a synchrotron:
not_known_entities = set(entities) - set(self._entity_to_key)
Presumably the problem here is that _entity_to_key can be largeish, and
building a set for its keys every time this function is called is slow.
Here we rewrite the logic to avoid building so many sets.
Previously we queued up the poke correctly when the device was deleted,
but then the actual EDU wouldn't get sent, as the device was no longer known.
Instead, we now send EDUs for deleted devices too if there's a poke for them.
This default config is parsed and used a base before the actual
config is overlaid, so with these values not commented out, the
code to detect when no turn params were set and refuse to generate
credentials was never firing because the dummy default was always set.
* Use more portable syntax using attrs package.
Newer syntax
attr.ib(factory=dict)
is just a syntactic sugar for
attr.ib(default=attr.Factory(dict))
It was introduced in newest version of attrs package (18.1.0)
and doesn't work with older versions.
We should either require minimum version of attrs to be 18.1.0,
or use older (slightly more verbose) syntax.
Requiring newest version is not a good solution because
Linux distributions may have older version of attrs (17.4.0 in Fedora 28),
and requiring to build (and package)
newer version just to use newer syntactic sugar in only one test
is just too much.
It's much better to fix that test to use older syntax.
Signed-off-by: Oleg Girko <ol@infoserver.lv>
Newer syntax
attr.ib(factory=dict)
is just a syntactic sugar for
attr.ib(default=attr.Factory(dict))
It was introduced in newest version of attrs package (18.1.0)
and doesn't work with older versions.
We should either require minimum version of attrs to be 18.1.0,
or use older (slightly more verbose) syntax.
Requiring newest version is not a good solution because
Linux distributions may have older version of attrs (17.4.0 in Fedora 28),
and requiring to build (and package)
newer version just to use newer syntactic sugar in only one test
is just too much.
It's much better to fix that test to use older syntax.
Signed-off-by: Oleg Girko <ol@infoserver.lv>
We need to do a bit more validation when we get a server name, but don't want
to be re-doing it all over the shop, so factor out a separate
parse_and_validate_server_name, and do the extra validation.
Also, use it to verify the server name in the config file.
a61738b removed a call to run_on_reactor from a unit test, but that call was
doing something useful, in making the function in question asynchronous.
Reinstate the call and add a check that we are testing what we wanted to be
testing.
Make sure that server_names used in auth headers are sane, and reject them with
a sensible error code, before they disappear off into the depths of the system.
otherwise we explode with:
```
Traceback (most recent call last):
File /usr/lib/python2.7/logging/handlers.py, line 78, in emit
logging.FileHandler.emit(self, record)
File /usr/lib/python2.7/logging/__init__.py, line 950, in emit
StreamHandler.emit(self, record)
File /usr/lib/python2.7/logging/__init__.py, line 887, in emit
self.handleError(record)
File /usr/lib/python2.7/logging/__init__.py, line 810, in handleError
None, sys.stderr)
File /usr/lib/python2.7/traceback.py, line 124, in print_exception
_print(file, 'Traceback (most recent call last):')
File /usr/lib/python2.7/traceback.py, line 13, in _print
file.write(str+terminator)
File /home/matrix/.synapse/local/lib/python2.7/site-packages/twisted/logger/_io.py, line 170, in write
self.log.emit(self.level, format=u{log_io}, log_io=line)
File /home/matrix/.synapse/local/lib/python2.7/site-packages/twisted/logger/_logger.py, line 144, in emit
self.observer(event)
File /home/matrix/.synapse/local/lib/python2.7/site-packages/twisted/logger/_observer.py, line 136, in __call__
errorLogger = self._errorLoggerForObserver(brokenObserver)
File /home/matrix/.synapse/local/lib/python2.7/site-packages/twisted/logger/_observer.py, line 156, in _errorLoggerForObserver
if obs is not observer
File /home/matrix/.synapse/local/lib/python2.7/site-packages/twisted/logger/_observer.py, line 81, in __init__
self.log = Logger(observer=self)
File /home/matrix/.synapse/local/lib/python2.7/site-packages/twisted/logger/_logger.py, line 64, in __init__
namespace = self._namespaceFromCallingContext()
File /home/matrix/.synapse/local/lib/python2.7/site-packages/twisted/logger/_logger.py, line 42, in _namespaceFromCallingContext
return currentframe(2).f_globals[__name__]
File /home/matrix/.synapse/local/lib/python2.7/site-packages/twisted/python/compat.py, line 93, in currentframe
for x in range(n + 1):
RuntimeError: maximum recursion depth exceeded while calling a Python object
Logged from file site.py, line 129
File /usr/lib/python2.7/logging/__init__.py, line 859, in emit
msg = self.format(record)
File /usr/lib/python2.7/logging/__init__.py, line 732, in format
return fmt.format(record)
File /usr/lib/python2.7/logging/__init__.py, line 471, in format
record.message = record.getMessage()
File /usr/lib/python2.7/logging/__init__.py, line 335, in getMessage
msg = msg % self.args
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 4: ordinal not in range(128)
Logged from file site.py, line 129
```
...where the logger apparently recurses whilst trying to log the error, hitting the
maximum recursion depth and killing everything badly.
Most rooms have a trivial history visibility like "shared" or
"world_readable", especially large rooms, so lets not bother getting the
full membership of those rooms in that case.
When _get_state_for_groups is given a wildcard filter, just do a complete
lookup. Hopefully this will give us the best of both worlds by not filling up
the ram if we only need one or two keys, but also making the cache still work
for the federation reader usecase.
When we finish processing a request, log the number of events we fetched from
the database to handle it.
[I'm trying to figure out which requests are responsible for large amounts of
event cache churn. It may turn out to be more helpful to add counts to the
prometheus per-request/block metrics, but that is an extension to this code
anyway.]
SECURITY UPDATE: Prevent unauthorised users from setting state events in a room
when there is no `m.room.power_levels` event in force in the room. (PR #3397)
Discussion around the Matrix Spec change proposal for this change can be
followed at https://github.com/matrix-org/matrix-doc/issues/1304.
This appears to have stopped working since matrix.org moved to cloudflare. The
Host header should match the name of the server, not whatever is in the SRV
record.
This is only used by filter_events_for_client, so we can simplify the whole
thing by just doing one user at a time, and removing a dead storage function to
boot.
It has been over a year since any code has been commited. Some of the relevant links in
the documentation are broken, but since no pull requests are being accepted, they
won't get fixed. We should probably remove this from the README.
Changes in synapse v0.31.1 (2018-06-08)
=======================================
v0.31.1 fixes a security bug in the ``get_missing_events`` federation API
where event visibility rules were not applied correctly.
We are not aware of it being actively exploited but please upgrade asap.
Bug Fixes:
* Fix event filtering in get_missing_events handler (PR #3371)
Firstly, don't swallow the reason for the failure
Secondly, don't assume all exceptions are verification failures
Thirdly, log a bit of info about the key being used if debug is enabled
These "temporary fixes" have been here three and a half years, and I can't find
any events in the matrix.org database where the calculated signature differs
from what's in the db. It's time for them to go away.
Changes in synapse v0.31.0 (2018-06-06)
======================================
Most notable change from v0.30.0 is to switch to python prometheus library to improve system
stats reporting. WARNING this changes a number of prometheus metrics in a
backwards-incompatible manner. For more details, see
`docs/metrics-howto.rst <docs/metrics-howto.rst#removal-of-deprecated-metrics--time-based-counters-becoming-histograms-in-0310>`_.
Bug Fixes:
* Fix metric documentation tables (PR #3341)
* Fix LaterGuage error handling (694968f)
* Fix replication metrics (b7e7fd2)
Changes in synapse v0.31.0-rc1 (2018-06-04)
==========================================
Features:
* Switch to the Python Prometheus library (PR #3256, #3274)
* Let users leave the server notice room after joining (PR #3287)
Changes:
* daily user type phone home stats (PR #3264)
* Use iter* methods for _filter_events_for_server (PR #3267)
* Docs on consent bits (PR #3268)
* Remove users from user directory on deactivate (PR #3277)
* Avoid sending consent notice to guest users (PR #3288)
* disable CPUMetrics if no /proc/self/stat (PR #3299)
* Add local and loopback IPv6 addresses to url_preview_ip_range_blacklist (PR #3312) Thanks to @thegcat!
* Consistently use six's iteritems and wrap lazy keys/values in list() if they're not meant to be lazy (PR #3307)
* Add private IPv6 addresses to example config for url preview blacklist (PR #3317) Thanks to @thegcat!
* Reduce stuck read-receipts: ignore depth when updating (PR #3318)
* Put python's logs into Trial when running unit tests (PR #3319)
Changes, python 3 migration:
* Replace some more comparisons with six (PR #3243) Thanks to @NotAFile!
* replace some iteritems with six (PR #3244) Thanks to @NotAFile!
* Add batch_iter to utils (PR #3245) Thanks to @NotAFile!
* use repr, not str (PR #3246) Thanks to @NotAFile!
* Misc Python3 fixes (PR #3247) Thanks to @NotAFile!
* Py3 storage/_base.py (PR #3278) Thanks to @NotAFile!
* more six iteritems (PR #3279) Thanks to @NotAFile!
* More Misc. py3 fixes (PR #3280) Thanks to @NotAFile!
* remaining isintance fixes (PR #3281) Thanks to @NotAFile!
* py3-ize state.py (PR #3283) Thanks to @NotAFile!
* extend tox testing for py3 to avoid regressions (PR #3302) Thanks to @krombel!
* use memoryview in py3 (PR #3303) Thanks to @NotAFile!
Bugs:
* Fix federation backfill bugs (PR #3261)
* federation: fix LaterGauge usage (PR #3328) Thanks to @intelfx!
This is unused. IT MUST DIE!!!1
̧̪͈̱̹̳͖͙H̵̰̤̰͕̖e̛ ͚͉̗̼̞w̶̩̥͉̮h̩̺̪̩͘ͅọ͎͉̟ ̜̩͔̦̘ͅW̪̫̩̣̲͔̳a͏͔̳͖i͖͜t͓̤̠͓͙s̘̰̩̥̙̝ͅ ̲̠̬̥Be̡̙̫̦h̰̩i̛̫͙͔̭̤̗̲n̳͞d̸ ͎̻͘T̛͇̝̲̹̠̗ͅh̫̦̝ͅe̩̫͟ ͓͖̼W͕̳͎͚̙̥ą̙l̘͚̺͔͞ͅl̳͍̙̤̤̮̳.̢
̟̺̜̙͉Z̤̲̙̙͎̥̝A͎̣͔̙͘L̥̻̗̳̻̳̳͢G͉̖̯͓̞̩̦O̹̹̺!̙͈͎̞̬ *
[**#matrix:matrix.org**](https://matrix.to/#/#matrix:matrix.org) is the official support room for Matrix, and can be accessed by any client from https://matrix.org/docs/projects/try-matrix-now.html
It can also be access via IRC bridge at irc://irc.freenode.net/matrix or on the web here: https://webchat.freenode.net/?channels=matrix
`submitting patches process <https://www.kernel.org/doc/html/latest/process/submitting-patches.html#sign-your-work-the-developer-s-certificate-of-origin>`_, Docker
(https://github.com/docker/docker/blob/master/CONTRIBUTING.md), and many other
projects use: the DCO (Developer Certificate of Origin:
http://developercertificate.org/). This is a simple declaration that you wrote
@@ -105,18 +143,27 @@ the contribution or otherwise have the right to contribute it to Matrix::
personal information I submit with it, including my sign-off) is
maintained indefinitely and may be redistributed consistent with
this project or the open source license(s) involved.
If you agree to this for your contribution, then all that's needed is to
include the line in your commit or pull request comment::
Signed-off-by: Your Name <your@email.example.org>
...using your real name; unfortunately pseudonyms and anonymous contributions
can't be accepted. Git makes this trivial - just use the -s flag when you do
``git commit``, having first set ``user.name`` and ``user.email`` git configs
(which you should have done anyway :)
We accept contributions under a legally identifiable name, such as
your name on government documentation or common-law names (names
claimed by legitimate usage or repute). Unfortunately, we cannot
accept anonymous contributions at this time.
Git allows you to add this signoff automatically when using the ``-s``
flag to ``git commit``, which uses the name and email set in your
``user.name`` and ``user.email`` git configs.
Conclusion
~~~~~~~~~~
That's it! Matrix is a very open and collaborative project as you might expect given our obsession with open communication. If we're going to successfully matrix together all the fragmented communication technologies out there we are reliant on contributions and collaboration from the community to do so. So please get involved - and we hope you have as much fun hacking on Matrix as we do!
That's it! Matrix is a very open and collaborative project as you might expect
given our obsession with open communication. If we're going to successfully
matrix together all the fragmented communication technologies out there we are
reliant on contributions and collaboration from the community to do so. So
please get involved - and we hope you have as much fun hacking on Matrix as we
automatically for you for free** through `Let's Encrypt
<https://letsencrypt.org/>`_ if you tell it to.
Note: Synapse does not currently hot-renew Let's Encrypt certificates for
you, it only checks for certificates that need renewing on restart. This
functionality will be implemented promptly, but if in the meantime your
federation certificates expire, simply restarting Synapse should renew
them automatically.
In order for Synapse to complete the ACME challenge to provision a
certificate, it needs access to port 80. Typically listening on port 80 is
only granted to applications running as root. There are thus two solutions to
this problem.
**Using a reverse proxy**
A reverse proxy such as Apache or Nginx allows a single process (the web
server) to listen on port 80 and redirect traffic to the appropriate program
running on your server.
**Authbind**
``authbind`` allows a program which does not or should not run as root to
bind to low-numbered ports in a controlled way. The setup is simpler, but
requires a webserver not to already be running on port 80. **This includes
every time Synapse renews a certificate**, which may be cumbersome if you
usually run a web server on port 80. Nevertheless, if that isn't a concern,
follow the instructions below.
Install ``authbind``. This can be done on Ubuntu/Debian with::
sudo apt-get install authbind
**Add authbind to the systemd script**
**TODO: This right?** If you would like to use your own
certificates, specifying them in Synapse's config file is sufficient.
**TODO: Fit this in**
These keys will allow your Home Server to identify itself to other Home
Servers, so don't lose or delete them. It would be wise to back them up
somewhere safe. (If, for whatever reason, you do need to change your Home
Server's keys, you may find that other Home Servers have the old key cached.
If you update the signing key, you should change the name of the key in the
``<server name>.signing.key`` file (the second word) to something different.
See `the spec`__ for more information on key management.)
**TODO: Does this still work?** This Synapse installation can then be later
upgraded by using pip again with the update flag::
source ~/synapse/env/bin/activate
pip install -U matrix-synapse[all]
In case of problems, please see the _`Troubleshooting` section below.
There is an offical synapse image available at https://hub.docker.com/r/matrixdotorg/synapse/tags/ which can be used with the docker-compose file available at `contrib/docker`. Further information on this including configuration options is available in `contrib/docker/README.md`.
We have now created a "matrix" user with its own home directory that stores
Synapse's data and configuration files, backed by a postgres database, all
packaged into a isolated python virtual environment.
Alternatively, Andreas Peters (previously Silvio Fricke) has contributed a Dockerfile to automate a synapse server in a single Docker image, at https://hub.docker.com/r/avhost/docker-matrix/tags/
Also, Martin Giess has created an auto-deployment process with vagrant/ansible,
tested with VirtualBox/AWS/DigitalOcean - see https://github.com/EMnify/matrix-synapse-auto-deploy
for details.
Configuring synapse
Configuring Synapse
-------------------
Before you can start Synapse, you will need to generate a configuration
file. To do this, run (in your virtualenv, as before)::
cd ~/.synapse
python -m synapse.app.homeserver \
--server-name my.domain.name \
--config-path homeserver.yaml \
--generate-config \
--report-stats=[yes|no]
... substituting an appropriate value for ``--server-name``. The server name
determines the "domain" part of user-ids for users on your server: these will
Before starting Synapse, inspect the ``cfg/homeserver.yaml`` file. ``server_name``
determines the "domain" part of user-ids for users on your server, which will
all be of the format ``@user:my.domain.name``. It also determines how other
matrix servers will reach yours for `Federation`_. For a test configuration,
set this to the hostname of your server. For a more production-ready setup, you
@@ -187,16 +309,7 @@ will probably want to specify your domain (``example.com``) rather than a
matrix-specific hostname here (in the same way that your email address is
probably ``user@example.com`` rather than ``user@email.example.com``) - but
doing so may require more advanced setup - see `Setting up
Federation`_. Beware that the server name cannot be changed later.
This command will generate you a config file that you can then customise, but it will
also generate a set of keys for you. These keys will allow your Home Server to
identify itself to other Home Servers, so don't lose or delete them. It would be
wise to back them up somewhere safe. (If, for whatever reason, you do need to
change your Home Server's keys, you may find that other Home Servers have the
old key cached. If you update the signing key, you should change the name of the
key in the ``<server name>.signing.key`` file (the second word) to something
different. See `the spec`__ for more information on key management.)
Federation`_. **Be aware that the server name cannot be changed later.**
..__:`key_management`_
@@ -204,10 +317,9 @@ The default configuration exposes two HTTP ports: 8008 and 8448. Port 8008 is
configured without TLS; it should be behind a reverse proxy for TLS/SSL
termination on port 443 which in turn should be used for clients. Port 8448
is configured to use TLS with a self-signed certificate. If you would like
to do initial test with a client without having to setup a reverse proxy,
you can temporarly use another certificate. (Note that a self-signed
certificate is fine for `Federation`_). You can do so by changing
``tls_certificate_path``, ``tls_private_key_path`` and ``tls_dh_params_path``
to do an initial test with a client without having to setup a reverse proxy,
you can temporarly use another certificate. You can do so by changing
``tls_certificate_path`` and ``tls_private_key_path``
in ``homeserver.yaml``; alternatively, you can use a reverse-proxy, but be sure
to read `Using a reverse proxy with Synapse`_ when doing so.
@@ -225,7 +337,7 @@ commandline script.
To get started, it is easiest to use the command line to register new users::
Slavi Pantaleev has created an Ansible playbook, which installs the offical
Docker image of Matrix Synapse along with many other Matrix-related services
(Postgres database, riot-web, coturn, mxisd, SSL support, etc.). For more
details, see https://github.com/spantaleev/matrix-docker-ansible-deploy
Troubleshooting
@@ -482,7 +573,7 @@ Troubleshooting
Troubleshooting Installation
----------------------------
Synapse requires pip 1.7 or later, so if your OS provides too old a version you
Synapse requires pip 8 or later, so if your OS provides too old a version you
may need to manually upgrade it::
sudo pip install --upgrade pip
@@ -492,7 +583,7 @@ You can fix this by manually upgrading pip and virtualenv::
sudo pip install --upgrade virtualenv
You can next rerun ``virtualenv -p python2.7 synapse`` to update the virtual env.
You can next rerun ``virtualenv -p python3 synapse`` to update the virtual env.
Installing may fail during installing virtualenv with ``InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.``
You can fix this by manually installing ndg-httpsclient::
@@ -517,30 +608,8 @@ failing, e.g.::
pip install twisted
On OS X, if you encounter clang: error: unknown argument: '-mno-fused-madd' you
will need to export CFLAGS=-Qunused-arguments.
Troubleshooting Running
-----------------------
If synapse fails with ``missing "sodium.h"`` crypto errors, you may need
to manually upgrade PyNaCL, as synapse uses NaCl (http://nacl.cr.yp.to/) for
encryption and digital signatures.
Unfortunately PyNACL currently has a few issues
(https://github.com/pyca/pynacl/issues/53) and
(https://github.com/pyca/pynacl/issues/79) that mean it may not install
correctly, causing all tests to fail with errors about missing "sodium.h". To
The `matrixdotorg/synapse` Docker image will run Synapse as a single process. It does not provide a
database server or a TURN server, you should run these separately.
If you run a Postgres server, you should simply include it in the same Compose
project or set the proper environment variables and the image will automatically
use that server.
## Build
Build the docker image with the `docker build` command from the root of the synapse repository.
```
docker build -t docker.io/matrixdotorg/synapse .
```
The `-t` option sets the image tag. Official images are tagged `matrixdotorg/synapse:<version>` where `<version>` is the same as the release tag in the synapse git repository.
You may have a local Python wheel cache available, in which case copy the relevant packages in the ``cache/`` directory at the root of the project.
## Run
This image is designed to run either with an automatically generated configuration
file or with a custom configuration that requires manual edition.
### Automated configuration
It is recommended that you use Docker Compose to run your containers, including
@@ -60,94 +36,6 @@ Then, customize your configuration and run the server:
docker-compose up -d
```
### Without Compose
### More information
If you do not wish to use Compose, you may still run this image using plain
Docker commands. Note that the following is just a guideline and you may need
to add parameters to the docker run command to account for the network situation
with your postgres database.
```
docker run \
-d \
--name synapse \
-v ${DATA_PATH}:/data \
-e SYNAPSE_SERVER_NAME=my.matrix.host \
-e SYNAPSE_REPORT_STATS=yes \
docker.io/matrixdotorg/synapse:latest
```
## Volumes
The image expects a single volume, located at ``/data``, that will hold:
* temporary files during uploads;
* uploaded media and thumbnails;
* the SQLite database if you do not configure postgres;
* the appservices configuration.
You are free to use separate volumes depending on storage endpoints at your
disposal. For instance, ``/data/media`` coud be stored on a large but low
performance hdd storage while other files could be stored on high performance
endpoints.
In order to setup an application service, simply create an ``appservices``
directory in the data volume and write the application service Yaml
configuration file there. Multiple application services are supported.
## Environment
Unless you specify a custom path for the configuration file, a very generic
file will be generated, based on the following environment settings.
These are a good starting point for setting up your own deployment.
Global settings:
* ``UID``, the user id Synapse will run as [default 991]
* ``GID``, the group id Synapse will run as [default 991]
* ``SYNAPSE_CONFIG_PATH``, path to a custom config file
If ``SYNAPSE_CONFIG_PATH`` is set, you should generate a configuration file
then customize it manually. No other environment variable is required.
Otherwise, a dynamic configuration file will be used. The following environment
variables are available for configuration:
* ``SYNAPSE_SERVER_NAME`` (mandatory), the current server public hostname.
* ``SYNAPSE_REPORT_STATS``, (mandatory, ``yes`` or ``no``), enable anonymous
statistics reporting back to the Matrix project which helps us to get funding.
* ``SYNAPSE_NO_TLS``, set this variable to disable TLS in Synapse (use this if
you run your own TLS-capable reverse proxy).
* ``SYNAPSE_ENABLE_REGISTRATION``, set this variable to enable registration on
the Synapse instance.
* ``SYNAPSE_ALLOW_GUEST``, set this variable to allow guest joining this server.
* ``SYNAPSE_EVENT_CACHE_SIZE``, the event cache size [default `10K`].
* ``SYNAPSE_CACHE_FACTOR``, the cache factor [default `0.5`].
* ``SYNAPSE_RECAPTCHA_PUBLIC_KEY``, set this variable to the recaptcha public
key in order to enable recaptcha upon registration.
* ``SYNAPSE_RECAPTCHA_PRIVATE_KEY``, set this variable to the recaptcha private
key in order to enable recaptcha upon registration.
* ``SYNAPSE_TURN_URIS``, set this variable to the coma-separated list of TURN
uris to enable TURN for this homeserver.
* ``SYNAPSE_TURN_SECRET``, set this to the TURN shared secret if required.
Shared secrets, that will be initialized to random values if not set:
* ``SYNAPSE_REGISTRATION_SHARED_SECRET``, secret for registrering users if
registration is disable.
* ``SYNAPSE_MACAROON_SECRET_KEY`` secret for signing access tokens
to the server.
Database specific values (will use SQLite if not set):
* `POSTGRES_DB` - The database name for the synapse postgres database. [default: `synapse`]
* `POSTGRES_HOST` - The host of the postgres database if you wish to use postgresql instead of sqlite3. [default: `db` which is useful when using a container on the same docker network in a compose file where the postgres service is called `db`]
* `POSTGRES_PASSWORD` - The password for the synapse postgres database. **If this is set then postgres will be used instead of sqlite3.** [default: none] **NOTE**: You are highly encouraged to use postgresql! Please use the compose file to make it easier to deploy.
* `POSTGRES_USER` - The user for the synapse postgres database. [default: `matrix`]
Mail server specific values (will not send emails if not set):
* ``SYNAPSE_SMTP_HOST``, hostname to the mail server.
* ``SYNAPSE_SMTP_PORT``, TCP port for accessing the mail server [default ``25``].
* ``SYNAPSE_SMTP_USER``, username for authenticating against the mail server if any.
* ``SYNAPSE_SMTP_PASSWORD``, password for authenticating against the mail server if any.
For more information on required environment variables and mounts, see the main docker documentation at [/docker/README.md](../../docker/README.md)
0. Set up Prometheus and Grafana. Out of scope for this readme. Useful documentation about using Grafana with Prometheus: http://docs.grafana.org/features/datasources/prometheus/
1. Have your Prometheus scrape your Synapse. https://github.com/matrix-org/synapse/blob/master/docs/metrics-howto.rst
2. Import dashboard into Grafana. Download `synapse.json`. Import it to Grafana and select the correct Prometheus datasource. http://docs.grafana.org/reference/export_import/
# prune all messages that are older than 1000 messages ago:
# LAST_MESSAGES=1000
# SQL_GET_EVENT="SELECT event_id from events WHERE type='m.room.message' AND room_id ='$ROOM' ORDER BY received_ts DESC LIMIT 1 offset $(($LAST_MESSAGES - 1))"
# ALTERNATIVELY:
# select the EVENT_ID manually:
#EVENT_ID='$1471814088343495zpPNI:matrix.org' # an example event from 21st of Aug 2016 by Matthew
POSTDATA='{"delete_local_events":"true"}'# this will really delete local events, so the messages in the room really disappear unless they are restored by remote federation
sql "SELECT * FROM room_aliases WHERE room_id='$ROOM'"
echo"get event..."
# for postgres:
EVENT_ID=$(sql "SELECT event_id FROM events WHERE type='m.room.message' AND received_ts<'$UNIX_TIMESTAMP' AND room_id='$ROOM' ORDER BY received_ts DESC LIMIT 1;")
if["$EVENT_ID"==""];then
echo"no event $TIME"
else
echo"event: $EVENT_ID"
SLEEP=2
set -x
# call purge
OUT=$(curl --header "$AUTH" -s -d $POSTDATA POST "$API_URL/admin/purge_history/$ROOM/$EVENT_ID")
\fBhash_password\fR calculates the hash of a supplied password using bcrypt\.
.
.P
\fBhash_password\fR takes a password as an parameter either on the command line or the \fBSTDIN\fR if not supplied\.
.
.P
It accepts an YAML file which can be used to specify parameters like the number of rounds for bcrypt and password_config section having the pepper value used for the hashing\. By default \fBbcrypt_rounds\fR is set to \fB10\fR\.
.
.P
The hashed password is written on the \fBSTDOUT\fR\.
.
.SH"FILES"
A sample YAML file accepted by \fBhash_password\fR is described below:
Read the password form the command line if [password] is supplied\. If not, prompt the user and read the password form the \fBSTDIN\fR\. It is not recommended to type the password on the command line directly\. Use the STDIN instead\.
.
.TP
\fB\-c\fR, \fB\-\-config\fR
Read the supplied YAML \fIfile\fR containing the options \fBbcrypt_rounds\fR and the \fBpassword_config\fR section containing the \fBpepper\fR value\.
\fBregister_new_matrix_user\fR\- Used to register new users with a given home server when registration has been disabled
.
.SH"SYNOPSIS"
\fBregister_new_matrix_user\fR options\.\.\.
.
.SH"DESCRIPTION"
\fBregister_new_matrix_user\fR registers new users with a given home server when registration has been disabled\. For this to work, the home server must be configured with the \'registration_shared_secret\' option set\.
.
.P
This accepts the user credentials like the username, password, is user an admin or not and registers the user onto the homeserver database\. Also, a YAML file containing the shared secret can be provided\. If not, the shared secret can be provided via the command line\.
.
.P
By default it assumes the home server URL to be \fBhttps://localhost:8448\fR\. This can be changed via the \fBserver_url\fR command line option\.
.
.SH"FILES"
A sample YAML file accepted by \fBregister_new_matrix_user\fR is described below:
.
.IP""4
.
.nf
registration_shared_secret: "s3cr3t"
.
.fi
.
.IP""0
.
.SH"OPTIONS"
.
.TP
\fB\-u\fR, \fB\-\-user\fR
Local part of the new user\. Will prompt if omitted\.
.
.TP
\fB\-p\fR, \fB\-\-password\fR
New password for user\. Will prompt if omitted\. Supplying the password on the command line is not recommended\. Use the STDIN instead\.
.
.TP
\fB\-a\fR, \fB\-\-admin\fR
Register new user as an admin\. Will prompt if omitted\.
.
.TP
\fB\-c\fR, \fB\-\-config\fR
Path to server config file containing the shared secret\.
.
.TP
\fB\-k\fR, \fB\-\-shared\-secret\fR
Shared secret as defined in server config file\. This is an optional parameter as it can be also supplied via the YAML file\.
.
.TP
\fBserver_url\fR
URL of the home server\. Defaults to \'https://localhost:8448\'\.
\fBsynapse_port_db\fR ports an existing synapse SQLite database to a new PostgreSQL database\.
.
.P
SQLite database is specified with \fB\-\-sqlite\-database\fR option and PostgreSQL configuration required to connect to PostgreSQL database is provided using \fB\-\-postgres\-config\fR configuration\. The configuration is specified in YAML format\.
.
.SH"OPTIONS"
.
.TP
\fB\-v\fR
Print log messages in \fBdebug\fR level instead of \fBinfo\fR level\.
.
.TP
\fB\-\-sqlite\-database\fR
The snapshot of the SQLite database file\. This must not be currently used by a running synapse server\.
.
.TP
\fB\-\-postgres\-config\fR
The database config file for the PostgreSQL database\.
.
.TP
\fB\-\-curses\fR
Display a curses based progress UI\.
.
.SH"CONFIG FILE"
The postgres configuration file must be a valid YAML file with the following options\.
.
.IP"\(bu"4
\fBdatabase\fR: Database configuration section\. This section header can be ignored and the options below may be specified as top level keys\.
.
.IP"\(bu"4
\fBname\fR: Connector to use when connecting to the database\. This value must be \fBpsycopg2\fR\.
.
.IP"\(bu"4
\fBargs\fR: DB API 2\.0 compatible arguments to send to the \fBpsycopg2\fR module\.
.
.IP"\(bu"4
\fBdbname\fR\- the database name
.
.IP"\(bu"4
\fBuser\fR\- user name used to authenticate
.
.IP"\(bu"4
\fBpassword\fR\- password used to authenticate
.
.IP"\(bu"4
\fBhost\fR\- database host address (defaults to UNIX socket if not provided)
.
.IP"\(bu"4
\fBport\fR\- connection port number (defaults to 5432 if not provided)
.
.IP""0
.
.IP"\(bu"4
\fBsynchronous_commit\fR: Optional\. Default is True\. If the value is \fBFalse\fR, enable asynchronous commit and don\'t wait for the server to call fsync before ending the transaction\. See: https://www\.postgresql\.org/docs/current/static/wal\-async\-commit\.html
.
.IP""0
.
.IP""0
.
.P
Following example illustrates the configuration file format\.
.
.IP""4
.
.nf
database:
name: psycopg2
args:
dbname: synapsedb
user: synapseuser
password: ORohmi9Eet=ohphi
host: localhost
synchronous_commit: false
.
.fi
.
.IP""0
.
.SH"COPYRIGHT"
This man page was written by Sunil Mohan Adapa <\fIsunil@medhas\.org\fR> for Debian GNU/Linux distribution\.
\fBsynctl\fR can be used to start, stop or restart Synapse server\. The control operation can be done on all processes or a single worker process\.
.
.SH"OPTIONS"
.
.TP
\fBaction\fR
The value of action should be one of \fBstart\fR, \fBstop\fR or \fBrestart\fR\.
.
.TP
\fBconfigfile\fR
Optional path of the configuration file to use\. Default value is \fBhomeserver\.yaml\fR\. The configuration file must exist for the operation to succeed\.
.
.TP
\fB\-w\fR, \fB\-\-worker\fR:
.
.IP
Perform start, stop or restart operations on a single worker\. Incompatible with \fB\-a\fR|\fB\-\-all\-processes\fR\. Value passed must be a valid worker\'s configuration file\.
.
.TP
\fB\-a\fR, \fB\-\-all\-processes\fR:
.
.IP
Perform start, stop or restart operations on all the workers in the given directory and the main synapse process\. Incompatible with \fB\-w\fR|\fB\-\-worker\fR\. Value passed must be a directory containing valid work configuration files\. All files ending with \fB\.yaml\fR extension shall be considered as configuration files and all other files in the directory are ignored\.
Synapse\'s architecture is quite RAM hungry currently \- a lot of recent room data and metadata is deliberately cached in RAM in order to speed up common requests\. This will be improved in future, but for now the easiest way to either reduce the RAM usage (at the risk of slowing things down) is to set the SYNAPSE_CACHE_FACTOR environment variable\. Roughly speaking, a SYNAPSE_CACHE_FACTOR of 1\.0 will max out at around 3\-4GB of resident memory \- this is what we currently run the matrix\.org on\. The default setting is currently 0\.1, which is probably around a ~700MB footprint\. You can dial it down further to 0\.02 if desired, which targets roughly ~512MB\. Conversely you can dial it up if you need performance for lots of users and have a box with a lot of RAM\.
.
.SH"COPYRIGHT"
This man page was written by Sunil Mohan Adapa <\fIsunil@medhas\.org\fR> for Debian GNU/Linux distribution\.
This Docker image will run Synapse as a single process. It does not provide a database
server or a TURN server, you should run these separately.
## Run
We do not currently offer a `latest` image, as this has somewhat undefined semantics.
We instead release only tagged versions so upgrading between releases is entirely
within your control.
### Using docker-compose (easier)
This image is designed to run either with an automatically generated configuration
file or with a custom configuration that requires manual editing.
An easy way to make use of this image is via docker-compose. See the
[contrib/docker](../contrib/docker)
section of the synapse project for examples.
### Without Compose (harder)
If you do not wish to use Compose, you may still run this image using plain
Docker commands. Note that the following is just a guideline and you may need
to add parameters to the docker run command to account for the network situation
with your postgres database.
```
docker run \
-d \
--name synapse \
-v ${DATA_PATH}:/data \
-e SYNAPSE_SERVER_NAME=my.matrix.host \
-e SYNAPSE_REPORT_STATS=yes \
docker.io/matrixdotorg/synapse:latest
```
## Volumes
The image expects a single volume, located at ``/data``, that will hold:
* temporary files during uploads;
* uploaded media and thumbnails;
* the SQLite database if you do not configure postgres;
* the appservices configuration.
You are free to use separate volumes depending on storage endpoints at your
disposal. For instance, ``/data/media`` coud be stored on a large but low
performance hdd storage while other files could be stored on high performance
endpoints.
In order to setup an application service, simply create an ``appservices``
directory in the data volume and write the application service Yaml
configuration file there. Multiple application services are supported.
## Environment
Unless you specify a custom path for the configuration file, a very generic
file will be generated, based on the following environment settings.
These are a good starting point for setting up your own deployment.
Global settings:
* ``UID``, the user id Synapse will run as [default 991]
* ``GID``, the group id Synapse will run as [default 991]
* ``SYNAPSE_CONFIG_PATH``, path to a custom config file
If ``SYNAPSE_CONFIG_PATH`` is set, you should generate a configuration file
then customize it manually. No other environment variable is required.
Otherwise, a dynamic configuration file will be used. The following environment
variables are available for configuration:
* ``SYNAPSE_SERVER_NAME`` (mandatory), the current server public hostname.
* ``SYNAPSE_REPORT_STATS``, (mandatory, ``yes`` or ``no``), enable anonymous
statistics reporting back to the Matrix project which helps us to get funding.
* ``SYNAPSE_NO_TLS``, set this variable to disable TLS in Synapse (use this if
you run your own TLS-capable reverse proxy).
* ``SYNAPSE_ENABLE_REGISTRATION``, set this variable to enable registration on
the Synapse instance.
* ``SYNAPSE_ALLOW_GUEST``, set this variable to allow guest joining this server.
* ``SYNAPSE_EVENT_CACHE_SIZE``, the event cache size [default `10K`].
* ``SYNAPSE_CACHE_FACTOR``, the cache factor [default `0.5`].
* ``SYNAPSE_RECAPTCHA_PUBLIC_KEY``, set this variable to the recaptcha public
key in order to enable recaptcha upon registration.
* ``SYNAPSE_RECAPTCHA_PRIVATE_KEY``, set this variable to the recaptcha private
key in order to enable recaptcha upon registration.
* ``SYNAPSE_TURN_URIS``, set this variable to the coma-separated list of TURN
uris to enable TURN for this homeserver.
* ``SYNAPSE_TURN_SECRET``, set this to the TURN shared secret if required.
* ``SYNAPSE_MAX_UPLOAD_SIZE``, set this variable to change the max upload size [default `10M`].
Shared secrets, that will be initialized to random values if not set:
* ``SYNAPSE_REGISTRATION_SHARED_SECRET``, secret for registrering users if
registration is disable.
* ``SYNAPSE_MACAROON_SECRET_KEY`` secret for signing access tokens
to the server.
Database specific values (will use SQLite if not set):
* `POSTGRES_DB` - The database name for the synapse postgres database. [default: `synapse`]
* `POSTGRES_HOST` - The host of the postgres database if you wish to use postgresql instead of sqlite3. [default: `db` which is useful when using a container on the same docker network in a compose file where the postgres service is called `db`]
* `POSTGRES_PASSWORD` - The password for the synapse postgres database. **If this is set then postgres will be used instead of sqlite3.** [default: none] **NOTE**: You are highly encouraged to use postgresql! Please use the compose file to make it easier to deploy.
* `POSTGRES_USER` - The user for the synapse postgres database. [default: `matrix`]
Mail server specific values (will not send emails if not set):
* ``SYNAPSE_SMTP_HOST``, hostname to the mail server.
* ``SYNAPSE_SMTP_PORT``, TCP port for accessing the mail server [default ``25``].
* ``SYNAPSE_SMTP_USER``, username for authenticating against the mail server if any.
* ``SYNAPSE_SMTP_PASSWORD``, password for authenticating against the mail server if any.
## Build
Build the docker image with the `docker build` command from the root of the synapse repository.
The `-t` option sets the image tag. Official images are tagged `matrixdotorg/synapse:<version>` where `<version>` is the same as the release tag in the synapse git repository.
You may have a local Python wheel cache available, in which case copy the relevant
packages in the ``cache/`` directory at the root of the project.
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.