Better terms

Postgres fast update
Speed things up a bit
2025-12-07 01:20:16 +00:00 · 2018-06-04 15:34:45 +01:00 · 2018-06-01 17:25:07 +01:00 · 2018-06-01 17:13:37 +01:00 · 2018-06-01 15:14:56 +01:00 · 2018-06-01 15:05:07 +01:00
93 changed files with 4382 additions and 486 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -49,3 +49,4 @@ env/
 *.config

 .vscode/
+.ropeproject/
--- a/CHANGES.rst
+++ b/CHANGES.rst
@@ -1,3 +1,66 @@
+Changes in <unreleased>
+=======================
+
+This release adds an index to the events table. This means that on first
+startup there will be an inceased amount of IO until the index is created, and
+an increase in disk usage.
+
+
+Changes in synapse v0.30.0 (2018-05-24)
+==========================================
+
+'Server Notices' are a new feature introduced in Synapse 0.30. They provide a
+channel whereby server administrators can send messages to users on the server.
+
+They are used as part of communication of the server policies (see ``docs/consent_tracking.md``),
+however the intention is that they may also find a use for features such
+as "Message of the day".
+
+This feature is specific to Synapse, but uses standard Matrix communication mechanisms,
+so should work with any Matrix client. For more details see ``docs/server_notices.md``
+
+Further Server Notices/Consent Tracking Support:
+
+* Allow overriding the server_notices user's avatar (PR #3273)
+* Use the localpart in the consent uri (PR #3272)
+* Support for putting %(consent_uri)s in messages (PR #3271)
+* Block attempts to send server notices to remote users (PR #3270)
+* Docs on consent bits (PR #3268)
+
+
+
+Changes in synapse v0.30.0-rc1 (2018-05-23)
+==========================================
+
+Server Notices/Consent Tracking Support:
+
+* ConsentResource to gather policy consent from users (PR #3213)
+* Move RoomCreationHandler out of synapse.handlers.Handlers (PR #3225)
+* Infrastructure for a server notices room (PR #3232)
+* Send users a server notice about consent (PR #3236)
+* Reject attempts to send event before privacy consent is given (PR #3257)
+* Add a 'has_consented' template var to consent forms (PR #3262)
+* Fix dependency on jinja2 (PR #3263)
+
+Features:
+
+* Cohort analytics (PR #3163, #3241, #3251)
+* Add lxml to docker image for web previews (PR #3239) Thanks to @ptman!
+* Add in flight request metrics (PR #3252)
+
+Changes:
+
+* Remove unused `update_external_syncs` (PR #3233)
+* Use stream rather depth ordering for push actions (PR #3212)
+* Make purge_history operate on tokens (PR #3221)
+* Don't support limitless pagination (PR #3265)
+
+Bug Fixes:
+
+* Fix logcontext resource usage tracking (PR #3258)
+* Fix error in handling receipts (PR #3235)
+* Stop the transaction cache caching failures (PR #3255)
+
 Changes in synapse v0.29.1 (2018-05-17)
 ==========================================
 Changes:
--- a/4
+++ b/4
@@ -1,12 +1,12 @@
 FROM docker.io/python:2-alpine3.7

-RUN apk add --no-cache --virtual .nacl_deps su-exec build-base libffi-dev zlib-dev libressl-dev libjpeg-turbo-dev linux-headers postgresql-dev
+RUN apk add --no-cache --virtual .nacl_deps su-exec build-base libffi-dev zlib-dev libressl-dev libjpeg-turbo-dev linux-headers postgresql-dev libxslt-dev

 COPY . /synapse

 # A wheel cache may be provided in ./cache for faster build
 RUN cd /synapse \
- && pip install --upgrade pip setuptools psycopg2 \
+ && pip install --upgrade pip setuptools psycopg2 lxml \
 && mkdir -p /synapse/cache \
 && pip install -f /synapse/cache --upgrade --process-dependency-links . \
 && mv /synapse/contrib/docker/start.py /synapse/contrib/docker/conf / \
--- a/docs/consent_tracking.md
+++ b/docs/consent_tracking.md
@@ -0,0 +1,160 @@
+Support in Synapse for tracking agreement to server terms and conditions
+========================================================================
+
+Synapse 0.30 introduces support for tracking whether users have agreed to the
+terms and conditions set by the administrator of a server - and blocking access
+to the server until they have.
+
+There are several parts to this functionality; each requires some specific
+configuration in `homeserver.yaml` to be enabled.
+
+Note that various parts of the configuation and this document refer to the
+"privacy policy": agreement with a privacy policy is one particular use of this
+feature, but of course adminstrators can specify other terms and conditions
+unrelated to "privacy" per se.
+
+Collecting policy agreement from a user
+---------------------------------------
+
+Synapse can be configured to serve the user a simple policy form with an
+"accept" button. Clicking "Accept" records the user's acceptance in the
+database and shows a success page.
+
+To enable this, first create templates for the policy and success pages.
+These should be stored on the local filesystem.
+
+These templates use the [Jinja2](http://jinja.pocoo.org) templating language,
+and [docs/privacy_policy_templates](privacy_policy_templates) gives
+examples of the sort of thing that can be done.
+
+Note that the templates must be stored under a name giving the language of the
+template - currently this must always be `en` (for "English");
+internationalisation support is intended for the future.
+
+The template for the policy itself should be versioned and named according to 
+the version: for example `1.0.html`. The version of the policy which the user
+has agreed to is stored in the database.
+
+Once the templates are in place, make the following changes to `homeserver.yaml`:
+
+ 1. Add a `user_consent` section, which should look like:
+
+    ```yaml
+    user_consent:
+      template_dir: privacy_policy_templates
+      version: 1.0
+    ```
+
+    `template_dir` points to the directory containing the policy
+    templates. `version` defines the version of the policy which will be served
+    to the user. In the example above, Synapse will serve
+    `privacy_policy_templates/en/1.0.html`.
+
+
+ 2. Add a `form_secret` setting at the top level:
+
+
+    ```yaml
+    form_secret: "<unique secret>"
+    ```
+
+    This should be set to an arbitrary secret string (try `pwgen -y 30` to
+    generate suitable secrets).
+
+    More on what this is used for below.
+
+ 3. Add `consent` wherever the `client` resource is currently enabled in the
+    `listeners` configuration. For example:
+
+    ```yaml
+    listeners:
+      - port: 8008
+        resources:
+          - names:
+            - client
+            - consent
+    ```
+
+
+Finally, ensure that `jinja2` is installed. If you are using a virtualenv, this
+should be a matter of `pip install Jinja2`. On debian, try `apt-get install
+python-jinja2`.
+
+Once this is complete, and the server has been restarted, try visiting
+`https://<server>/_matrix/consent`. If correctly configured, this should give
+an error "Missing string query parameter 'u'". It is now possible to manually
+construct URIs where users can give their consent.
+
+### Constructing the consent URI
+
+It may be useful to manually construct the "consent URI" for a given user - for
+instance, in order to send them an email asking them to consent. To do this,
+take the base `https://<server>/_matrix/consent` URL and add the following
+query parameters:
+
+ * `u`: the user id of the user. This can either be a full MXID
+   (`@user:server.com`) or just the localpart (`user`).
+
+ * `h`: hex-encoded HMAC-SHA256 of `u` using the `form_secret` as a key. It is
+   possible to calculate this on the commandline with something like:
+
+   ```bash
+   echo -n '<user>' | openssl sha256 -hmac '<form_secret>'
+   ```
+
+   This should result in a URI which looks something like:
+   `https://<server>/_matrix/consent?u=<user>&h=68a152465a4d...`.
+
+
+Sending users a server notice asking them to agree to the policy
+----------------------------------------------------------------
+
+It is possible to configure Synapse to send a [server
+notice](server_notices.md) to anybody who has not yet agreed to the current
+version of the policy. To do so:
+
+ * ensure that the consent resource is configured, as in the previous section
+
+ * ensure that server notices are configured, as in [server_notices.md](server_notices.md).
+
+ * Add `server_notice_content` under `user_consent` in `homeserver.yaml`. For
+   example:
+
+   ```yaml
+   user_consent:
+     server_notice_content:
+       msgtype: m.text
+       body: >-
+         Please give your consent to the privacy policy at %(consent_uri)s.
+   ```
+
+   Synapse automatically replaces the placeholder `%(consent_uri)s` with the
+   consent uri for that user.
+
+ * ensure that `public_baseurl` is set in `homeserver.yaml`, and gives the base
+   URI that clients use to connect to the server. (It is used to construct
+   `consent_uri` in the server notice.)
+
+
+Blocking users from using the server until they agree to the policy
+-------------------------------------------------------------------
+
+Synapse can be configured to block any attempts to join rooms or send messages
+until the user has given their agreement to the policy. (Joining the server
+notices room is exempted from this).
+
+To enable this, add `block_events_error` under `user_consent`. For example:
+
+```yaml
+user_consent:
+  block_events_error: >-
+    You can't send any messages until you consent to the privacy policy at
+    %(consent_uri)s.
+```
+
+Synapse automatically replaces the placeholder `%(consent_uri)s` with the
+consent uri for that user.
+
+ensure that `public_baseurl` is set in `homeserver.yaml`, and gives the base
+URI that clients use to connect to the server. (It is used to construct
+`consent_uri` in the error.)
--- a/docs/manhole.md
+++ b/docs/manhole.md
@@ -0,0 +1,43 @@
+Using the synapse manhole
+=========================
+
+The "manhole" allows server administrators to access a Python shell on a running
+Synapse installation. This is a very powerful mechanism for administration and
+debugging.
+
+To enable it, first uncomment the `manhole` listener configuration in
+`homeserver.yaml`:
+
+```yaml
+listeners:
+  - port: 9000
+    bind_addresses: ['::1', '127.0.0.1']
+    type: manhole
+```
+
+(`bind_addresses` in the above is important: it ensures that access to the
+manhole is only possible for local users).
+
+Note that this will give administrative access to synapse to **all users** with
+shell access to the server. It should therefore **not** be enabled in
+environments where untrusted users have shell access.
+
+Then restart synapse, and point an ssh client at port 9000 on localhost, using
+the username `matrix`:
+
+```bash
+ssh -p9000 matrix@localhost
+```
+
+The password is `rabbithole`.
+
+This gives a Python REPL in which `hs` gives access to the
+`synapse.server.HomeServer` object - which in turn gives access to many other
+parts of the process.
+
+As a simple example, retrieving an event from the database:
+
+```
+>>> hs.get_datastore().get_event('$1416420717069yeQaw:matrix.org')
+<Deferred at 0x7ff253fc6998 current result: <FrozenEvent event_id='$1416420717069yeQaw:matrix.org', type='m.room.create', state_key=''>>
+```
--- a/docs/postgres.rst
+++ b/docs/postgres.rst
@@ -6,7 +6,13 @@ Postgres version 9.4 or later is known to work.
 Set up database
 ===============

-The PostgreSQL database used *must* have the correct encoding set, otherwise
+Assuming your PostgreSQL database user is called ``postgres``, create a user
+``synapse_user`` with::
+
+ su - postgres
+ createuser --pwprompt synapse_user
+
+The PostgreSQL database used *must* have the correct encoding set, otherwise it
 would not be able to store UTF8 strings. To create a database with the correct
 encoding use, e.g.::

@@ -46,8 +52,8 @@ As with Debian/Ubuntu, postgres support depends on the postgres python connector
 Synapse config
 ==============

-When you are ready to start using PostgreSQL, add the following line to your
-config file::
+When you are ready to start using PostgreSQL, edit the ``database`` section in
+your config file to match the following lines::

    database:
        name: psycopg2
@@ -96,9 +102,12 @@ complete, restart synapse.  For instance::
    cp homeserver.db homeserver.db.snapshot
    ./synctl start

-Assuming your new config file (as described in the section *Synapse config*)
-is named ``homeserver-postgres.yaml`` and the SQLite snapshot is at
-``homeserver.db.snapshot`` then simply run::
+Copy the old config file into a new config file::
+
+    cp homeserver.yaml homeserver-postgres.yaml
+
+Edit the database section as described in the section *Synapse config* above
+and with the SQLite snapshot located at ``homeserver.db.snapshot`` simply run::

    synapse_port_db --sqlite-database homeserver.db.snapshot \
        --postgres-config homeserver-postgres.yaml
@@ -117,6 +126,11 @@ run::
        --postgres-config homeserver-postgres.yaml

 Once that has completed, change the synapse config to point at the PostgreSQL
-database configuration file ``homeserver-postgres.yaml`` (i.e. rename it to 
-``homeserver.yaml``) and restart synapse. Synapse should now be running against
-PostgreSQL.
+database configuration file ``homeserver-postgres.yaml``:
+
+    ./synctl stop
+    mv homeserver.yaml homeserver-old-sqlite.yaml 
+    mv homeserver-postgres.yaml homeserver.yaml 
+    ./synctl start
+
+Synapse should now be running against PostgreSQL.
--- a/docs/privacy_policy_templates/en/1.0.html
+++ b/docs/privacy_policy_templates/en/1.0.html
@@ -0,0 +1,23 @@
+<!doctype html>
+<html lang="en">
+  <head>
+    <title>Matrix.org Privacy policy</title>
+  </head>
+  <body>
+  {% if has_consented %}
+    <p>
+      Your base already belong to us.
+    </p>
+  {% else %}
+    <p>
+      All your base are belong to us.
+    </p>
+    <form method="post" action="consent">
+      <input type="hidden" name="v" value="{{version}}"/>
+      <input type="hidden" name="u" value="{{user}}"/>
+      <input type="hidden" name="h" value="{{userhmac}}"/>
+      <input type="submit" value="Sure thing!"/>
+    </form>
+  {% endif %}
+  </body>
+</html>
--- a/docs/privacy_policy_templates/en/success.html
+++ b/docs/privacy_policy_templates/en/success.html
@@ -0,0 +1,11 @@
+<!doctype html>
+<html lang="en">
+  <head>
+    <title>Matrix.org Privacy policy</title>
+  </head>
+  <body>
+    <p>
+      Sweet.
+    </p>
+  </body>
+</html>
--- a/docs/server_notices.md
+++ b/docs/server_notices.md
@@ -0,0 +1,71 @@
+Server Notices
+==============
+
+'Server Notices' are a new feature introduced in Synapse 0.30. They provide a
+channel whereby server administrators can send messages to users on the server.
+
+They are used as part of communication of the server polices(see
+[consent_tracking.md](consent_tracking.md)), however the intention is that 
+they may also find a use for features such as "Message of the day".
+
+This is a feature specific to Synapse, but it uses standard Matrix
+communication mechanisms, so should work with any Matrix client.
+
+User experience
+---------------
+
+When the user is first sent a server notice, they will get an invitation to a
+room (typically called 'Server Notices', though this is configurable in
+`homeserver.yaml`). They will be **unable to reject** this invitation -
+attempts to do so will receive an error.
+
+Once they accept the invitation, they will see the notice message in the room
+history; it will appear to have come from the 'server notices user' (see
+below).
+
+The user is prevented from sending any messages in this room by the power
+levels. They also cannot leave it.
+
+Synapse configuration
+---------------------
+
+Server notices come from a specific user id on the server. Server
+administrators are free to choose the user id - something like `server` is
+suggested, meaning the notices will come from
+`@server:<your_server_name>`. Once the Server Notices user is configured, that
+user id becomes a special, privileged user, so administrators should ensure
+that **it is not already allocated**.
+
+In order to support server notices, it is necessary to add some configuration
+to the `homeserver.yaml` file. In particular, you should add a `server_notices`
+section, which should look like this:
+
+```yaml
+server_notices:
+   system_mxid_localpart: server
+   system_mxid_display_name: "Server Notices"
+   system_mxid_avatar_url: "mxc://server.com/oumMVlgDnLYFaPVkExemNVVZ"
+   room_name: "Server Notices"
+```
+
+The only compulsory setting is `system_mxid_localpart`, which defines the user
+id of the Server Notices user, as above. `room_name` defines the name of the
+room which will be created.
+
+`system_mxid_display_name` and `system_mxid_avatar_url` can be used to set the
+displayname and avatar of the Server Notices user.
+
+Sending notices
+---------------
+
+As of the current version of synapse, there is no convenient interface for
+sending notices (other than the automated ones sent as part of consent
+tracking).
+
+In the meantime, it is possible to test this feature using the manhole. Having
+gone into the manhole as described in [manhole.md](manhole.md), a notice can be
+sent with something like:
+
+```
+>>> hs.get_server_notices_manager().send_notice('@user:server.com', {'msgtype':'m.text', 'body':'foo'})
+```
--- a/synapse/init.py
+++ b/synapse/init.py
@@ -16,4 +16,4 @@
 """ This is a reference implementation of a Matrix home server.
 """

-__version__ = "0.29.1"
+__version__ = "0.30.0"
--- a/synapse/api/errors.py
+++ b/synapse/api/errors.py
@@ -19,6 +19,7 @@ import logging

 import simplejson as json
 from six import iteritems
+from six.moves import http_client

 logger = logging.getLogger(__name__)

@@ -51,6 +52,8 @@ class Codes(object):
    THREEPID_DENIED = "M_THREEPID_DENIED"
    INVALID_USERNAME = "M_INVALID_USERNAME"
    SERVER_NOT_TRUSTED = "M_SERVER_NOT_TRUSTED"
+    CONSENT_NOT_GIVEN = "M_CONSENT_NOT_GIVEN"
+    CANNOT_LEAVE_SERVER_NOTICE_ROOM = "M_CANNOT_LEAVE_SERVER_NOTICE_ROOM"


 class CodeMessageException(RuntimeError):
@@ -138,6 +141,32 @@ class SynapseError(CodeMessageException):
        return res


+class ConsentNotGivenError(SynapseError):
+    """The error returned to the client when the user has not consented to the
+    privacy policy.
+    """
+    def __init__(self, msg, consent_uri):
+        """Constructs a ConsentNotGivenError
+
+        Args:
+            msg (str): The human-readable error message
+            consent_url (str): The URL where the user can give their consent
+        """
+        super(ConsentNotGivenError, self).__init__(
+            code=http_client.FORBIDDEN,
+            msg=msg,
+            errcode=Codes.CONSENT_NOT_GIVEN
+        )
+        self._consent_uri = consent_uri
+
+    def error_dict(self):
+        return cs_error(
+            self.msg,
+            self.errcode,
+            consent_uri=self._consent_uri
+        )
+
+
 class RegistrationError(SynapseError):
    """An error raised when a registration event fails."""
    pass
@@ -292,7 +321,7 @@ def cs_error(msg, code=Codes.UNKNOWN, **kwargs):

    Args:
        msg (str): The error message.
-        code (int): The error code.
+        code (str): The error code.
        kwargs : Additional keys to add to the response.
    Returns:
        A dict representing the error response JSON.
--- a/synapse/api/urls.py
+++ b/synapse/api/urls.py
@@ -1,5 +1,6 @@
 # -*- coding: utf-8 -*-
 # Copyright 2014-2016 OpenMarket Ltd
+# Copyright 2018 New Vector Ltd.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -14,6 +15,12 @@
 # limitations under the License.

 """Contains the URL paths to prefix various aspects of the server with. """
+from hashlib import sha256
+import hmac
+
+from six.moves.urllib.parse import urlencode
+
+from synapse.config import ConfigError

 CLIENT_PREFIX = "/_matrix/client/api/v1"
 CLIENT_V2_ALPHA_PREFIX = "/_matrix/client/v2_alpha"
@@ -25,3 +32,46 @@ SERVER_KEY_PREFIX = "/_matrix/key/v1"
 SERVER_KEY_V2_PREFIX = "/_matrix/key/v2"
 MEDIA_PREFIX = "/_matrix/media/r0"
 LEGACY_MEDIA_PREFIX = "/_matrix/media/v1"
+
+
+class ConsentURIBuilder(object):
+    def __init__(self, hs_config):
+        """
+        Args:
+            hs_config (synapse.config.homeserver.HomeServerConfig):
+        """
+        if hs_config.form_secret is None:
+            raise ConfigError(
+                "form_secret not set in config",
+            )
+        if hs_config.public_baseurl is None:
+            raise ConfigError(
+                "public_baseurl not set in config",
+            )
+
+        self._hmac_secret = hs_config.form_secret.encode("utf-8")
+        self._public_baseurl = hs_config.public_baseurl
+
+    def build_user_consent_uri(self, user_id):
+        """Build a URI which we can give to the user to do their privacy
+        policy consent
+
+        Args:
+            user_id (str): mxid or username of user
+
+        Returns
+            (str) the URI where the user can do consent
+        """
+        mac = hmac.new(
+            key=self._hmac_secret,
+            msg=user_id,
+            digestmod=sha256,
+        ).hexdigest()
+        consent_uri = "%s_matrix/consent?%s" % (
+            self._public_baseurl,
+            urlencode({
+                "u": user_id,
+                "h": mac
+            }),
+        )
+        return consent_uri
--- a/synapse/app/homeserver.py
+++ b/synapse/app/homeserver.py
@@ -184,6 +184,15 @@ class SynapseHomeServer(HomeServer):
                "/_matrix/client/versions": client_resource,
            })

+        if name == "consent":
+            from synapse.rest.consent.consent_resource import ConsentResource
+            consent_resource = ConsentResource(self)
+            if compress:
+                consent_resource = gz_wrap(consent_resource)
+            resources.update({
+                "/_matrix/consent": consent_resource,
+            })
+
        if name == "federation":
            resources.update({
                FEDERATION_PREFIX: TransportLayerServer(self),
@@ -475,6 +484,14 @@ def run(hs):
                " changes across releases."
            )

+    def generate_user_daily_visit_stats():
+        hs.get_datastore().generate_user_daily_visits()
+
+    # Rather than update on per session basis, batch up the requests.
+    # If you increase the loop period, the accuracy of user_daily_visits
+    # table will decrease
+    clock.looping_call(generate_user_daily_visit_stats, 5 * 60 * 1000)
+
    if hs.config.report_stats:
        logger.info("Scheduling stats reporting for 3 hour intervals")
        clock.looping_call(phone_stats_home, 3 * 60 * 60 * 1000)
--- a/synapse/config/init.py
+++ b/synapse/config/init.py
@@ -12,3 +12,9 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
+
+from ._base import ConfigError
+
+# export ConfigError if somebody does import *
+# this is largely a fudge to stop PEP8 moaning about the import
+__all__ = ["ConfigError"]
--- a/synapse/config/consent_config.py
+++ b/synapse/config/consent_config.py
@@ -0,0 +1,79 @@
+# -*- coding: utf-8 -*-
+# Copyright 2018 New Vector Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from ._base import Config
+
+DEFAULT_CONFIG = """\
+# User Consent configuration
+#
+# Parts of this section are required if enabling the 'consent' resource under
+# 'listeners', in particular 'template_dir' and 'version'.
+#
+# 'template_dir' gives the location of the templates for the HTML forms.
+# This directory should contain one subdirectory per language (eg, 'en', 'fr'),
+# and each language directory should contain the policy document (named as
+# '<version>.html') and a success page (success.html).
+#
+# 'version' specifies the 'current' version of the policy document. It defines
+# the version to be served by the consent resource if there is no 'v'
+# parameter.
+#
+# 'server_notice_content', if enabled, will send a user a "Server Notice"
+# asking them to consent to the privacy policy. The 'server_notices' section
+# must also be configured for this to work.
+#
+# 'block_events_error', if set, will block any attempts to send events
+# until the user consents to the privacy policy. The value of the setting is
+# used as the text of the error.
+#
+# user_consent:
+#   template_dir: res/templates/privacy
+#   version: 1.0
+#   server_notice_content:
+#     msgtype: m.text
+#     body: >-
+#       To continue using this homeserver you must review and agree to the
+#       terms and conditions at %(consent_uri)s
+#   block_events_error: >-
+#     To continue using this homeserver you must review and agree to the
+#     terms and conditions at %(consent_uri)s
+#
+"""
+
+
+class ConsentConfig(Config):
+    def __init__(self):
+        super(ConsentConfig, self).__init__()
+
+        self.user_consent_version = None
+        self.user_consent_template_dir = None
+        self.user_consent_server_notice_content = None
+        self.block_events_without_consent_error = None
+
+    def read_config(self, config):
+        consent_config = config.get("user_consent")
+        if consent_config is None:
+            return
+        self.user_consent_version = str(consent_config["version"])
+        self.user_consent_template_dir = consent_config["template_dir"]
+        self.user_consent_server_notice_content = consent_config.get(
+            "server_notice_content",
+        )
+        self.block_events_without_consent_error = consent_config.get(
+            "block_events_error",
+        )
+
+    def default_config(self, **kwargs):
+        return DEFAULT_CONFIG
--- a/synapse/config/homeserver.py
+++ b/synapse/config/homeserver.py
@@ -1,5 +1,6 @@
 # -*- coding: utf-8 -*-
 # Copyright 2014-2016 OpenMarket Ltd
+# Copyright 2018 New Vector Ltd
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -12,7 +13,6 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-
 from .tls import TlsConfig
 from .server import ServerConfig
 from .logger import LoggingConfig
@@ -37,6 +37,8 @@ from .push import PushConfig
 from .spam_checker import SpamCheckerConfig
 from .groups import GroupsConfig
 from .user_directory import UserDirectoryConfig
+from .consent_config import ConsentConfig
+from .server_notices_config import ServerNoticesConfig


 class HomeServerConfig(TlsConfig, ServerConfig, DatabaseConfig, LoggingConfig,
@@ -45,12 +47,15 @@ class HomeServerConfig(TlsConfig, ServerConfig, DatabaseConfig, LoggingConfig,
                       AppServiceConfig, KeyConfig, SAML2Config, CasConfig,
                       JWTConfig, PasswordConfig, EmailConfig,
                       WorkerConfig, PasswordAuthProviderConfig, PushConfig,
-                       SpamCheckerConfig, GroupsConfig, UserDirectoryConfig,):
+                       SpamCheckerConfig, GroupsConfig, UserDirectoryConfig,
+                       ConsentConfig,
+                       ServerNoticesConfig,
+                       ):
    pass


 if __name__ == '__main__':
    import sys
    sys.stdout.write(
-        HomeServerConfig().generate_config(sys.argv[1], sys.argv[2])[0]
+        HomeServerConfig().generate_config(sys.argv[1], sys.argv[2], True)[0]
    )
--- a/synapse/config/key.py
+++ b/synapse/config/key.py
@@ -59,14 +59,20 @@ class KeyConfig(Config):

        self.expire_access_token = config.get("expire_access_token", False)

+        # a secret which is used to calculate HMACs for form values, to stop
+        # falsification of values
+        self.form_secret = config.get("form_secret", None)
+
    def default_config(self, config_dir_path, server_name, is_generating_file=False,
                       **kwargs):
        base_key_name = os.path.join(config_dir_path, server_name)

        if is_generating_file:
            macaroon_secret_key = random_string_with_symbols(50)
+            form_secret = '"%s"' % random_string_with_symbols(50)
        else:
            macaroon_secret_key = None
+            form_secret = 'null'

        return """\
        macaroon_secret_key: "%(macaroon_secret_key)s"
@@ -74,6 +80,10 @@ class KeyConfig(Config):
        # Used to enable access token expiration.
        expire_access_token: False

+        # a secret which is used to calculate HMACs for form values, to stop
+        # falsification of values
+        form_secret: %(form_secret)s
+
        ## Signing Keys ##

        # Path to the signing key to sign messages with
--- a/synapse/config/server_notices_config.py
+++ b/synapse/config/server_notices_config.py
@@ -0,0 +1,86 @@
+# -*- coding: utf-8 -*-
+# Copyright 2018 New Vector Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from ._base import Config
+from synapse.types import UserID
+
+DEFAULT_CONFIG = """\
+# Server Notices room configuration
+#
+# Uncomment this section to enable a room which can be used to send notices
+# from the server to users. It is a special room which cannot be left; notices
+# come from a special "notices" user id.
+#
+# If you uncomment this section, you *must* define the system_mxid_localpart
+# setting, which defines the id of the user which will be used to send the
+# notices.
+#
+# It's also possible to override the room name, the display name of the
+# "notices" user, and the avatar for the user.
+#
+# server_notices:
+#   system_mxid_localpart: notices
+#   system_mxid_display_name: "Server Notices"
+#   system_mxid_avatar_url: "mxc://server.com/oumMVlgDnLYFaPVkExemNVVZ"
+#   room_name: "Server Notices"
+"""
+
+
+class ServerNoticesConfig(Config):
+    """Configuration for the server notices room.
+
+    Attributes:
+        server_notices_mxid (str|None):
+            The MXID to use for server notices.
+            None if server notices are not enabled.
+
+        server_notices_mxid_display_name (str|None):
+            The display name to use for the server notices user.
+            None if server notices are not enabled.
+
+        server_notices_mxid_avatar_url (str|None):
+            The display name to use for the server notices user.
+            None if server notices are not enabled.
+
+        server_notices_room_name (str|None):
+            The name to use for the server notices room.
+            None if server notices are not enabled.
+    """
+    def __init__(self):
+        super(ServerNoticesConfig, self).__init__()
+        self.server_notices_mxid = None
+        self.server_notices_mxid_display_name = None
+        self.server_notices_mxid_avatar_url = None
+        self.server_notices_room_name = None
+
+    def read_config(self, config):
+        c = config.get("server_notices")
+        if c is None:
+            return
+
+        mxid_localpart = c['system_mxid_localpart']
+        self.server_notices_mxid = UserID(
+            mxid_localpart, self.server_name,
+        ).to_string()
+        self.server_notices_mxid_display_name = c.get(
+            'system_mxid_display_name', None,
+        )
+        self.server_notices_mxid_avatar_url = c.get(
+            'system_mxid_avatar_url', None,
+        )
+        # todo: i18n
+        self.server_notices_room_name = c.get('room_name', "Server Notices")
+
+    def default_config(self, **kwargs):
+        return DEFAULT_CONFIG
--- a/synapse/events/utils.py
+++ b/synapse/events/utils.py
@@ -20,6 +20,8 @@ from frozendict import frozendict

 import re

+from six import string_types
+
 # Split strings on "." but not "\." This uses a negative lookbehind assertion for '\'
 # (?<!stuff) matches if the current position in the string is not preceded
 # by a match for 'stuff'.
@@ -277,7 +279,7 @@ def serialize_event(e, time_now_ms, as_client_event=True,

    if only_event_fields:
        if (not isinstance(only_event_fields, list) or
-                not all(isinstance(f, basestring) for f in only_event_fields)):
+                not all(isinstance(f, string_types) for f in only_event_fields)):
            raise TypeError("only_event_fields must be a list of strings")
        d = only_fields(d, only_event_fields)

--- a/synapse/events/validator.py
+++ b/synapse/events/validator.py
@@ -17,6 +17,8 @@ from synapse.types import EventID, RoomID, UserID
 from synapse.api.errors import SynapseError
 from synapse.api.constants import EventTypes, Membership

+from six import string_types
+

 class EventValidator(object):

@@ -49,7 +51,7 @@ class EventValidator(object):
            strings.append("state_key")

        for s in strings:
-            if not isinstance(getattr(event, s), basestring):
+            if not isinstance(getattr(event, s), string_types):
                raise SynapseError(400, "Not '%s' a string type" % (s,))

        if event.type == EventTypes.Member:
@@ -88,5 +90,5 @@ class EventValidator(object):
        for s in keys:
            if s not in d:
                raise SynapseError(400, "'%s' not in content" % (s,))
-            if not isinstance(d[s], basestring):
+            if not isinstance(d[s], string_types):
                raise SynapseError(400, "Not '%s' a string type" % (s,))
--- a/synapse/groups/groups_server.py
+++ b/synapse/groups/groups_server.py
@@ -20,6 +20,8 @@ from synapse.api.errors import SynapseError
 from synapse.types import GroupID, RoomID, UserID, get_domain_from_id
 from twisted.internet import defer

+from six import string_types
+
 logger = logging.getLogger(__name__)


@@ -431,7 +433,7 @@ class GroupsServerHandler(object):
                        "long_description"):
            if keyname in content:
                value = content[keyname]
-                if not isinstance(value, basestring):
+                if not isinstance(value, string_types):
                    raise SynapseError(400, "%r value is not a string" % (keyname,))
                profile[keyname] = value

--- a/synapse/handlers/init.py
+++ b/synapse/handlers/init.py
@@ -14,9 +14,7 @@
 # limitations under the License.

 from .register import RegistrationHandler
-from .room import (
-    RoomCreationHandler, RoomContextHandler,
-)
+from .room import RoomContextHandler
 from .message import MessageHandler
 from .federation import FederationHandler
 from .directory import DirectoryHandler
@@ -47,7 +45,6 @@ class Handlers(object):
    def __init__(self, hs):
        self.registration_handler = RegistrationHandler(hs)
        self.message_handler = MessageHandler(hs)
-        self.room_creation_handler = RoomCreationHandler(hs)
        self.federation_handler = FederationHandler(hs)
        self.directory_handler = DirectoryHandler(hs)
        self.admin_handler = AdminHandler(hs)
--- a/synapse/handlers/deactivate_account.py
+++ b/synapse/handlers/deactivate_account.py
@@ -30,6 +30,7 @@ class DeactivateAccountHandler(BaseHandler):
        self._auth_handler = hs.get_auth_handler()
        self._device_handler = hs.get_device_handler()
        self._room_member_handler = hs.get_room_member_handler()
+        self.user_directory_handler = hs.get_user_directory_handler()

        # Flag that indicates whether the process to part users from rooms is running
        self._user_parter_running = False
@@ -61,10 +62,13 @@ class DeactivateAccountHandler(BaseHandler):
        yield self.store.user_delete_threepids(user_id)
        yield self.store.user_set_password_hash(user_id, None)

-        # Add the user to a table of users penpding deactivation (ie.
+        # Add the user to a table of users pending deactivation (ie.
        # removal from all the rooms they're a member of)
        yield self.store.add_user_pending_deactivation(user_id)

+        # delete from user directory
+        yield self.user_directory_handler.handle_user_deactivated(user_id)
+
        # Now start the process that goes through that list and
        # parts users from rooms (if it isn't already running)
        self._start_user_parting()
--- a/synapse/handlers/device.py
+++ b/synapse/handlers/device.py
@@ -26,6 +26,8 @@ from ._base import BaseHandler

 import logging

+from six import itervalues, iteritems
+
 logger = logging.getLogger(__name__)


@@ -318,7 +320,7 @@ class DeviceHandler(BaseHandler):
            # The user may have left the room
            # TODO: Check if they actually did or if we were just invited.
            if room_id not in room_ids:
-                for key, event_id in current_state_ids.iteritems():
+                for key, event_id in iteritems(current_state_ids):
                    etype, state_key = key
                    if etype != EventTypes.Member:
                        continue
@@ -338,7 +340,7 @@ class DeviceHandler(BaseHandler):
            # special-case for an empty prev state: include all members
            # in the changed list
            if not event_ids:
-                for key, event_id in current_state_ids.iteritems():
+                for key, event_id in iteritems(current_state_ids):
                    etype, state_key = key
                    if etype != EventTypes.Member:
                        continue
@@ -354,10 +356,10 @@ class DeviceHandler(BaseHandler):

            # Check if we've joined the room? If so we just blindly add all the users to
            # the "possibly changed" users.
-            for state_dict in prev_state_ids.itervalues():
+            for state_dict in itervalues(prev_state_ids):
                member_event = state_dict.get((EventTypes.Member, user_id), None)
                if not member_event or member_event != current_member_id:
-                    for key, event_id in current_state_ids.iteritems():
+                    for key, event_id in iteritems(current_state_ids):
                        etype, state_key = key
                        if etype != EventTypes.Member:
                            continue
@@ -367,14 +369,14 @@ class DeviceHandler(BaseHandler):
            # If there has been any change in membership, include them in the
            # possibly changed list. We'll check if they are joined below,
            # and we're not toooo worried about spuriously adding users.
-            for key, event_id in current_state_ids.iteritems():
+            for key, event_id in iteritems(current_state_ids):
                etype, state_key = key
                if etype != EventTypes.Member:
                    continue

                # check if this member has changed since any of the extremities
                # at the stream_ordering, and add them to the list if so.
-                for state_dict in prev_state_ids.itervalues():
+                for state_dict in itervalues(prev_state_ids):
                    prev_event_id = state_dict.get(key, None)
                    if not prev_event_id or prev_event_id != event_id:
                        if state_key != user_id:
--- a/synapse/handlers/e2e_keys.py
+++ b/synapse/handlers/e2e_keys.py
@@ -19,6 +19,7 @@ import logging

 from canonicaljson import encode_canonical_json
 from twisted.internet import defer
+from six import iteritems

 from synapse.api.errors import (
    SynapseError, CodeMessageException, FederationDeniedError,
@@ -92,7 +93,7 @@ class E2eKeysHandler(object):
        remote_queries_not_in_cache = {}
        if remote_queries:
            query_list = []
-            for user_id, device_ids in remote_queries.iteritems():
+            for user_id, device_ids in iteritems(remote_queries):
                if device_ids:
                    query_list.extend((user_id, device_id) for device_id in device_ids)
                else:
@@ -103,9 +104,9 @@ class E2eKeysHandler(object):
                    query_list
                )
            )
-            for user_id, devices in remote_results.iteritems():
+            for user_id, devices in iteritems(remote_results):
                user_devices = results.setdefault(user_id, {})
-                for device_id, device in devices.iteritems():
+                for device_id, device in iteritems(devices):
                    keys = device.get("keys", None)
                    device_display_name = device.get("device_display_name", None)
                    if keys:
@@ -250,9 +251,9 @@ class E2eKeysHandler(object):
            "Claimed one-time-keys: %s",
            ",".join((
                "%s for %s:%s" % (key_id, user_id, device_id)
-                for user_id, user_keys in json_result.iteritems()
-                for device_id, device_keys in user_keys.iteritems()
-                for key_id, _ in device_keys.iteritems()
+                for user_id, user_keys in iteritems(json_result)
+                for device_id, device_keys in iteritems(user_keys)
+                for key_id, _ in iteritems(device_keys)
            )),
        )

--- a/synapse/handlers/events.py
+++ b/synapse/handlers/events.py
@@ -48,6 +48,7 @@ class EventStreamHandler(BaseHandler):

        self.notifier = hs.get_notifier()
        self.state = hs.get_state_handler()
+        self._server_notices_sender = hs.get_server_notices_sender()

    @defer.inlineCallbacks
    @log_function
@@ -58,6 +59,10 @@ class EventStreamHandler(BaseHandler):

        If `only_keys` is not None, events from keys will be sent down.
        """
+
+        # send any outstanding server notices to the user.
+        yield self._server_notices_sender.on_user_syncing(auth_user_id)
+
        auth_user = UserID.from_string(auth_user_id)
        presence_handler = self.hs.get_presence_handler()

--- a/synapse/handlers/federation.py
+++ b/synapse/handlers/federation.py
@@ -24,6 +24,7 @@ from signedjson.key import decode_verify_key_bytes
 from signedjson.sign import verify_signed_json
 import six
 from six.moves import http_client
+from six import iteritems
 from twisted.internet import defer
 from unpaddedbase64 import decode_base64

@@ -81,6 +82,7 @@ class FederationHandler(BaseHandler):
        self.pusher_pool = hs.get_pusherpool()
        self.spam_checker = hs.get_spam_checker()
        self.event_creation_handler = hs.get_event_creation_handler()
+        self._server_notices_mxid = hs.config.server_notices_mxid

        # When joining a room we need to queue any events for that room up
        self.room_queues = {}
@@ -478,18 +480,18 @@ class FederationHandler(BaseHandler):
        # to get all state ids that we're interested in.
        event_map = yield self.store.get_events([
            e_id
-            for key_to_eid in event_to_state_ids.values()
-            for key, e_id in key_to_eid.items()
+            for key_to_eid in event_to_state_ids.itervalues()
+            for key, e_id in key_to_eid.iteritems()
            if key[0] != EventTypes.Member or check_match(key[1])
        ])

        event_to_state = {
            e_id: {
                key: event_map[inner_e_id]
-                for key, inner_e_id in key_to_eid.items()
+                for key, inner_e_id in key_to_eid.iteritems()
                if inner_e_id in event_map
            }
-            for e_id, key_to_eid in event_to_state_ids.items()
+            for e_id, key_to_eid in event_to_state_ids.iteritems()
        }

        def redact_disallowed(event, state):
@@ -504,7 +506,7 @@ class FederationHandler(BaseHandler):
                    # membership states for the requesting server to determine
                    # if the server is either in the room or has been invited
                    # into the room.
-                    for ev in state.values():
+                    for ev in state.itervalues():
                        if ev.type != EventTypes.Member:
                            continue
                        try:
@@ -712,37 +714,15 @@ class FederationHandler(BaseHandler):
        defer.returnValue(events)

    @defer.inlineCallbacks
-    def maybe_backfill(self, room_id, current_depth):
+    def maybe_backfill(self, room_id, extremities):
        """Checks the database to see if we should backfill before paginating,
        and if so do.
+
+        Args:
+            room_id (str)
+            extremities (list[str]): List of event_ids to backfill from. These
+                should be event IDs that we don't yet have.
        """
-        extremities = yield self.store.get_oldest_events_with_depth_in_room(
-            room_id
-        )
-
-        if not extremities:
-            logger.debug("Not backfilling as no extremeties found.")
-            return
-
-        # Check if we reached a point where we should start backfilling.
-        sorted_extremeties_tuple = sorted(
-            extremities.items(),
-            key=lambda e: -int(e[1])
-        )
-        max_depth = sorted_extremeties_tuple[0][1]
-
-        # We don't want to specify too many extremities as it causes the backfill
-        # request URI to be too long.
-        extremities = dict(sorted_extremeties_tuple[:5])
-
-        if current_depth > max_depth:
-            logger.debug(
-                "Not backfilling as we don't need to. %d < %d",
-                max_depth, current_depth,
-            )
-            return
-
-        # Now we need to decide which hosts to hit first.

        # First we try hosts that are already in the room
        # TODO: HEURISTIC ALERT.
@@ -750,9 +730,19 @@ class FederationHandler(BaseHandler):
        curr_state = yield self.state_handler.get_current_state(room_id)

        def get_domains_from_state(state):
+            """Get joined domains from state
+
+            Args:
+                state (dict[tuple, FrozenEvent]): State map from type/state
+                    key to event.
+
+            Returns:
+                list[tuple[str, int]]: Returns a list of servers with the
+                lowest depth of their joins. Sorted by lowest depth first.
+            """
            joined_users = [
                (state_key, int(event.depth))
-                for (e_type, state_key), event in state.items()
+                for (e_type, state_key), event in state.iteritems()
                if e_type == EventTypes.Member
                and event.membership == Membership.JOIN
            ]
@@ -769,7 +759,7 @@ class FederationHandler(BaseHandler):
                except Exception:
                    pass

-            return sorted(joined_domains.items(), key=lambda d: d[1])
+            return sorted(joined_domains.iteritems(), key=lambda d: d[1])

        curr_domains = get_domains_from_state(curr_state)

@@ -786,7 +776,7 @@ class FederationHandler(BaseHandler):
                    yield self.backfill(
                        dom, room_id,
                        limit=100,
-                        extremities=[e for e in extremities.keys()]
+                        extremities=extremities,
                    )
                    # If this succeeded then we probably already have the
                    # appropriate stuff.
@@ -832,7 +822,7 @@ class FederationHandler(BaseHandler):
        tried_domains = set(likely_domains)
        tried_domains.add(self.server_name)

-        event_ids = list(extremities.keys())
+        event_ids = list(extremities)

        logger.debug("calling resolve_state_groups in _maybe_backfill")
        resolve = logcontext.preserve_fn(
@@ -842,31 +832,34 @@ class FederationHandler(BaseHandler):
            [resolve(room_id, [e]) for e in event_ids],
            consumeErrors=True,
        ))
+
+        # dict[str, dict[tuple, str]], a map from event_id to state map of
+        # event_ids.
        states = dict(zip(event_ids, [s.state for s in states]))

        state_map = yield self.store.get_events(
-            [e_id for ids in states.values() for e_id in ids],
+            [e_id for ids in states.itervalues() for e_id in ids.itervalues()],
            get_prev_content=False
        )
        states = {
            key: {
                k: state_map[e_id]
-                for k, e_id in state_dict.items()
+                for k, e_id in state_dict.iteritems()
                if e_id in state_map
-            } for key, state_dict in states.items()
+            } for key, state_dict in states.iteritems()
        }

-        for e_id, _ in sorted_extremeties_tuple:
+        for e_id in event_ids:
            likely_domains = get_domains_from_state(states[e_id])

            success = yield try_backfill([
-                dom for dom in likely_domains
+                dom for dom, _ in likely_domains
                if dom not in tried_domains
            ])
            if success:
                defer.returnValue(True)

-            tried_domains.update(likely_domains)
+            tried_domains.update(dom for dom, _ in likely_domains)

        defer.returnValue(False)

@@ -1180,6 +1173,13 @@ class FederationHandler(BaseHandler):
        if not self.is_mine_id(event.state_key):
            raise SynapseError(400, "The invite event must be for this server")

+        # block any attempts to invite the server notices mxid
+        if event.state_key == self._server_notices_mxid:
+            raise SynapseError(
+                http_client.FORBIDDEN,
+                "Cannot invite this user",
+            )
+
        event.internal_metadata.outlier = True
        event.internal_metadata.invite_from_remote = True

@@ -1367,7 +1367,7 @@ class FederationHandler(BaseHandler):
        )

        if state_groups:
-            _, state = state_groups.items().pop()
+            _, state = list(iteritems(state_groups)).pop()
            results = {
                (e.type, e.state_key): e for e in state
            }
@@ -2013,7 +2013,7 @@ class FederationHandler(BaseHandler):
                this will not be included in the current_state in the context.
        """
        state_updates = {
-            k: a.event_id for k, a in auth_events.iteritems()
+            k: a.event_id for k, a in iteritems(auth_events)
            if k != event_key
        }
        context.current_state_ids = dict(context.current_state_ids)
@@ -2023,7 +2023,7 @@ class FederationHandler(BaseHandler):
            context.delta_ids.update(state_updates)
        context.prev_state_ids = dict(context.prev_state_ids)
        context.prev_state_ids.update({
-            k: a.event_id for k, a in auth_events.iteritems()
+            k: a.event_id for k, a in iteritems(auth_events)
        })
        context.state_group = yield self.store.store_state_group(
            event.event_id,
@@ -2075,7 +2075,7 @@ class FederationHandler(BaseHandler):

        def get_next(it, opt=None):
            try:
-                return it.next()
+                return next(it)
            except Exception:
                return opt

--- a/synapse/handlers/groups_local.py
+++ b/synapse/handlers/groups_local.py
@@ -15,6 +15,7 @@
 # limitations under the License.

 from twisted.internet import defer
+from six import iteritems

 from synapse.api.errors import SynapseError
 from synapse.types import get_domain_from_id
@@ -449,7 +450,7 @@ class GroupsLocalHandler(object):

        results = {}
        failed_results = []
-        for destination, dest_user_ids in destinations.iteritems():
+        for destination, dest_user_ids in iteritems(destinations):
            try:
                r = yield self.transport_client.bulk_get_publicised_groups(
                    destination, list(dest_user_ids),
--- a/synapse/handlers/message.py
+++ b/synapse/handlers/message.py
@@ -19,11 +19,17 @@ import sys

 from canonicaljson import encode_canonical_json
 import six
+from six import string_types, itervalues, iteritems
 from twisted.internet import defer, reactor
+from twisted.internet.defer import succeed
 from twisted.python.failure import Failure

 from synapse.api.constants import EventTypes, Membership, MAX_DEPTH
-from synapse.api.errors import AuthError, Codes, SynapseError
+from synapse.api.errors import (
+    AuthError, Codes, SynapseError,
+    ConsentNotGivenError,
+)
+from synapse.api.urls import ConsentURIBuilder
 from synapse.crypto.event_signing import add_hashes_and_signatures
 from synapse.events.utils import serialize_event
 from synapse.events.validator import EventValidator
@@ -86,14 +92,14 @@ class MessageHandler(BaseHandler):
        # map from purge id to PurgeStatus
        self._purges_by_id = {}

-    def start_purge_history(self, room_id, topological_ordering,
+    def start_purge_history(self, room_id, token,
                            delete_local_events=False):
        """Start off a history purge on a room.

        Args:
            room_id (str): The room to purge from

-            topological_ordering (int): minimum topo ordering to preserve
+            token (str): topological token to delete events before
            delete_local_events (bool): True to delete local events as well as
                remote ones

@@ -115,19 +121,19 @@ class MessageHandler(BaseHandler):
        self._purges_by_id[purge_id] = PurgeStatus()
        run_in_background(
            self._purge_history,
-            purge_id, room_id, topological_ordering, delete_local_events,
+            purge_id, room_id, token, delete_local_events,
        )
        return purge_id

    @defer.inlineCallbacks
-    def _purge_history(self, purge_id, room_id, topological_ordering,
+    def _purge_history(self, purge_id, room_id, token,
                       delete_local_events):
        """Carry out a history purge on a room.

        Args:
            purge_id (str): The id for this purge
            room_id (str): The room to purge from
-            topological_ordering (int): minimum topo ordering to preserve
+            token (str): topological token to delete events before
            delete_local_events (bool): True to delete local events as well as
                remote ones

@@ -138,7 +144,7 @@ class MessageHandler(BaseHandler):
        try:
            with (yield self.pagination_lock.write(room_id)):
                yield self.store.purge_history(
-                    room_id, topological_ordering, delete_local_events,
+                    room_id, token, delete_local_events,
                )
            logger.info("[purge] complete")
            self._purges_by_id[purge_id].status = PurgeStatus.STATUS_COMPLETE
@@ -205,31 +211,19 @@ class MessageHandler(BaseHandler):
            )

            if source_config.direction == 'b':
-                # if we're going backwards, we might need to backfill. This
-                # requires that we have a topo token.
-                if room_token.topological:
-                    max_topo = room_token.topological
-                else:
-                    max_topo = yield self.store.get_max_topological_token(
-                        room_id, room_token.stream
-                    )
-
                if membership == Membership.LEAVE:
                    # If they have left the room then clamp the token to be before
                    # they left the room, to save the effort of loading from the
                    # database.
+
                    leave_token = yield self.store.get_topological_token_for_event(
-                        member_event_id
+                        member_event_id,
+                    )
+                    source_config.from_key = yield self.store.clamp_token_before(
+                        room_id, source_config.from_key, leave_token,
                    )
-                    leave_token = RoomStreamToken.parse(leave_token)
-                    if leave_token.topological < max_topo:
-                        source_config.from_key = str(leave_token)

-                yield self.hs.get_handlers().federation_handler.maybe_backfill(
-                    room_id, max_topo
-                )
-
-            events, next_key = yield self.store.paginate_room_events(
+            events, next_key, extremities = yield self.store.paginate_room_events(
                room_id=room_id,
                from_key=source_config.from_key,
                to_key=source_config.to_key,
@@ -238,6 +232,20 @@ class MessageHandler(BaseHandler):
                event_filter=event_filter,
            )

+            if source_config.direction == 'b' and extremities:
+                yield self.hs.get_handlers().federation_handler.maybe_backfill(
+                    room_id, extremities
+                )
+
+                events, next_key, extremities = yield self.store.paginate_room_events(
+                    room_id=room_id,
+                    from_key=source_config.from_key,
+                    to_key=source_config.to_key,
+                    direction=source_config.direction,
+                    limit=source_config.limit,
+                    event_filter=event_filter,
+                )
+
            next_token = pagin_config.from_token.copy_and_replace(
                "room_key", next_key
            )
@@ -397,7 +405,7 @@ class MessageHandler(BaseHandler):
                "avatar_url": profile.avatar_url,
                "display_name": profile.display_name,
            }
-            for user_id, profile in users_with_profile.iteritems()
+            for user_id, profile in iteritems(users_with_profile)
        })


@@ -431,6 +439,9 @@ class EventCreationHandler(object):

        self.spam_checker = hs.get_spam_checker()

+        if self.config.block_events_without_consent_error is not None:
+            self._consent_uri_builder = ConsentURIBuilder(self.config)
+
    @defer.inlineCallbacks
    def create_event(self, requester, event_dict, token_id=None, txn_id=None,
                     prev_events_and_hashes=None):
@@ -482,6 +493,10 @@ class EventCreationHandler(object):
                        target, e
                    )

+        is_exempt = yield self._is_exempt_from_privacy_policy(builder)
+        if not is_exempt:
+            yield self.assert_accepted_privacy_policy(requester)
+
        if token_id is not None:
            builder.internal_metadata.token_id = token_id

@@ -496,6 +511,83 @@ class EventCreationHandler(object):

        defer.returnValue((event, context))

+    def _is_exempt_from_privacy_policy(self, builder):
+        """"Determine if an event to be sent is exempt from having to consent
+        to the privacy policy
+
+        Args:
+            builder (synapse.events.builder.EventBuilder): event being created
+
+        Returns:
+            Deferred[bool]: true if the event can be sent without the user
+                consenting
+        """
+        # the only thing the user can do is join the server notices room.
+        if builder.type == EventTypes.Member:
+            membership = builder.content.get("membership", None)
+            if membership == Membership.JOIN:
+                return self._is_server_notices_room(builder.room_id)
+        return succeed(False)
+
+    @defer.inlineCallbacks
+    def _is_server_notices_room(self, room_id):
+        if self.config.server_notices_mxid is None:
+            defer.returnValue(False)
+        user_ids = yield self.store.get_users_in_room(room_id)
+        defer.returnValue(self.config.server_notices_mxid in user_ids)
+
+    @defer.inlineCallbacks
+    def assert_accepted_privacy_policy(self, requester):
+        """Check if a user has accepted the privacy policy
+
+        Called when the given user is about to do something that requires
+        privacy consent. We see if the user is exempt and otherwise check that
+        they have given consent. If they have not, a ConsentNotGiven error is
+        raised.
+
+        Args:
+            requester (synapse.types.Requester):
+                The user making the request
+
+        Returns:
+            Deferred[None]: returns normally if the user has consented or is
+                exempt
+
+        Raises:
+            ConsentNotGivenError: if the user has not given consent yet
+        """
+        if self.config.block_events_without_consent_error is None:
+            return
+
+        # exempt AS users from needing consent
+        if requester.app_service is not None:
+            return
+
+        user_id = requester.user.to_string()
+
+        # exempt the system notices user
+        if (
+            self.config.server_notices_mxid is not None and
+            user_id == self.config.server_notices_mxid
+        ):
+            return
+
+        u = yield self.store.get_user_by_id(user_id)
+        assert u is not None
+        if u["consent_version"] == self.config.user_consent_version:
+            return
+
+        consent_uri = self._consent_uri_builder.build_user_consent_uri(
+            requester.user.localpart,
+        )
+        msg = self.config.block_events_without_consent_error % {
+            'consent_uri': consent_uri,
+        }
+        raise ConsentNotGivenError(
+            msg=msg,
+            consent_uri=consent_uri,
+        )
+
    @defer.inlineCallbacks
    def send_nonmember_event(self, requester, event, context, ratelimit=True):
        """
@@ -578,7 +670,7 @@ class EventCreationHandler(object):

            spam_error = self.spam_checker.check_event_for_spam(event)
            if spam_error:
-                if not isinstance(spam_error, basestring):
+                if not isinstance(spam_error, string_types):
                    spam_error = "Spam is not permitted here"
                raise SynapseError(
                    403, spam_error, Codes.FORBIDDEN
@@ -792,7 +884,7 @@ class EventCreationHandler(object):

                state_to_include_ids = [
                    e_id
-                    for k, e_id in context.current_state_ids.iteritems()
+                    for k, e_id in iteritems(context.current_state_ids)
                    if k[0] in self.hs.config.room_invite_state_types
                    or k == (EventTypes.Member, event.sender)
                ]
@@ -806,7 +898,7 @@ class EventCreationHandler(object):
                        "content": e.content,
                        "sender": e.sender,
                    }
-                    for e in state_to_include.itervalues()
+                    for e in itervalues(state_to_include)
                ]

                invitee = UserID.from_string(event.state_key)
--- a/synapse/handlers/presence.py
+++ b/synapse/handlers/presence.py
@@ -25,6 +25,8 @@ The methods that define policy are:
 from twisted.internet import defer, reactor
 from contextlib import contextmanager

+from six import itervalues, iteritems
+
 from synapse.api.errors import SynapseError
 from synapse.api.constants import PresenceState
 from synapse.storage.presence import UserPresenceState
@@ -40,7 +42,6 @@ import synapse.metrics

 import logging

-
 logger = logging.getLogger(__name__)

 metrics = synapse.metrics.get_metrics_for(__name__)
@@ -87,6 +88,11 @@ assert LAST_ACTIVE_GRANULARITY < IDLE_TIMER
 class PresenceHandler(object):

    def __init__(self, hs):
+        """
+
+        Args:
+            hs (synapse.server.HomeServer):
+        """
        self.is_mine = hs.is_mine
        self.is_mine_id = hs.is_mine_id
        self.clock = hs.get_clock()
@@ -94,7 +100,6 @@ class PresenceHandler(object):
        self.wheel_timer = WheelTimer()
        self.notifier = hs.get_notifier()
        self.federation = hs.get_federation_sender()
-
        self.state = hs.get_state_handler()

        federation_registry = hs.get_federation_registry()
@@ -463,61 +468,6 @@ class PresenceHandler(object):
            syncing_user_ids.update(user_ids)
        return syncing_user_ids

-    @defer.inlineCallbacks
-    def update_external_syncs(self, process_id, syncing_user_ids):
-        """Update the syncing users for an external process
-
-        Args:
-            process_id(str): An identifier for the process the users are
-                syncing against. This allows synapse to process updates
-                as user start and stop syncing against a given process.
-            syncing_user_ids(set(str)): The set of user_ids that are
-                currently syncing on that server.
-        """
-
-        # Grab the previous list of user_ids that were syncing on that process
-        prev_syncing_user_ids = (
-            self.external_process_to_current_syncs.get(process_id, set())
-        )
-        # Grab the current presence state for both the users that are syncing
-        # now and the users that were syncing before this update.
-        prev_states = yield self.current_state_for_users(
-            syncing_user_ids | prev_syncing_user_ids
-        )
-        updates = []
-        time_now_ms = self.clock.time_msec()
-
-        # For each new user that is syncing check if we need to mark them as
-        # being online.
-        for new_user_id in syncing_user_ids - prev_syncing_user_ids:
-            prev_state = prev_states[new_user_id]
-            if prev_state.state == PresenceState.OFFLINE:
-                updates.append(prev_state.copy_and_replace(
-                    state=PresenceState.ONLINE,
-                    last_active_ts=time_now_ms,
-                    last_user_sync_ts=time_now_ms,
-                ))
-            else:
-                updates.append(prev_state.copy_and_replace(
-                    last_user_sync_ts=time_now_ms,
-                ))
-
-        # For each user that is still syncing or stopped syncing update the
-        # last sync time so that we will correctly apply the grace period when
-        # they stop syncing.
-        for old_user_id in prev_syncing_user_ids:
-            prev_state = prev_states[old_user_id]
-            updates.append(prev_state.copy_and_replace(
-                last_user_sync_ts=time_now_ms,
-            ))
-
-        yield self._update_states(updates)
-
-        # Update the last updated time for the process. We expire the entries
-        # if we don't receive an update in the given timeframe.
-        self.external_process_last_updated_ms[process_id] = self.clock.time_msec()
-        self.external_process_to_current_syncs[process_id] = syncing_user_ids
-
    @defer.inlineCallbacks
    def update_external_syncs_row(self, process_id, user_id, is_syncing, sync_time_msec):
        """Update the syncing users for an external process as a delta.
@@ -581,7 +531,7 @@ class PresenceHandler(object):
                prev_state.copy_and_replace(
                    last_user_sync_ts=time_now_ms,
                )
-                for prev_state in prev_states.itervalues()
+                for prev_state in itervalues(prev_states)
            ])
            self.external_process_last_updated_ms.pop(process_id, None)

@@ -604,14 +554,14 @@ class PresenceHandler(object):
            for user_id in user_ids
        }

-        missing = [user_id for user_id, state in states.iteritems() if not state]
+        missing = [user_id for user_id, state in iteritems(states) if not state]
        if missing:
            # There are things not in our in memory cache. Lets pull them out of
            # the database.
            res = yield self.store.get_presence_for_users(missing)
            states.update(res)

-            missing = [user_id for user_id, state in states.iteritems() if not state]
+            missing = [user_id for user_id, state in iteritems(states) if not state]
            if missing:
                new = {
                    user_id: UserPresenceState.default(user_id)
@@ -1099,7 +1049,7 @@ class PresenceEventSource(object):
            defer.returnValue((updates.values(), max_token))
        else:
            defer.returnValue(([
-                s for s in updates.itervalues()
+                s for s in itervalues(updates)
                if s.state != PresenceState.OFFLINE
            ], max_token))

@@ -1356,11 +1306,11 @@ def get_interested_remotes(store, states, state_handler):
    # hosts in those rooms.
    room_ids_to_states, users_to_states = yield get_interested_parties(store, states)

-    for room_id, states in room_ids_to_states.iteritems():
+    for room_id, states in iteritems(room_ids_to_states):
        hosts = yield state_handler.get_current_hosts_in_room(room_id)
        hosts_and_states.append((hosts, states))

-    for user_id, states in users_to_states.iteritems():
+    for user_id, states in iteritems(users_to_states):
        host = get_domain_from_id(user_id)
        hosts_and_states.append(([host], states))

--- a/synapse/handlers/register.py
+++ b/synapse/handlers/register.py
@@ -34,6 +34,11 @@ logger = logging.getLogger(__name__)
 class RegistrationHandler(BaseHandler):

    def __init__(self, hs):
+        """
+
+        Args:
+            hs (synapse.server.HomeServer):
+        """
        super(RegistrationHandler, self).__init__(hs)

        self.auth = hs.get_auth()
@@ -49,6 +54,7 @@ class RegistrationHandler(BaseHandler):
        self._generate_user_id_linearizer = Linearizer(
            name="_generate_user_id_linearizer",
        )
+        self._server_notices_mxid = hs.config.server_notices_mxid

    @defer.inlineCallbacks
    def check_username(self, localpart, guest_access_token=None,
@@ -338,6 +344,14 @@ class RegistrationHandler(BaseHandler):
            yield identity_handler.bind_threepid(c, user_id)

    def check_user_id_not_appservice_exclusive(self, user_id, allowed_appservice=None):
+        # don't allow people to register the server notices mxid
+        if self._server_notices_mxid is not None:
+            if user_id == self._server_notices_mxid:
+                raise SynapseError(
+                    400, "This user ID is reserved.",
+                    errcode=Codes.EXCLUSIVE
+                )
+
        # valid user IDs must not clash with any user ID namespaces claimed by
        # application services.
        services = self.store.get_app_services()
--- a/synapse/handlers/room.py
+++ b/synapse/handlers/room.py
@@ -68,14 +68,27 @@ class RoomCreationHandler(BaseHandler):
        self.event_creation_handler = hs.get_event_creation_handler()

    @defer.inlineCallbacks
-    def create_room(self, requester, config, ratelimit=True):
+    def create_room(self, requester, config, ratelimit=True,
+                    creator_join_profile=None):
        """ Creates a new room.

        Args:
-            requester (Requester): The user who requested the room creation.
+            requester (synapse.types.Requester):
+                The user who requested the room creation.
            config (dict) : A dict of configuration options.
+            ratelimit (bool): set to False to disable the rate limiter
+
+            creator_join_profile (dict|None):
+                Set to override the displayname and avatar for the creating
+                user in this room. If unset, displayname and avatar will be
+                derived from the user's profile. If set, should contain the
+                values to go in the body of the 'join' event (typically
+                `avatar_url` and/or `displayname`.
+
        Returns:
-            The new room ID.
+            Deferred[dict]:
+                a dict containing the keys `room_id` and, if an alias was
+                requested, `room_alias`.
        Raises:
            SynapseError if the room ID couldn't be stored, or something went
            horribly wrong.
@@ -113,6 +126,10 @@ class RoomCreationHandler(BaseHandler):
            except Exception:
                raise SynapseError(400, "Invalid user_id: %s" % (i,))

+        yield self.event_creation_handler.assert_accepted_privacy_policy(
+            requester,
+        )
+
        invite_3pid_list = config.get("invite_3pid", [])

        visibility = config.get("visibility", None)
@@ -176,7 +193,8 @@ class RoomCreationHandler(BaseHandler):
            initial_state=initial_state,
            creation_content=creation_content,
            room_alias=room_alias,
-            power_level_content_override=config.get("power_level_content_override", {})
+            power_level_content_override=config.get("power_level_content_override", {}),
+            creator_join_profile=creator_join_profile,
        )

        if "name" in config:
@@ -256,6 +274,7 @@ class RoomCreationHandler(BaseHandler):
            creation_content,
            room_alias,
            power_level_content_override,
+            creator_join_profile,
    ):
        def create(etype, content, **kwargs):
            e = {
@@ -299,6 +318,7 @@ class RoomCreationHandler(BaseHandler):
            room_id,
            "join",
            ratelimit=False,
+            content=creator_join_profile,
        )

        # We treat the power levels override specially as this needs to be one
@@ -514,7 +534,7 @@ class RoomEventSource(object):

    @defer.inlineCallbacks
    def get_pagination_rows(self, user, config, key):
-        events, next_key = yield self.store.paginate_room_events(
+        events, next_key, _ = yield self.store.paginate_room_events(
            room_id=key,
            from_key=config.from_key,
            to_key=config.to_key,
--- a/synapse/handlers/room_member.py
+++ b/synapse/handlers/room_member.py
@@ -17,11 +17,14 @@
 import abc
 import logging

+from six.moves import http_client
+
 from signedjson.key import decode_verify_key_bytes
 from signedjson.sign import verify_signed_json
 from twisted.internet import defer
 from unpaddedbase64 import decode_base64

+import synapse.server
 import synapse.types
 from synapse.api.constants import (
    EventTypes, Membership,
@@ -46,6 +49,11 @@ class RoomMemberHandler(object):
    __metaclass__ = abc.ABCMeta

    def __init__(self, hs):
+        """
+
+        Args:
+            hs (synapse.server.HomeServer):
+        """
        self.hs = hs
        self.store = hs.get_datastore()
        self.auth = hs.get_auth()
@@ -63,6 +71,7 @@ class RoomMemberHandler(object):

        self.clock = hs.get_clock()
        self.spam_checker = hs.get_spam_checker()
+        self._server_notices_mxid = self.config.server_notices_mxid

    @abc.abstractmethod
    def _remote_join(self, requester, remote_room_hosts, room_id, user, content):
@@ -289,12 +298,37 @@ class RoomMemberHandler(object):
            is_blocked = yield self.store.is_room_blocked(room_id)
            if is_blocked:
                raise SynapseError(403, "This room has been blocked on this server")
+        else:
+            # we don't allow people to reject invites to, or leave, the
+            # server notice room.
+            is_blocked = yield self._is_server_notice_room(room_id)
+            if is_blocked:
+                raise SynapseError(
+                    http_client.FORBIDDEN,
+                    "You cannot leave this room",
+                    errcode=Codes.CANNOT_LEAVE_SERVER_NOTICE_ROOM,
+                )
+
+        if effective_membership_state == Membership.INVITE:
+            # block any attempts to invite the server notices mxid
+            if target.to_string() == self._server_notices_mxid:
+                raise SynapseError(
+                    http_client.FORBIDDEN,
+                    "Cannot invite this user",
+                )

-        if effective_membership_state == "invite":
            block_invite = False
-            is_requester_admin = yield self.auth.is_server_admin(
-                requester.user,
-            )
+
+            if (self._server_notices_mxid is not None and
+                    requester.user.to_string() == self._server_notices_mxid):
+                # allow the server notices mxid to send invites
+                is_requester_admin = True
+
+            else:
+                is_requester_admin = yield self.auth.is_server_admin(
+                    requester.user,
+                )
+
            if not is_requester_admin:
                if self.config.block_non_admin_invites:
                    logger.info(
@@ -844,6 +878,13 @@ class RoomMemberHandler(object):

        defer.returnValue(False)

+    @defer.inlineCallbacks
+    def _is_server_notice_room(self, room_id):
+        if self._server_notices_mxid is None:
+            defer.returnValue(False)
+        user_ids = yield self.store.get_users_in_room(room_id)
+        defer.returnValue(self._server_notices_mxid in user_ids)
+

 class RoomMemberMasterHandler(RoomMemberHandler):
    def __init__(self, hs):
--- a/synapse/handlers/sync.py
+++ b/synapse/handlers/sync.py
@@ -28,6 +28,8 @@ import collections
 import logging
 import itertools

+from six import itervalues, iteritems
+
 logger = logging.getLogger(__name__)


@@ -275,7 +277,7 @@ class SyncHandler(object):
                # result returned by the event source is poor form (it might cache
                # the object)
                room_id = event["room_id"]
-                event_copy = {k: v for (k, v) in event.iteritems()
+                event_copy = {k: v for (k, v) in iteritems(event)
                              if k != "room_id"}
                ephemeral_by_room.setdefault(room_id, []).append(event_copy)

@@ -294,7 +296,7 @@ class SyncHandler(object):
            for event in receipts:
                room_id = event["room_id"]
                # exclude room id, as above
-                event_copy = {k: v for (k, v) in event.iteritems()
+                event_copy = {k: v for (k, v) in iteritems(event)
                              if k != "room_id"}
                ephemeral_by_room.setdefault(room_id, []).append(event_copy)

@@ -325,7 +327,7 @@ class SyncHandler(object):
                current_state_ids = frozenset()
                if any(e.is_state() for e in recents):
                    current_state_ids = yield self.state.get_current_state_ids(room_id)
-                    current_state_ids = frozenset(current_state_ids.itervalues())
+                    current_state_ids = frozenset(itervalues(current_state_ids))

                recents = yield filter_events_for_client(
                    self.store,
@@ -382,7 +384,7 @@ class SyncHandler(object):
                current_state_ids = frozenset()
                if any(e.is_state() for e in loaded_recents):
                    current_state_ids = yield self.state.get_current_state_ids(room_id)
-                    current_state_ids = frozenset(current_state_ids.itervalues())
+                    current_state_ids = frozenset(itervalues(current_state_ids))

                loaded_recents = yield filter_events_for_client(
                    self.store,
@@ -984,7 +986,7 @@ class SyncHandler(object):
        if since_token:
            for joined_sync in sync_result_builder.joined:
                it = itertools.chain(
-                    joined_sync.timeline.events, joined_sync.state.itervalues()
+                    joined_sync.timeline.events, itervalues(joined_sync.state)
                )
                for event in it:
                    if event.type == EventTypes.Member:
@@ -1062,7 +1064,7 @@ class SyncHandler(object):
        newly_left_rooms = []
        room_entries = []
        invited = []
-        for room_id, events in mem_change_events_by_room_id.iteritems():
+        for room_id, events in iteritems(mem_change_events_by_room_id):
            non_joins = [e for e in events if e.membership != Membership.JOIN]
            has_join = len(non_joins) != len(events)

--- a/synapse/handlers/user_directory.py
+++ b/synapse/handlers/user_directory.py
@@ -22,6 +22,7 @@ from synapse.util.metrics import Measure
 from synapse.util.async import sleep
 from synapse.types import get_localpart_from_id

+from six import iteritems

 logger = logging.getLogger(__name__)

@@ -122,6 +123,13 @@ class UserDirectoryHandler(object):
            user_id, profile.display_name, profile.avatar_url, None,
        )

+    @defer.inlineCallbacks
+    def handle_user_deactivated(self, user_id):
+        """Called when a user ID is deactivated
+        """
+        yield self.store.remove_from_user_dir(user_id)
+        yield self.store.remove_from_user_in_public_room(user_id)
+
    @defer.inlineCallbacks
    def _unsafe_process(self):
        # If self.pos is None then means we haven't fetched it from DB
@@ -403,7 +411,7 @@ class UserDirectoryHandler(object):

        if change:
            users_with_profile = yield self.state.get_current_user_in_room(room_id)
-            for user_id, profile in users_with_profile.iteritems():
+            for user_id, profile in iteritems(users_with_profile):
                yield self._handle_new_user(room_id, user_id, profile)
        else:
            users = yield self.store.get_users_in_public_due_to_room(room_id)
--- a/synapse/http/matrixfederationclient.py
+++ b/synapse/http/matrixfederationclient.py
@@ -42,6 +42,8 @@ import random
 import sys
 import urllib
 from six.moves.urllib import parse as urlparse
+from six import string_types
+

 logger = logging.getLogger(__name__)
 outbound_logger = logging.getLogger("synapse.http.outbound")
@@ -553,7 +555,7 @@ class MatrixFederationHttpClient(object):

        encoded_args = {}
        for k, vs in args.items():
-            if isinstance(vs, basestring):
+            if isinstance(vs, string_types):
                vs = [vs]
            encoded_args[k] = [v.encode("UTF-8") for v in vs]

@@ -668,7 +670,7 @@ def check_content_type_is_json(headers):
        RuntimeError if the

    """
-    c_type = headers.getRawHeaders("Content-Type")
+    c_type = headers.getRawHeaders(b"Content-Type")
    if c_type is None:
        raise RuntimeError(
            "No Content-Type header"
@@ -685,7 +687,7 @@ def check_content_type_is_json(headers):
 def encode_query_args(args):
    encoded_args = {}
    for k, vs in args.items():
-        if isinstance(vs, basestring):
+        if isinstance(vs, string_types):
            vs = [vs]
        encoded_args[k] = [v.encode("UTF-8") for v in vs]

--- a/synapse/http/request_metrics.py
+++ b/synapse/http/request_metrics.py
@@ -98,14 +98,87 @@ response_size = metrics.register_counter(
    "response_size", labels=["method", "servlet", "tag"]
 )

+# In flight metrics are incremented while the requests are in flight, rather
+# than when the response was written.
+
+in_flight_requests_ru_utime = metrics.register_counter(
+    "in_flight_requests_ru_utime_seconds", labels=["method", "servlet"],
+)
+
+in_flight_requests_ru_stime = metrics.register_counter(
+    "in_flight_requests_ru_stime_seconds", labels=["method", "servlet"],
+)
+
+in_flight_requests_db_txn_count = metrics.register_counter(
+    "in_flight_requests_db_txn_count", labels=["method", "servlet"],
+)
+
+# seconds spent waiting for db txns, excluding scheduling time, when processing
+# this request
+in_flight_requests_db_txn_duration = metrics.register_counter(
+    "in_flight_requests_db_txn_duration_seconds", labels=["method", "servlet"],
+)
+
+# seconds spent waiting for a db connection, when processing this request
+in_flight_requests_db_sched_duration = metrics.register_counter(
+    "in_flight_requests_db_sched_duration_seconds", labels=["method", "servlet"]
+)
+
+
+# The set of all in flight requests, set[RequestMetrics]
+_in_flight_requests = set()
+
+
+def _collect_in_flight():
+    """Called just before metrics are collected, so we use it to update all
+    the in flight request metrics
+    """
+
+    for rm in _in_flight_requests:
+        rm.update_metrics()
+
+
+metrics.register_collector(_collect_in_flight)
+
+
+def _get_in_flight_counts():
+    """Returns a count of all in flight requests by (method, server_name)
+
+    Returns:
+        dict[tuple[str, str], int]
+    """
+
+    # Map from (method, name) -> int, the number of in flight requests of that
+    # type
+    counts = {}
+    for rm in _in_flight_requests:
+        key = (rm.method, rm.name,)
+        counts[key] = counts.get(key, 0) + 1
+
+    return counts
+
+
+metrics.register_callback(
+    "in_flight_requests_count",
+    _get_in_flight_counts,
+    labels=["method", "servlet"]
+)
+

 class RequestMetrics(object):
-    def start(self, time_msec, name):
+    def start(self, time_msec, name, method):
        self.start = time_msec
        self.start_context = LoggingContext.current_context()
        self.name = name
+        self.method = method
+
+        self._request_stats = _RequestStats.from_context(self.start_context)
+
+        _in_flight_requests.add(self)

    def stop(self, time_msec, request):
+        _in_flight_requests.discard(self)
+
        context = LoggingContext.current_context()

        tag = ""
@@ -147,3 +220,88 @@ class RequestMetrics(object):
        )

        response_size.inc_by(request.sentLength, request.method, self.name, tag)
+
+        # We always call this at the end to ensure that we update the metrics
+        # regardless of whether a call to /metrics while the request was in
+        # flight.
+        self.update_metrics()
+
+    def update_metrics(self):
+        """Updates the in flight metrics with values from this request.
+        """
+
+        diff = self._request_stats.update(self.start_context)
+
+        in_flight_requests_ru_utime.inc_by(
+            diff.ru_utime, self.method, self.name,
+        )
+
+        in_flight_requests_ru_stime.inc_by(
+            diff.ru_stime, self.method, self.name,
+        )
+
+        in_flight_requests_db_txn_count.inc_by(
+            diff.db_txn_count, self.method, self.name,
+        )
+
+        in_flight_requests_db_txn_duration.inc_by(
+            diff.db_txn_duration_ms / 1000., self.method, self.name,
+        )
+
+        in_flight_requests_db_sched_duration.inc_by(
+            diff.db_sched_duration_ms / 1000., self.method, self.name,
+        )
+
+
+class _RequestStats(object):
+    """Keeps tracks of various metrics for an in flight request.
+    """
+
+    __slots__ = [
+        "ru_utime", "ru_stime",
+        "db_txn_count", "db_txn_duration_ms", "db_sched_duration_ms",
+    ]
+
+    def __init__(self, ru_utime, ru_stime, db_txn_count,
+                 db_txn_duration_ms, db_sched_duration_ms):
+        self.ru_utime = ru_utime
+        self.ru_stime = ru_stime
+        self.db_txn_count = db_txn_count
+        self.db_txn_duration_ms = db_txn_duration_ms
+        self.db_sched_duration_ms = db_sched_duration_ms
+
+    @staticmethod
+    def from_context(context):
+        ru_utime, ru_stime = context.get_resource_usage()
+
+        return _RequestStats(
+            ru_utime, ru_stime,
+            context.db_txn_count,
+            context.db_txn_duration_ms,
+            context.db_sched_duration_ms,
+        )
+
+    def update(self, context):
+        """Updates the current values and returns the difference between the
+        old and new values.
+
+        Returns:
+            _RequestStats: The difference between the old and new values
+        """
+        new = _RequestStats.from_context(context)
+
+        diff = _RequestStats(
+            new.ru_utime - self.ru_utime,
+            new.ru_stime - self.ru_stime,
+            new.db_txn_count - self.db_txn_count,
+            new.db_txn_duration_ms - self.db_txn_duration_ms,
+            new.db_sched_duration_ms - self.db_sched_duration_ms,
+        )
+
+        self.ru_utime = new.ru_utime
+        self.ru_stime = new.ru_stime
+        self.db_txn_count = new.db_txn_count
+        self.db_txn_duration_ms = new.db_txn_duration_ms
+        self.db_sched_duration_ms = new.db_sched_duration_ms
+
+        return diff
--- a/synapse/http/server.py
+++ b/synapse/http/server.py
@@ -13,7 +13,8 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-
+import cgi
+from six.moves import http_client

 from synapse.api.errors import (
    cs_exception, SynapseError, CodeMessageException, UnrecognizedRequestError, Codes
@@ -44,6 +45,18 @@ import simplejson

 logger = logging.getLogger(__name__)

+HTML_ERROR_TEMPLATE = """<!DOCTYPE html>
+<html lang=en>
+  <head>
+    <meta charset="utf-8">
+    <title>Error {code}</title>
+  </head>
+  <body>
+     <p>{msg}</p>
+  </body>
+</html>
+"""
+

 def wrap_json_request_handler(h):
    """Wraps a request handler method with exception handling.
@@ -102,6 +115,65 @@ def wrap_json_request_handler(h):
    return wrap_request_handler_with_logging(wrapped_request_handler)


+def wrap_html_request_handler(h):
+    """Wraps a request handler method with exception handling.
+
+    Also adds logging as per wrap_request_handler_with_logging.
+
+    The handler method must have a signature of "handle_foo(self, request)",
+    where "self" must have a "clock" attribute (and "request" must be a
+    SynapseRequest).
+    """
+    def wrapped_request_handler(self, request):
+        d = defer.maybeDeferred(h, self, request)
+        d.addErrback(_return_html_error, request)
+        return d
+
+    return wrap_request_handler_with_logging(wrapped_request_handler)
+
+
+def _return_html_error(f, request):
+    """Sends an HTML error page corresponding to the given failure
+
+    Args:
+        f (twisted.python.failure.Failure):
+        request (twisted.web.iweb.IRequest):
+    """
+    if f.check(CodeMessageException):
+        cme = f.value
+        code = cme.code
+        msg = cme.msg
+
+        if isinstance(cme, SynapseError):
+            logger.info(
+                "%s SynapseError: %s - %s", request, code, msg
+            )
+        else:
+            logger.error(
+                "Failed handle request %r: %s",
+                request,
+                f.getTraceback().rstrip(),
+            )
+    else:
+        code = http_client.INTERNAL_SERVER_ERROR
+        msg = "Internal server error"
+
+        logger.error(
+            "Failed handle request %r: %s",
+            request,
+            f.getTraceback().rstrip(),
+        )
+
+    body = HTML_ERROR_TEMPLATE.format(
+        code=code, msg=cgi.escape(msg),
+    ).encode("utf-8")
+    request.setResponseCode(code)
+    request.setHeader(b"Content-Type", b"text/html; charset=utf-8")
+    request.setHeader(b"Content-Length", b"%i" % (len(body),))
+    request.write(body)
+    finish_request(request)
+
+
 def wrap_request_handler_with_logging(h):
    """Wraps a request handler to provide logging and metrics

@@ -132,7 +204,7 @@ def wrap_request_handler_with_logging(h):
                servlet_name = self.__class__.__name__
                with request.processing(servlet_name):
                    with PreserveLoggingContext(request_context):
-                        d = h(self, request)
+                        d = defer.maybeDeferred(h, self, request)

                        # record the arrival of the request *after*
                        # dispatching to the handler, so that the handler
--- a/synapse/http/site.py
+++ b/synapse/http/site.py
@@ -56,7 +56,7 @@ class SynapseRequest(Request):

    def __repr__(self):
        # We overwrite this so that we don't log ``access_token``
-        return '<%s at 0x%x method=%s uri=%s clientproto=%s site=%s>' % (
+        return '<%s at 0x%x method=%r uri=%r clientproto=%r site=%r>' % (
            self.__class__.__name__,
            id(self),
            self.method,
@@ -85,7 +85,9 @@ class SynapseRequest(Request):
    def _started_processing(self, servlet_name):
        self.start_time = int(time.time() * 1000)
        self.request_metrics = RequestMetrics()
-        self.request_metrics.start(self.start_time, name=servlet_name)
+        self.request_metrics.start(
+            self.start_time, name=servlet_name, method=self.method,
+        )

        self.site.access_logger.info(
            "%s - %s - Received request: %s %s",
--- a/synapse/metrics/process_collector.py
+++ b/synapse/metrics/process_collector.py
@@ -15,6 +15,7 @@

 import os

+from six import iteritems

 TICKS_PER_SEC = 100
 BYTES_PER_PAGE = 4096
@@ -55,7 +56,7 @@ def update_resource_metrics():
            # line is PID (command) more stats go here ...
            raw_stats = line.split(") ", 1)[1].split(" ")

-            for (name, index) in STAT_FIELDS.iteritems():
+            for (name, index) in iteritems(STAT_FIELDS):
                # subtract 3 from the index, because proc(5) is 1-based, and
                # we've lost the first two fields in PID and COMMAND above
                stats[name] = int(raw_stats[index - 3])
--- a/synapse/push/bulk_push_rule_evaluator.py
+++ b/synapse/push/bulk_push_rule_evaluator.py
@@ -30,6 +30,7 @@ from synapse.state import POWER_KEY

 from collections import namedtuple

+from six import itervalues, iteritems

 logger = logging.getLogger(__name__)

@@ -126,7 +127,7 @@ class BulkPushRuleEvaluator(object):
            )
            auth_events = yield self.store.get_events(auth_events_ids)
            auth_events = {
-                (e.type, e.state_key): e for e in auth_events.itervalues()
+                (e.type, e.state_key): e for e in itervalues(auth_events)
            }

        sender_level = get_user_power_level(event.sender, auth_events)
@@ -160,7 +161,7 @@ class BulkPushRuleEvaluator(object):

        condition_cache = {}

-        for uid, rules in rules_by_user.iteritems():
+        for uid, rules in iteritems(rules_by_user):
            if event.sender == uid:
                continue

@@ -406,7 +407,7 @@ class RulesForRoom(object):
        # If the event is a join event then it will be in current state evnts
        # map but not in the DB, so we have to explicitly insert it.
        if event.type == EventTypes.Member:
-            for event_id in member_event_ids.itervalues():
+            for event_id in itervalues(member_event_ids):
                if event_id == event.event_id:
                    members[event_id] = (event.state_key, event.membership)

@@ -414,7 +415,7 @@ class RulesForRoom(object):
            logger.debug("Found members %r: %r", self.room_id, members.values())

        interested_in_user_ids = set(
-            user_id for user_id, membership in members.itervalues()
+            user_id for user_id, membership in itervalues(members)
            if membership == Membership.JOIN
        )

@@ -426,7 +427,7 @@ class RulesForRoom(object):
        )

        user_ids = set(
-            uid for uid, have_pusher in if_users_with_pushers.iteritems() if have_pusher
+            uid for uid, have_pusher in iteritems(if_users_with_pushers) if have_pusher
        )

        logger.debug("With pushers: %r", user_ids)
@@ -447,7 +448,7 @@ class RulesForRoom(object):
        )

        ret_rules_by_user.update(
-            item for item in rules_by_user.iteritems() if item[0] is not None
+            item for item in iteritems(rules_by_user) if item[0] is not None
        )

        self.update_cache(sequence, members, ret_rules_by_user, state_group)
--- a/synapse/push/push_rule_evaluator.py
+++ b/synapse/push/push_rule_evaluator.py
@@ -21,6 +21,8 @@ from synapse.types import UserID
 from synapse.util.caches import CACHE_SIZE_FACTOR, register_cache
 from synapse.util.caches.lrucache import LruCache

+from six import string_types
+
 logger = logging.getLogger(__name__)


@@ -238,7 +240,7 @@ def _flatten_dict(d, prefix=[], result=None):
    if result is None:
        result = {}
    for key, value in d.items():
-        if isinstance(value, basestring):
+        if isinstance(value, string_types):
            result[".".join(prefix + [key])] = value.lower()
        elif hasattr(value, "items"):
            _flatten_dict(value, prefix=(prefix + [key]), result=result)
--- a/synapse/replication/tcp/protocol.py
+++ b/synapse/replication/tcp/protocol.py
@@ -68,6 +68,7 @@ import synapse.metrics
 import struct
 import fcntl

+from six import iterkeys, iteritems

 metrics = synapse.metrics.get_metrics_for(__name__)

@@ -392,7 +393,7 @@ class ServerReplicationStreamProtocol(BaseReplicationStreamProtocol):

        if stream_name == "ALL":
            # Subscribe to all streams we're publishing to.
-            for stream in self.streamer.streams_by_name.iterkeys():
+            for stream in iterkeys(self.streamer.streams_by_name):
                self.subscribe_to_stream(stream, token)
        else:
            self.subscribe_to_stream(stream_name, token)
@@ -498,7 +499,7 @@ class ClientReplicationStreamProtocol(BaseReplicationStreamProtocol):
        BaseReplicationStreamProtocol.connectionMade(self)

        # Once we've connected subscribe to the necessary streams
-        for stream_name, token in self.handler.get_streams_to_replicate().iteritems():
+        for stream_name, token in iteritems(self.handler.get_streams_to_replicate()):
            self.replicate(stream_name, token)

        # Tell the server if we have any users currently syncing (should only
@@ -633,7 +634,7 @@ metrics.register_callback(
    lambda: {
        (k[0], p.name, p.conn_id): count
        for p in connected_connections
-        for k, count in p.inbound_commands_counter.counts.iteritems()
+        for k, count in iteritems(p.inbound_commands_counter.counts)
    },
    labels=["command", "name", "conn_id"],
 )
@@ -643,7 +644,7 @@ metrics.register_callback(
    lambda: {
        (k[0], p.name, p.conn_id): count
        for p in connected_connections
-        for k, count in p.outbound_commands_counter.counts.iteritems()
+        for k, count in iteritems(p.outbound_commands_counter.counts)
    },
    labels=["command", "name", "conn_id"],
 )
--- a/synapse/replication/tcp/resource.py
+++ b/synapse/replication/tcp/resource.py
@@ -26,6 +26,7 @@ from synapse.util.metrics import Measure, measure_func
 import logging
 import synapse.metrics

+from six import itervalues

 metrics = synapse.metrics.get_metrics_for(__name__)
 stream_updates_counter = metrics.register_counter(
@@ -69,6 +70,7 @@ class ReplicationStreamer(object):
        self.presence_handler = hs.get_presence_handler()
        self.clock = hs.get_clock()
        self.notifier = hs.get_notifier()
+        self._server_notices_sender = hs.get_server_notices_sender()

        # Current connections.
        self.connections = []
@@ -79,7 +81,7 @@ class ReplicationStreamer(object):
        # We only support federation stream if federation sending hase been
        # disabled on the master.
        self.streams = [
-            stream(hs) for stream in STREAMS_MAP.itervalues()
+            stream(hs) for stream in itervalues(STREAMS_MAP)
            if stream != FederationStream or not hs.config.send_federation
        ]

@@ -253,6 +255,7 @@ class ReplicationStreamer(object):
        yield self.store.insert_client_ip(
            user_id, access_token, ip, user_agent, device_id, last_seen,
        )
+        yield self._server_notices_sender.on_user_ip(user_id)

    def send_sync_to_all_connections(self, data):
        """Sends a SYNC command to all clients.
--- a/synapse/rest/client/transactions.py
+++ b/synapse/rest/client/transactions.py
@@ -19,6 +19,7 @@ import logging

 from synapse.api.auth import get_access_token_from_request
 from synapse.util.async import ObservableDeferred
+from synapse.util.logcontext import make_deferred_yieldable, run_in_background

 logger = logging.getLogger(__name__)

@@ -80,27 +81,26 @@ class HttpTransactionCache(object):
        Returns:
            Deferred which resolves to a tuple of (response_code, response_dict).
        """
-        try:
-            return self.transactions[txn_key][0].observe()
-        except (KeyError, IndexError):
-            pass  # execute the function instead.
+        if txn_key in self.transactions:
+            observable = self.transactions[txn_key][0]
+        else:
+            # execute the function instead.
+            deferred = run_in_background(fn, *args, **kwargs)

-        deferred = fn(*args, **kwargs)
+            observable = ObservableDeferred(deferred)
+            self.transactions[txn_key] = (observable, self.clock.time_msec())

-        # if the request fails with a Twisted failure, remove it
-        # from the transaction map. This is done to ensure that we don't
-        # cache transient errors like rate-limiting errors, etc.
-        def remove_from_map(err):
-            self.transactions.pop(txn_key, None)
-            return err
-        deferred.addErrback(remove_from_map)
+            # if the request fails with an exception, remove it
+            # from the transaction map. This is done to ensure that we don't
+            # cache transient errors like rate-limiting errors, etc.
+            def remove_from_map(err):
+                self.transactions.pop(txn_key, None)
+                # we deliberately do not propagate the error any further, as we
+                # expect the observers to have reported it.

-        # We don't add any other errbacks to the raw deferred, so we ask
-        # ObservableDeferred to swallow the error. This is fine as the error will
-        # still be reported to the observers.
-        observable = ObservableDeferred(deferred, consumeErrors=True)
-        self.transactions[txn_key] = (observable, self.clock.time_msec())
-        return observable.observe()
+            deferred.addErrback(remove_from_map)
+
+        return make_deferred_yieldable(observable.observe())

    def _cleanup(self):
        now = self.clock.time_msec()
--- a/synapse/rest/client/v1/admin.py
+++ b/synapse/rest/client/v1/admin.py
@@ -151,10 +151,11 @@ class PurgeHistoryRestServlet(ClientV1RestServlet):
            if event.room_id != room_id:
                raise SynapseError(400, "Event is for wrong room.")

-            depth = event.depth
+            token = yield self.store.get_topological_token_for_event(event_id)
+
            logger.info(
-                "[purge] purging up to depth %i (event_id %s)",
-                depth, event_id,
+                "[purge] purging up to token %s (event_id %s)",
+                token, event_id,
            )
        elif 'purge_up_to_ts' in body:
            ts = body['purge_up_to_ts']
@@ -174,7 +175,9 @@ class PurgeHistoryRestServlet(ClientV1RestServlet):
                )
            )
            if room_event_after_stream_ordering:
-                (_, depth, _) = room_event_after_stream_ordering
+                token = yield self.store.get_topological_token_for_event(
+                    room_event_after_stream_ordering,
+                )
            else:
                logger.warn(
                    "[purge] purging events not possible: No event found "
@@ -187,9 +190,9 @@ class PurgeHistoryRestServlet(ClientV1RestServlet):
                    errcode=Codes.NOT_FOUND,
                )
            logger.info(
-                "[purge] purging up to depth %i (received_ts %i => "
+                "[purge] purging up to token %d (received_ts %i => "
                "stream_ordering %i)",
-                depth, ts, stream_ordering,
+                token, ts, stream_ordering,
            )
        else:
            raise SynapseError(
@@ -199,7 +202,7 @@ class PurgeHistoryRestServlet(ClientV1RestServlet):
            )

        purge_id = yield self.handlers.message_handler.start_purge_history(
-            room_id, depth,
+            room_id, token,
            delete_local_events=delete_local_events,
        )

@@ -273,8 +276,8 @@ class ShutdownRoomRestServlet(ClientV1RestServlet):
    def __init__(self, hs):
        super(ShutdownRoomRestServlet, self).__init__(hs)
        self.store = hs.get_datastore()
-        self.handlers = hs.get_handlers()
        self.state = hs.get_state_handler()
+        self._room_creation_handler = hs.get_room_creation_handler()
        self.event_creation_handler = hs.get_event_creation_handler()
        self.room_member_handler = hs.get_room_member_handler()

@@ -296,7 +299,7 @@ class ShutdownRoomRestServlet(ClientV1RestServlet):
        message = content.get("message", self.DEFAULT_MESSAGE)
        room_name = content.get("room_name", "Content Violation Notification")

-        info = yield self.handlers.room_creation_handler.create_room(
+        info = yield self._room_creation_handler.create_room(
            room_creator_requester,
            config={
                "preset": "public_chat",
--- a/synapse/rest/client/v1/presence.py
+++ b/synapse/rest/client/v1/presence.py
@@ -23,6 +23,8 @@ from synapse.handlers.presence import format_user_presence_state
 from synapse.http.servlet import parse_json_object_from_request
 from .base import ClientV1RestServlet, client_path_patterns

+from six import string_types
+
 import logging

 logger = logging.getLogger(__name__)
@@ -71,7 +73,7 @@ class PresenceStatusRestServlet(ClientV1RestServlet):

            if "status_msg" in content:
                state["status_msg"] = content.pop("status_msg")
-                if not isinstance(state["status_msg"], basestring):
+                if not isinstance(state["status_msg"], string_types):
                    raise SynapseError(400, "status_msg must be a string.")

            if content:
@@ -129,7 +131,7 @@ class PresenceListRestServlet(ClientV1RestServlet):

        if "invite" in content:
            for u in content["invite"]:
-                if not isinstance(u, basestring):
+                if not isinstance(u, string_types):
                    raise SynapseError(400, "Bad invite value.")
                if len(u) == 0:
                    continue
@@ -140,7 +142,7 @@ class PresenceListRestServlet(ClientV1RestServlet):

        if "drop" in content:
            for u in content["drop"]:
-                if not isinstance(u, basestring):
+                if not isinstance(u, string_types):
                    raise SynapseError(400, "Bad drop value.")
                if len(u) == 0:
                    continue
--- a/synapse/rest/client/v1/room.py
+++ b/synapse/rest/client/v1/room.py
@@ -41,7 +41,7 @@ class RoomCreateRestServlet(ClientV1RestServlet):

    def __init__(self, hs):
        super(RoomCreateRestServlet, self).__init__(hs)
-        self.handlers = hs.get_handlers()
+        self._room_creation_handler = hs.get_room_creation_handler()

    def register(self, http_server):
        PATTERNS = "/createRoom"
@@ -64,8 +64,7 @@ class RoomCreateRestServlet(ClientV1RestServlet):
    def on_POST(self, request):
        requester = yield self.auth.get_user_by_req(request)

-        handler = self.handlers.room_creation_handler
-        info = yield handler.create_room(
+        info = yield self._room_creation_handler.create_room(
            requester, self.get_room_config(request)
        )

--- a/synapse/rest/client/v2_alpha/sync.py
+++ b/synapse/rest/client/v2_alpha/sync.py
@@ -85,6 +85,7 @@ class SyncRestServlet(RestServlet):
        self.clock = hs.get_clock()
        self.filtering = hs.get_filtering()
        self.presence_handler = hs.get_presence_handler()
+        self._server_notices_sender = hs.get_server_notices_sender()

    @defer.inlineCallbacks
    def on_GET(self, request):
@@ -149,6 +150,9 @@ class SyncRestServlet(RestServlet):
        else:
            since_token = None

+        # send any outstanding server notices to the user.
+        yield self._server_notices_sender.on_user_syncing(user.to_string())
+
        affect_presence = set_presence != PresenceState.OFFLINE

        if affect_presence:
--- a/synapse/rest/consent/init.py
+++ b/synapse/rest/consent/init.py
--- a/synapse/rest/consent/consent_resource.py
+++ b/synapse/rest/consent/consent_resource.py
@@ -0,0 +1,222 @@
+# -*- coding: utf-8 -*-
+# Copyright 2018 New Vector Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from hashlib import sha256
+import hmac
+import logging
+from os import path
+from six.moves import http_client
+
+import jinja2
+from jinja2 import TemplateNotFound
+from twisted.internet import defer
+from twisted.web.resource import Resource
+from twisted.web.server import NOT_DONE_YET
+
+from synapse.api.errors import NotFoundError, SynapseError, StoreError
+from synapse.config import ConfigError
+from synapse.http.server import (
+    finish_request,
+    wrap_html_request_handler,
+)
+from synapse.http.servlet import parse_string
+from synapse.types import UserID
+
+
+# language to use for the templates. TODO: figure this out from Accept-Language
+TEMPLATE_LANGUAGE = "en"
+
+logger = logging.getLogger(__name__)
+
+# use hmac.compare_digest if we have it (python 2.7.7), else just use equality
+if hasattr(hmac, "compare_digest"):
+    compare_digest = hmac.compare_digest
+else:
+    def compare_digest(a, b):
+        return a == b
+
+
+class ConsentResource(Resource):
+    """A twisted Resource to display a privacy policy and gather consent to it
+
+    When accessed via GET, returns the privacy policy via a template.
+
+    When accessed via POST, records the user's consent in the database and
+    displays a success page.
+
+    The config should include a template_dir setting which contains templates
+    for the HTML. The directory should contain one subdirectory per language
+    (eg, 'en', 'fr'), and each language directory should contain the policy
+    document (named as '<version>.html') and a success page (success.html).
+
+    Both forms take a set of parameters from the browser. For the POST form,
+    these are normally sent as form parameters (but may be query-params); for
+    GET requests they must be query params. These are:
+
+        u: the complete mxid, or the localpart of the user giving their
+           consent. Required for both GET (where it is used as an input to the
+           template) and for POST (where it is used to find the row in the db
+           to update).
+
+        h: hmac_sha256(secret, u), where 'secret' is the privacy_secret in the
+           config file. If it doesn't match, the request is 403ed.
+
+        v: the version of the privacy policy being agreed to.
+
+           For GET: optional, and defaults to whatever was set in the config
+           file. Used to choose the version of the policy to pick from the
+           templates directory.
+
+           For POST: required; gives the value to be recorded in the database
+           against the user.
+    """
+    def __init__(self, hs):
+        """
+        Args:
+            hs (synapse.server.HomeServer): homeserver
+        """
+        Resource.__init__(self)
+
+        self.hs = hs
+        self.store = hs.get_datastore()
+
+        # this is required by the request_handler wrapper
+        self.clock = hs.get_clock()
+
+        self._default_consent_version = hs.config.user_consent_version
+        if self._default_consent_version is None:
+            raise ConfigError(
+                "Consent resource is enabled but user_consent section is "
+                "missing in config file.",
+            )
+
+        # daemonize changes the cwd to /, so make the path absolute now.
+        consent_template_directory = path.abspath(
+            hs.config.user_consent_template_dir,
+        )
+        if not path.isdir(consent_template_directory):
+            raise ConfigError(
+                "Could not find template directory '%s'" % (
+                    consent_template_directory,
+                ),
+            )
+
+        loader = jinja2.FileSystemLoader(consent_template_directory)
+        self._jinja_env = jinja2.Environment(
+            loader=loader,
+            autoescape=jinja2.select_autoescape(['html', 'htm', 'xml']),
+        )
+
+        if hs.config.form_secret is None:
+            raise ConfigError(
+                "Consent resource is enabled but form_secret is not set in "
+                "config file. It should be set to an arbitrary secret string.",
+            )
+
+        self._hmac_secret = hs.config.form_secret.encode("utf-8")
+
+    def render_GET(self, request):
+        self._async_render_GET(request)
+        return NOT_DONE_YET
+
+    @wrap_html_request_handler
+    @defer.inlineCallbacks
+    def _async_render_GET(self, request):
+        """
+        Args:
+            request (twisted.web.http.Request):
+        """
+
+        version = parse_string(request, "v",
+                               default=self._default_consent_version)
+        username = parse_string(request, "u", required=True)
+        userhmac = parse_string(request, "h", required=True)
+
+        self._check_hash(username, userhmac)
+
+        if username.startswith('@'):
+            qualified_user_id = username
+        else:
+            qualified_user_id = UserID(username, self.hs.hostname).to_string()
+
+        u = yield self.store.get_user_by_id(qualified_user_id)
+        if u is None:
+            raise NotFoundError("Unknown user")
+
+        try:
+            self._render_template(
+                request, "%s.html" % (version,),
+                user=username, userhmac=userhmac, version=version,
+                has_consented=(u["consent_version"] == version),
+            )
+        except TemplateNotFound:
+            raise NotFoundError("Unknown policy version")
+
+    def render_POST(self, request):
+        self._async_render_POST(request)
+        return NOT_DONE_YET
+
+    @wrap_html_request_handler
+    @defer.inlineCallbacks
+    def _async_render_POST(self, request):
+        """
+        Args:
+            request (twisted.web.http.Request):
+        """
+        version = parse_string(request, "v", required=True)
+        username = parse_string(request, "u", required=True)
+        userhmac = parse_string(request, "h", required=True)
+
+        self._check_hash(username, userhmac)
+
+        if username.startswith('@'):
+            qualified_user_id = username
+        else:
+            qualified_user_id = UserID(username, self.hs.hostname).to_string()
+
+        try:
+            yield self.store.user_set_consent_version(qualified_user_id, version)
+        except StoreError as e:
+            if e.code != 404:
+                raise
+            raise NotFoundError("Unknown user")
+
+        try:
+            self._render_template(request, "success.html")
+        except TemplateNotFound:
+            raise NotFoundError("success.html not found")
+
+    def _render_template(self, request, template_name, **template_args):
+        # get_template checks for ".." so we don't need to worry too much
+        # about path traversal here.
+        template_html = self._jinja_env.get_template(
+            path.join(TEMPLATE_LANGUAGE, template_name)
+        )
+        html_bytes = template_html.render(**template_args).encode("utf8")
+
+        request.setHeader(b"Content-Type", b"text/html; charset=utf-8")
+        request.setHeader(b"Content-Length", b"%i" % len(html_bytes))
+        request.write(html_bytes)
+        finish_request(request)
+
+    def _check_hash(self, userid, userhmac):
+        want_mac = hmac.new(
+            key=self._hmac_secret,
+            msg=userid,
+            digestmod=sha256,
+        ).hexdigest()
+
+        if not compare_digest(want_mac, userhmac):
+            raise SynapseError(http_client.FORBIDDEN, "HMAC incorrect")
--- a/synapse/rest/media/v1/media_repository.py
+++ b/synapse/rest/media/v1/media_repository.py
@@ -48,6 +48,7 @@ import shutil
 import cgi
 import logging
 from six.moves.urllib import parse as urlparse
+from six import iteritems

 logger = logging.getLogger(__name__)

@@ -603,7 +604,7 @@ class MediaRepository(object):
                thumbnails[(t_width, t_height, r_type)] = r_method

        # Now we generate the thumbnails for each dimension, store it
-        for (t_width, t_height, t_type), t_method in thumbnails.iteritems():
+        for (t_width, t_height, t_type), t_method in iteritems(thumbnails):
            # Generate the thumbnail
            if t_method == "crop":
                t_byte_source = yield make_deferred_yieldable(threads.deferToThread(
--- a/synapse/rest/media/v1/preview_url_resource.py
+++ b/synapse/rest/media/v1/preview_url_resource.py
@@ -24,7 +24,9 @@ import shutil
 import sys
 import traceback
 import simplejson as json
-import urlparse
+
+from six.moves import urllib_parse as urlparse
+from six import string_types

 from twisted.web.server import NOT_DONE_YET
 from twisted.internet import defer
@@ -590,8 +592,8 @@ def _iterate_over_text(tree, *tags_to_ignore):
    # to be returned.
    elements = iter([tree])
    while True:
-        el = elements.next()
-        if isinstance(el, basestring):
+        el = next(elements)
+        if isinstance(el, string_types):
            yield el
        elif el is not None and el.tag not in tags_to_ignore:
            # el.text is the text before the first child, so we can immediately
--- a/synapse/server.py
+++ b/synapse/server.py
@@ -46,6 +46,7 @@ from synapse.handlers.devicemessage import DeviceMessageHandler
 from synapse.handlers.device import DeviceHandler
 from synapse.handlers.e2e_keys import E2eKeysHandler
 from synapse.handlers.presence import PresenceHandler
+from synapse.handlers.room import RoomCreationHandler
 from synapse.handlers.room_list import RoomListHandler
 from synapse.handlers.room_member import RoomMemberMasterHandler
 from synapse.handlers.room_member_worker import RoomMemberWorkerHandler
@@ -71,6 +72,11 @@ from synapse.rest.media.v1.media_repository import (
    MediaRepository,
    MediaRepositoryResource,
 )
+from synapse.server_notices.server_notices_manager import ServerNoticesManager
+from synapse.server_notices.server_notices_sender import ServerNoticesSender
+from synapse.server_notices.worker_server_notices_sender import (
+    WorkerServerNoticesSender,
+)
 from synapse.state import StateHandler, StateResolutionHandler
 from synapse.storage import DataStore
 from synapse.streams.events import EventSources
@@ -97,6 +103,9 @@ class HomeServer(object):
    which must be implemented by the subclass. This code may call any of the
    required "get" methods on the instance to obtain the sub-dependencies that
    one requires.
+
+    Attributes:
+        config (synapse.config.homeserver.HomeserverConfig):
    """

    DEPENDENCIES = [
@@ -106,6 +115,7 @@ class HomeServer(object):
        'federation_server',
        'handlers',
        'auth',
+        'room_creation_handler',
        'state_handler',
        'state_resolution_handler',
        'presence_handler',
@@ -151,6 +161,8 @@ class HomeServer(object):
        'spam_checker',
        'room_member_handler',
        'federation_registry',
+        'server_notices_manager',
+        'server_notices_sender',
    ]

    def __init__(self, hostname, **kwargs):
@@ -224,6 +236,9 @@ class HomeServer(object):
    def build_simple_http_client(self):
        return SimpleHttpClient(self)

+    def build_room_creation_handler(self):
+        return RoomCreationHandler(self)
+
    def build_state_handler(self):
        return StateHandler(self)

@@ -390,6 +405,16 @@ class HomeServer(object):
    def build_federation_registry(self):
        return FederationHandlerRegistry()

+    def build_server_notices_manager(self):
+        if self.config.worker_app:
+            raise Exception("Workers cannot send server notices")
+        return ServerNoticesManager(self)
+
+    def build_server_notices_sender(self):
+        if self.config.worker_app:
+            return WorkerServerNoticesSender(self)
+        return ServerNoticesSender(self)
+
    def remove_pusher(self, app_id, push_key, user_id):
        return self.get_pusherpool().remove_pusher(app_id, push_key, user_id)

--- a/synapse/server.pyi
+++ b/synapse/server.pyi
@@ -1,4 +1,5 @@
 import synapse.api.auth
+import synapse.config.homeserver
 import synapse.federation.transaction_queue
 import synapse.federation.transport.client
 import synapse.handlers
@@ -8,11 +9,17 @@ import synapse.handlers.device
 import synapse.handlers.e2e_keys
 import synapse.handlers.set_password
 import synapse.rest.media.v1.media_repository
+import synapse.server_notices.server_notices_manager
+import synapse.server_notices.server_notices_sender
 import synapse.state
 import synapse.storage


 class HomeServer(object):
+    @property
+    def config(self) -> synapse.config.homeserver.HomeServerConfig:
+        pass
+
    def get_auth(self) -> synapse.api.auth.Auth:
        pass

@@ -40,6 +47,12 @@ class HomeServer(object):
    def get_deactivate_account_handler(self) -> synapse.handlers.deactivate_account.DeactivateAccountHandler:
        pass

+    def get_room_creation_handler(self) -> synapse.handlers.room.RoomCreationHandler:
+        pass
+
+    def get_event_creation_handler(self) -> synapse.handlers.message.EventCreationHandler:
+        pass
+
    def get_set_password_handler(self) -> synapse.handlers.set_password.SetPasswordHandler:
        pass

@@ -54,3 +67,9 @@ class HomeServer(object):

    def get_media_repository(self) -> synapse.rest.media.v1.media_repository.MediaRepository:
        pass
+
+    def get_server_notices_manager(self) -> synapse.server_notices.server_notices_manager.ServerNoticesManager:
+        pass
+
+    def get_server_notices_sender(self) -> synapse.server_notices.server_notices_sender.ServerNoticesSender:
+        pass
--- a/synapse/server_notices/init.py
+++ b/synapse/server_notices/init.py
--- a/synapse/server_notices/consent_server_notices.py
+++ b/synapse/server_notices/consent_server_notices.py
@@ -0,0 +1,132 @@
+# -*- coding: utf-8 -*-
+# Copyright 2018 New Vector Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import logging
+
+from six import (iteritems, string_types)
+from twisted.internet import defer
+
+from synapse.api.errors import SynapseError
+from synapse.api.urls import ConsentURIBuilder
+from synapse.config import ConfigError
+from synapse.types import get_localpart_from_id
+
+logger = logging.getLogger(__name__)
+
+
+class ConsentServerNotices(object):
+    """Keeps track of whether we need to send users server_notices about
+    privacy policy consent, and sends one if we do.
+    """
+    def __init__(self, hs):
+        """
+
+        Args:
+            hs (synapse.server.HomeServer):
+        """
+        self._server_notices_manager = hs.get_server_notices_manager()
+        self._store = hs.get_datastore()
+
+        self._users_in_progress = set()
+
+        self._current_consent_version = hs.config.user_consent_version
+        self._server_notice_content = hs.config.user_consent_server_notice_content
+
+        if self._server_notice_content is not None:
+            if not self._server_notices_manager.is_enabled():
+                raise ConfigError(
+                    "user_consent configuration requires server notices, but "
+                    "server notices are not enabled.",
+                )
+            if 'body' not in self._server_notice_content:
+                raise ConfigError(
+                    "user_consent server_notice_consent must contain a 'body' "
+                    "key.",
+                )
+
+            self._consent_uri_builder = ConsentURIBuilder(hs.config)
+
+    @defer.inlineCallbacks
+    def maybe_send_server_notice_to_user(self, user_id):
+        """Check if we need to send a notice to this user, and does so if so
+
+        Args:
+            user_id (str): user to check
+
+        Returns:
+            Deferred
+        """
+        if self._server_notice_content is None:
+            # not enabled
+            return
+
+        # make sure we don't send two messages to the same user at once
+        if user_id in self._users_in_progress:
+            return
+        self._users_in_progress.add(user_id)
+        try:
+            u = yield self._store.get_user_by_id(user_id)
+
+            if u["consent_version"] == self._current_consent_version:
+                # user has already consented
+                return
+
+            if u["consent_server_notice_sent"] == self._current_consent_version:
+                # we've already sent a notice to the user
+                return
+
+            # need to send a message.
+            try:
+                consent_uri = self._consent_uri_builder.build_user_consent_uri(
+                    get_localpart_from_id(user_id),
+                )
+                content = copy_with_str_subst(
+                    self._server_notice_content, {
+                        'consent_uri': consent_uri,
+                    },
+                )
+                yield self._server_notices_manager.send_notice(
+                    user_id, content,
+                )
+                yield self._store.user_set_consent_server_notice_sent(
+                    user_id, self._current_consent_version,
+                )
+            except SynapseError as e:
+                logger.error("Error sending server notice about user consent: %s", e)
+        finally:
+            self._users_in_progress.remove(user_id)
+
+
+def copy_with_str_subst(x, substitutions):
+    """Deep-copy a structure, carrying out string substitions on any strings
+
+    Args:
+        x (object): structure to be copied
+        substitutions (object): substitutions to be made - passed into the
+            string '%' operator
+
+    Returns:
+        copy of x
+    """
+    if isinstance(x, string_types):
+        return x % substitutions
+    if isinstance(x, dict):
+        return {
+            k: copy_with_str_subst(v, substitutions) for (k, v) in iteritems(x)
+        }
+    if isinstance(x, (list, tuple)):
+        return [copy_with_str_subst(y) for y in x]
+
+    # assume it's uninterested and can be shallow-copied.
+    return x
--- a/synapse/server_notices/server_notices_manager.py
+++ b/synapse/server_notices/server_notices_manager.py
@@ -0,0 +1,146 @@
+# -*- coding: utf-8 -*-
+# Copyright 2018 New Vector Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import logging
+
+from twisted.internet import defer
+
+from synapse.api.constants import EventTypes, Membership, RoomCreationPreset
+from synapse.types import create_requester
+from synapse.util.caches.descriptors import cachedInlineCallbacks
+
+logger = logging.getLogger(__name__)
+
+
+class ServerNoticesManager(object):
+    def __init__(self, hs):
+        """
+
+        Args:
+            hs (synapse.server.HomeServer):
+        """
+
+        self._store = hs.get_datastore()
+        self._config = hs.config
+        self._room_creation_handler = hs.get_room_creation_handler()
+        self._event_creation_handler = hs.get_event_creation_handler()
+        self._is_mine_id = hs.is_mine_id
+
+    def is_enabled(self):
+        """Checks if server notices are enabled on this server.
+
+        Returns:
+            bool
+        """
+        return self._config.server_notices_mxid is not None
+
+    @defer.inlineCallbacks
+    def send_notice(self, user_id, event_content):
+        """Send a notice to the given user
+
+        Creates the server notices room, if none exists.
+
+        Args:
+            user_id (str): mxid of user to send event to.
+            event_content (dict): content of event to send
+
+        Returns:
+            Deferred[None]
+        """
+        room_id = yield self.get_notice_room_for_user(user_id)
+
+        system_mxid = self._config.server_notices_mxid
+        requester = create_requester(system_mxid)
+
+        logger.info("Sending server notice to %s", user_id)
+
+        yield self._event_creation_handler.create_and_send_nonmember_event(
+            requester, {
+                "type": EventTypes.Message,
+                "room_id": room_id,
+                "sender": system_mxid,
+                "content": event_content,
+            },
+            ratelimit=False,
+        )
+
+    @cachedInlineCallbacks()
+    def get_notice_room_for_user(self, user_id):
+        """Get the room for notices for a given user
+
+        If we have not yet created a notice room for this user, create it
+
+        Args:
+            user_id (str): complete user id for the user we want a room for
+
+        Returns:
+            str: room id of notice room.
+        """
+        if not self.is_enabled():
+            raise Exception("Server notices not enabled")
+
+        assert self._is_mine_id(user_id), \
+            "Cannot send server notices to remote users"
+
+        rooms = yield self._store.get_rooms_for_user_where_membership_is(
+            user_id, [Membership.INVITE, Membership.JOIN],
+        )
+        system_mxid = self._config.server_notices_mxid
+        for room in rooms:
+            # it's worth noting that there is an asymmetry here in that we
+            # expect the user to be invited or joined, but the system user must
+            # be joined. This is kinda deliberate, in that if somebody somehow
+            # manages to invite the system user to a room, that doesn't make it
+            # the server notices room.
+            user_ids = yield self._store.get_users_in_room(room.room_id)
+            if system_mxid in user_ids:
+                # we found a room which our user shares with the system notice
+                # user
+                logger.info("Using room %s", room.room_id)
+                defer.returnValue(room.room_id)
+
+        # apparently no existing notice room: create a new one
+        logger.info("Creating server notices room for %s", user_id)
+
+        # see if we want to override the profile info for the server user.
+        # note that if we want to override either the display name or the
+        # avatar, we have to use both.
+        join_profile = None
+        if (
+            self._config.server_notices_mxid_display_name is not None or
+            self._config.server_notices_mxid_avatar_url is not None
+        ):
+            join_profile = {
+                "displayname": self._config.server_notices_mxid_display_name,
+                "avatar_url": self._config.server_notices_mxid_avatar_url,
+            }
+
+        requester = create_requester(system_mxid)
+        info = yield self._room_creation_handler.create_room(
+            requester,
+            config={
+                "preset": RoomCreationPreset.PRIVATE_CHAT,
+                "name": self._config.server_notices_room_name,
+                "power_level_content_override": {
+                    "users_default": -10,
+                },
+                "invite": (user_id,)
+            },
+            ratelimit=False,
+            creator_join_profile=join_profile,
+        )
+        room_id = info['room_id']
+
+        logger.info("Created server notices room %s for %s", room_id, user_id)
+        defer.returnValue(room_id)
--- a/synapse/server_notices/server_notices_sender.py
+++ b/synapse/server_notices/server_notices_sender.py
@@ -0,0 +1,58 @@
+# -*- coding: utf-8 -*-
+# Copyright 2018 New Vector Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from synapse.server_notices.consent_server_notices import ConsentServerNotices
+
+
+class ServerNoticesSender(object):
+    """A centralised place which sends server notices automatically when
+    Certain Events take place
+    """
+    def __init__(self, hs):
+        """
+
+        Args:
+            hs (synapse.server.HomeServer):
+        """
+        # todo: it would be nice to make this more dynamic
+        self._consent_server_notices = ConsentServerNotices(hs)
+
+    def on_user_syncing(self, user_id):
+        """Called when the user performs a sync operation.
+
+        Args:
+            user_id (str): mxid of user who synced
+
+        Returns:
+            Deferred
+        """
+        return self._consent_server_notices.maybe_send_server_notice_to_user(
+            user_id,
+        )
+
+    def on_user_ip(self, user_id):
+        """Called on the master when a worker process saw a client request.
+
+        Args:
+            user_id (str): mxid
+
+        Returns:
+            Deferred
+        """
+        # The synchrotrons use a stubbed version of ServerNoticesSender, so
+        # we check for notices to send to the user in on_user_ip as well as
+        # in on_user_syncing
+        return self._consent_server_notices.maybe_send_server_notice_to_user(
+            user_id,
+        )
--- a/synapse/server_notices/worker_server_notices_sender.py
+++ b/synapse/server_notices/worker_server_notices_sender.py
@@ -0,0 +1,46 @@
+# -*- coding: utf-8 -*-
+# Copyright 2018 New Vector Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from twisted.internet import defer
+
+
+class WorkerServerNoticesSender(object):
+    """Stub impl of ServerNoticesSender which does nothing"""
+    def __init__(self, hs):
+        """
+        Args:
+            hs (synapse.server.HomeServer):
+        """
+
+    def on_user_syncing(self, user_id):
+        """Called when the user performs a sync operation.
+
+        Args:
+            user_id (str): mxid of user who synced
+
+        Returns:
+            Deferred
+        """
+        return defer.succeed(None)
+
+    def on_user_ip(self, user_id):
+        """Called on the master when a worker process saw a client request.
+
+        Args:
+            user_id (str): mxid
+
+        Returns:
+            Deferred
+        """
+        raise AssertionError("on_user_ip unexpectedly called on worker")
--- a/synapse/state.py
+++ b/synapse/state.py
@@ -32,6 +32,8 @@ from frozendict import frozendict
 import logging
 import hashlib

+from six import iteritems, itervalues
+
 logger = logging.getLogger(__name__)


@@ -132,7 +134,7 @@ class StateHandler(object):

        state_map = yield self.store.get_events(state.values(), get_prev_content=False)
        state = {
-            key: state_map[e_id] for key, e_id in state.iteritems() if e_id in state_map
+            key: state_map[e_id] for key, e_id in iteritems(state) if e_id in state_map
        }

        defer.returnValue(state)
@@ -338,7 +340,7 @@ class StateHandler(object):
        )

        if len(state_groups_ids) == 1:
-            name, state_list = state_groups_ids.items().pop()
+            name, state_list = list(state_groups_ids.items()).pop()

            prev_group, delta_ids = yield self.store.get_state_group_delta(name)

@@ -378,7 +380,7 @@ class StateHandler(object):
            new_state = resolve_events_with_state_map(state_set_ids, state_map)

        new_state = {
-            key: state_map[ev_id] for key, ev_id in new_state.iteritems()
+            key: state_map[ev_id] for key, ev_id in iteritems(new_state)
        }

        return new_state
@@ -458,15 +460,15 @@ class StateResolutionHandler(object):
            # build a map from state key to the event_ids which set that state.
            # dict[(str, str), set[str])
            state = {}
-            for st in state_groups_ids.itervalues():
-                for key, e_id in st.iteritems():
+            for st in itervalues(state_groups_ids):
+                for key, e_id in iteritems(st):
                    state.setdefault(key, set()).add(e_id)

            # build a map from state key to the event_ids which set that state,
            # including only those where there are state keys in conflict.
            conflicted_state = {
                k: list(v)
-                for k, v in state.iteritems()
+                for k, v in iteritems(state)
                if len(v) > 1
            }

@@ -474,13 +476,13 @@ class StateResolutionHandler(object):
                logger.info("Resolving conflicted state for %r", room_id)
                with Measure(self.clock, "state._resolve_events"):
                    new_state = yield resolve_events_with_factory(
-                        state_groups_ids.values(),
+                        list(state_groups_ids.values()),
                        event_map=event_map,
                        state_map_factory=state_map_factory,
                    )
            else:
                new_state = {
-                    key: e_ids.pop() for key, e_ids in state.iteritems()
+                    key: e_ids.pop() for key, e_ids in iteritems(state)
                }

            with Measure(self.clock, "state.create_group_ids"):
@@ -489,8 +491,8 @@ class StateResolutionHandler(object):
                # which will be used as a cache key for future resolutions, but
                # not get persisted.
                state_group = None
-                new_state_event_ids = frozenset(new_state.itervalues())
-                for sg, events in state_groups_ids.iteritems():
+                new_state_event_ids = frozenset(itervalues(new_state))
+                for sg, events in iteritems(state_groups_ids):
                    if new_state_event_ids == frozenset(e_id for e_id in events):
                        state_group = sg
                        break
@@ -501,11 +503,11 @@ class StateResolutionHandler(object):

                prev_group = None
                delta_ids = None
-                for old_group, old_ids in state_groups_ids.iteritems():
+                for old_group, old_ids in iteritems(state_groups_ids):
                    if not set(new_state) - set(old_ids):
                        n_delta_ids = {
                            k: v
-                            for k, v in new_state.iteritems()
+                            for k, v in iteritems(new_state)
                            if old_ids.get(k) != v
                        }
                        if not delta_ids or len(n_delta_ids) < len(delta_ids):
@@ -527,7 +529,7 @@ class StateResolutionHandler(object):

 def _ordered_events(events):
    def key_func(e):
-        return -int(e.depth), hashlib.sha1(e.event_id).hexdigest()
+        return -int(e.depth), hashlib.sha1(e.event_id.encode()).hexdigest()

    return sorted(events, key=key_func)

@@ -584,7 +586,7 @@ def _seperate(state_sets):
    conflicted_state = {}

    for state_set in state_sets[1:]:
-        for key, value in state_set.iteritems():
+        for key, value in iteritems(state_set):
            # Check if there is an unconflicted entry for the state key.
            unconflicted_value = unconflicted_state.get(key)
            if unconflicted_value is None:
@@ -640,7 +642,7 @@ def resolve_events_with_factory(state_sets, event_map, state_map_factory):

    needed_events = set(
        event_id
-        for event_ids in conflicted_state.itervalues()
+        for event_ids in itervalues(conflicted_state)
        for event_id in event_ids
    )
    if event_map is not None:
@@ -662,7 +664,7 @@ def resolve_events_with_factory(state_sets, event_map, state_map_factory):
        unconflicted_state, conflicted_state, state_map
    )

-    new_needed_events = set(auth_events.itervalues())
+    new_needed_events = set(itervalues(auth_events))
    new_needed_events -= needed_events
    if event_map is not None:
        new_needed_events -= set(event_map.iterkeys())
@@ -679,7 +681,7 @@ def resolve_events_with_factory(state_sets, event_map, state_map_factory):

 def _create_auth_events_from_maps(unconflicted_state, conflicted_state, state_map):
    auth_events = {}
-    for event_ids in conflicted_state.itervalues():
+    for event_ids in itervalues(conflicted_state):
        for event_id in event_ids:
            if event_id in state_map:
                keys = event_auth.auth_types_for_event(state_map[event_id])
@@ -694,7 +696,7 @@ def _create_auth_events_from_maps(unconflicted_state, conflicted_state, state_ma
 def _resolve_with_state(unconflicted_state_ids, conflicted_state_ds, auth_event_ids,
                        state_map):
    conflicted_state = {}
-    for key, event_ids in conflicted_state_ds.iteritems():
+    for key, event_ids in iteritems(conflicted_state_ds):
        events = [state_map[ev_id] for ev_id in event_ids if ev_id in state_map]
        if len(events) > 1:
            conflicted_state[key] = events
@@ -703,7 +705,7 @@ def _resolve_with_state(unconflicted_state_ids, conflicted_state_ds, auth_event_

    auth_events = {
        key: state_map[ev_id]
-        for key, ev_id in auth_event_ids.iteritems()
+        for key, ev_id in iteritems(auth_event_ids)
        if ev_id in state_map
    }

@@ -716,7 +718,7 @@ def _resolve_with_state(unconflicted_state_ids, conflicted_state_ds, auth_event_
        raise

    new_state = unconflicted_state_ids
-    for key, event in resolved_state.iteritems():
+    for key, event in iteritems(resolved_state):
        new_state[key] = event.event_id

    return new_state
@@ -741,7 +743,7 @@ def _resolve_state_events(conflicted_state, auth_events):

    auth_events.update(resolved_state)

-    for key, events in conflicted_state.iteritems():
+    for key, events in iteritems(conflicted_state):
        if key[0] == EventTypes.JoinRules:
            logger.debug("Resolving conflicted join rules %r", events)
            resolved_state[key] = _resolve_auth_events(
@@ -751,7 +753,7 @@ def _resolve_state_events(conflicted_state, auth_events):

    auth_events.update(resolved_state)

-    for key, events in conflicted_state.iteritems():
+    for key, events in iteritems(conflicted_state):
        if key[0] == EventTypes.Member:
            logger.debug("Resolving conflicted member lists %r", events)
            resolved_state[key] = _resolve_auth_events(
@@ -761,7 +763,7 @@ def _resolve_state_events(conflicted_state, auth_events):

    auth_events.update(resolved_state)

-    for key, events in conflicted_state.iteritems():
+    for key, events in iteritems(conflicted_state):
        if key not in resolved_state:
            logger.debug("Resolving conflicted state %r:%r", key, events)
            resolved_state[key] = _resolve_normal_events(
--- a/synapse/storage/init.py
+++ b/synapse/storage/init.py
@@ -14,6 +14,11 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.

+import datetime
+from dateutil import tz
+import time
+import logging
+
 from synapse.storage.devices import DeviceStore
 from .appservice import (
    ApplicationServiceStore, ApplicationServiceTransactionStore
@@ -55,10 +60,6 @@ from .engines import PostgresEngine
 from synapse.api.constants import PresenceState
 from synapse.util.caches.stream_change_cache import StreamChangeCache

-
-import logging
-
-
 logger = logging.getLogger(__name__)


@@ -130,6 +131,7 @@ class DataStore(RoomMemberStore, RoomStore,
        self._group_updates_id_gen = StreamIdGenerator(
            db_conn, "local_group_updates", "stream_id",
        )
+        self._chunk_id_gen = IdGenerator(db_conn, "events", "chunk_id")

        if isinstance(self.database_engine, PostgresEngine):
            self._cache_id_gen = StreamIdGenerator(
@@ -213,6 +215,9 @@ class DataStore(RoomMemberStore, RoomStore,
        self._stream_order_on_start = self.get_room_max_stream_ordering()
        self._min_stream_order_on_start = self.get_room_min_stream_ordering()

+        # Used in _generate_user_daily_visits to keep track of progress
+        self._last_user_visit_update = self._get_start_of_day()
+
        super(DataStore, self).__init__(db_conn, hs)

    def take_presence_startup_info(self):
@@ -347,6 +352,69 @@ class DataStore(RoomMemberStore, RoomStore,

        return self.runInteraction("count_r30_users", _count_r30_users)

+    def _get_start_of_day(self):
+        """
+        Returns millisecond unixtime for start of UTC day.
+        """
+        now = datetime.datetime.utcnow()
+        today_start = datetime.datetime(now.year, now.month,
+                                        now.day, tzinfo=tz.tzutc())
+        return int(time.mktime(today_start.timetuple())) * 1000
+
+    def generate_user_daily_visits(self):
+        """
+        Generates daily visit data for use in cohort/ retention analysis
+        """
+        def _generate_user_daily_visits(txn):
+            logger.info("Calling _generate_user_daily_visits")
+            today_start = self._get_start_of_day()
+            a_day_in_milliseconds = 24 * 60 * 60 * 1000
+            now = self.clock.time_msec()
+
+            sql = """
+                INSERT INTO user_daily_visits (user_id, device_id, timestamp)
+                    SELECT u.user_id, u.device_id, ?
+                    FROM user_ips AS u
+                    LEFT JOIN (
+                      SELECT user_id, device_id, timestamp FROM user_daily_visits
+                      WHERE timestamp = ?
+                    ) udv
+                    ON u.user_id = udv.user_id AND u.device_id=udv.device_id
+                    INNER JOIN users ON users.name=u.user_id
+                    WHERE last_seen > ? AND last_seen <= ?
+                    AND udv.timestamp IS NULL AND users.is_guest=0
+                    AND users.appservice_id IS NULL
+                    GROUP BY u.user_id, u.device_id
+            """
+
+            # This means that the day has rolled over but there could still
+            # be entries from the previous day. There is an edge case
+            # where if the user logs in at 23:59 and overwrites their
+            # last_seen at 00:01 then they will not be counted in the
+            # previous day's stats - it is important that the query is run
+            # often to minimise this case.
+            if today_start > self._last_user_visit_update:
+                yesterday_start = today_start - a_day_in_milliseconds
+                txn.execute(sql, (
+                    yesterday_start, yesterday_start,
+                    self._last_user_visit_update, today_start
+                ))
+                self._last_user_visit_update = today_start
+
+            txn.execute(sql, (
+                today_start, today_start,
+                self._last_user_visit_update,
+                now
+            ))
+            # Update _last_user_visit_update to now. The reason to do this
+            # rather just clamping to the beginning of the day is to limit
+            # the size of the join - meaning that the query can be run more
+            # frequently
+            self._last_user_visit_update = now
+
+        return self.runInteraction("generate_user_daily_visits",
+                                   _generate_user_daily_visits)
+
    def get_users(self):
        """Function to reterive a list of users in users table.

--- a/synapse/storage/_base.py
+++ b/synapse/storage/_base.py
@@ -27,9 +27,17 @@ import sys
 import time
 import threading

+from six import itervalues, iterkeys, iteritems
+from six.moves import intern, range

 logger = logging.getLogger(__name__)

+try:
+    MAX_TXN_ID = sys.maxint - 1
+except AttributeError:
+    # python 3 does not have a maximum int value
+    MAX_TXN_ID = 2**63 - 1
+
 sql_logger = logging.getLogger("synapse.storage.SQL")
 transaction_logger = logging.getLogger("synapse.storage.txn")
 perf_logger = logging.getLogger("synapse.storage.TIME")
@@ -137,7 +145,7 @@ class PerformanceCounters(object):

    def interval(self, interval_duration, limit=3):
        counters = []
-        for name, (count, cum_time) in self.current_counters.iteritems():
+        for name, (count, cum_time) in iteritems(self.current_counters):
            prev_count, prev_time = self.previous_counters.get(name, (0, 0))
            counters.append((
                (cum_time - prev_time) / interval_duration,
@@ -222,7 +230,7 @@ class SQLBaseStore(object):

        # We don't really need these to be unique, so lets stop it from
        # growing really large.
-        self._TXN_ID = (self._TXN_ID + 1) % (sys.maxint - 1)
+        self._TXN_ID = (self._TXN_ID + 1) % (MAX_TXN_ID)

        name = "%s-%x" % (desc, txn_id, )

@@ -543,7 +551,7 @@ class SQLBaseStore(object):
            ", ".join("%s = ?" % (k,) for k in values),
            " AND ".join("%s = ?" % (k,) for k in keyvalues)
        )
-        sqlargs = values.values() + keyvalues.values()
+        sqlargs = list(values.values()) + list(keyvalues.values())

        txn.execute(sql, sqlargs)
        if txn.rowcount > 0:
@@ -561,7 +569,7 @@ class SQLBaseStore(object):
            ", ".join(k for k in allvalues),
            ", ".join("?" for _ in allvalues)
        )
-        txn.execute(sql, allvalues.values())
+        txn.execute(sql, list(allvalues.values()))
        # successfully inserted
        return True

@@ -629,8 +637,8 @@ class SQLBaseStore(object):
        }

        if keyvalues:
-            sql += " WHERE %s" % " AND ".join("%s = ?" % k for k in keyvalues.iterkeys())
-            txn.execute(sql, keyvalues.values())
+            sql += " WHERE %s" % " AND ".join("%s = ?" % k for k in iterkeys(keyvalues))
+            txn.execute(sql, list(keyvalues.values()))
        else:
            txn.execute(sql)

@@ -694,7 +702,7 @@ class SQLBaseStore(object):
                table,
                " AND ".join("%s = ?" % (k, ) for k in keyvalues)
            )
-            txn.execute(sql, keyvalues.values())
+            txn.execute(sql, list(keyvalues.values()))
        else:
            sql = "SELECT %s FROM %s" % (
                ", ".join(retcols),
@@ -725,9 +733,12 @@ class SQLBaseStore(object):
        if not iterable:
            defer.returnValue(results)

+        # iterables can not be sliced, so convert it to a list first
+        it_list = list(iterable)
+
        chunks = [
-            iterable[i:i + batch_size]
-            for i in xrange(0, len(iterable), batch_size)
+            it_list[i:i + batch_size]
+            for i in range(0, len(it_list), batch_size)
        ]
        for chunk in chunks:
            rows = yield self.runInteraction(
@@ -767,7 +778,7 @@ class SQLBaseStore(object):
        )
        values.extend(iterable)

-        for key, value in keyvalues.iteritems():
+        for key, value in iteritems(keyvalues):
            clauses.append("%s = ?" % (key,))
            values.append(value)

@@ -790,7 +801,7 @@ class SQLBaseStore(object):
    @staticmethod
    def _simple_update_txn(txn, table, keyvalues, updatevalues):
        if keyvalues:
-            where = "WHERE %s" % " AND ".join("%s = ?" % k for k in keyvalues.iterkeys())
+            where = "WHERE %s" % " AND ".join("%s = ?" % k for k in iterkeys(keyvalues))
        else:
            where = ""

@@ -802,7 +813,7 @@ class SQLBaseStore(object):

        txn.execute(
            update_sql,
-            updatevalues.values() + keyvalues.values()
+            list(updatevalues.values()) + list(keyvalues.values())
        )

        return txn.rowcount
@@ -850,7 +861,7 @@ class SQLBaseStore(object):
            " AND ".join("%s = ?" % (k,) for k in keyvalues)
        )

-        txn.execute(select_sql, keyvalues.values())
+        txn.execute(select_sql, list(keyvalues.values()))

        row = txn.fetchone()
        if not row:
@@ -888,7 +899,7 @@ class SQLBaseStore(object):
            " AND ".join("%s = ?" % (k, ) for k in keyvalues)
        )

-        txn.execute(sql, keyvalues.values())
+        txn.execute(sql, list(keyvalues.values()))
        if txn.rowcount == 0:
            raise StoreError(404, "No row found")
        if txn.rowcount > 1:
@@ -906,7 +917,7 @@ class SQLBaseStore(object):
            " AND ".join("%s = ?" % (k, ) for k in keyvalues)
        )

-        return txn.execute(sql, keyvalues.values())
+        return txn.execute(sql, list(keyvalues.values()))

    def _simple_delete_many(self, table, column, iterable, keyvalues, desc):
        return self.runInteraction(
@@ -938,7 +949,7 @@ class SQLBaseStore(object):
        )
        values.extend(iterable)

-        for key, value in keyvalues.iteritems():
+        for key, value in iteritems(keyvalues):
            clauses.append("%s = ?" % (key,))
            values.append(value)

@@ -978,7 +989,7 @@ class SQLBaseStore(object):
        txn.close()

        if cache:
-            min_val = min(cache.itervalues())
+            min_val = min(itervalues(cache))
        else:
            min_val = max_value

@@ -1093,7 +1104,7 @@ class SQLBaseStore(object):
                " AND ".join("%s = ?" % (k,) for k in keyvalues),
                " ? ASC LIMIT ? OFFSET ?"
            )
-            txn.execute(sql, keyvalues.values() + pagevalues)
+            txn.execute(sql, list(keyvalues.values()) + list(pagevalues))
        else:
            sql = "SELECT %s FROM %s ORDER BY %s" % (
                ", ".join(retcols),
--- a/synapse/storage/chunk_ordered_table.py
+++ b/synapse/storage/chunk_ordered_table.py
@@ -0,0 +1,485 @@
+# -*- coding: utf-8 -*-
+# Copyright 2018 New Vector Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import math
+import logging
+
+from collections import deque
+from fractions import Fraction
+
+from synapse.storage._base import SQLBaseStore
+from synapse.storage.engines import PostgresEngine
+from synapse.util.katriel_bodlaender import OrderedListStore
+
+import synapse.metrics
+
+metrics = synapse.metrics.get_metrics_for(__name__)
+rebalance_counter = metrics.register_counter("rebalances")
+
+
+logger = logging.getLogger(__name__)
+
+
+class ChunkDBOrderedListStore(OrderedListStore):
+    """Used as the list store for room chunks, efficiently maintaining them in
+    topological order on updates.
+
+    A room chunk is a connected portion of the room events DAG. Chunks are
+    constructed so that they have the additional property that for all events in
+    the chunk, either all of their prev_events are in that chunk or none of them
+    are. This ensures that no event that is subsequently received needs to be
+    inserted into the middle of a chunk, since it cannot both reference an event
+    in the chunk and be referenced by an event in the chunk (assuming no
+    cycles).
+
+    As such the set of chunks in a room inherits a DAG, i.e. if an event in one
+    chunk references an event in a second chunk, then we say that the first
+    chunk references the second, and thus forming a DAG. (This means that chunks
+    start off disconnected until an event is received that connects the two
+    chunks.)
+
+    We can therefore end up with multiple chunks in a room when the server
+    misses some events, e.g. due to the server being offline for a time.
+
+    The server may only have a subset of all events in a room, in which case
+    its possible for the server to have chunks that are unconnected from each
+    other. The ordering between unconnected chunks is arbitrary.
+
+    The class is designed for use inside transactions and so takes a
+    transaction object in the constructor. This means that it needs to be
+    re-instantiated in each transaction, so all state needs to be stored
+    in the database.
+
+    Internally the ordering is implemented using floats, and the average is
+    taken when a node is inserted between other nodes. To avoid precision
+    errors a minimum difference between sucessive orderings is attempted to be
+    kept; whenever the difference is too small we attempt to rebalance. See
+    the `_rebalance` function for implementation details.
+
+    Note that OrderedListStore orders nodes such that source of an edge
+    comes before the target. This is counter intuitive when edges represent
+    causality, so for the purposes of ordering algorithm we invert the edge
+    directions, i.e. if chunk A has a prev chunk of B then we say that the
+    edge is from B to A. This ensures that newer chunks get inserted at the
+    end (rather than the start).
+
+    Note: Calls to `add_node` and `add_edge` cannot overlap for the same room,
+    and so callers should perform some form of per-room locking when using
+    this class.
+
+    Args:
+        txn
+        room_id (str)
+        clock
+        rebalance_digits (int): When a rebalance is triggered we rebalance
+            in a range around the node, where the bounds are rounded to this
+            number of digits.
+        min_difference (int): A rebalance is triggered when the difference
+            between two successive orderings is less than the reciprocal of
+            this.
+    """
+    def __init__(self,
+                 txn, room_id, clock, database_engine,
+                 rebalance_max_denominator=100,
+                 max_denominator=100000):
+        self.txn = txn
+        self.room_id = room_id
+        self.clock = clock
+        self.database_engine = database_engine
+
+        self.rebalance_md = rebalance_max_denominator
+        self.max_denominator = max_denominator
+
+    def is_before(self, a, b):
+        """Implements OrderedListStore"""
+        return self._get_order(a) < self._get_order(b)
+
+    def get_prev(self, node_id):
+        """Implements OrderedListStore"""
+
+        sql = """
+            SELECT chunk_id FROM chunk_linearized
+            WHERE next_chunk_id = ?
+        """
+
+        self.txn.execute(sql, (node_id,))
+
+        row = self.txn.fetchone()
+        if row:
+            return row[0]
+        return None
+
+    def get_next(self, node_id):
+        """Implements OrderedListStore"""
+
+        sql = """
+            SELECT next_chunk_id FROM chunk_linearized
+            WHERE chunk_id = ?
+        """
+
+        self.txn.execute(sql, (node_id,))
+
+        row = self.txn.fetchone()
+        if row:
+            return row[0]
+        return None
+
+    def _insert_before(self, node_id, target_id):
+        """Implements OrderedListStore"""
+
+        rebalance = False  # Set to true if we need to trigger a rebalance
+
+        if target_id:
+            before_id = self.get_prev(target_id)
+            if before_id:
+                new_order = self._insert_between(node_id, before_id, target_id)
+            else:
+                new_order = self._insert_at_start(node_id, target_id)
+        else:
+            # If target_id is None then we insert at the end.
+            self.txn.execute("""
+                SELECT chunk_id
+                FROM chunk_linearized
+                WHERE room_id = ? AND next_chunk_id is NULL
+            """, (self.room_id,))
+
+            row = self.txn.fetchone()
+            if row:
+                new_order = self._insert_at_end(node_id, row[0])
+            else:
+                new_order = self._insert_first(node_id)
+
+        rebalance = new_order.denominator > self.max_denominator
+
+        if rebalance:
+            self._rebalance(node_id)
+
+    def _insert_after(self, node_id, target_id):
+        """Implements OrderedListStore"""
+
+        rebalance = False  # Set to true if we need to trigger a rebalance
+
+        next_chunk_id = None
+        if target_id:
+            next_chunk_id = self.get_next(target_id)
+            if next_chunk_id:
+                new_order = self._insert_between(node_id, target_id, next_chunk_id)
+            else:
+                new_order = self._insert_at_end(node_id, target_id)
+        else:
+            # If target_id is None then we insert at the start.
+            self.txn.execute("""
+                SELECT chunk_id
+                FROM chunk_linearized
+                NATURAL JOIN chunk_linearized_first
+                WHERE room_id = ?
+            """, (self.room_id,))
+
+            row = self.txn.fetchone()
+            if row:
+                new_order = self._insert_at_start(node_id, row[0])
+            else:
+                new_order = self._insert_first(node_id)
+
+        rebalance = new_order.denominator > self.max_denominator
+
+        if rebalance:
+            self._rebalance(node_id)
+
+    def _insert_between(self, node_id, left_id, right_id):
+        left_order = self._get_order(left_id)
+        right_order = self._get_order(right_id)
+
+        assert left_order < right_order
+
+        new_order = stern_brocot_single(left_order, right_order)
+
+        SQLBaseStore._simple_update_one_txn(
+            self.txn,
+            table="chunk_linearized",
+            keyvalues={"chunk_id": left_id},
+            updatevalues={"next_chunk_id": node_id},
+        )
+
+        SQLBaseStore._simple_insert_txn(
+            self.txn,
+            table="chunk_linearized",
+            values={
+                "chunk_id": node_id,
+                "room_id": self.room_id,
+                "next_chunk_id": right_id,
+                "numerator": int(new_order.numerator),
+                "denominator": int(new_order.denominator),
+            }
+        )
+
+        return new_order
+
+    def _insert_at_end(self, node_id, last_id):
+        last_order = self._get_order(last_id)
+        new_order = Fraction(int(math.ceil(last_order)) + 1, 1)
+
+        SQLBaseStore._simple_update_one_txn(
+            self.txn,
+            table="chunk_linearized",
+            keyvalues={"chunk_id": last_id},
+            updatevalues={"next_chunk_id": node_id},
+        )
+
+        SQLBaseStore._simple_insert_txn(
+            self.txn,
+            table="chunk_linearized",
+            values={
+                "chunk_id": node_id,
+                "room_id": self.room_id,
+                "next_chunk_id": None,
+                "numerator": int(new_order.numerator),
+                "denominator": int(new_order.denominator),
+            }
+        )
+
+        return new_order
+
+    def _insert_at_start(self, node_id, first_id):
+        first_order = self._get_order(first_id)
+        new_order = stern_brocot_single(0, first_order)
+
+        SQLBaseStore._simple_update_one_txn(
+            self.txn,
+            table="chunk_linearized_first",
+            keyvalues={"room_id": self.room_id},
+            updatevalues={"chunk_id": node_id},
+        )
+
+        SQLBaseStore._simple_insert_txn(
+            self.txn,
+            table="chunk_linearized",
+            values={
+                "chunk_id": node_id,
+                "room_id": self.room_id,
+                "next_chunk_id": first_id,
+                "numerator": int(new_order.numerator),
+                "denominator": int(new_order.denominator),
+            }
+        )
+
+        return new_order
+
+    def _insert_first(self, node_id):
+        SQLBaseStore._simple_insert_txn(
+            self.txn,
+            table="chunk_linearized_first",
+            values={
+                "room_id": self.room_id,
+                "chunk_id": node_id,
+            },
+        )
+
+        SQLBaseStore._simple_insert_txn(
+            self.txn,
+            table="chunk_linearized",
+            values={
+                "chunk_id": node_id,
+                "room_id": self.room_id,
+                "next_chunk_id": None,
+                "numerator": 1,
+                "denominator": 1,
+            }
+        )
+
+        return Fraction(1, 1)
+
+    def get_nodes_with_edges_to(self, node_id):
+        """Implements OrderedListStore"""
+
+        # Note that we use the inverse relation here
+        sql = """
+            SELECT l.chunk_id, l.numerator, l.denominator FROM chunk_graph AS g
+            INNER JOIN chunk_linearized AS l ON g.prev_id = l.chunk_id
+            WHERE g.chunk_id = ?
+        """
+        self.txn.execute(sql, (node_id,))
+        return [(Fraction(n, d), c) for c, n, d in self.txn]
+
+    def get_nodes_with_edges_from(self, node_id):
+        """Implements OrderedListStore"""
+
+        # Note that we use the inverse relation here
+        sql = """
+            SELECT  l.chunk_id, l.numerator, l.denominator FROM chunk_graph AS g
+            INNER JOIN chunk_linearized AS l ON g.chunk_id = l.chunk_id
+            WHERE g.prev_id = ?
+        """
+        self.txn.execute(sql, (node_id,))
+        return [(Fraction(n, d), c) for c, n, d in self.txn]
+
+    def _delete_ordering(self, node_id):
+        """Implements OrderedListStore"""
+
+        next_chunk_id = SQLBaseStore._simple_select_one_onecol_txn(
+            self.txn,
+            table="chunk_linearized",
+            keyvalues={
+                "chunk_id": node_id,
+            },
+            retcol="next_chunk_id",
+        )
+
+        SQLBaseStore._simple_delete_txn(
+            self.txn,
+            table="chunk_linearized",
+            keyvalues={"chunk_id": node_id},
+        )
+
+        sql = """
+            UPDATE chunk_linearized SET next_chunk_id = ?
+            WHERE next_chunk_id = ?
+        """
+
+        self.txn.execute(sql, (next_chunk_id, node_id,))
+
+        sql = """
+            UPDATE chunk_linearized_first SET chunk_id = ?
+            WHERE chunk_id = ?
+        """
+
+        self.txn.execute(sql, (next_chunk_id, node_id,))
+
+    def _add_edge_to_graph(self, source_id, target_id):
+        """Implements OrderedListStore"""
+
+        # Note that we use the inverse relation
+        SQLBaseStore._simple_insert_txn(
+            self.txn,
+            table="chunk_graph",
+            values={"chunk_id": target_id, "prev_id": source_id}
+        )
+
+    def _get_order(self, node_id):
+        """Get the ordering of the given node.
+        """
+
+        row = SQLBaseStore._simple_select_one_txn(
+            self.txn,
+            table="chunk_linearized",
+            keyvalues={"chunk_id": node_id},
+            retcols=("numerator", "denominator",),
+        )
+        return Fraction(row["numerator"], row["denominator"])
+
+    def _rebalance(self, node_id):
+        """Rebalances the list around the given node to ensure that the
+        ordering floats don't get too small.
+
+        This works by finding a range that includes the given node, and
+        recalculating the ordering floats such that they're equidistant in
+        that range.
+        """
+
+        logger.info("Rebalancing room %s, chunk %s", self.room_id, node_id)
+
+        old_order = self._get_order(node_id)
+
+        a, b, c, d = find_farey_terms(old_order, self.rebalance_md)
+        assert old_order < Fraction(a, b)
+        assert c + d > self.rebalance_md
+
+        with_sql = """
+            WITH RECURSIVE chunks (chunk_id, next, n, a, b, c, d) AS (
+                    SELECT chunk_id, next_chunk_id, ?, ?, ?, ?, ?
+                    FROM chunk_linearized WHERE chunk_id = ?
+                UNION ALL
+                    SELECT n.chunk_id, n.next_chunk_id, n,
+                    c, d, ((n + b) / d) * c - a, ((n + b) / d) * d - b
+                    FROM chunks AS c
+                    INNER JOIN chunk_linearized AS l ON l.chunk_id = c.chunk_id
+                    INNER JOIN chunk_linearized AS n ON n.chunk_id = l.next_chunk_id
+                    WHERE c * 1.0 / d > n.numerator * 1.0 / n.denominator
+            )
+        """
+
+        if isinstance(self.database_engine, PostgresEngine):
+            sql = with_sql + """
+                UPDATE chunk_linearized AS l
+                SET numerator = a, denominator = b
+                FROM chunks AS c
+                WHERE c.chunk_id = l.chunk_id
+            """
+        else:
+            sql = with_sql + """
+                UPDATE chunk_linearized
+                SET (numerator, denominator) = (
+                    SELECT a, b FROM chunks
+                    WHERE chunks.chunk_id = chunk_linearized.chunk_id
+                )
+                WHERE chunk_id in (SELECT chunk_id FROM chunks)
+            """
+
+        self.txn.execute(sql, (
+            self.rebalance_md, a, b, c, d, node_id
+        ))
+
+        rebalance_counter.inc()
+
+
+def stern_brocot_single(min_frac, max_frac):
+    assert 0 <= min_frac < max_frac
+
+    # If the determinant is 1 then the fraction with smallest numerator and
+    # denominator in the range is the mediant, so we don't have to use the
+    # stern brocot tree to search for it.
+    determinant = (
+        min_frac.denominator * max_frac.numerator
+        - min_frac.numerator * max_frac.denominator
+    )
+
+    if determinant == 1:
+        return Fraction(
+            min_frac.numerator + max_frac.numerator,
+            min_frac.denominator + max_frac.denominator,
+        )
+
+    a, b, c, d = 0, 1, 1, 0
+
+    while True:
+        f = Fraction(a + c, b + d)
+        if f <= min_frac:
+            a, b, c, d = a + c, b + d, c, d
+
+        elif min_frac < f < max_frac:
+            return f
+        else:
+            a, b, c, d = a, b, a + c, b + d
+
+
+def find_farey_terms(min_frac, max_denom):
+    a, b, c, d =  0, 1, 1, 0
+
+    while True:
+        cur_frac = Fraction(a + c, b + d)
+
+        if b + d > max_denom:
+            break
+
+        if cur_frac <= min_frac:
+            a, b, c, d = a + c, b + d, c, d
+        elif min_frac < cur_frac:
+            a, b, c, d = a, b, a + c, b + d
+
+    if Fraction(a, b) <= min_frac:
+        k = int((max_denom + b) / d)
+        a, b, c, d = c, d, (k*c-a), (k*d-b)
+
+    return a, b, c, d
--- a/synapse/storage/client_ips.py
+++ b/synapse/storage/client_ips.py
@@ -22,6 +22,8 @@ from . import background_updates

 from synapse.util.caches import CACHE_SIZE_FACTOR

+from six import iteritems
+

 logger = logging.getLogger(__name__)

@@ -55,6 +57,13 @@ class ClientIpStore(background_updates.BackgroundUpdateStore):
            columns=["user_id", "last_seen"],
        )

+        self.register_background_index_update(
+            "user_ips_last_seen_only_index",
+            index_name="user_ips_last_seen_only",
+            table="user_ips",
+            columns=["last_seen"],
+        )
+
        # (user_id, access_token, ip) -> (user_agent, device_id, last_seen)
        self._batch_row_update = {}

@@ -92,7 +101,7 @@ class ClientIpStore(background_updates.BackgroundUpdateStore):
    def _update_client_ips_batch_txn(self, txn, to_update):
        self.database_engine.lock_table(txn, "user_ips")

-        for entry in to_update.iteritems():
+        for entry in iteritems(to_update):
            (user_id, access_token, ip), (user_agent, device_id, last_seen) = entry

            self._simple_upsert_txn(
@@ -224,5 +233,5 @@ class ClientIpStore(background_updates.BackgroundUpdateStore):
                "user_agent": user_agent,
                "last_seen": last_seen,
            }
-            for (access_token, ip), (user_agent, last_seen) in results.iteritems()
+            for (access_token, ip), (user_agent, last_seen) in iteritems(results)
        ))
--- a/synapse/storage/devices.py
+++ b/synapse/storage/devices.py
@@ -21,6 +21,7 @@ from synapse.api.errors import StoreError
 from ._base import SQLBaseStore, Cache
 from synapse.util.caches.descriptors import cached, cachedList, cachedInlineCallbacks

+from six import itervalues, iteritems

 logger = logging.getLogger(__name__)

@@ -360,7 +361,7 @@ class DeviceStore(SQLBaseStore):
            return (now_stream_id, [])

        if len(query_map) >= 20:
-            now_stream_id = max(stream_id for stream_id in query_map.itervalues())
+            now_stream_id = max(stream_id for stream_id in itervalues(query_map))

        devices = self._get_e2e_device_keys_txn(
            txn, query_map.keys(), include_all_devices=True
@@ -373,13 +374,13 @@ class DeviceStore(SQLBaseStore):
        """

        results = []
-        for user_id, user_devices in devices.iteritems():
+        for user_id, user_devices in iteritems(devices):
            # The prev_id for the first row is always the last row before
            # `from_stream_id`
            txn.execute(prev_sent_id_sql, (destination, user_id, from_stream_id))
            rows = txn.fetchall()
            prev_id = rows[0][0]
-            for device_id, device in user_devices.iteritems():
+            for device_id, device in iteritems(user_devices):
                stream_id = query_map[(user_id, device_id)]
                result = {
                    "user_id": user_id,
@@ -483,7 +484,7 @@ class DeviceStore(SQLBaseStore):
        if devices:
            user_devices = devices[user_id]
            results = []
-            for device_id, device in user_devices.iteritems():
+            for device_id, device in iteritems(user_devices):
                result = {
                    "device_id": device_id,
                }
--- a/synapse/storage/end_to_end_keys.py
+++ b/synapse/storage/end_to_end_keys.py
@@ -21,6 +21,8 @@ import simplejson as json

 from ._base import SQLBaseStore

+from six import iteritems
+

 class EndToEndKeyStore(SQLBaseStore):
    def set_e2e_device_keys(self, user_id, device_id, time_now, device_keys):
@@ -81,8 +83,8 @@ class EndToEndKeyStore(SQLBaseStore):
            query_list, include_all_devices,
        )

-        for user_id, device_keys in results.iteritems():
-            for device_id, device_info in device_keys.iteritems():
+        for user_id, device_keys in iteritems(results):
+            for device_id, device_info in iteritems(device_keys):
                device_info["keys"] = json.loads(device_info.pop("key_json"))

        defer.returnValue(results)
--- a/synapse/storage/event_push_actions.py
+++ b/synapse/storage/event_push_actions.py
@@ -18,12 +18,12 @@ from synapse.storage._base import SQLBaseStore, LoggingTransaction
 from twisted.internet import defer
 from synapse.util.async import sleep
 from synapse.util.caches.descriptors import cachedInlineCallbacks
-from synapse.types import RoomStreamToken
-from .stream import lower_bound

 import logging
 import simplejson as json

+from six import iteritems
+
 logger = logging.getLogger(__name__)


@@ -99,7 +99,7 @@ class EventPushActionsWorkerStore(SQLBaseStore):
    def _get_unread_counts_by_receipt_txn(self, txn, room_id, user_id,
                                          last_read_event_id):
        sql = (
-            "SELECT stream_ordering, topological_ordering"
+            "SELECT stream_ordering"
            " FROM events"
            " WHERE room_id = ? AND event_id = ?"
        )
@@ -111,17 +111,12 @@ class EventPushActionsWorkerStore(SQLBaseStore):
            return {"notify_count": 0, "highlight_count": 0}

        stream_ordering = results[0][0]
-        topological_ordering = results[0][1]

        return self._get_unread_counts_by_pos_txn(
-            txn, room_id, user_id, topological_ordering, stream_ordering
+            txn, room_id, user_id, stream_ordering
        )

-    def _get_unread_counts_by_pos_txn(self, txn, room_id, user_id, topological_ordering,
-                                      stream_ordering):
-        token = RoomStreamToken(
-            topological_ordering, stream_ordering
-        )
+    def _get_unread_counts_by_pos_txn(self, txn, room_id, user_id, stream_ordering):

        # First get number of notifications.
        # We don't need to put a notif=1 clause as all rows always have
@@ -132,10 +127,10 @@ class EventPushActionsWorkerStore(SQLBaseStore):
            " WHERE"
            " user_id = ?"
            " AND room_id = ?"
-            " AND %s"
-        ) % (lower_bound(token, self.database_engine, inclusive=False),)
+            " AND stream_ordering > ?"
+        )

-        txn.execute(sql, (user_id, room_id))
+        txn.execute(sql, (user_id, room_id, stream_ordering))
        row = txn.fetchone()
        notify_count = row[0] if row else 0

@@ -155,10 +150,10 @@ class EventPushActionsWorkerStore(SQLBaseStore):
            " highlight = 1"
            " AND user_id = ?"
            " AND room_id = ?"
-            " AND %s"
-        ) % (lower_bound(token, self.database_engine, inclusive=False),)
+            " AND stream_ordering > ?"
+        )

-        txn.execute(sql, (user_id, room_id))
+        txn.execute(sql, (user_id, room_id, stream_ordering))
        row = txn.fetchone()
        highlight_count = row[0] if row else 0

@@ -209,7 +204,6 @@ class EventPushActionsWorkerStore(SQLBaseStore):
                "   ep.highlight "
                " FROM ("
                "   SELECT room_id,"
-                "       MAX(topological_ordering) as topological_ordering,"
                "       MAX(stream_ordering) as stream_ordering"
                "   FROM events"
                "   INNER JOIN receipts_linearized USING (room_id, event_id)"
@@ -219,13 +213,7 @@ class EventPushActionsWorkerStore(SQLBaseStore):
                " event_push_actions AS ep"
                " WHERE"
                "   ep.room_id = rl.room_id"
-                "   AND ("
-                "       ep.topological_ordering > rl.topological_ordering"
-                "       OR ("
-                "           ep.topological_ordering = rl.topological_ordering"
-                "           AND ep.stream_ordering > rl.stream_ordering"
-                "       )"
-                "   )"
+                "   AND ep.stream_ordering > rl.stream_ordering"
                "   AND ep.user_id = ?"
                "   AND ep.stream_ordering > ?"
                "   AND ep.stream_ordering <= ?"
@@ -318,7 +306,6 @@ class EventPushActionsWorkerStore(SQLBaseStore):
                "  ep.highlight, e.received_ts"
                " FROM ("
                "   SELECT room_id,"
-                "       MAX(topological_ordering) as topological_ordering,"
                "       MAX(stream_ordering) as stream_ordering"
                "   FROM events"
                "   INNER JOIN receipts_linearized USING (room_id, event_id)"
@@ -329,13 +316,7 @@ class EventPushActionsWorkerStore(SQLBaseStore):
                " INNER JOIN events AS e USING (room_id, event_id)"
                " WHERE"
                "   ep.room_id = rl.room_id"
-                "   AND ("
-                "       ep.topological_ordering > rl.topological_ordering"
-                "       OR ("
-                "           ep.topological_ordering = rl.topological_ordering"
-                "           AND ep.stream_ordering > rl.stream_ordering"
-                "       )"
-                "   )"
+                "   AND ep.stream_ordering > rl.stream_ordering"
                "   AND ep.user_id = ?"
                "   AND ep.stream_ordering > ?"
                "   AND ep.stream_ordering <= ?"
@@ -441,7 +422,7 @@ class EventPushActionsWorkerStore(SQLBaseStore):

            txn.executemany(sql, (
                _gen_entry(user_id, actions)
-                for user_id, actions in user_id_actions.iteritems()
+                for user_id, actions in iteritems(user_id_actions)
            ))

        return self.runInteraction(
@@ -762,10 +743,10 @@ class EventPushActionsStore(EventPushActionsWorkerStore):
        )

    def _remove_old_push_actions_before_txn(self, txn, room_id, user_id,
-                                            topological_ordering, stream_ordering):
+                                            stream_ordering):
        """
        Purges old push actions for a user and room before a given
-        topological_ordering.
+        stream_ordering.

        We however keep a months worth of highlighted notifications, so that
        users can still get a list of recent highlights.
@@ -774,7 +755,7 @@ class EventPushActionsStore(EventPushActionsWorkerStore):
            txn: The transcation
            room_id: Room ID to delete from
            user_id: user ID to delete for
-            topological_ordering: The lowest topological ordering which will
+            stream_ordering: The lowest stream ordering which will
                                  not be deleted.
        """
        txn.call_after(
@@ -793,9 +774,9 @@ class EventPushActionsStore(EventPushActionsWorkerStore):
        txn.execute(
            "DELETE FROM event_push_actions "
            " WHERE user_id = ? AND room_id = ? AND "
-            " topological_ordering <= ?"
+            " stream_ordering <= ?"
            " AND ((stream_ordering < ? AND highlight = 1) or highlight = 0)",
-            (user_id, room_id, topological_ordering, self.stream_ordering_month_ago)
+            (user_id, room_id, stream_ordering, self.stream_ordering_month_ago)
        )

        txn.execute("""
--- a/synapse/storage/events.py
+++ b/synapse/storage/events.py
@@ -23,6 +23,7 @@ import simplejson as json
 from twisted.internet import defer

 from synapse.storage.events_worker import EventsWorkerStore
+from synapse.storage.chunk_ordered_table import ChunkDBOrderedListStore
 from synapse.util.async import ObservableDeferred
 from synapse.util.frozenutils import frozendict_json_encoder
 from synapse.util.logcontext import (
@@ -33,7 +34,7 @@ from synapse.util.metrics import Measure
 from synapse.api.constants import EventTypes
 from synapse.api.errors import SynapseError
 from synapse.util.caches.descriptors import cached, cachedInlineCallbacks
-from synapse.types import get_domain_from_id
+from synapse.types import get_domain_from_id, RoomStreamToken
 import synapse.metrics

 # these are only included to make the type annotations work
@@ -201,6 +202,7 @@ def _retry_on_integrity_error(func):
 class EventsStore(EventsWorkerStore):
    EVENT_ORIGIN_SERVER_TS_NAME = "event_origin_server_ts"
    EVENT_FIELDS_SENDER_URL_UPDATE_NAME = "event_fields_sender_url"
+    EVENT_FIELDS_CHUNK = "event_fields_chunk_id"

    def __init__(self, db_conn, hs):
        super(EventsStore, self).__init__(db_conn, hs)
@@ -232,6 +234,20 @@ class EventsStore(EventsWorkerStore):
            psql_only=True,
        )

+        self.register_background_index_update(
+            "events_chunk_index",
+            index_name="events_chunk_index",
+            table="events",
+            columns=["room_id", "chunk_id", "topological_ordering", "stream_ordering"],
+            unique=True,
+            psql_only=True,
+        )
+
+        self.register_background_update_handler(
+            self.EVENT_FIELDS_CHUNK,
+            self._background_compute_chunks,
+        )
+
        self._event_persist_queue = _EventPeristenceQueue()

        self._state_resolution_handler = hs.get_state_resolution_handler()
@@ -1010,13 +1026,20 @@ class EventsStore(EventsWorkerStore):
                    }
                )

-                sql = (
-                    "UPDATE events SET outlier = ?"
-                    " WHERE event_id = ?"
+                chunk_id, topo = self._compute_chunk_id_txn(
+                    txn, event.room_id, event.event_id,
+                    [eid for eid, _ in event.prev_events],
                )
-                txn.execute(
-                    sql,
-                    (False, event.event_id,)
+
+                self._simple_update_txn(
+                    txn,
+                    table="events",
+                    keyvalues={"event_id": event.event_id},
+                    updatevalues={
+                        "outlier": False,
+                        "chunk_id": chunk_id,
+                        "topological_ordering": topo,
+                    },
                )

                # Update the event_backward_extremities table now that this
@@ -1099,13 +1122,22 @@ class EventsStore(EventsWorkerStore):
            ],
        )

+        if event.internal_metadata.is_outlier():
+            chunk_id, topo = None, 0
+        else:
+            chunk_id, topo = self._compute_chunk_id_txn(
+                txn, event.room_id, event.event_id,
+                [eid for eid, _ in event.prev_events],
+            )
+
        self._simple_insert_many_txn(
            txn,
            table="events",
            values=[
                {
                    "stream_ordering": event.internal_metadata.stream_ordering,
-                    "topological_ordering": event.depth,
+                    "chunk_id": chunk_id,
+                    "topological_ordering": topo,
                    "depth": event.depth,
                    "event_id": event.event_id,
                    "room_id": event.room_id,
@@ -1335,6 +1367,214 @@ class EventsStore(EventsWorkerStore):
            (event.event_id, event.redacts)
        )

+    def _compute_chunk_id_txn(self, txn, room_id, event_id, prev_event_ids):
+        """Computes the chunk ID and topological ordering for an event.
+
+        Also handles updating chunk_graph table.
+
+        Args:
+            txn,
+            room_id (str)
+            event_id (str)
+            prev_event_ids (list[str])
+
+        Returns:
+            tuple[int, int]: Returns the chunk_id, topological_ordering for
+            the event
+        """
+
+        # We calculate the chunk for an event using the following rules:
+        #
+        # 1. If all prev events have the same chunk ID then use that chunk ID
+        # 2. If we have none of the prev events but do have events pointing to
+        #    it, then we use their chunk ID if:
+        #     - They’re all in the same chunk, and
+        #     - All their prev events match the events being inserted
+        # 3. Otherwise, create a new chunk and use that
+
+        # Set of chunks that the event refers to. Includes None if there were
+        # prev events that we don't have (or don't have a chunk for)
+        prev_chunk_ids = set()
+
+        for eid in prev_event_ids:
+            chunk_id = self._simple_select_one_onecol_txn(
+                txn,
+                table="events",
+                keyvalues={"event_id": eid},
+                retcol="chunk_id",
+                allow_none=True,
+            )
+
+            prev_chunk_ids.add(chunk_id)
+
+        forward_events = self._simple_select_onecol_txn(
+            txn,
+            table="event_edges",
+            keyvalues={
+                "prev_event_id": event_id,
+                "is_state": False,
+            },
+            retcol="event_id",
+        )
+
+        # Set of chunks that refer to this event.
+        forward_chunk_ids = set()
+
+        # Set of event_ids of all prev_events of those in `forward_events`. This
+        # is guaranteed to contain at least the given event_id.
+        sibling_events = set()
+        for eid in set(forward_events):
+            chunk_id = self._simple_select_one_onecol_txn(
+                txn,
+                table="events",
+                keyvalues={"event_id": eid},
+                retcol="chunk_id",
+                allow_none=True,
+            )
+
+            if chunk_id is not None:
+                # chunk_id can be None if it's an outlier
+                forward_chunk_ids.add(chunk_id)
+
+            pes = self._simple_select_onecol_txn(
+                txn,
+                table="event_edges",
+                keyvalues={
+                    "event_id": eid,
+                    "is_state": False,
+                },
+                retcol="prev_event_id",
+            )
+
+            sibling_events.update(pes)
+
+        table = ChunkDBOrderedListStore(
+            txn, room_id, self.clock, self.database_engine,
+        )
+
+        # If there is only one previous chunk (and that isn't None), then this
+        # satisfies condition one.
+        if len(prev_chunk_ids) == 1 and None not in prev_chunk_ids:
+            chunk_id = list(prev_chunk_ids)[0]
+
+            # This event is being inserted at the end of the chunk
+            new_topo = self._simple_select_one_onecol_txn(
+                txn,
+                table="events",
+                keyvalues={
+                    "room_id": room_id,
+                    "chunk_id": chunk_id,
+                },
+                retcol="COALESCE(MAX(topological_ordering), 0)",
+            )
+            new_topo += 1
+
+            # We need to now update the database with any new edges between chunks
+            current_prev_ids = set()
+
+            current_forward_ids = self._simple_select_onecol_txn(
+                txn,
+                table="chunk_graph",
+                keyvalues={
+                    "prev_id": chunk_id,
+                },
+                retcol="chunk_id",
+            )
+
+        # If there is only one forward chunk and only one sibling event (which
+        # would be the given event), then this satisfies condition two.
+        elif len(forward_chunk_ids) == 1 and len(sibling_events) == 1:
+            chunk_id = list(forward_chunk_ids)[0]
+
+            # This event is being inserted at the start of the chunk
+            new_topo = self._simple_select_one_onecol_txn(
+                txn,
+                table="events",
+                keyvalues={
+                    "room_id": room_id,
+                    "chunk_id": chunk_id,
+                },
+                retcol="COALESCE(MIN(topological_ordering), 0)",
+            )
+            new_topo -= 1
+
+            # We need to now update the database with any new edges between chunks
+            current_prev_ids = self._simple_select_onecol_txn(
+                txn,
+                table="chunk_graph",
+                keyvalues={
+                    "chunk_id": chunk_id,
+                },
+                retcol="prev_id",
+            )
+
+            current_forward_ids = set()
+        else:
+            chunk_id = self._chunk_id_gen.get_next()
+            new_topo = 0
+
+            # We've generated a new chunk, so we have to tell the
+            # ChunkDBOrderedListStore about that.
+            table.add_node(chunk_id)
+
+            # We need to now update the database with any new edges between chunks
+            current_prev_ids = self._simple_select_onecol_txn(
+                txn,
+                table="chunk_graph",
+                keyvalues={
+                    "chunk_id": chunk_id,
+                },
+                retcol="prev_id",
+            )
+
+            current_forward_ids = self._simple_select_onecol_txn(
+                txn,
+                table="chunk_graph",
+                keyvalues={
+                    "prev_id": chunk_id,
+                },
+                retcol="chunk_id",
+            )
+
+        prev_chunk_ids = set(
+            pid for pid in prev_chunk_ids
+            if pid is not None and pid not in current_prev_ids and pid != chunk_id
+        )
+        forward_chunk_ids = set(
+            fid for fid in forward_chunk_ids
+            if fid not in current_forward_ids and fid != chunk_id
+        )
+
+        if prev_chunk_ids:
+            for pid in prev_chunk_ids:
+                # Note that the edge direction is reversed than what you might
+                # expect. See ChunkDBOrderedListStore for more details.
+                table.add_edge(pid, chunk_id)
+
+        if forward_chunk_ids:
+            for fid in forward_chunk_ids:
+                # Note that the edge direction is reversed than what you might
+                # expect. See ChunkDBOrderedListStore for more details.
+                table.add_edge(chunk_id, fid)
+
+        # We now need to update the backwards extremities for the chunks.
+
+        txn.executemany("""
+                INSERT INTO chunk_backwards_extremities (chunk_id, event_id)
+                SELECT ?, ? WHERE NOT EXISTS (
+                    SELECT event_id FROM events WHERE event_id = ?
+                    AND NOT outlier
+                )
+            """, [(chunk_id, eid, eid) for eid in prev_event_ids])
+
+        self._simple_delete_txn(
+            txn,
+            table="chunk_backwards_extremities",
+            keyvalues={"event_id": event_id},
+        )
+
+        return chunk_id, new_topo
+
    @defer.inlineCallbacks
    def have_events_in_timeline(self, event_ids):
        """Given a list of event ids, check if we have already processed and
@@ -1628,6 +1868,66 @@ class EventsStore(EventsWorkerStore):

        defer.returnValue(result)

+    @defer.inlineCallbacks
+    def _background_compute_chunks(self, progress, batch_size):
+        up_to_stream_id = progress.get("up_to_stream_id")
+        if up_to_stream_id is None:
+            up_to_stream_id = self.get_current_events_token() + 1
+
+        rows_inserted = progress.get("rows_inserted", 0)
+
+        def reindex_chunks_txn(txn):
+            txn.execute("""
+                SELECT stream_ordering, room_id, event_id,
+                (
+                    SELECT COALESCE(array_agg(prev_event_id), ARRAY[]::TEXT[])
+                    FROM event_edges AS eg
+                    WHERE NOT is_state AND eg.event_id = e.event_id
+                ) AS prev_events
+                FROM events AS e
+                WHERE stream_ordering < ? AND outlier = ? AND chunk_id IS NULL
+                ORDER BY stream_ordering DESC
+                LIMIT ?
+            """, (up_to_stream_id, False, batch_size))
+
+            rows = txn.fetchall()
+
+            stream_ordering = up_to_stream_id
+            for stream_ordering, room_id, event_id, prev_events in rows:
+                chunk_id, topo = self._compute_chunk_id_txn(
+                    txn, room_id, event_id, prev_events,
+                )
+
+                self._simple_update_txn(
+                    txn,
+                    table="events",
+                    keyvalues={"event_id": event_id},
+                    updatevalues={
+                        "chunk_id": chunk_id,
+                        "topological_ordering": topo,
+                    },
+                )
+
+            progress = {
+                "up_to_stream_id": stream_ordering,
+                "rows_inserted": rows_inserted + len(rows)
+            }
+
+            self._background_update_progress_txn(
+                txn, self.EVENT_FIELDS_CHUNK, progress
+            )
+
+            return len(rows)
+
+        result = yield self.runInteraction(
+            self.EVENT_FIELDS_CHUNK, reindex_chunks_txn
+        )
+
+        if not result:
+            yield self._end_background_update(self.EVENT_FIELDS_CHUNK)
+
+        defer.returnValue(result)
+
    def get_current_backfill_token(self):
        """The current minimum token that backfilled events have reached"""
        return -self._backfill_id_gen.get_current_token()
@@ -1803,15 +2103,14 @@ class EventsStore(EventsWorkerStore):
        return self.runInteraction("get_all_new_events", get_all_new_events_txn)

    def purge_history(
-        self, room_id, topological_ordering, delete_local_events,
+        self, room_id, token, delete_local_events,
    ):
        """Deletes room history before a certain point

        Args:
            room_id (str):

-            topological_ordering (int):
-                minimum topo ordering to preserve
+            token (str): A topological token to delete events before

            delete_local_events (bool):
                if True, we will delete local events as well as remote ones
@@ -1821,13 +2120,15 @@ class EventsStore(EventsWorkerStore):

        return self.runInteraction(
            "purge_history",
-            self._purge_history_txn, room_id, topological_ordering,
+            self._purge_history_txn, room_id, token,
            delete_local_events,
        )

    def _purge_history_txn(
-        self, txn, room_id, topological_ordering, delete_local_events,
+        self, txn, room_id, token_str, delete_local_events,
    ):
+        token = RoomStreamToken.parse(token_str)
+
        # Tables that should be pruned:
        #     event_auth
        #     event_backward_extremities
@@ -1872,6 +2173,13 @@ class EventsStore(EventsWorkerStore):
            " ON events_to_purge(should_delete)",
        )

+        # We do joins against events_to_purge for e.g. calculating state
+        # groups to purge, etc., so lets make an index.
+        txn.execute(
+            "CREATE INDEX events_to_purge_id"
+            " ON events_to_purge(event_id)",
+        )
+
        # First ensure that we're not about to delete all the forward extremeties
        txn.execute(
            "SELECT e.event_id, e.depth FROM events as e "
@@ -1884,7 +2192,7 @@ class EventsStore(EventsWorkerStore):
        rows = txn.fetchall()
        max_depth = max(row[0] for row in rows)

-        if max_depth <= topological_ordering:
+        if max_depth <= token.topological:
            # We need to ensure we don't delete all the events from the datanase
            # otherwise we wouldn't be able to send any events (due to not
            # having any backwards extremeties)
@@ -1900,17 +2208,42 @@ class EventsStore(EventsWorkerStore):
            should_delete_expr += " AND event_id NOT LIKE ?"
            should_delete_params += ("%:" + self.hs.hostname, )

-        should_delete_params += (room_id, topological_ordering)
-
-        txn.execute(
-            "INSERT INTO events_to_purge"
-            " SELECT event_id, %s"
-            " FROM events AS e LEFT JOIN state_events USING (event_id)"
-            " WHERE e.room_id = ? AND topological_ordering < ?" % (
-                should_delete_expr,
-            ),
-            should_delete_params,
+        next_token = RoomStreamToken(
+            token.chunk,
+            token.topological - 1,
+            token.stream,
        )
+        while True:
+            rows, next_token, _ = self._paginate_room_events_txn(
+                txn, room_id, next_token, direction='b', limit=1000,
+            )
+            next_token = RoomStreamToken.parse(next_token)
+
+            if len(rows) == 0:
+                break
+
+            txn.executemany(
+                """INSERT INTO events_to_purge
+                SELECT event_id, %s
+                FROM events
+                LEFT JOIN state_events USING (event_id)
+                WHERE event_id = ?
+                """ % (
+                    should_delete_expr,
+                ),
+                (
+                    should_delete_params + (row.event_id,)
+                    for row in rows
+                ),
+            )
+
+        txn.execute("""
+            DELETE FROM events_to_purge
+            WHERE event_id IN (
+                SELECT event_id FROM event_forward_extremities
+            )
+        """)
+
        txn.execute(
            "SELECT event_id, should_delete FROM events_to_purge"
        )
@@ -1923,13 +2256,13 @@ class EventsStore(EventsWorkerStore):
        logger.info("[purge] Finding new backward extremities")

        # We calculate the new entries for the backward extremeties by finding
-        # all events that point to events that are to be purged
+        # events to be purged that are pointed to by events we're not going to
+        # purge.
        txn.execute(
            "SELECT DISTINCT e.event_id FROM events_to_purge AS e"
            " INNER JOIN event_edges AS ed ON e.event_id = ed.prev_event_id"
-            " INNER JOIN events AS e2 ON e2.event_id = ed.event_id"
-            " WHERE e2.topological_ordering >= ?",
-            (topological_ordering, )
+            " LEFT JOIN events_to_purge AS ep2 ON ed.event_id = ep2.event_id"
+            " WHERE ep2.event_id IS NULL AND NOT ed.is_state",
        )
        new_backwards_extrems = txn.fetchall()

@@ -1949,20 +2282,47 @@ class EventsStore(EventsWorkerStore):
            ]
        )

+        txn.execute(
+            """DELETE FROM chunk_backwards_extremities
+            WHERE event_id IN (
+                SELECT event_id FROM events WHERE room_id = ?
+            )
+            """,
+            (room_id,)
+        )
+
+        txn.execute(
+            """
+            INSERT INTO chunk_backwards_extremities
+            SELECT DISTINCT ee.chunk_id, e.event_id
+            FROM events_to_purge AS e
+            INNER JOIN event_edges AS ed ON e.event_id = ed.prev_event_id
+            INNER JOIN events AS ee ON ee.event_id = ed.event_id
+            LEFT JOIN events_to_purge AS ep2 ON ed.event_id = ep2.event_id
+            WHERE ep2.event_id IS NULL AND NOT ed.is_state
+            """,
+        )
+
        logger.info("[purge] finding redundant state groups")

        # Get all state groups that are only referenced by events that are
        # to be deleted.
-        txn.execute(
-            "SELECT state_group FROM event_to_state_groups"
-            " INNER JOIN events USING (event_id)"
-            " WHERE state_group IN ("
-            "   SELECT DISTINCT state_group FROM events_to_purge"
-            "   INNER JOIN event_to_state_groups USING (event_id)"
-            " )"
-            " GROUP BY state_group HAVING MAX(topological_ordering) < ?",
-            (topological_ordering, )
-        )
+        # This works by first getting state groups that we may want to delete,
+        # joining against event_to_state_groups to get events that use that
+        # state group, then left joining against events_to_purge again. Any
+        # state group where the left join produce *no nulls* are referenced
+        # only by events that are going to be purged.
+        txn.execute("""
+            SELECT state_group FROM
+            (
+                SELECT DISTINCT state_group FROM events_to_purge
+                INNER JOIN event_to_state_groups USING (event_id)
+            ) AS sp
+            INNER JOIN event_to_state_groups USING (state_group)
+            LEFT JOIN events_to_purge AS ep USING (event_id)
+            GROUP BY state_group
+            HAVING SUM(CASE WHEN ep.event_id IS NULL THEN 1 ELSE 0 END) = 0
+        """)

        state_rows = txn.fetchall()
        logger.info("[purge] found %i redundant state groups", len(state_rows))
@@ -2095,7 +2455,7 @@ class EventsStore(EventsWorkerStore):
        # Mark all state and own events as outliers
        logger.info("[purge] marking remaining events as outliers")
        txn.execute(
-            "UPDATE events SET outlier = ?"
+            "UPDATE events SET outlier = ?, chunk_id = NULL"
            " WHERE event_id IN ("
            "    SELECT event_id FROM events_to_purge "
            "    WHERE NOT should_delete"
@@ -2109,10 +2469,25 @@ class EventsStore(EventsWorkerStore):
        #
        # So, let's stick it at the end so that we don't block event
        # persistence.
-        logger.info("[purge] updating room_depth")
+        #
+        # We do this by calculating the minimum depth of the backwards
+        # extremities. However, the events in event_backward_extremities
+        # are ones we don't have yet so we need to look at the events that
+        # point to it via event_edges table.
+        txn.execute("""
+            SELECT COALESCE(MIN(depth), 0)
+            FROM event_backward_extremities AS eb
+            INNER JOIN event_edges AS eg ON eg.prev_event_id = eb.event_id
+            INNER JOIN events AS e ON e.event_id = eg.event_id
+            WHERE eb.room_id = ?
+        """, (room_id,))
+        min_depth, = txn.fetchone()
+
+        logger.info("[purge] updating room_depth to %d", min_depth)
+
        txn.execute(
            "UPDATE room_depth SET min_depth = ? WHERE room_id = ?",
-            (topological_ordering, room_id,)
+            (min_depth, room_id,)
        )

        # finally, drop the temp table. this will commit the txn in sqlite,
--- a/synapse/storage/events_worker.py
+++ b/synapse/storage/events_worker.py
@@ -337,7 +337,7 @@ class EventsWorkerStore(SQLBaseStore):
    def _fetch_event_rows(self, txn, events):
        rows = []
        N = 200
-        for i in range(1 + len(events) / N):
+        for i in range(1 + len(events) // N):
            evs = events[i * N:(i + 1) * N]
            if not evs:
                break
--- a/synapse/storage/filtering.py
+++ b/synapse/storage/filtering.py
@@ -44,7 +44,7 @@ class FilteringStore(SQLBaseStore):
            desc="get_user_filter",
        )

-        defer.returnValue(json.loads(str(def_json).decode("utf-8")))
+        defer.returnValue(json.loads(bytes(def_json).decode("utf-8")))

    def add_user_filter(self, user_localpart, user_filter):
        def_json = encode_canonical_json(user_filter)
--- a/synapse/storage/keys.py
+++ b/synapse/storage/keys.py
@@ -92,7 +92,7 @@ class KeyStore(SQLBaseStore):

        if verify_key_bytes:
            defer.returnValue(decode_verify_key_bytes(
-                key_id, str(verify_key_bytes)
+                key_id, bytes(verify_key_bytes)
            ))

    @defer.inlineCallbacks
--- a/synapse/storage/prepare_database.py
+++ b/synapse/storage/prepare_database.py
@@ -26,7 +26,7 @@ logger = logging.getLogger(__name__)

 # Remember to update this number every time a change is made to database
 # schema files, so the users will be informed on server restarts.
-SCHEMA_VERSION = 48
+SCHEMA_VERSION = 49

 dir_path = os.path.abspath(os.path.dirname(__file__))

--- a/synapse/storage/receipts.py
+++ b/synapse/storage/receipts.py
@@ -297,18 +297,22 @@ class ReceiptsWorkerStore(SQLBaseStore):
        if receipt_type != "m.read":
            return

-        # Returns an ObservableDeferred
+        # Returns either an ObservableDeferred or the raw result
        res = self.get_users_with_read_receipts_in_room.cache.get(
            room_id, None, update_metrics=False,
        )

-        if res:
-            if isinstance(res, defer.Deferred) and res.called:
+        # first handle the Deferred case
+        if isinstance(res, defer.Deferred):
+            if res.called:
                res = res.result
-            if user_id in res:
-                # We'd only be adding to the set, so no point invalidating if the
-                # user is already there
-                return
+            else:
+                res = None
+
+        if res and user_id in res:
+            # We'd only be adding to the set, so no point invalidating if the
+            # user is already there
+            return

        self.get_users_with_read_receipts_in_room.invalidate((room_id,))

@@ -365,7 +369,7 @@ class ReceiptsStore(ReceiptsWorkerStore):
        # We don't want to clobber receipts for more recent events, so we
        # have to compare orderings of existing receipts
        sql = (
-            "SELECT topological_ordering, stream_ordering, event_id FROM events"
+            "SELECT stream_ordering FROM events"
            " INNER JOIN receipts_linearized as r USING (event_id, room_id)"
            " WHERE r.room_id = ? AND r.receipt_type = ? AND r.user_id = ?"
        )
@@ -373,10 +377,8 @@ class ReceiptsStore(ReceiptsWorkerStore):
        txn.execute(sql, (room_id, receipt_type, user_id))

        if topological_ordering:
-            for to, so, _ in txn:
-                if int(to) > topological_ordering:
-                    return False
-                elif int(to) == topological_ordering and int(so) >= stream_ordering:
+            for so, in txn:
+                if int(so) >= stream_ordering:
                    return False

        self._simple_delete_txn(
@@ -407,7 +409,6 @@ class ReceiptsStore(ReceiptsWorkerStore):
                txn,
                room_id=room_id,
                user_id=user_id,
-                topological_ordering=topological_ordering,
                stream_ordering=stream_ordering,
            )

--- a/synapse/storage/registration.py
+++ b/synapse/storage/registration.py
@@ -33,7 +33,10 @@ class RegistrationWorkerStore(SQLBaseStore):
            keyvalues={
                "name": user_id,
            },
-            retcols=["name", "password_hash", "is_guest"],
+            retcols=[
+                "name", "password_hash", "is_guest",
+                "consent_version", "consent_server_notice_sent",
+            ],
            allow_none=True,
            desc="get_user_by_id",
        )
@@ -286,6 +289,53 @@ class RegistrationStore(RegistrationWorkerStore,
            "user_set_password_hash", user_set_password_hash_txn
        )

+    def user_set_consent_version(self, user_id, consent_version):
+        """Updates the user table to record privacy policy consent
+
+        Args:
+            user_id (str): full mxid of the user to update
+            consent_version (str): version of the policy the user has consented
+                to
+
+        Raises:
+            StoreError(404) if user not found
+        """
+        def f(txn):
+            self._simple_update_one_txn(
+                txn,
+                table='users',
+                keyvalues={'name': user_id, },
+                updatevalues={'consent_version': consent_version, },
+            )
+            self._invalidate_cache_and_stream(
+                txn, self.get_user_by_id, (user_id,)
+            )
+        return self.runInteraction("user_set_consent_version", f)
+
+    def user_set_consent_server_notice_sent(self, user_id, consent_version):
+        """Updates the user table to record that we have sent the user a server
+        notice about privacy policy consent
+
+        Args:
+            user_id (str): full mxid of the user to update
+            consent_version (str): version of the policy we have notified the
+                user about
+
+        Raises:
+            StoreError(404) if user not found
+        """
+        def f(txn):
+            self._simple_update_one_txn(
+                txn,
+                table='users',
+                keyvalues={'name': user_id, },
+                updatevalues={'consent_server_notice_sent': consent_version, },
+            )
+            self._invalidate_cache_and_stream(
+                txn, self.get_user_by_id, (user_id,)
+            )
+        return self.runInteraction("user_set_consent_server_notice_sent", f)
+
    def user_delete_access_tokens(self, user_id, except_token_id=None,
                                  device_id=None):
        """
--- a/synapse/storage/roommember.py
+++ b/synapse/storage/roommember.py
@@ -30,6 +30,8 @@ from synapse.types import get_domain_from_id
 import logging
 import simplejson as json

+from six import itervalues, iteritems
+
 logger = logging.getLogger(__name__)


@@ -272,7 +274,7 @@ class RoomMemberWorkerStore(EventsWorkerStore):
        users_in_room = {}
        member_event_ids = [
            e_id
-            for key, e_id in current_state_ids.iteritems()
+            for key, e_id in iteritems(current_state_ids)
            if key[0] == EventTypes.Member
        ]

@@ -289,7 +291,7 @@ class RoomMemberWorkerStore(EventsWorkerStore):
                    users_in_room = dict(prev_res)
                    member_event_ids = [
                        e_id
-                        for key, e_id in context.delta_ids.iteritems()
+                        for key, e_id in iteritems(context.delta_ids)
                        if key[0] == EventTypes.Member
                    ]
                    for etype, state_key in context.delta_ids:
@@ -741,7 +743,7 @@ class _JoinedHostsCache(object):
            if state_entry.state_group == self.state_group:
                pass
            elif state_entry.prev_group == self.state_group:
-                for (typ, state_key), event_id in state_entry.delta_ids.iteritems():
+                for (typ, state_key), event_id in iteritems(state_entry.delta_ids):
                    if typ != EventTypes.Member:
                        continue

@@ -771,7 +773,7 @@ class _JoinedHostsCache(object):
                self.state_group = state_entry.state_group
            else:
                self.state_group = object()
-            self._len = sum(len(v) for v in self.hosts_to_joined_users.itervalues())
+            self._len = sum(len(v) for v in itervalues(self.hosts_to_joined_users))
        defer.returnValue(frozenset(self.hosts_to_joined_users))

    def __len__(self):
--- a/synapse/storage/schema/delta/48/add_user_consent.sql
+++ b/synapse/storage/schema/delta/48/add_user_consent.sql
@@ -0,0 +1,18 @@
+/* Copyright 2018 New Vector Ltd
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/* record the version of the privacy policy the user has consented to
+ */
+ALTER TABLE users ADD COLUMN consent_version TEXT;
--- a/synapse/storage/schema/delta/49/add_user_consent_server_notice_sent.sql
+++ b/synapse/storage/schema/delta/49/add_user_consent_server_notice_sent.sql
@@ -0,0 +1,20 @@
+/* Copyright 2018 New Vector Ltd
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/* record whether we have sent a server notice about consenting to the
+ * privacy policy. Specifically records the version of the policy we sent
+ * a message about.
+ */
+ALTER TABLE users ADD COLUMN consent_server_notice_sent TEXT;
--- a/synapse/storage/schema/delta/49/add_user_daily_visits.sql
+++ b/synapse/storage/schema/delta/49/add_user_daily_visits.sql
@@ -0,0 +1,21 @@
+/* Copyright 2018 New Vector Ltd
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+CREATE TABLE user_daily_visits ( user_id TEXT NOT NULL,
+                                 device_id TEXT,
+                                 timestamp BIGINT NOT NULL );
+CREATE INDEX user_daily_visits_uts_idx ON user_daily_visits(user_id, timestamp);
+CREATE INDEX user_daily_visits_ts_idx ON user_daily_visits(timestamp);
--- a/synapse/storage/schema/delta/49/add_user_ips_last_seen_only_index.sql
+++ b/synapse/storage/schema/delta/49/add_user_ips_last_seen_only_index.sql
@@ -0,0 +1,17 @@
+/* Copyright 2018 New Vector Ltd
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+INSERT into background_updates (update_name, progress_json)
+    VALUES ('user_ips_last_seen_only_index', '{}');
--- a/synapse/storage/schema/delta/49/event_chunks.py
+++ b/synapse/storage/schema/delta/49/event_chunks.py
@@ -0,0 +1,149 @@
+# -*- coding: utf-8 -*-
+# Copyright 2018 New Vector Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from synapse.storage._base import SQLBaseStore, LoggingTransaction
+from synapse.storage.prepare_database import get_statements
+
+SQL = """
+
+ALTER TABLE events ADD COLUMN chunk_id BIGINT;
+
+-- FIXME: Add index on contains_url
+
+INSERT INTO background_updates (update_name, progress_json) VALUES
+    ('events_chunk_index', '{}');
+
+-- Stores how chunks of graph relate to each other
+CREATE TABLE chunk_graph (
+    chunk_id BIGINT NOT NULL,
+    prev_id BIGINT NOT NULL
+);
+
+CREATE UNIQUE INDEX chunk_graph_id ON chunk_graph (chunk_id, prev_id);
+CREATE INDEX chunk_graph_prev_id ON chunk_graph (prev_id);
+
+-- The extremities in each chunk. Note that these are pointing to events that
+-- we don't have, rather than boundary between chunks.
+CREATE TABLE chunk_backwards_extremities (
+    chunk_id BIGINT NOT NULL,
+    event_id TEXT NOT NULL
+);
+
+CREATE INDEX chunk_backwards_extremities_id ON chunk_backwards_extremities(
+    chunk_id, event_id
+);
+CREATE INDEX chunk_backwards_extremities_event_id ON chunk_backwards_extremities(
+    event_id
+);
+
+-- Maintains an absolute ordering of chunks. Gets updated when we see new
+-- edges between chunks.
+CREATE TABLE chunk_linearized (
+    chunk_id BIGINT NOT NULL,
+    room_id TEXT NOT NULL,
+    next_chunk_id BIGINT,
+    numerator BIGINT NOT NULL,
+    denominator BIGINT NOT NULL
+);
+
+CREATE UNIQUE INDEX chunk_linearized_id ON chunk_linearized (chunk_id);
+CREATE UNIQUE INDEX chunk_linearized_next_id ON chunk_linearized (
+    next_chunk_id, room_id
+);
+
+CREATE TABLE chunk_linearized_first (
+    chunk_id BIGINT NOT NULL,
+    room_id TEXT NOT NULL
+);
+
+CREATE UNIQUE INDEX chunk_linearized_first_id ON chunk_linearized_first (room_id);
+
+INSERT into background_updates (update_name, progress_json)
+    VALUES ('event_fields_chunk_id', '{}');
+
+"""
+
+
+def run_create(cur, database_engine, *args, **kwargs):
+    for statement in get_statements(SQL.splitlines()):
+        cur.execute(statement)
+
+    txn = LoggingTransaction(
+        cur, "schema_update", database_engine, [], [],
+    )
+
+    rows = SQLBaseStore._simple_select_list_txn(
+        txn,
+        table="event_forward_extremities",
+        keyvalues={},
+        retcols=("event_id", "room_id",),
+    )
+
+    next_chunk_id = 1
+    room_to_next_order = {}
+    prev_chunks_by_room = {}
+
+    for row in rows:
+        chunk_id = next_chunk_id
+        next_chunk_id += 1
+
+        room_id = row["room_id"]
+        event_id = row["event_id"]
+
+        SQLBaseStore._simple_update_txn(
+            txn,
+            table="events",
+            keyvalues={"room_id": room_id, "event_id": event_id},
+            updatevalues={"chunk_id": chunk_id},
+        )
+
+        ordering = room_to_next_order.get(room_id, 1)
+        room_to_next_order[room_id] = ordering + 1
+
+        prev_chunks = prev_chunks_by_room.setdefault(room_id, [])
+
+        SQLBaseStore._simple_insert_txn(
+            txn,
+            table="chunk_linearized",
+            values={
+                "chunk_id": chunk_id,
+                "room_id": row["room_id"],
+                "numerator": ordering,
+                "denominator": 1,
+            },
+        )
+
+        if prev_chunks:
+            SQLBaseStore._simple_update_one_txn(
+                txn,
+                table="chunk_linearized",
+                keyvalues={"chunk_id": prev_chunks[-1]},
+                updatevalues={"next_chunk_id": chunk_id},
+            )
+        else:
+            SQLBaseStore._simple_insert_txn(
+                txn,
+                table="chunk_linearized_first",
+                values={
+                    "chunk_id": chunk_id,
+                    "room_id": row["room_id"],
+                },
+            )
+
+        prev_chunks.append(chunk_id)
+
+
+def run_upgrade(*args, **kwargs):
+    pass
--- a/synapse/storage/stream.py
+++ b/synapse/storage/stream.py
@@ -41,6 +41,7 @@ from synapse.storage.events import EventsWorkerStore
 from synapse.types import RoomStreamToken
 from synapse.util.caches.stream_change_cache import StreamChangeCache
 from synapse.util.logcontext import make_deferred_yieldable, run_in_background
+from synapse.storage.chunk_ordered_table import ChunkDBOrderedListStore
 from synapse.storage.engines import PostgresEngine

 import abc
@@ -62,24 +63,25 @@ _TOPOLOGICAL_TOKEN = "topological"

 # Used as return values for pagination APIs
 _EventDictReturn = namedtuple("_EventDictReturn", (
-    "event_id", "topological_ordering", "stream_ordering",
+    "event_id", "chunk_id", "topological_ordering", "stream_ordering",
 ))


 def lower_bound(token, engine, inclusive=False):
    inclusive = "=" if inclusive else ""
-    if token.topological is None:
+    if token.chunk is None:
        return "(%d <%s %s)" % (token.stream, inclusive, "stream_ordering")
    else:
        if isinstance(engine, PostgresEngine):
            # Postgres doesn't optimise ``(x < a) OR (x=a AND y<b)`` as well
            # as it optimises ``(x,y) < (a,b)`` on multicolumn indexes. So we
            # use the later form when running against postgres.
-            return "((%d,%d) <%s (%s,%s))" % (
-                token.topological, token.stream, inclusive,
+            return "(chunk_id = %d AND (%d,%d) <%s (%s,%s))" % (
+                token.chunk, token.topological, token.stream, inclusive,
                "topological_ordering", "stream_ordering",
            )
-        return "(%d < %s OR (%d = %s AND %d <%s %s))" % (
+        return "(chunk_id = %d AND (%d < %s OR (%d = %s AND %d <%s %s)))" % (
+            token.chunk,
            token.topological, "topological_ordering",
            token.topological, "topological_ordering",
            token.stream, inclusive, "stream_ordering",
@@ -88,18 +90,19 @@ def lower_bound(token, engine, inclusive=False):

 def upper_bound(token, engine, inclusive=True):
    inclusive = "=" if inclusive else ""
-    if token.topological is None:
+    if token.chunk is None:
        return "(%d >%s %s)" % (token.stream, inclusive, "stream_ordering")
    else:
        if isinstance(engine, PostgresEngine):
            # Postgres doesn't optimise ``(x > a) OR (x=a AND y>b)`` as well
            # as it optimises ``(x,y) > (a,b)`` on multicolumn indexes. So we
            # use the later form when running against postgres.
-            return "((%d,%d) >%s (%s,%s))" % (
-                token.topological, token.stream, inclusive,
+            return "(chunk_id = %d AND (%d,%d) >%s (%s,%s))" % (
+                token.chunk, token.topological, token.stream, inclusive,
                "topological_ordering", "stream_ordering",
            )
-        return "(%d > %s OR (%d = %s AND %d >%s %s))" % (
+        return "(chunk_id = %d AND (%d > %s OR (%d = %s AND %d >%s %s)))" % (
+            token.chunk,
            token.topological, "topological_ordering",
            token.topological, "topological_ordering",
            token.stream, inclusive, "stream_ordering",
@@ -275,7 +278,7 @@ class StreamWorkerStore(EventsWorkerStore, SQLBaseStore):
            ) % (order,)
            txn.execute(sql, (room_id, from_id, to_id, limit))

-            rows = [_EventDictReturn(row[0], None, row[1]) for row in txn]
+            rows = [_EventDictReturn(row[0], None, None, row[1]) for row in txn]
            return rows

        rows = yield self.runInteraction("get_room_events_stream_for_room", f)
@@ -325,7 +328,7 @@ class StreamWorkerStore(EventsWorkerStore, SQLBaseStore):
            )
            txn.execute(sql, (user_id, from_id, to_id,))

-            rows = [_EventDictReturn(row[0], None, row[1]) for row in txn]
+            rows = [_EventDictReturn(row[0], None, None, row[1]) for row in txn]

            return rows

@@ -392,7 +395,7 @@ class StreamWorkerStore(EventsWorkerStore, SQLBaseStore):

        end_token = RoomStreamToken.parse(end_token)

-        rows, token = yield self.runInteraction(
+        rows, token, _ = yield self.runInteraction(
            "get_recent_event_ids_for_room", self._paginate_room_events_txn,
            room_id, from_token=end_token, limit=limit,
        )
@@ -437,15 +440,17 @@ class StreamWorkerStore(EventsWorkerStore, SQLBaseStore):
        `room_id` causes it to return the current room specific topological
        token.
        """
-        token = yield self.get_room_max_stream_ordering()
        if room_id is None:
-            defer.returnValue("s%d" % (token,))
+            token = yield self.get_room_max_stream_ordering()
+            defer.returnValue(str(RoomStreamToken(None, None, token)))
        else:
-            topo = yield self.runInteraction(
-                "_get_max_topological_txn", self._get_max_topological_txn,
+            token = yield self.runInteraction(
+                "get_room_events_max_id", self._get_topological_token_for_room_txn,
                room_id,
            )
-            defer.returnValue("t%d-%d" % (topo, token))
+            if not token:
+                raise Exception("Server not in room")
+            defer.returnValue(str(token))

    def get_stream_token_for_event(self, event_id):
        """The stream token for an event
@@ -460,7 +465,7 @@ class StreamWorkerStore(EventsWorkerStore, SQLBaseStore):
            table="events",
            keyvalues={"event_id": event_id},
            retcol="stream_ordering",
-        ).addCallback(lambda row: "s%d" % (row,))
+        ).addCallback(lambda row: str(RoomStreamToken(None, None, row)))

    def get_topological_token_for_event(self, event_id):
        """The stream token for an event
@@ -469,16 +474,34 @@ class StreamWorkerStore(EventsWorkerStore, SQLBaseStore):
        Raises:
            StoreError if the event wasn't in the database.
        Returns:
-            A deferred "t%d-%d" topological token.
+            A deferred topological token.
        """
        return self._simple_select_one(
            table="events",
            keyvalues={"event_id": event_id},
-            retcols=("stream_ordering", "topological_ordering"),
+            retcols=("stream_ordering", "topological_ordering", "chunk_id"),
            desc="get_topological_token_for_event",
-        ).addCallback(lambda row: "t%d-%d" % (
-            row["topological_ordering"], row["stream_ordering"],)
-        )
+        ).addCallback(lambda row: str(RoomStreamToken(
+            row["chunk_id"],
+            row["topological_ordering"],
+            row["stream_ordering"],
+        )))
+
+    def _get_topological_token_for_room_txn(self, txn, room_id):
+        sql = """
+            SELECT chunk_id, topological_ordering, stream_ordering
+            FROM events
+            NATURAL JOIN event_forward_extremities
+            WHERE room_id = ?
+            ORDER BY stream_ordering DESC
+            LIMIT 1
+        """
+        txn.execute(sql, (room_id,))
+        row = txn.fetchone()
+        if row:
+            c, t, s = row
+            return RoomStreamToken(c, t, s)
+        return None

    def get_max_topological_token(self, room_id, stream_key):
        sql = (
@@ -515,18 +538,25 @@ class StreamWorkerStore(EventsWorkerStore, SQLBaseStore):
                null topological_ordering.
        """
        for event, row in zip(events, rows):
+            chunk = row.chunk_id
+            topo = row.topological_ordering
            stream = row.stream_ordering
-            if topo_order and row.topological_ordering:
-                topo = row.topological_ordering
-            else:
-                topo = None
+
            internal = event.internal_metadata
-            internal.before = str(RoomStreamToken(topo, stream - 1))
-            internal.after = str(RoomStreamToken(topo, stream))
-            internal.order = (
-                int(topo) if topo else 0,
-                int(stream),
-            )
+            if topo_order and chunk:
+                internal.before = str(RoomStreamToken(chunk, topo, stream - 1))
+                internal.after = str(RoomStreamToken(chunk, topo, stream))
+                internal.order = (
+                    int(chunk) if chunk else 0,
+                    int(topo) if topo else 0,
+                    int(stream),
+                )
+            else:
+                internal.before = str(RoomStreamToken(None, None, stream - 1))
+                internal.after = str(RoomStreamToken(None, None, stream))
+                internal.order = (
+                    0, 0, int(stream),
+                )

    @defer.inlineCallbacks
    def get_events_around(self, room_id, event_id, before_limit, after_limit):
@@ -586,27 +616,29 @@ class StreamWorkerStore(EventsWorkerStore, SQLBaseStore):
                "event_id": event_id,
                "room_id": room_id,
            },
-            retcols=["stream_ordering", "topological_ordering"],
+            retcols=["stream_ordering", "topological_ordering", "chunk_id"],
        )

        # Paginating backwards includes the event at the token, but paginating
        # forward doesn't.
        before_token = RoomStreamToken(
-            results["topological_ordering"] - 1,
-            results["stream_ordering"],
+            results["chunk_id"],
+            results["topological_ordering"],
+            results["stream_ordering"] - 1,
        )

        after_token = RoomStreamToken(
+            results["chunk_id"],
            results["topological_ordering"],
            results["stream_ordering"],
        )

-        rows, start_token = self._paginate_room_events_txn(
+        rows, start_token, _ = self._paginate_room_events_txn(
            txn, room_id, before_token, direction='b', limit=before_limit,
        )
        events_before = [r.event_id for r in rows]

-        rows, end_token = self._paginate_room_events_txn(
+        rows, end_token, _ = self._paginate_room_events_txn(
            txn, room_id, after_token, direction='f', limit=after_limit,
        )
        events_after = [r.event_id for r in rows]
@@ -684,16 +716,42 @@ class StreamWorkerStore(EventsWorkerStore, SQLBaseStore):
                results to only those before
            direction(char): Either 'b' or 'f' to indicate whether we are
                paginating forwards or backwards from `from_key`.
-            limit (int): The maximum number of events to return. Zero or less
-                means no limit.
+            limit (int): The maximum number of events to return.
            event_filter (Filter|None): If provided filters the events to
                those that match the filter.

        Returns:
-            Deferred[tuple[list[_EventDictReturn], str]]: Returns the results
-            as a list of _EventDictReturn and a token that points to the end
-            of the result set.
+            Deferred[tuple[list[_EventDictReturn], str, list[int]]: Returns
+            the results as a list of _EventDictReturn, a token that points to
+            the end of the result set, and a list of chunks iterated over.
        """
+
+        assert int(limit) >= 0
+
+        # For backwards compatibility we need to check if the token has a
+        # topological part but no chunk part. If that's the case we can use the
+        # stream part to generate an appropriate topological token.
+        if from_token.chunk is None and from_token.topological is not None:
+            res = self._simple_select_one_txn(
+                txn,
+                table="events",
+                keyvalues={
+                    "stream_ordering": from_token.stream,
+                },
+                retcols=(
+                    "chunk_id",
+                    "topological_ordering",
+                    "stream_ordering",
+                ),
+                allow_none=True,
+            )
+            if res and res["chunk_id"] is not None:
+                from_token = RoomStreamToken(
+                    res["chunk_id"],
+                    res["topological_ordering"],
+                    res["stream_ordering"],
+                )
+
        # Tokens really represent positions between elements, but we use
        # the convention of pointing to the event before the gap. Hence
        # we have a bit of asymmetry when it comes to equalities.
@@ -723,29 +781,75 @@ class StreamWorkerStore(EventsWorkerStore, SQLBaseStore):
            bounds += " AND " + filter_clause
            args.extend(filter_args)

-        if int(limit) > 0:
-            args.append(int(limit))
-            limit_str = " LIMIT ?"
-        else:
-            limit_str = ""
+        limit = int(limit)
+        args.append(limit)

        sql = (
-            "SELECT event_id, topological_ordering, stream_ordering"
+            "SELECT event_id, chunk_id, topological_ordering, stream_ordering"
            " FROM events"
            " WHERE outlier = ? AND room_id = ? AND %(bounds)s"
            " ORDER BY topological_ordering %(order)s,"
-            " stream_ordering %(order)s %(limit)s"
+            " stream_ordering %(order)s LIMIT ?"
        ) % {
            "bounds": bounds,
            "order": order,
-            "limit": limit_str
        }

        txn.execute(sql, args)

-        rows = [_EventDictReturn(row[0], row[1], row[2]) for row in txn]
+        rows = [_EventDictReturn(row[0], row[1], row[2], row[3]) for row in txn]
+
+        iterated_chunks = []
+
+        chunk_id = None
+        if rows:
+            chunk_id = rows[-1].chunk_id
+            iterated_chunks = [r.chunk_id for r in rows]
+        elif from_token.chunk:
+            chunk_id = from_token.chunk
+            iterated_chunks = [chunk_id]
+
+        table = ChunkDBOrderedListStore(
+            txn, room_id, self.clock, self.database_engine,
+        )
+
+        while chunk_id and (limit <= 0 or len(rows) < limit):
+            if chunk_id not in iterated_chunks:
+                iterated_chunks.append(chunk_id)
+
+            if direction == 'b':
+                # FIXME: There may be multiple things here
+                chunk_id = table.get_prev(chunk_id)
+            else:
+                chunk_id = table.get_next(chunk_id)
+
+            if chunk_id is None:
+                break
+
+            sql = (
+                "SELECT event_id, chunk_id, topological_ordering, stream_ordering"
+                " FROM events"
+                " WHERE outlier = ? AND room_id = ? AND chunk_id = %(chunk_id)d"
+                " ORDER BY topological_ordering %(order)s,"
+                " stream_ordering %(order)s LIMIT ?"
+            ) % {
+                "chunk_id": chunk_id,
+                "order": order,
+            }
+
+            txn.execute(sql, args)
+            new_rows = [_EventDictReturn(row[0], row[1], row[2], row[3]) for row in txn]
+
+            if not new_rows:
+                break
+
+            rows.extend(new_rows)
+
+        if limit > 0:
+            rows = rows[:limit]

        if rows:
+            chunk = rows[-1].chunk_id
            topo = rows[-1].topological_ordering
            toke = rows[-1].stream_ordering
            if direction == 'b':
@@ -755,12 +859,12 @@ class StreamWorkerStore(EventsWorkerStore, SQLBaseStore):
                # when we are going backwards so we subtract one from the
                # stream part.
                toke -= 1
-            next_token = RoomStreamToken(topo, toke)
+            next_token = RoomStreamToken(chunk, topo, toke)
        else:
            # TODO (erikj): We should work out what to do here instead.
            next_token = to_token if to_token else from_token

-        return rows, str(next_token),
+        return rows, str(next_token), iterated_chunks,

    @defer.inlineCallbacks
    def paginate_room_events(self, room_id, from_key, to_key=None,
@@ -782,16 +886,38 @@ class StreamWorkerStore(EventsWorkerStore, SQLBaseStore):
        Returns:
            tuple[list[dict], str]: Returns the results as a list of dicts and
            a token that points to the end of the result set. The dicts have
-            the keys "event_id", "topological_ordering" and "stream_orderign".
+            the keys "event_id", "topological_ordering" and "stream_ordering".
        """

        from_key = RoomStreamToken.parse(from_key)
        if to_key:
            to_key = RoomStreamToken.parse(to_key)

-        rows, token = yield self.runInteraction(
-            "paginate_room_events", self._paginate_room_events_txn,
-            room_id, from_key, to_key, direction, limit, event_filter,
+        def _do_paginate_room_events(txn):
+            rows, token, chunks = self._paginate_room_events_txn(
+                txn, room_id, from_key, to_key, direction, limit, event_filter,
+            )
+
+            extremities = []
+            seen = set()
+            for chunk_id in chunks:
+                if chunk_id in seen:
+                    continue
+                seen.add(chunk_id)
+
+                event_ids = self._simple_select_onecol_txn(
+                    txn,
+                    table="chunk_backwards_extremities",
+                    keyvalues={"chunk_id": chunk_id},
+                    retcol="event_id"
+                )
+
+                extremities.extend(e for e in event_ids if e not in extremities)
+
+            return rows, token, extremities
+
+        rows, token, extremities = yield self.runInteraction(
+            "paginate_room_events", _do_paginate_room_events,
        )

        events = yield self._get_events(
@@ -801,7 +927,49 @@ class StreamWorkerStore(EventsWorkerStore, SQLBaseStore):

        self._set_before_and_after(events, rows)

-        defer.returnValue((events, token))
+        defer.returnValue((events, token, extremities))
+
+    def clamp_token_before(self, room_id, token, clamp_to):
+        token = RoomStreamToken.parse(token)
+        clamp_to = RoomStreamToken.parse(clamp_to)
+
+        def clamp_token_before_txn(txn, token):
+            if not token.topological:
+                sql = """
+                    SELECT chunk_id, topological_ordering FROM events
+                    WHERE room_id = ? AND stream_ordering <= ?
+                    ORDER BY stream_ordering DESC
+                """
+                txn.execute(sql, (room_id, token.stream,))
+                row = txn.fetchone()
+                if not row:
+                    return str(token)
+
+                chunk_id, topo = row
+                token = RoomStreamToken(chunk_id, topo, token.stream)
+
+            if token.chunk == clamp_to.chunk:
+                if token.topological < clamp_to.topological:
+                    return str(token)
+                else:
+                    return str(clamp_to)
+
+            sql = "SELECT rationale FROM chunk_linearized WHERE chunk_id = ?"
+
+            txn.execute(sql, (token.chunk,))
+            token_order, = txn.fetchone()
+
+            txn.execute(sql, (clamp_to.chunk,))
+            clamp_order, = txn.fetchone()
+
+            if token_order < clamp_order:
+                return str(token)
+            else:
+                return str(clamp_to)
+
+        return self.runInteraction(
+            "clamp_token_before", clamp_token_before_txn, token
+        )


 class StreamStore(StreamWorkerStore):
--- a/synapse/types.py
+++ b/synapse/types.py
@@ -306,7 +306,7 @@ StreamToken.START = StreamToken(
 )


-class RoomStreamToken(namedtuple("_StreamToken", "topological stream")):
+class RoomStreamToken(namedtuple("_StreamToken", "chunk topological stream")):
    """Tokens are positions between events. The token "s1" comes after event 1.

            s0    s1
@@ -334,10 +334,17 @@ class RoomStreamToken(namedtuple("_StreamToken", "topological stream")):
    def parse(cls, string):
        try:
            if string[0] == 's':
-                return cls(topological=None, stream=int(string[1:]))
-            if string[0] == 't':
+                return cls(chunk=None, topological=None, stream=int(string[1:]))
+            if string[0] == 't':  # For backwards compat with older tokens.
                parts = string[1:].split('-', 1)
-                return cls(topological=int(parts[0]), stream=int(parts[1]))
+                return cls(chunk=None, topological=int(parts[0]), stream=int(parts[1]))
+            if string[0] == 'c':
+                parts = string[1:].split('~', 2)
+                return cls(
+                    chunk=int(parts[0]),
+                    topological=int(parts[1]),
+                    stream=int(parts[2]),
+                )
        except Exception:
            pass
        raise SynapseError(400, "Invalid token %r" % (string,))
@@ -346,12 +353,14 @@ class RoomStreamToken(namedtuple("_StreamToken", "topological stream")):
    def parse_stream_token(cls, string):
        try:
            if string[0] == 's':
-                return cls(topological=None, stream=int(string[1:]))
+                return cls(chunk=None, topological=None, stream=int(string[1:]))
        except Exception:
            pass
        raise SynapseError(400, "Invalid token %r" % (string,))

    def __str__(self):
+        if self.chunk is not None:
+            return "c%d~%d~%d" % (self.chunk, self.topological, self.stream)
        if self.topological is not None:
            return "t%d-%d" % (self.topological, self.stream)
        else:
--- a/synapse/util/init.py
+++ b/synapse/util/init.py
@@ -20,6 +20,8 @@ from twisted.internet import defer, reactor, task
 import time
 import logging

+from itertools import islice
+
 logger = logging.getLogger(__name__)


@@ -79,3 +81,19 @@ class Clock(object):
        except Exception:
            if not ignore_errs:
                raise
+
+
+def batch_iter(iterable, size):
+    """batch an iterable up into tuples with a maximum size
+
+    Args:
+        iterable (iterable): the iterable to slice
+        size (int): the maximum batch size
+
+    Returns:
+        an iterator over the chunks
+    """
+    # make sure we can deal with iterables like lists too
+    sourceiter = iter(iterable)
+    # call islice until it returns an empty tuple
+    return iter(lambda: tuple(islice(sourceiter, size)), ())
--- a/synapse/util/caches/init.py
+++ b/synapse/util/caches/init.py
@@ -16,6 +16,9 @@
 import synapse.metrics
 import os

+from six.moves import intern
+import six
+
 CACHE_SIZE_FACTOR = float(os.environ.get("SYNAPSE_CACHE_FACTOR", 0.5))

 metrics = synapse.metrics.get_metrics_for("synapse.util.caches")
@@ -66,7 +69,9 @@ def intern_string(string):
        return None

    try:
-        string = string.encode("ascii")
+        if six.PY2:
+            string = string.encode("ascii")
+
        return intern(string)
    except UnicodeEncodeError:
        return string
--- a/synapse/util/katriel_bodlaender.py
+++ b/synapse/util/katriel_bodlaender.py
@@ -0,0 +1,349 @@
+# -*- coding: utf-8 -*-
+# Copyright 2018 New Vector Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""This module contains an implementation of the Katriel-Bodlaender algorithm,
+which is used to do online topological ordering of graphs.
+
+Note that the ordering derived from the graph is such that the source node of
+an edge comes before the target node of the edge, i.e. a graph of A -> B -> C
+would produce the ordering [A, B, C].
+
+This ordering is therefore opposite to what one might expect when considering
+the room DAG, as newer messages would be added to the start rather than the
+end.
+
+***The ChunkDBOrderedListStore therefore inverts the direction of edges***
+
+See:
+    A tight analysis of the Katriel–Bodlaender algorithm for online topological
+    ordering
+    Hsiao-Fei Liua and Kun-Mao Chao
+    https://www.sciencedirect.com/science/article/pii/S0304397507006573
+and:
+    Online Topological Ordering
+    Irit Katriel and Hans L. Bodlaender
+    http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.78.7933 )
+"""
+
+from abc import ABCMeta, abstractmethod
+
+
+class OrderedListStore(object):
+    """An abstract base class that is used to store a graph and maintain a
+    topological consistent, total ordering.
+
+    Internally this uses the Katriel-Bodlaender algorithm, which requires the
+    store expose an interface for the total ordering that supports:
+
+        - Insertion of the node into the ordering either immediately before or
+          after another node.
+        - Deletion of the node from the ordering
+        - Comparing the relative ordering of two arbitary nodes
+        - Get the node immediately before or after a given node in the ordering
+
+    It also needs to be able to interact with the graph in the following ways:
+
+        - Query the number of edges from a node in the graph
+        - Query the number of edges into a node in the graph
+        - Add an edge to the graph
+
+
+    Users of subclasses should call `add_node` and `add_edge` whenever editing
+    the graph. The total ordering exposed will remain constant until the next
+    call to one of these methods.
+
+    Note: Calls to `add_node` and `add_edge` cannot overlap, and so callers
+    should perform some form of locking.
+    """
+
+    __metaclass__ = ABCMeta
+
+    def add_node(self, node_id):
+        """Adds a node to the graph.
+
+        Args:
+            node_id (str)
+        """
+        self._insert_before(node_id, None)
+
+    def add_edge(self, source, target):
+        """Adds a new edge to the graph and updates the ordering.
+
+        See module level docs.
+
+        Note that both the source and target nodes must have been inserted into
+        the store (at an arbitrary position) already.
+
+        Args:
+            source (str): The source node of the new edge
+            target (str): The target node of the new edge
+        """
+
+        # The following is the Katriel-Bodlaender algorithm.
+
+        to_s = []
+        from_t = []
+        to_s_neighbours = []
+        from_t_neighbours = []
+        to_s_indegree = 0
+        from_t_outdegree = 0
+        s = source
+        t = target
+
+        while s and t and not self.is_before(s, t):
+            m_s = to_s_indegree
+            m_t = from_t_outdegree
+
+            # These functions return a tuple where the first term is a float
+            # that can be used to order the the list of neighbours.
+            # These are valid until the next write
+            pe_s = self.get_nodes_with_edges_to(s)
+            fe_t = self.get_nodes_with_edges_from(t)
+
+            for n, _ in pe_s:
+                assert n not in to_s
+
+            for n, _ in fe_t:
+                assert n not in from_t
+
+            l_s = len(pe_s)
+            l_t = len(fe_t)
+
+            if m_s + l_s <= m_t + l_t:
+                to_s.append(s)
+                to_s_neighbours.extend(pe_s)
+                to_s_indegree += l_s
+
+                if to_s_neighbours:
+                    to_s_neighbours = list(set(to_s_neighbours))
+                    to_s_neighbours.sort()
+                    _, s = to_s_neighbours.pop()
+                else:
+                    s = None
+
+            if m_s + l_s >= m_t + l_t:
+                from_t.append(t)
+                from_t_neighbours.extend(fe_t)
+                from_t_outdegree += l_t
+
+                if from_t_neighbours:
+                    from_t_neighbours = list(set(from_t_neighbours))
+                    from_t_neighbours.sort(reverse=True)
+                    _, t = from_t_neighbours.pop()
+                else:
+                    t = None
+
+        if s is None:
+            s = self.get_prev(target)
+
+        if t is None:
+            t = self.get_next(source)
+
+        for node_id in to_s:
+            self._delete_ordering(node_id)
+
+        while to_s:
+            s1 = to_s.pop()
+            self._insert_after(s1, s)
+            s = s1
+
+        for node_id in from_t:
+            self._delete_ordering(node_id)
+
+        while from_t:
+            t1 = from_t.pop()
+            self._insert_before(t1, t)
+            t = t1
+
+        self._add_edge_to_graph(source, target)
+
+    @abstractmethod
+    def is_before(self, first_node, second_node):
+        """Returns whether the first node is before the second node.
+
+        Args:
+            first_node (str)
+            second_node (str)
+
+        Returns:
+            bool: True if first_node is before second_node
+        """
+        pass
+
+    @abstractmethod
+    def get_prev(self, node_id):
+        """Gets the node immediately before the given node in the topological
+        ordering.
+
+        Args:
+            node_id (str)
+
+        Returns:
+            str|None: A node ID or None if no preceding node exists
+        """
+        pass
+
+    @abstractmethod
+    def get_next(self, node_id):
+        """Gets the node immediately after the given node in the topological
+        ordering.
+
+        Args:
+            node_id (str)
+
+        Returns:
+            str|None: A node ID or None if no proceding node exists
+        """
+        pass
+
+    @abstractmethod
+    def get_nodes_with_edges_to(self, node_id):
+        """Get all nodes with edges to the given node
+
+        Args:
+            node_id (str)
+
+        Returns:
+            list[tuple[float, str]]: Returns a list of tuple of an ordering
+            term and the node ID. The ordering term can be used to sort the
+            returned list.
+            The ordering is valid until subsequent calls to `add_edge`
+            functions
+        """
+        pass
+
+    @abstractmethod
+    def get_nodes_with_edges_from(self, node_id):
+        """Get all nodes with edges from the given node
+
+        Args:
+            node_id (str)
+
+        Returns:
+            list[tuple[float, str]]: Returns a list of tuple of an ordering
+            term and the node ID. The ordering term can be used to sort the
+            returned list.
+            The ordering is valid until subsequent calls to `add_edge`
+            functions
+        """
+        pass
+
+    @abstractmethod
+    def _insert_before(self, node_id, target_id):
+        """Inserts node immediately before target node.
+
+        If target_id is None then the node is inserted at the end of the list
+
+        Args:
+            node_id (str)
+            target_id (str|None)
+        """
+        pass
+
+    @abstractmethod
+    def _insert_after(self, node_id, target_id):
+        """Inserts node immediately after target node.
+
+        If target_id is None then the node is inserted at the start of the list
+
+        Args:
+            node_id (str)
+            target_id (str|None)
+        """
+        pass
+
+    @abstractmethod
+    def _delete_ordering(self, node_id):
+        """Deletes the given node from the ordered list (but not the graph).
+
+        Used when we want to reinsert it into a different position
+
+        Args:
+            node_id (str)
+        """
+        pass
+
+    @abstractmethod
+    def _add_edge_to_graph(self, source_id, target_id):
+        """Adds an edge to the graph from source to target.
+
+        Does not update ordering.
+
+        Args:
+            source_id (str)
+            target_id (str)
+        """
+        pass
+
+
+class InMemoryOrderedListStore(OrderedListStore):
+    """An in memory OrderedListStore
+    """
+
+    def __init__(self):
+        # The ordered list of nodes
+        self.list = []
+
+        # Map from node to set of nodes that it references
+        self.edges_from = {}
+
+        # Map from node to set of nodes that it is referenced by
+        self.edges_to = {}
+
+    def is_before(self, first_node, second_node):
+        return self.list.index(first_node) < self.list.index(second_node)
+
+    def get_prev(self, node_id):
+        idx = self.list.index(node_id) - 1
+        if idx >= 0:
+            return self.list[idx]
+        else:
+            return None
+
+    def get_next(self, node_id):
+        idx = self.list.index(node_id) + 1
+        if idx < len(self.list):
+            return self.list[idx]
+        else:
+            return None
+
+    def _insert_before(self, node_id, target_id):
+        if target_id is not None:
+            idx = self.list.index(target_id)
+            self.list.insert(idx, node_id)
+        else:
+            self.list.append(node_id)
+
+    def _insert_after(self, node_id, target_id):
+        if target_id is not None:
+            idx = self.list.index(target_id) + 1
+            self.list.insert(idx, node_id)
+        else:
+            self.list.insert(0, node_id)
+
+    def _delete_ordering(self, node_id):
+        self.list.remove(node_id)
+
+    def get_nodes_with_edges_to(self, node_id):
+        to_nodes = self.edges_to.get(node_id, [])
+        return [(self.list.index(nid), nid) for nid in to_nodes]
+
+    def get_nodes_with_edges_from(self, node_id):
+        from_nodes = self.edges_from.get(node_id, [])
+        return [(self.list.index(nid), nid) for nid in from_nodes]
+
+    def _add_edge_to_graph(self, source_id, target_id):
+        self.edges_from.setdefault(source_id, set()).add(target_id)
+        self.edges_to.setdefault(target_id, set()).add(source_id)
--- a/synapse/util/logcontext.py
+++ b/synapse/util/logcontext.py
@@ -60,7 +60,7 @@ class LoggingContext(object):
    __slots__ = [
        "previous_context", "name", "ru_stime", "ru_utime",
        "db_txn_count", "db_txn_duration_ms", "db_sched_duration_ms",
-        "usage_start", "usage_end",
+        "usage_start",
        "main_thread", "alive",
        "request", "tag",
    ]
@@ -109,8 +109,10 @@ class LoggingContext(object):
        # ms spent waiting for db txns to be scheduled
        self.db_sched_duration_ms = 0

+        # If alive has the thread resource usage when the logcontext last
+        # became active.
        self.usage_start = None
-        self.usage_end = None
+
        self.main_thread = threading.current_thread()
        self.request = None
        self.tag = ""
@@ -159,7 +161,7 @@ class LoggingContext(object):
        """Restore the logging context in thread local storage to the state it
        was before this context was entered.
        Returns:
-            None to avoid suppressing any exeptions that were thrown.
+            None to avoid suppressing any exceptions that were thrown.
        """
        current = self.set_current_context(self.previous_context)
        if current is not self:
@@ -185,29 +187,43 @@ class LoggingContext(object):

    def start(self):
        if threading.current_thread() is not self.main_thread:
+            logger.warning("Started logcontext %s on different thread", self)
            return

-        if self.usage_start and self.usage_end:
-            self.ru_utime += self.usage_end.ru_utime - self.usage_start.ru_utime
-            self.ru_stime += self.usage_end.ru_stime - self.usage_start.ru_stime
-            self.usage_start = None
-            self.usage_end = None
-
+        # If we haven't already started record the thread resource usage so
+        # far
        if not self.usage_start:
            self.usage_start = get_thread_resource_usage()

    def stop(self):
        if threading.current_thread() is not self.main_thread:
+            logger.warning("Stopped logcontext %s on different thread", self)
            return

+        # When we stop, let's record the resource used since we started
        if self.usage_start:
-            self.usage_end = get_thread_resource_usage()
+            usage_end = get_thread_resource_usage()
+
+            self.ru_utime += usage_end.ru_utime - self.usage_start.ru_utime
+            self.ru_stime += usage_end.ru_stime - self.usage_start.ru_stime
+
+            self.usage_start = None
+        else:
+            logger.warning("Called stop on logcontext %s without calling start", self)

    def get_resource_usage(self):
+        """Get CPU time used by this logcontext so far.
+
+        Returns:
+            tuple[float, float]: The user and system CPU usage in seconds
+        """
        ru_utime = self.ru_utime
        ru_stime = self.ru_stime

-        if self.usage_start and threading.current_thread() is self.main_thread:
+        # If we are on the correct thread and we're currently running then we
+        # can include resource usage so far.
+        is_main_thread = threading.current_thread() is self.main_thread
+        if self.alive and self.usage_start and is_main_thread:
            current = get_thread_resource_usage()
            ru_utime += current.ru_utime - self.usage_start.ru_utime
            ru_stime += current.ru_stime - self.usage_start.ru_stime
--- a/tests/rest/client/test_transactions.py
+++ b/tests/rest/client/test_transactions.py
@@ -2,6 +2,9 @@ from synapse.rest.client.transactions import HttpTransactionCache
 from synapse.rest.client.transactions import CLEANUP_PERIOD_MS
 from twisted.internet import defer
 from mock import Mock, call
+
+from synapse.util import async
+from synapse.util.logcontext import LoggingContext
 from tests import unittest
 from tests.utils import MockClock

@@ -39,6 +42,78 @@ class HttpTransactionCacheTestCase(unittest.TestCase):
        # expect only a single call to do the work
        cb.assert_called_once_with("some_arg", keyword="arg", changing_args=0)

+    @defer.inlineCallbacks
+    def test_logcontexts_with_async_result(self):
+        @defer.inlineCallbacks
+        def cb():
+            yield async.sleep(0)
+            defer.returnValue("yay")
+
+        @defer.inlineCallbacks
+        def test():
+            with LoggingContext("c") as c1:
+                res = yield self.cache.fetch_or_execute(self.mock_key, cb)
+                self.assertIs(LoggingContext.current_context(), c1)
+                self.assertEqual(res, "yay")
+
+        # run the test twice in parallel
+        d = defer.gatherResults([test(), test()])
+        self.assertIs(LoggingContext.current_context(), LoggingContext.sentinel)
+        yield d
+        self.assertIs(LoggingContext.current_context(), LoggingContext.sentinel)
+
+    @defer.inlineCallbacks
+    def test_does_not_cache_exceptions(self):
+        """Checks that, if the callback throws an exception, it is called again
+        for the next request.
+        """
+        called = [False]
+
+        def cb():
+            if called[0]:
+                # return a valid result the second time
+                return defer.succeed(self.mock_http_response)
+
+            called[0] = True
+            raise Exception("boo")
+
+        with LoggingContext("test") as test_context:
+            try:
+                yield self.cache.fetch_or_execute(self.mock_key, cb)
+            except Exception as e:
+                self.assertEqual(e.message, "boo")
+            self.assertIs(LoggingContext.current_context(), test_context)
+
+            res = yield self.cache.fetch_or_execute(self.mock_key, cb)
+            self.assertEqual(res, self.mock_http_response)
+            self.assertIs(LoggingContext.current_context(), test_context)
+
+    @defer.inlineCallbacks
+    def test_does_not_cache_failures(self):
+        """Checks that, if the callback returns a failure, it is called again
+        for the next request.
+        """
+        called = [False]
+
+        def cb():
+            if called[0]:
+                # return a valid result the second time
+                return defer.succeed(self.mock_http_response)
+
+            called[0] = True
+            return defer.fail(Exception("boo"))
+
+        with LoggingContext("test") as test_context:
+            try:
+                yield self.cache.fetch_or_execute(self.mock_key, cb)
+            except Exception as e:
+                self.assertEqual(e.message, "boo")
+            self.assertIs(LoggingContext.current_context(), test_context)
+
+            res = yield self.cache.fetch_or_execute(self.mock_key, cb)
+            self.assertEqual(res, self.mock_http_response)
+            self.assertIs(LoggingContext.current_context(), test_context)
+
    @defer.inlineCallbacks
    def test_cleans_up(self):
        cb = Mock(
--- a/tests/storage/test_chunk_linearizer_table.py
+++ b/tests/storage/test_chunk_linearizer_table.py
@@ -0,0 +1,241 @@
+# -*- coding: utf-8 -*-
+# Copyright 2018 New Vector Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from twisted.internet import defer
+
+import random
+import tests.unittest
+import tests.utils
+
+from gmpy2 import mpq as Fraction
+
+from synapse.storage.chunk_ordered_table import ChunkDBOrderedListStore
+
+
+class ChunkLinearizerStoreTestCase(tests.unittest.TestCase):
+    """Tests to ensure that the ordering and rebalancing functions of
+    ChunkDBOrderedListStore work as expected.
+    """
+
+    def __init__(self, *args, **kwargs):
+        super(ChunkLinearizerStoreTestCase, self).__init__(*args, **kwargs)
+
+    @defer.inlineCallbacks
+    def setUp(self):
+        hs = yield tests.utils.setup_test_homeserver()
+        self.store = hs.get_datastore()
+        self.clock = hs.get_clock()
+
+    @defer.inlineCallbacks
+    def test_simple_insert_fetch(self):
+        room_id = "foo_room1"
+
+        def test_txn(txn):
+            table = ChunkDBOrderedListStore(
+                txn, room_id, self.clock,
+                self.store.database_engine,
+                5, 100,
+            )
+
+            table.add_node("A")
+            table._insert_after("B", "A")
+            table._insert_before("C", "A")
+            table._insert_after("D", "A")
+
+            sql = """
+                SELECT chunk_id, numerator, denominator FROM chunk_linearized
+                WHERE room_id = ?
+            """
+            txn.execute(sql, (room_id,))
+
+            ordered = sorted([(Fraction(n, d), r) for r, n, d in txn])
+            ordered = [c for _, c in ordered]
+
+            self.assertEqual(["C", "A", "D", "B"], ordered)
+
+        yield self.store.runInteraction("test", test_txn)
+
+    @defer.inlineCallbacks
+    def test_many_insert_fetch(self):
+        room_id = "foo_room2"
+
+        def test_txn(txn):
+            table = ChunkDBOrderedListStore(
+                txn, room_id, self.clock,
+                self.store.database_engine,
+                5, 100,
+            )
+
+            nodes = [(i, "node_%d" % (i,)) for i in xrange(1, 1000)]
+            expected = [n for _, n in nodes]
+
+            already_inserted = []
+
+            random.shuffle(nodes)
+            while nodes:
+                i, node_id = nodes.pop()
+                if not already_inserted:
+                    table.add_node(node_id)
+                else:
+                    for j, target_id in already_inserted:
+                        if j > i:
+                            break
+
+                    if j < i:
+                        table._insert_after(node_id, target_id)
+                    else:
+                        table._insert_before(node_id, target_id)
+
+                already_inserted.append((i, node_id))
+                already_inserted.sort()
+
+            sql = """
+                SELECT chunk_id, numerator, denominator FROM chunk_linearized
+                WHERE room_id = ?
+            """
+            txn.execute(sql, (room_id,))
+
+            ordered = sorted([(Fraction(n, d), r) for r, n, d in txn])
+            ordered = [c for _, c in ordered]
+
+            self.assertEqual(expected, ordered)
+
+        yield self.store.runInteraction("test", test_txn)
+
+    @defer.inlineCallbacks
+    def test_prepend_and_append(self):
+        room_id = "foo_room3"
+
+        def test_txn(txn):
+            table = ChunkDBOrderedListStore(
+                txn, room_id, self.clock,
+                self.store.database_engine,
+                5, 1000,
+            )
+
+            table.add_node("a")
+
+            expected = ["a"]
+
+            for i in xrange(1, 1000):
+                node_id = "node_id_before_%d" % i
+                table._insert_before(node_id, expected[0])
+                expected.insert(0, node_id)
+
+            for i in xrange(1, 1000):
+                node_id = "node_id_after_%d" % i
+                table._insert_after(node_id, expected[-1])
+                expected.append(node_id)
+
+            sql = """
+                SELECT chunk_id, numerator, denominator FROM chunk_linearized
+                WHERE room_id = ?
+            """
+            txn.execute(sql, (room_id,))
+
+            ordered = sorted([(Fraction(n, d), r) for r, n, d in txn])
+            ordered = [c for _, c in ordered]
+
+            self.assertEqual(expected, ordered)
+
+        yield self.store.runInteraction("test", test_txn)
+
+    @defer.inlineCallbacks
+    def test_worst_case(self):
+        room_id = "foo_room3"
+
+        def test_txn(txn):
+            table = ChunkDBOrderedListStore(
+                txn, room_id, self.clock,
+                self.store.database_engine,
+                5, 100,
+            )
+
+            table.add_node("a")
+
+            prev_node = "a"
+
+            expected_prefix = ["a"]
+            expected_suffix = []
+
+            for i in xrange(1, 100):
+                node_id = "node_id_%d" % i
+                if i % 2 == 0:
+                    table._insert_before(node_id, prev_node)
+                    expected_prefix.append(node_id)
+                else:
+                    table._insert_after(node_id, prev_node)
+                    expected_suffix.append(node_id)
+                prev_node = node_id
+
+            sql = """
+                SELECT chunk_id, numerator, denominator FROM chunk_linearized
+                WHERE room_id = ?
+            """
+            txn.execute(sql, (room_id,))
+
+            ordered = sorted([(Fraction(n, d), r) for r, n, d in txn])
+            ordered = [c for _, c in ordered]
+
+            expected = expected_prefix + list(reversed(expected_suffix))
+
+            self.assertEqual(expected, ordered)
+
+        yield self.store.runInteraction("test", test_txn)
+
+    @defer.inlineCallbacks
+    def test_get_edges_to(self):
+        room_id = "foo_room4"
+
+        def test_txn(txn):
+            table = ChunkDBOrderedListStore(
+                txn, room_id, self.clock,
+                self.store.database_engine,
+                5, 100,
+            )
+
+            table.add_node("A")
+            table._insert_after("B", "A")
+            table._add_edge_to_graph("A", "B")
+            table._insert_before("C", "A")
+            table._add_edge_to_graph("C", "A")
+
+            nodes = table.get_nodes_with_edges_from("A")
+            self.assertEqual([n for _, n in nodes], ["B"])
+
+            nodes = table.get_nodes_with_edges_to("A")
+            self.assertEqual([n for _, n in nodes], ["C"])
+
+        yield self.store.runInteraction("test", test_txn)
+
+    @defer.inlineCallbacks
+    def test_get_next_and_prev(self):
+        room_id = "foo_room5"
+
+        def test_txn(txn):
+            table = ChunkDBOrderedListStore(
+                txn, room_id, self.clock,
+                self.store.database_engine,
+                5, 100,
+            )
+
+            table.add_node("A")
+            table._insert_after("B", "A")
+            table._insert_before("C", "A")
+
+            self.assertEqual(table.get_next("A"), "B")
+            self.assertEqual(table.get_prev("A"), "C")
+
+        yield self.store.runInteraction("test", test_txn)
--- a/tests/storage/test_event_push_actions.py
+++ b/tests/storage/test_event_push_actions.py
@@ -55,7 +55,7 @@ class EventPushActionsStoreTestCase(tests.unittest.TestCase):
        def _assert_counts(noitf_count, highlight_count):
            counts = yield self.store.runInteraction(
                "", self.store._get_unread_counts_by_pos_txn,
-                room_id, user_id, 0, 0
+                room_id, user_id, 0
            )
            self.assertEquals(
                counts,
@@ -86,7 +86,7 @@ class EventPushActionsStoreTestCase(tests.unittest.TestCase):
        def _mark_read(stream, depth):
            return self.store.runInteraction(
                "", self.store._remove_old_push_actions_before_txn,
-                room_id, user_id, depth, stream
+                room_id, user_id, stream
            )

        yield _assert_counts(0, 0)
--- a/tests/storage/test_registration.py
+++ b/tests/storage/test_registration.py
@@ -42,9 +42,14 @@ class RegistrationStoreTestCase(unittest.TestCase):
        yield self.store.register(self.user_id, self.tokens[0], self.pwhash)

        self.assertEquals(
-            # TODO(paul): Surely this field should be 'user_id', not 'name'
-            #  Additionally surely it shouldn't come in a 1-element list
-            {"name": self.user_id, "password_hash": self.pwhash, "is_guest": 0},
+            {
+                # TODO(paul): Surely this field should be 'user_id', not 'name'
+                "name": self.user_id,
+                "password_hash": self.pwhash,
+                "is_guest": 0,
+                "consent_version": None,
+                "consent_server_notice_sent": None,
+            },
            (yield self.store.get_user_by_id(self.user_id))
        )

--- a/tests/util/test_katriel_bodlaender.py
+++ b/tests/util/test_katriel_bodlaender.py
@@ -0,0 +1,84 @@
+# -*- coding: utf-8 -*-
+# Copyright 2018 New Vector Ltd
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from synapse.util.katriel_bodlaender import InMemoryOrderedListStore
+
+from tests import unittest
+
+
+class KatrielBodlaenderTests(unittest.TestCase):
+    def test_simple_graph(self):
+        store = InMemoryOrderedListStore()
+
+        nodes = [
+            "node_1",
+            "node_2",
+            "node_3",
+            "node_4",
+        ]
+
+        for node in nodes:
+            store.add_node(node)
+
+        store.add_edge("node_2", "node_3")
+        store.add_edge("node_1", "node_2")
+        store.add_edge("node_3", "node_4")
+
+        self.assertEqual(nodes, store.list)
+
+    def test_reverse_graph(self):
+        store = InMemoryOrderedListStore()
+
+        nodes = [
+            "node_1",
+            "node_2",
+            "node_3",
+            "node_4",
+        ]
+
+        for node in nodes:
+            store.add_node(node)
+
+        store.add_edge("node_3", "node_2")
+        store.add_edge("node_2", "node_1")
+        store.add_edge("node_4", "node_3")
+
+        self.assertEqual(list(reversed(nodes)), store.list)
+
+    def test_divergent_graph(self):
+        store = InMemoryOrderedListStore()
+
+        nodes = [
+            "node_1",
+            "node_2",
+            "node_3",
+            "node_4",
+            "node_5",
+            "node_6",
+        ]
+
+        for node in reversed(nodes):
+            store.add_node(node)
+
+        store.add_edge("node_2", "node_3")
+        store.add_edge("node_2", "node_5")
+        store.add_edge("node_1", "node_2")
+        store.add_edge("node_3", "node_4")
+        store.add_edge("node_1", "node_3")
+        store.add_edge("node_4", "node_5")
+        store.add_edge("node_5", "node_6")
+        store.add_edge("node_4", "node_6")
+
+        self.assertEqual(nodes, store.list)
--- a/tests/utils.py
+++ b/tests/utils.py
@@ -63,6 +63,8 @@ def setup_test_homeserver(name="test", datastore=None, config=None, **kargs):
        config.federation_rc_concurrent = 10
        config.filter_timeline_limit = 5000
        config.user_directory_search_all_users = False
+        config.user_consent_server_notice_content = None
+        config.block_events_without_consent_error = None

        # disable user directory updates, because they get done in the
        # background, which upsets the test runner.