Module mz_adapter::coord

source ·
Expand description

Translation of SQL commands into timestamped Controller commands.

The various SQL commands instruct the system to take actions that are not yet explicitly timestamped. On the other hand, the underlying data continually change as time moves forward. On the third hand, we greatly benefit from the information that some times are no longer of interest, so that we may compact the representation of the continually changing collections.

The Coordinator curates these interactions by observing the progress collections make through time, choosing timestamps for its own commands, and eventually communicating that certain times have irretrievably “passed”.

Frontiers another way

If the above description of frontiers left you with questions, this repackaged explanation might help.

  • since is the least recent time (i.e. oldest time) that you can read from sources and be guaranteed that the returned data is accurate as of that time.

    Reads at times less than since may return values that were not actually seen at the specified time, but arrived later (i.e. the results are compacted).

    For correctness’ sake, the coordinator never chooses to read at a time less than an arrangement’s since.

  • upper is the first time after the most recent time that you can read from sources and receive an immediate response. Alternately, it is the least time at which the data may still change (that is the reason we may not be able to respond immediately).

    Reads at times >= upper may not immediately return because the answer isn’t known yet. However, once the upper is > the specified read time, the read can return.

    For the sake of returned values’ freshness, the coordinator prefers performing reads at an arrangement’s upper. However, because we more strongly prefer correctness, the coordinator will choose timestamps greater than an object’s upper if it is also being accessed alongside objects whose since times are >= its upper.

This illustration attempts to show, with time moving left to right, the relationship between since and upper.

  • #: possibly inaccurate results
  • -: immediate, correct response
  • ?: not yet known
  • s: since
  • u: upper
  • |: eligible for coordinator to select
####s----u?????
    |||||||||||

Modules

  • appends 🔒
    Logic and types for all appends executed by the Coordinator.
  • A timestamp oracle that relies on the Catalog for persistence/durability and reserves ranges of timestamps.
  • Logic for processing client Commands. Each Command is initiated by a client via some external Materialize API (ex: HTTP and psql).
  • Internal consistency checks that validate invariants of Coordinator.
  • ddl 🔒
    This module encapsulates all of the Coordinator’s logic for creating, dropping, and altering objects.
  • id_bundle 🔒
  • indexes 🔒
  • Special cases related to the “introspection” of Materialize
  • Logic for processing Coordinator messages. The Coordinator receives messages from various sources (ex: controller, clients, background tasks, etc).
  • peek 🔒
    Logic and types for creating, executing, and tracking peeks.
  • Types and methods related to initializing, updating, and removing read policies on collections.
  • sequencer 🔒
    Logic for executing a planned SQL query.
  • sql 🔒
    Various utility methods used by the Coordinator. Ideally these are all put in more meaningfully named modules.
  • timeline 🔒
    A mechanism to ensure that a sequence of writes and reads proceed correctly through timestamps.
  • Logic for selecting timestamps for various operations on collections.

Structs

Enums

Traits

Functions

Type Aliases