Module mz_storage::source::mysql::snapshot
source · Expand description
Renders the table snapshot side of the MySqlSourceConnection
dataflow.
§Snapshot reading
Depending on the source_outputs resume_upper
parameters this dataflow decides which tables to
snapshot and performs a simple SELECT * FROM table
on them in order to get a snapshot.
There are a few subtle points about this operation, described below.
It is crucial for correctness that we always perform the snapshot of all tables at a specific
point in time. This must be true even in the presence of restarts or partially committed
snapshots. The consistent point that the snapshot must happen at is discovered and durably
recorded during planning of the source and is exposed to this ingestion dataflow via the
initial_gtid_set
field in MySqlSourceDetails
.
Unfortunately MySQL does not provide an API to perform a transaction at a specific point in
time. Instead, MySQL allows us to perform a snapshot of a table and let us know at which point
in time the snapshot was taken. Using this information we can take a snapshot at an arbitrary
point in time and then rewind it to the desired initial_gtid_set
by “rewinding” it. These two
phases are described in the following section.
§Producing a snapshot at a known point in time.
Ideally we would like to start a transaction and ask MySQL to tell us the point in time this transaction is running at. As far as we know there isn’t such API so we achieve this using table locks instead.
The full set of tables that are meant to be snapshotted are partitioned among the workers. Each
worker initiates a connection to the server and acquires a table lock on all the tables that
have been assigned to it. By doing so we establish a moment in time where we know no writes are
happening to the tables we are interested in. After the locks are taken each worker reads the
current upper frontier (snapshot_upper
) using the @@gtid_executed
system variable. This
frontier establishes an upper bound on any possible write to the tables of interest until the
lock is released.
Each worker now starts a transaction via a new connection with ‘REPEATABLE READ’ and
‘CONSISTENT SNAPSHOT’ semantics. Due to linearizability we know that this transaction’s view of
the database must some time t_snapshot
such that snapshot_upper <= t_snapshot
. We don’t
actually know the exact value of t_snapshot
and it might be strictly greater than
snapshot_upper
. However, because this transaction will only be used to read the locked tables
and we know that snapshot_upper
is an upper bound on all the writes that have happened to
them we can safely pretend that the transaction’s t_snapshot
is equal to snapshot_upper
.
We have therefore succeeded in starting a transaction at a known point in time!
At this point it is safe for each worker to unlock the tables, since the transaction has established a point in time, and close the initial connection. Each worker can then read the snapshot of the tables it is responsible for and publish it downstream.
TODO: Other software products hold the table lock for the duration of the snapshot, and some do not. We should figure out why and if we need to hold the lock longer. This may be because of a difference in how REPEATABLE READ works in some MySQL-compatible systems (e.g. Aurora MySQL).
§Rewinding the snapshot to a specific point in time.
Having obtained a snapshot of a table at some snapshot_upper
we are now tasked with
transforming this snapshot into one at initial_gtid_set
. In other words we have produced a
snapshot containing all updates that happened at t: !(snapshot_upper <= t)
but what we
actually want is a snapshot containing all updates that happened at t: !(initial_gtid <= t)
.
If we assume that initial_gtid_set <= snapshot_upper
, which is a fair assumption since the
former is obtained before the latter, then we can observe that the snapshot we produced
contains all updates at t: !(initial_gtid <= t)
(i.e the snapshot we want) and some additional
unwanted updates at t: initial_gtid <= t && !(snapshot_upper <= t)
. We happen to know exactly
what those additional unwanted updates are because those will be obtained by reading the
replication stream in the replication operator and so all we need to do to “rewind” our
snapshot_upper
snapshot to initial_gtid
is to ask the replication operator to “undo” any
updates that falls in the undesirable region.
This is exactly what RewindRequest
is about. It informs the replication operator that a
particular table has been snapshotted at snapshot_upper
and would like all the updates
discovered during replication that happen at t: initial_gtid <= t && !(snapshot_upper <= t)
.
to be cancelled. In Differential Dataflow this is as simple as flipping the sign of the diff
field.
The snapshot reader emits updates at the minimum timestamp (by convention) to allow the updates to be potentially negated by the replication operator, which will emit negated updates at the minimum timestamp (by convention) when it encounters rows from a table that occur before the GTID frontier in the Rewind Request for that table.
Structs§
Functions§
- Fetch the size of the snapshot on this worker.
- render 🔒Renders the snapshot dataflow. See the module documentation for more information.