Module mz_storage::render::upsert::types
source · Expand description
This module defines the UpsertStateBackend
trait and various implementations.
This trait is the way the upsert
operator interacts with various state backings.
Because its a complex trait with a somewhat leaky abstraction, it warrants a high-level description, explaining the complexity. The trait has 3 methods:
multi_get
multi_get
returns the current value for a (unique) set of keys. To keep implementations
efficient, the set of keys is an iterator, and results are written back into another parallel
iterator. In addition to returning the current values, implementations must also return the
size of those values as they are stored within the implementation. Implementations are
required to chunk large iterators if they need to operate over smaller batches.
multi_put
Update or delete values for a set of keys. To keep implementations efficient, the set
of updates is an iterator. Implementations are also required to return the difference
in values and total size after processing the updates. To simplify this (and because
in the upsert
usecase we have this data readily available), the updates are input
with the size of the current value (if any) that was returned from a previous multi_get
.
Implementations are required to chunk large iterators if they need to operate over smaller
batches.
merge_snapshot_chunk
The most complicated method, this method requires implementations to consolidate a chunk of
updates into their state. This method effectively asks implementations to implement the logic in
https://docs.rs/differential-dataflow/latest/differential_dataflow/consolidation/fn.consolidate.html,
but under the assumption that the set of updates is a valid upsert Collection
. Note that this
allows implementations to do this a memory-efficient (or even, _memory-bounded) way. Because
this is non-trivial, this module provides StateValue
, which implements some of the core logic
required to do this. StateValue::merge_update
has more information about this.
merge_snapshot_chunk
has to return stats about the number of values and size of the state,
just like multi_put
.
Another curiosity is that implementation can assume that merge_snapshot_chunk
is called with
a set of updates with a number of keys not greater than UpsertStateBackend::SNAPSHOT_BATCH_SIZE
. This
is different than multi_put
and multi_get
purely because it simplifies the way that the upsert
operator handles snapshots.
A note on state size
The UpsertStateBackend
trait requires implementations report relatively accurate information about
how the state size changes over time. Note that it does NOT ask the implementations to give
accurate information about actual resource consumption (like disk space including space
amplification), and instead is just asking about the size of the values, after they have been
encoded. For implementations like RocksDB
, these may be highly accurate (it literally
reports the encoded size as written to the RocksDB API, and for others like the
InMemoryHashMap
, they may be rough estimates of actual memory usage. See
StateValue::memory_size
for more information.
Note also that after snapshot consolidation, additional space may be used if StateValue
is
used.
Structs
- Statistics for a single call to
multi_get
. - Statistics for a single call to
merge_snapshot_chunk
. - Statistics for a single call to
multi_put
. - A value as produced during consolidation of a snapshot.
- An
UpsertStateBackend
wrapper that supports snapshot merging, and reports basic metrics about the usage of theUpsertStateBackend
. - The result type for individual gets.
Enums
- In any
UpsertStateBackend
implementation, we need to support 2 modes:
Traits
- A trait that defines the fundamental primitives required by a state-backing of the
upsert
operator.
Functions
- Build the default
BincodeOpts
.
Type Aliases
- The default set of
bincode
options used for consolidating upsert snapshots (and writing values to RocksDB).