Skip to main content

Module upsert

Module upsert 

Source

ModulesΒ§

columnar_upsert_key πŸ”’
Columnar (the columnar crate, distinct from columnation) support for UpsertKey, so the upsert-v2 source stash can use a paged columnar merge batcher keyed natively by UpsertKey (no Row packing). UpsertKey is a POD [u8; 32] newtype, so the container is a fixed-stride byte column.
rocksdb πŸ”’
An UpsertStateBackend that stores values in RocksDB.
types πŸ”’
State-management for UPSERT.
upsert_stash_pager
Pager for the upsert-v2 source stash.

MacrosΒ§

upsert_source_time_unit πŸ”’
Source times whose sources never render the upsert envelope (MySQL and SQL Server CDC). Order = () keeps the generic render path free of any columnar requirement on the source time. The projection panics rather than returning: () would collapse every from_time to equal, silently breaking β€œlatest offset wins” dedup, so if such a source ever reaches the upsert path we want a loud failure, not arbitrary per-key output.

StructsΒ§

DigestHasher πŸ”’
UpsertConfig πŸ”’
UpsertKey

EnumsΒ§

DrainStyle πŸ”’
The style of drain we are performing on the stash. AtTime-drains cannot assume that all values have been seen, and must leave tombstones behind for deleted values.

TraitsΒ§

UpsertErrorEmitter πŸ”’
UpsertSourceTime
Projects a source’s native FromTime to a columnar, totally-ordered key used by the upsert source stash to keep the latest update per (key, time).

FunctionsΒ§

drain_staged_input πŸ”’
Helper method for upsert_inner used to stage data updates from the input timely edge.
process_upsert_state_error πŸ”’
Emit the given error, and stall till the dataflow is restarted.
rehydration_finished
This leaf operator drops token after the input reaches the resume_upper. This is useful to take coordinated actions across all workers, after the upsert operator has rehydrated.
stage_input πŸ”’
Helper method for upsert_classic used to stage data updates from the input/source timely edge.
upsert πŸ”’
Resumes an upsert computation at resume_upper given as inputs a collection of upsert commands and the collection of the previous output of this operator. Returns a tuple of
upsert_classic πŸ”’
upsert_operator πŸ”’
upsert_thinning πŸ”’
Renders an operator that discards updates that are known to not affect the outcome of upsert in a streaming fashion. For each distinct (key, time) in the input it emits the value with the highest from_time. Its purpose is to thin out data as much as possible before exchanging them across workers.
upsert_v2 πŸ”’
An experimental upsert implementation loosely described in this doc: Upsert V2 Much Simpler Boogaloo

Type AliasesΒ§

KeyHash πŸ”’
The hash function used to map upsert keys. It is important that this hash is a cryptographic hash so that there is no risk of collisions. Collisions on SHA256 have a probability of 2^128 which is many orders of magnitude smaller than many other events that we don’t even think about (e.g bit flips). In short, we can safely assume that sha256(a) == sha256(b) iff a == b.
UpsertValue