Function mz_storage_operators::persist_source::persist_source

source ·
pub fn persist_source<G>(
    scope: &mut G,
    source_id: GlobalId,
    persist_clients: Arc<PersistClientCache>,
    txns_ctx: &TxnsContext,
    worker_dyncfgs: &ConfigSet,
    metadata: CollectionMetadata,
    as_of: Option<Antichain<Timestamp>>,
    snapshot_mode: SnapshotMode,
    until: Antichain<Timestamp>,
    map_filter_project: Option<&mut MfpPlan>,
    max_inflight_bytes: Option<usize>,
    start_signal: impl Future<Output = ()> + 'static,
    error_handler: impl FnOnce(String) -> Pin<Box<dyn Future<Output = ()>>> + 'static
) -> (Stream<G, (Row, Timestamp, Diff)>, Stream<G, (DataflowError, Timestamp, Diff)>, Vec<PressOnDropButton>)
where G: Scope<Timestamp = Timestamp>,
Expand description

Creates a new source that reads from a persist shard, distributing the work of reading data to all timely workers.

All times emitted will have been advanced by the given as_of frontier. All updates at times greater or equal to until will be suppressed. The map_filter_project argument, if supplied, may be partially applied, and any un-applied part of the argument will be left behind in the argument.

Users of this function have the ability to apply flow control to the output to limit the in-flight data (measured in bytes) it can emit. The flow control input is a timely stream that communicates the frontier at which the data emitted from by this source have been dropped.

Note: Because this function is reading batches from persist, it is working at batch granularity. In practice, the source will be overshooting the target flow control upper by an amount that is related to the size of batches.

If no flow control is desired an empty stream whose frontier immediately advances to the empty antichain can be used. An easy easy of creating such stream is by using timely::dataflow::operators::generic::operator::empty.