Module source_reader_pipeline

Source
Expand description

Types related to the creation of dataflow raw sources.

Raw sources are differential dataflow collections of data directly produced by the upstream service. The main export of this module is create_raw_source, which turns RawSourceCreationConfigs into the aforementioned streams.

The full source, which is the differential stream that represents the actual object created by a CREATE SOURCE statement, is created by composing create_raw_source with decoding, SourceEnvelope rendering, and more.

Structsยง

FrontierCapture
RawSourceCreationConfig
Shared configuration information for all source types. This is used in the create_raw_source functions, which produce raw sources.
SourceExportCreationConfig
Reduced version of RawSourceCreationConfig that is used when rendering each export.

Functionsยง

create_raw_source
Creates a source dataflow operator graph from a source connection. The type of SourceConnection determines the type of connection that should be created.
reclock_committed_upper ๐Ÿ”’
Reclocks an IntoTime frontier stream into a FromTime frontier stream. This is used for the virtual (through persist) feedback edge so that we convert the IntoTime resumption frontier into the FromTime frontier that is used with the sourceโ€™s OffsetCommiter.
remap_operator ๐Ÿ”’
Mints new contents for the remap shard based on summaries about the source upper it receives from the raw reader operators.
source_render_operator ๐Ÿ”’
Renders the source dataflow fragment from the given SourceConnection. This returns a collection timestamped with the source specific timestamp type. Also returns a second stream that can be used to learn about the source_upper that all the source reader instances know about. This second stream will be used by the remap_operator to mint new timestamp bindings into the remap shard.
synthesize_probes ๐Ÿ”’
Synthesizes a probe stream that produces the frontier of the given progress stream at the given interval.