mz_compute_types::dataflows

Type Alias DataflowDesc

Source
pub type DataflowDesc = DataflowDescription<OptimizedMirRelationExpr, ()>;
Expand description

A commonly used name for dataflows contain MIR expressions.

Aliased Type§

struct DataflowDesc {
    pub source_imports: BTreeMap<GlobalId, (SourceInstanceDesc<()>, bool)>,
    pub index_imports: BTreeMap<GlobalId, IndexImport>,
    pub objects_to_build: Vec<BuildDesc<OptimizedMirRelationExpr>>,
    pub index_exports: BTreeMap<GlobalId, (IndexDesc, RelationType)>,
    pub sink_exports: BTreeMap<GlobalId, ComputeSinkDesc>,
    pub as_of: Option<Antichain<Timestamp>>,
    pub until: Antichain<Timestamp>,
    pub initial_storage_as_of: Option<Antichain<Timestamp>>,
    pub refresh_schedule: Option<RefreshSchedule>,
    pub debug_name: String,
    pub time_dependence: Option<TimeDependence>,
}

Fields§

§source_imports: BTreeMap<GlobalId, (SourceInstanceDesc<()>, bool)>

Sources instantiations made available to the dataflow pair with monotonicity information.

§index_imports: BTreeMap<GlobalId, IndexImport>

Indexes made available to the dataflow. (id of index, import)

§objects_to_build: Vec<BuildDesc<OptimizedMirRelationExpr>>

Views and indexes to be built and stored in the local context. Objects must be built in the specific order, as there may be dependencies of later objects on prior identifiers.

§index_exports: BTreeMap<GlobalId, (IndexDesc, RelationType)>

Indexes to be made available to be shared with other dataflows (id of new index, description of index, relationtype of base source/view/table)

§sink_exports: BTreeMap<GlobalId, ComputeSinkDesc>

sinks to be created (id of new sink, description of sink)

§as_of: Option<Antichain<Timestamp>>

An optional frontier to which inputs should be advanced.

If this is set, it should override the default setting determined by the upper bound of since frontiers contributing to the dataflow. It is an error for this to be set to a frontier not beyond that default.

§until: Antichain<Timestamp>

Frontier beyond which the dataflow should not execute. Specifically, updates at times greater or equal to this frontier are suppressed. This is often set to as_of + 1 to enable “batch” computations. Note that frontier advancements might still happen to times that are after the until, only data is suppressed. (This is consistent with how frontier advancements can also happen before the as_of.)

§initial_storage_as_of: Option<Antichain<Timestamp>>

The initial as_of when the collection is first created. Filled only for materialized views. Note that this doesn’t change upon restarts.

§refresh_schedule: Option<RefreshSchedule>

The schedule of REFRESH materialized views.

§debug_name: String

Human-readable name

§time_dependence: Option<TimeDependence>

Description of how the dataflow’s progress relates to wall-clock time. None for unknown.

Implementations

Source§

impl<T> DataflowDescription<OptimizedMirRelationExpr, (), T>

Source

pub fn new(name: String) -> Self

Creates a new dataflow description with a human-readable name.

Source

pub fn import_index( &mut self, id: GlobalId, desc: IndexDesc, typ: RelationType, monotonic: bool, )

Imports a previously exported index.

This method makes available an index previously exported as id, identified to the query by description (which names the view the index arranges, and the keys by which it is arranged).

Source

pub fn import_source( &mut self, id: GlobalId, typ: RelationType, monotonic: bool, )

Imports a source and makes it available as id.

Source

pub fn insert_plan(&mut self, id: GlobalId, plan: OptimizedMirRelationExpr)

Binds to id the relation expression plan.

Source

pub fn export_index( &mut self, id: GlobalId, description: IndexDesc, on_type: RelationType, )

Exports as id an index described by description.

Future uses of import_index in other dataflow descriptions may use id, as long as this dataflow has not been terminated in the meantime.

Source

pub fn export_sink(&mut self, id: GlobalId, description: ComputeSinkDesc<(), T>)

Exports as id a sink described by description.

Source

pub fn is_imported(&self, id: &GlobalId) -> bool

Returns true iff id is already imported.

Source

pub fn arity_of(&self, id: &GlobalId) -> usize

The number of columns associated with an identifier in the dataflow.

Source

pub fn visit_children<R, S, E>(&mut self, r: R, s: S) -> Result<(), E>
where R: Fn(&mut OptimizedMirRelationExpr) -> Result<(), E>, S: Fn(&mut MirScalarExpr) -> Result<(), E>,

Calls r and s on any sub-members of those types in self. Halts at the first error return.

Source§

impl<P, S> DataflowDescription<P, S, Timestamp>

Source

pub fn is_single_time(&self) -> bool

Tests if the dataflow refers to a single timestamp, namely that as_of has a single coordinate and that the until value corresponds to the as_of value plus one, or as_of is the maximum timestamp and is thus single.

Source§

impl<P, S, T> DataflowDescription<P, S, T>

Source

pub fn set_as_of(&mut self, as_of: Antichain<T>)

Sets the as_of frontier to the supplied argument.

This method allows the dataflow to indicate a frontier up through which all times should be advanced. This can be done for at least two reasons: 1. correctness and 2. performance.

Correctness may require an as_of to ensure that historical detail is consolidated at representative times that do not present specific detail that is not specifically correct. For example, updates may be compacted to times that are no longer the source times, but instead some byproduct of when compaction was executed; we should not present those specific times as meaningfully different from other equivalent times.

Performance may benefit from an aggressive as_of as it reduces the number of distinct moments at which collections vary. Differential dataflow will refresh its outputs at each time its inputs change and to moderate that we can minimize the volume of distinct input times as much as possible.

Generally, one should consider setting as_of at least to the since frontiers of contributing data sources and as aggressively as the computation permits.

Source

pub fn set_initial_as_of(&mut self, initial_as_of: Antichain<T>)

Records the initial as_of of the storage collection associated with a materialized view.

Source

pub fn import_ids(&self) -> impl Iterator<Item = GlobalId> + Clone + '_

Identifiers of imported objects (indexes and sources).

Source

pub fn imported_index_ids(&self) -> impl Iterator<Item = GlobalId> + Clone + '_

Identifiers of imported indexes.

Source

pub fn imported_source_ids(&self) -> impl Iterator<Item = GlobalId> + Clone + '_

Identifiers of imported sources.

Source

pub fn export_ids(&self) -> impl Iterator<Item = GlobalId> + Clone + '_

Identifiers of exported objects (indexes and sinks).

Source

pub fn exported_index_ids(&self) -> impl Iterator<Item = GlobalId> + Clone + '_

Identifiers of exported indexes.

Source

pub fn exported_sink_ids(&self) -> impl Iterator<Item = GlobalId> + Clone + '_

Identifiers of exported sinks.

Source

pub fn persist_sink_ids(&self) -> impl Iterator<Item = GlobalId> + '_

Identifiers of exported persist sinks.

Source

pub fn subscribe_ids(&self) -> impl Iterator<Item = GlobalId> + '_

Identifiers of exported subscribe sinks.

Source

pub fn continual_task_ids(&self) -> impl Iterator<Item = GlobalId> + '_

Identifiers of exported continual tasks.

Source

pub fn copy_to_ids(&self) -> impl Iterator<Item = GlobalId> + '_

Identifiers of exported copy to sinks.

Source

pub fn display_import_ids(&self) -> impl Display + '_

Produce a Displayable value containing the import IDs of this dataflow.

Source

pub fn display_export_ids(&self) -> impl Display + '_

Produce a Displayable value containing the export IDs of this dataflow.

Source

pub fn is_transient(&self) -> bool

Whether this dataflow installs transient collections.

Source

pub fn build_desc(&self, id: GlobalId) -> &BuildDesc<P>

Returns the description of the object to build with the specified identifier.

§Panics

Panics if id is not present in objects_to_build exactly once.

Source§

impl<P, S, T> DataflowDescription<P, S, T>
where P: CollectionPlan,

Source

pub fn depends_on(&self, collection_id: GlobalId) -> BTreeSet<GlobalId>

Computes the set of identifiers upon which the specified collection identifier depends.

collection_id must specify a valid object in objects_to_build.

This method includes identifiers for e.g. intermediate views, and should be filtered if one only wants sources and indexes.

This method is safe for mutually recursive view definitions.

Source

pub fn depends_on_into( &self, collection_id: GlobalId, out: &mut BTreeSet<GlobalId>, )

Like depends_on, but appends to an existing BTreeSet.

Source

pub fn depends_on_imports(&self, collection_id: GlobalId) -> BTreeSet<GlobalId>

Computes the set of imports upon which the specified collection depends.

This method behaves like depends_on but filters out internal dependencies that are not included in the dataflow imports.

Trait Implementations

Source§

impl Arbitrary for DataflowDescription<OptimizedMirRelationExpr, (), Timestamp>

Source§

type Strategy = BoxedStrategy<DataflowDescription<OptimizedMirRelationExpr>>

The type of Strategy used to generate values of type Self.
Source§

type Parameters = ()

The type of parameters that arbitrary_with accepts for configuration of the generated Strategy. Parameters must implement Default.
Source§

fn arbitrary_with(_: Self::Parameters) -> Self::Strategy

Generates a Strategy for producing arbitrary values of type the implementing type (Self). The strategy is passed the arguments given in args. Read more
Source§

fn arbitrary() -> Self::Strategy

Generates a Strategy for producing arbitrary values of type the implementing type (Self). Read more
Source§

impl<P: Clone, S: Clone + 'static, T: Clone> Clone for DataflowDescription<P, S, T>

Source§

fn clone(&self) -> DataflowDescription<P, S, T>

Returns a copy of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl<P: Debug, S: Debug + 'static, T: Debug> Debug for DataflowDescription<P, S, T>

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl<'de, P, S, T> Deserialize<'de> for DataflowDescription<P, S, T>
where P: Deserialize<'de>, S: Deserialize<'de> + 'static, T: Deserialize<'de>,

Source§

fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>
where __D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more
Source§

impl<'a> Explain<'a> for DataflowDescription<OptimizedMirRelationExpr>

Source§

type Context = ExplainContext<'a>

The type of the immutable context in which the explanation will be rendered.
Source§

type Text = ExplainMultiPlan<'a, MirRelationExpr>

The explanation type produced by a successful Explain::explain_text call.
Source§

type VerboseText = ExplainMultiPlan<'a, MirRelationExpr>

The explanation type produced by a successful Explain::explain_verbose_text call.
Source§

type Json = ExplainMultiPlan<'a, MirRelationExpr>

The explanation type produced by a successful Explain::explain_json call.
Source§

type Dot = UnsupportedFormat

The explanation type produced by a successful Explain::explain_json call.
Source§

fn explain_text( &'a mut self, context: &'a Self::Context, ) -> Result<Self::Text, ExplainError>

Construct a Result::Ok of the Explain::Text format from the config and the context. Read more
Source§

fn explain_verbose_text( &'a mut self, context: &'a Self::Context, ) -> Result<Self::VerboseText, ExplainError>

Construct a Result::Ok of the Explain::VerboseText format from the config and the context. Read more
Source§

fn explain_json( &'a mut self, context: &'a Self::Context, ) -> Result<Self::Text, ExplainError>

Construct a Result::Ok of the Explain::Json format from the config and the context. Read more
Source§

fn explain( &'a mut self, format: &'a ExplainFormat, context: &'a Self::Context, ) -> Result<String, ExplainError>

Explain an instance of Self within the given Explain::Context. Read more
Source§

fn explain_dot( &'a mut self, context: &'a Self::Context, ) -> Result<Self::Dot, ExplainError>

Construct a Result::Ok of the Explain::Dot format from the config and the context. Read more
Source§

impl<P: PartialEq, S: PartialEq + 'static, T: PartialEq> PartialEq for DataflowDescription<P, S, T>

Source§

fn eq(&self, other: &DataflowDescription<P, S, T>) -> bool

Tests for self and other values to be equal, and is used by ==.
1.0.0 · Source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
Source§

impl<P, S, T> Serialize for DataflowDescription<P, S, T>
where P: Serialize, S: Serialize + 'static, T: Serialize,

Source§

fn serialize<__S>(&self, __serializer: __S) -> Result<__S::Ok, __S::Error>
where __S: Serializer,

Serialize this value into the given Serde serializer. Read more
Source§

impl<P: Eq, S: Eq + 'static, T: Eq> Eq for DataflowDescription<P, S, T>

Source§

impl<P, S: 'static, T> StructuralPartialEq for DataflowDescription<P, S, T>