Enum dataflow_types::plan::Plan [−][src]
pub enum Plan {
Constant {
rows: Result<Vec<(Row, Timestamp, Diff)>, EvalError>,
},
Get {
id: Id,
keys: AvailableCollections,
mfp: MapFilterProject,
key_val: Option<(Vec<MirScalarExpr>, Option<Row>)>,
},
Let {
id: LocalId,
value: Box<Plan>,
body: Box<Plan>,
},
Mfp {
input: Box<Plan>,
mfp: MapFilterProject,
input_key_val: Option<(Vec<MirScalarExpr>, Option<Row>)>,
},
FlatMap {
input: Box<Plan>,
func: TableFunc,
exprs: Vec<MirScalarExpr>,
mfp: MapFilterProject,
input_key: Option<Vec<MirScalarExpr>>,
},
Join {
inputs: Vec<Plan>,
plan: JoinPlan,
},
Reduce {
input: Box<Plan>,
key_val_plan: KeyValPlan,
plan: ReducePlan,
input_key: Option<Vec<MirScalarExpr>>,
},
TopK {
input: Box<Plan>,
top_k_plan: TopKPlan,
},
Negate {
input: Box<Plan>,
},
Threshold {
input: Box<Plan>,
threshold_plan: ThresholdPlan,
},
Union {
inputs: Vec<Plan>,
},
ArrangeBy {
input: Box<Plan>,
forms: AvailableCollections,
input_key: Option<Vec<MirScalarExpr>>,
input_mfp: MapFilterProject,
},
}
Expand description
A rendering plan with as much conditional logic as possible removed.
Variants
Constant
Fields
A collection containing a pre-determined collection.
Get
Fields
id: Id
A global or local identifier naming the collection.
keys: AvailableCollections
Arrangements that will be available.
The collection will also be loaded if available, which it will not be for imported data, but which it may be for locally defined data.
mfp: MapFilterProject
Any linear operator work to apply as part of producing the data.
This logic allows us to efficiently extract collections from data that have been pre-arranged, avoiding copying rows that are not used and columns that are projected away.
A reference to a bound collection.
This is commonly either an external reference to an existing source or
maintained arrangement, or an internal reference to a Let
identifier.
Let
Fields
id: LocalId
The local identifier to be used, available to body
as Id::Local(id)
.
Binds value
to id
, and then results in body
with that binding.
This stage has the effect of sharing value
across multiple possible
uses in body
, and is the only mechanism we have for sharing collection
information across parts of a dataflow.
The binding is not available outside of body
.
Mfp
Fields
mfp: MapFilterProject
Linear operator to apply to each record.
Map, Filter, and Project operators.
This stage contains work that we would ideally like to fuse to other plan stages, but for practical reasons cannot. For example: reduce, threshold, and topk stages are not able to absorb this operator.
FlatMap
Fields
func: TableFunc
The variable-record emitting function.
exprs: Vec<MirScalarExpr>
Expressions that for each row prepare the arguments to func
.
mfp: MapFilterProject
Linear operator to apply to each record produced by func
.
input_key: Option<Vec<MirScalarExpr>>
The particular arrangement of the input we expect to use, if any
A variable number of output records for each input record.
This stage is a bit of a catch-all for logic that does not easily fit in map stages. This includes table valued functions, but also functions of multiple arguments, and functions that modify the sign of updates.
This stage allows a MapFilterProject
operator to be fused to its output,
and this can be very important as otherwise the output of func
is just
appended to the input record, for as many outputs as it has. This has the
unpleasant default behavior of repeating potentially large records that
are being unpacked, producing quadratic output in those cases. Instead,
in these cases use a mfp
member that projects away these large fields.
Join
Fields
plan: JoinPlan
Detailed information about the implementation of the join.
This includes information about the implementation strategy, but also any map, filter, project work that we might follow the join with, but potentially pushed down into the implementation of the join.
A multiway relational equijoin, with fused map, filter, and projection.
This stage performs a multiway join among inputs
, using the equality
constraints expressed in plan
. The plan also describes the implementataion
strategy we will use, and any pushed down per-record work.
Reduce
Fields
key_val_plan: KeyValPlan
A plan for changing input records into key, value pairs.
plan: ReducePlan
A plan for performing the reduce.
The implementation of reduction has several different strategies based on the properties of the reduction, and the input itself. Please check out the documentation for this type for more detail.
input_key: Option<Vec<MirScalarExpr>>
The particular arrangement of the input we expect to use, if any
Aggregation by key.
TopK
Fields
top_k_plan: TopKPlan
A plan for performing the Top-K.
The implementation of reduction has several different strategies based on the properties of the reduction, and the input itself. Please check out the documentation for this type for more detail.
Key-based “Top K” operator, retaining the first K records in each group.
Negate
Inverts the sign of each update.
Threshold
Fields
threshold_plan: ThresholdPlan
A plan for performing the threshold.
The implementation of reduction has several different strategies based on the properties of the reduction, and the input itself. Please check out the documentation for this type for more detail.
Filters records that accumulate negatively.
Although the operator suppresses updates, it is a stateful operator taking resources proportional to the number of records with non-zero accumulation.
Union
Adds the contents of the input collections.
Importantly, this is multiset union, so the multiplicities of records will
add. This is in contrast to set union, where the multiplicities would be
capped at one. A set union can be formed with Union
followed by Reduce
implementing the “distinct” operator.
ArrangeBy
Fields
forms: AvailableCollections
A list of arrangement keys, and possibly a raw collection, that will be added to those of the input.
If any of these collection forms are already present in the input, they have no effect.
input_key: Option<Vec<MirScalarExpr>>
The key that must be used to access the input.
input_mfp: MapFilterProject
The MFP that must be applied to the input.
The input
plan, but with additional arrangements.
This operator does not change the logical contents of input
, but ensures
that certain arrangements are available in the results. This operator can
be important for e.g. the Join
stage which benefits from multiple arrangements
or to cap a Plan
so that indexes can be exported.
Implementations
pub fn arrange_by(
self,
collections: AvailableCollections,
old_collections: &AvailableCollections,
arity: usize
) -> Self
pub fn arrange_by(
self,
collections: AvailableCollections,
old_collections: &AvailableCollections,
arity: usize
) -> Self
Replace the plan with another one that has the collection in some additional forms.
pub fn from_mir(
expr: &MirRelationExpr,
arrangements: &mut BTreeMap<Id, AvailableCollections>
) -> Result<(Self, AvailableCollections), ()>
pub fn from_mir(
expr: &MirRelationExpr,
arrangements: &mut BTreeMap<Id, AvailableCollections>
) -> Result<(Self, AvailableCollections), ()>
This method converts a MirRelationExpr into a plan that can be directly rendered.
The rough structure is that we repeatedly extract map/filter/project operators
from each expression we see, bundle them up as a MapFilterProject
object, and
then produce a plan for the combination of that with the next operator.
The method takes as an argument the existing arrangements for each bound identifier,
which it will locally add to and remove from for Let
bindings (by the end of the
call it should contain the same bindings as when it started).
The result of the method is both a Plan
, but also a list of arrangements that
are certain to be produced, which can be relied on by the next steps in the plan.
Each of the arrangement keys is associated with an MFP that must be applied if that arrangement is used,
to back out the permutation associated with that arrangement.
An empty list of arrangement keys indicates that only a Collection
stream can
be assumed to exist.
fn from_mir_inner(
expr: &MirRelationExpr,
arrangements: &mut BTreeMap<Id, AvailableCollections>
) -> Result<(Self, AvailableCollections), ()>
pub fn finalize_dataflow(
desc: DataflowDescription<OptimizedMirRelationExpr>
) -> Result<DataflowDescription<Self>, ()>
pub fn finalize_dataflow(
desc: DataflowDescription<OptimizedMirRelationExpr>
) -> Result<DataflowDescription<Self>, ()>
Convert the dataflow description into one that uses render plans.
Trait Implementations
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error> where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error> where
__D: Deserializer<'de>,
Deserialize this value from the given Serde deserializer. Read more
Auto Trait Implementations
impl RefUnwindSafe for Plan
impl UnwindSafe for Plan
Blanket Implementations
Mutably borrows from an owned value. Read more
Attaches the provided Subscriber
to this type, returning a
WithDispatch
wrapper. Read more
Attaches the current default Subscriber
to this type, returning a
WithDispatch
wrapper. Read more