Enum dataflow::Plan [−][src]
pub enum Plan {
Constant {
rows: Result<Vec<(Row, Timestamp, Diff)>, EvalError>,
},
Get {
id: Id,
keys: Vec<Vec<MirScalarExpr>>,
mfp: MapFilterProject,
key_val: Option<(Vec<MirScalarExpr>, Row)>,
},
Let {
id: LocalId,
value: Box<Plan>,
body: Box<Plan>,
},
Mfp {
input: Box<Plan>,
mfp: MapFilterProject,
key_val: Option<(Vec<MirScalarExpr>, Row)>,
},
FlatMap {
input: Box<Plan>,
func: TableFunc,
exprs: Vec<MirScalarExpr>,
mfp: MapFilterProject,
},
Join {
inputs: Vec<Plan>,
plan: JoinPlan,
},
Reduce {
input: Box<Plan>,
key_val_plan: KeyValPlan,
plan: ReducePlan,
permutation: Permutation,
},
TopK {
input: Box<Plan>,
top_k_plan: TopKPlan,
},
Negate {
input: Box<Plan>,
},
Threshold {
input: Box<Plan>,
threshold_plan: ThresholdPlan,
},
Union {
inputs: Vec<Plan>,
},
ArrangeBy {
input: Box<Plan>,
ensure_arrangements: Vec<(Vec<MirScalarExpr>, Permutation, Vec<usize>)>,
},
}
Expand description
A rendering plan with all conditional logic removed.
This type is exposed publicly but the intent is that its details are under the control of this crate, and they are subject to change as we find more compelling ways to represent renderable plans. Several stages have already encapsulated much of their logic in their own stage-specific plans, and we expect more of the plans to do the same in the future, without consultation.
Variants
A collection containing a pre-determined collection.
Fields of Constant
A reference to a bound collection.
This is commonly either an external reference to an existing source or
maintained arrangement, or an internal reference to a Let
identifier.
Fields of Get
id: Id
A global or local identifier naming the collection.
keys: Vec<Vec<MirScalarExpr>>
Arrangements that will be available.
The collection will also be loaded if available, which it will not be for imported data, but which it may be for locally defined data.
mfp: MapFilterProject
Any linear operator work to apply as part of producing the data.
This logic allows us to efficiently extract collections from data that have been pre-arranged, avoiding copying rows that are not used and columns that are projected away.
key_val: Option<(Vec<MirScalarExpr>, Row)>
Optionally, a pair of arrangement key and row value to search for.
When this is present, it means that the implementation can search the arrangement keyed by the first argument for the value that is the second argument, and process only those elements.
Binds value
to id
, and then results in body
with that binding.
This stage has the effect of sharing value
across multiple possible
uses in body
, and is the only mechanism we have for sharing collection
information across parts of a dataflow.
The binding is not available outside of body
.
Fields of Let
Map, Filter, and Project operators.
This stage contains work that we would ideally like to fuse to other plan stages, but for practical reasons cannot. For example: reduce, threshold, and topk stages are not able to absorb this operator.
Fields of Mfp
input: Box<Plan>
The input collection.
mfp: MapFilterProject
Linear operator to apply to each record.
key_val: Option<(Vec<MirScalarExpr>, Row)>
Optionally, a pair of arrangement key and row value to search for.
When this is present, it means that the implementation can search the arrangement keyed by the first argument for the value that is the second argument, and process only those elements.
A variable number of output records for each input record.
This stage is a bit of a catch-all for logic that does not easily fit in map stages. This includes table valued functions, but also functions of multiple arguments, and functions that modify the sign of updates.
This stage allows a MapFilterProject
operator to be fused to its output,
and this can be very important as otherwise the output of func
is just
appended to the input record, for as many outputs as it has. This has the
unpleasant default behavior of repeating potentially large records that
are being unpacked, producing quadratic output in those cases. Instead,
in these cases use a mfp
member that projects away these large fields.
Fields of FlatMap
input: Box<Plan>
The input collection.
func: TableFunc
The variable-record emitting function.
exprs: Vec<MirScalarExpr>
Expressions that for each row prepare the arguments to func
.
mfp: MapFilterProject
Linear operator to apply to each record produced by func
.
A multiway relational equijoin, with fused map, filter, and projection.
This stage performs a multiway join among inputs
, using the equality
constraints expressed in plan
. The plan also describes the implementataion
strategy we will use, and any pushed down per-record work.
Fields of Join
inputs: Vec<Plan>
An ordered list of inputs that will be joined.
plan: JoinPlan
Detailed information about the implementation of the join.
This includes information about the implementation strategy, but also any map, filter, project work that we might follow the join with, but potentially pushed down into the implementation of the join.
Aggregation by key.
Fields of Reduce
input: Box<Plan>
The input collection.
key_val_plan: KeyValPlan
A plan for changing input records into key, value pairs.
plan: ReducePlan
A plan for performing the reduce.
The implementation of reduction has several different strategies based on the properties of the reduction, and the input itself. Please check out the documentation for this type for more detail.
permutation: Permutation
Permutation of the produced arrangement
Key-based “Top K” operator, retaining the first K records in each group.
Fields of TopK
Inverts the sign of each update.
Filters records that accumulate negatively.
Although the operator suppresses updates, it is a stateful operator taking resources proportional to the number of records with non-zero accumulation.
Fields of Threshold
input: Box<Plan>
The input collection.
threshold_plan: ThresholdPlan
A plan for performing the threshold.
The implementation of reduction has several different strategies based on the properties of the reduction, and the input itself. Please check out the documentation for this type for more detail.
Adds the contents of the input collections.
Importantly, this is multiset union, so the multiplicities of records will
add. This is in contrast to set union, where the multiplicities would be
capped at one. A set union can be formed with Union
followed by Reduce
implementing the “distinct” operator.
The input
plan, but with additional arrangements.
This operator does not change the logical contents of input
, but ensures
that certain arrangements are available in the results. This operator can
be important for e.g. the Join
stage which benefits from multiple arrangements
or to cap a Plan
so that indexes can be exported.
Fields of ArrangeBy
input: Box<Plan>
The input collection.
ensure_arrangements: Vec<(Vec<MirScalarExpr>, Permutation, Vec<usize>)>
A list of arrangement keys that will be added to those of the input, together with a permutation and thinning pattern. The permutation and thinning pattern will be applied on the input if there is no existing arrangement on the set of keys.
If any of these keys are already present in the input, they have no effect.
Implementations
pub fn from_mir(
expr: &MirRelationExpr,
arrangements: &mut BTreeMap<Id, Vec<Vec<MirScalarExpr>>>
) -> Result<(Self, Vec<Vec<MirScalarExpr>>), ()>
pub fn from_mir(
expr: &MirRelationExpr,
arrangements: &mut BTreeMap<Id, Vec<Vec<MirScalarExpr>>>
) -> Result<(Self, Vec<Vec<MirScalarExpr>>), ()>
This method converts a MirRelationExpr into a plan that can be directly rendered.
The rough structure is that we repeatedly extract map/filter/project operators
from each expression we see, bundle them up as a MapFilterProject
object, and
then produce a plan for the combination of that with the next operator.
The method takes as an argument the existing arrangements for each bound identifier,
which it will locally add to and remove from for Let
bindings (by the end of the
call it should contain the same bindings as when it started).
The result of the method is both a Plan
, but also a list of arrangements that
are certain to be produced, which can be relied on by the next steps in the plan.
An empty list of arrangement keys indicates that only a Collection
stream can
be assumed to exist.
pub fn finalize_dataflow(
desc: DataflowDescription<OptimizedMirRelationExpr>
) -> Result<DataflowDescription<Self>, ()>
pub fn finalize_dataflow(
desc: DataflowDescription<OptimizedMirRelationExpr>
) -> Result<DataflowDescription<Self>, ()>
Convert the dataflow description into one that uses render plans.
Trait Implementations
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error> where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error> where
__D: Deserializer<'de>,
Deserialize this value from the given Serde deserializer. Read more
Auto Trait Implementations
impl RefUnwindSafe for Plan
impl UnwindSafe for Plan
Blanket Implementations
Mutably borrows from an owned value. Read more
Attaches the provided Subscriber
to this type, returning a
WithDispatch
wrapper. Read more
Attaches the current default Subscriber
to this type, returning a
WithDispatch
wrapper. Read more