pub trait Operate<T: Timestamp> {
// Required methods
fn inputs(&self) -> usize;
fn outputs(&self) -> usize;
fn initialize(
self: Box<Self>,
) -> (Connectivity<T::Summary>, Rc<RefCell<SharedProgress<T>>>, Box<dyn Schedule>);
fn notify_me(&self) -> &[FrontierInterest];
// Provided method
fn local(&self) -> bool { ... }
}Expand description
A dataflow operator that progress with a specific timestamp type.
This trait describes the methods necessary to present as a dataflow operator.
This trait is a “builder” for operators, in that it reveals the structure of the operator
and its requirements, but then (through initialize) consumes itself to produce a boxed
schedulable object. At the moment of initialization, the values of the other methods are
captured and frozen.
Required Methods§
Sourcefn initialize(
self: Box<Self>,
) -> (Connectivity<T::Summary>, Rc<RefCell<SharedProgress<T>>>, Box<dyn Schedule>)
fn initialize( self: Box<Self>, ) -> (Connectivity<T::Summary>, Rc<RefCell<SharedProgress<T>>>, Box<dyn Schedule>)
Initializes the operator, converting the operator builder to a schedulable object.
In addition, initialization produces internal connectivity, and a shared progress conduit which must contain any initial output capabilities the operator would like to hold.
The internal connectivity summarizes the operator by a map from pairs (input, output)
to an antichain of timestamp summaries, indicating how a timestamp on any of its inputs may
be transformed to timestamps on any of its outputs. The conservative and most common result
is full connectivity between all inputs and outputs, each with the identity summary.
The shared progress object allows information to move between the host and the schedulable. Importantly, it also indicates the initial internal capabilities for all of its outputs. This must happen at this moment, as it is the only moment where an operator is allowed to safely “create” capabilities without basing them on other, prior capabilities.
Sourcefn notify_me(&self) -> &[FrontierInterest]
fn notify_me(&self) -> &[FrontierInterest]
Indicates for each input whether the operator should be invoked when that input’s frontier changes.
Returns a Vec<FrontierInterest> with one entry per input. Each entry describes whether
frontier changes on that input should cause the operator to be scheduled. The conservative
default is Always for each input.
Provided Methods§
Sourcefn local(&self) -> bool
fn local(&self) -> bool
Indicates if the operator is strictly local to this worker.
A parent scope must understand whether the progress information returned by the worker reflects only this worker’s progress, so that it knows whether to send and receive the corresponding progress messages to its peers. If the operator is strictly local, it must exchange this information, whereas if the operator is itself implemented by the same set of workers, the parent scope understands that progress information already reflects the aggregate information among the workers.
This is a coarse approximation to refined worker sets. In a future better world, operators would explain how their implementations are partitioned, so that a parent scope knows what progress information to exchange with which peers. Right now the two choices are either “all” or “none”, but it could be more detailed. In the more detailed case, this method should / could return a pair (index, peers), indicating the group id of the worker out of how many groups. This becomes complicated, as a full all-to-all exchange would result in multiple copies of the same progress messages (but aggregated variously) arriving at arbitrary times.