Skip to main content

Operate

Trait Operate 

Source
pub trait Operate<T: Timestamp> {
    // Required methods
    fn inputs(&self) -> usize;
    fn outputs(&self) -> usize;
    fn initialize(
        self: Box<Self>,
    ) -> (Connectivity<T::Summary>, Rc<RefCell<SharedProgress<T>>>, Box<dyn Schedule>);
    fn notify_me(&self) -> &[FrontierInterest];

    // Provided method
    fn local(&self) -> bool { ... }
}
Expand description

A dataflow operator that progress with a specific timestamp type.

This trait describes the methods necessary to present as a dataflow operator. This trait is a “builder” for operators, in that it reveals the structure of the operator and its requirements, but then (through initialize) consumes itself to produce a boxed schedulable object. At the moment of initialization, the values of the other methods are captured and frozen.

Required Methods§

Source

fn inputs(&self) -> usize

The number of inputs.

Source

fn outputs(&self) -> usize

The number of outputs.

Source

fn initialize( self: Box<Self>, ) -> (Connectivity<T::Summary>, Rc<RefCell<SharedProgress<T>>>, Box<dyn Schedule>)

Initializes the operator, converting the operator builder to a schedulable object.

In addition, initialization produces internal connectivity, and a shared progress conduit which must contain any initial output capabilities the operator would like to hold.

The internal connectivity summarizes the operator by a map from pairs (input, output) to an antichain of timestamp summaries, indicating how a timestamp on any of its inputs may be transformed to timestamps on any of its outputs. The conservative and most common result is full connectivity between all inputs and outputs, each with the identity summary.

The shared progress object allows information to move between the host and the schedulable. Importantly, it also indicates the initial internal capabilities for all of its outputs. This must happen at this moment, as it is the only moment where an operator is allowed to safely “create” capabilities without basing them on other, prior capabilities.

Source

fn notify_me(&self) -> &[FrontierInterest]

Indicates for each input whether the operator should be invoked when that input’s frontier changes.

Returns a Vec<FrontierInterest> with one entry per input. Each entry describes whether frontier changes on that input should cause the operator to be scheduled. The conservative default is Always for each input.

Provided Methods§

Source

fn local(&self) -> bool

Indicates if the operator is strictly local to this worker.

A parent scope must understand whether the progress information returned by the worker reflects only this worker’s progress, so that it knows whether to send and receive the corresponding progress messages to its peers. If the operator is strictly local, it must exchange this information, whereas if the operator is itself implemented by the same set of workers, the parent scope understands that progress information already reflects the aggregate information among the workers.

This is a coarse approximation to refined worker sets. In a future better world, operators would explain how their implementations are partitioned, so that a parent scope knows what progress information to exchange with which peers. Right now the two choices are either “all” or “none”, but it could be more detailed. In the more detailed case, this method should / could return a pair (index, peers), indicating the group id of the worker out of how many groups. This becomes complicated, as a full all-to-all exchange would result in multiple copies of the same progress messages (but aggregated variously) arriving at arbitrary times.

Implementors§

Source§

impl<TOuter, TInner> Operate<TOuter> for Subgraph<TOuter, TInner>
where TOuter: Timestamp, TInner: Timestamp + Refines<TOuter>,