Module differential_dataflow::trace::description
source · Expand description
Descriptions of intervals of partially ordered times.
A description provides what intends to be an unambiguous characterization of a batch of updates. We do assume that these updates are in the context of a known computation and known input, so there is a well-defined “correct answer” for the full set of updates.
full = { (data, time, diff) }
Our aim with a description is to specify a subset of these updates unambiguously.
Each description contains three frontiers, sets of mutually incomparable partially ordered
times. The first two frontiers are lower
and upper
, and they indicate the subset of
full
represented in the batch: those updates whose times are greater or equal to some
element of lower
but not greater or equal to any element of upper
.
subset = { (data, time, diff) in full | lower.any(|t| t.le(time)) &&
!upper.any(|t| t.le(time)) }
The third frontier since
is used to indicate that the times presented by the batch may
no longer reflect the values in subset
above. Although the updates are precisely those
bound by lower
and upper
, we may have advanced some of the times.
The guarantee provided by a batch is that for any time greater or equal to some element of
since
, the accumulated weight of batch updates before that time is identical to the accumulated
weights of updates from full
at times greater or equal to an element of lower
, greater
or equal to no element of upper
, and less or equal to the query time.
for all times t1:
if since.any(|t2| t2.less_equal(t1)) then:
for all data:
sum x where (data, t2, x) in batch and t2.less_equal(t1)
==
sum x where (data, t2, x) in full and t2.less_equal(t1)
and lower.any(|t3| t3.less_equal(t2))
and !upper.any(|t3| t3.less_equal(t2))
Very importantly, this equality does not make any other guarantees about the contents of the batch when one iterates through it. There are some consequences of the math that can be relied upon, though.
The most important consequence is that when since <= lower
the contents of the batch
must be identical to the updates in subset
. If it is ever the case that since
is
in advance of lower
, consumers of the batch must take care that they not use the times
observed in the batch without ensuring that they are appropriately advanced (typically by
since
). Failing to do so may produce updates that are not in advance of since
, which
will often be a logic bug, as since
does not advance without a corresponding advance in
times at which data may possibly be sent.
Structs§
- Describes an interval of partially ordered times.