Module dataflow::source::timestamp[][src]

Expand description

Types and methods for managing timestamp assignment and invention in sources External users will interact primarily with instances of a TimestampBindingRc object which lets various source instances reading on the same worker coordinate about the underlying TimestampBindingBox and give readers that are lagging behind the ability to delay compaction. Besides that, the only other bit of complexity in this code is the TimestampProposer object which manages the collaborative invention of timestamps by several source instances all reading from the same worker. The key idea is that since all source readers are assigned to the same worker, only one of them will be reading at a given time, and that reader can either consult the timestamp bindings generated by its peers if it is not the furthest ahead, or if it is the furthest ahead, it can propose a new assingment of (partition, offset) -> timestamp that its peers will respect.

Structs

Timestamp that was assigned to a source message.

This struct holds per partition timestamp binding state, as a ordered list of bindings (time, offset). Each binding indicates “all offsets < offset must be bound to time”, and adjacent pairs of bindings (time1, offset1), (time2, offset2) denote that offsets in [offset1, offset2) should get bound to time1.

Source-agnostic timestamp for SourceMessages. Admittedly, this is quite Kafka-centric.

This struct holds per-source timestamp state in a way that can be shared across different source instances and allow different source instances to indicate how far they have read up to.

A wrapper that allows multiple source instances to share a TimestampBindingBox and hold back its compaction.

Helper that can track the timestamp bindings from a TimestampBindingRc and emit differential updates that can be used to reconstruct the timestamp bindings.

This struct holds state for proposed timestamps and proposed bindings from offsets to timestamps.