Struct mz_txn_wal::txn_cache::TxnsCacheState

source ·
pub struct TxnsCacheState<T> {
    txns_id: ShardId,
    pub(crate) init_ts: T,
    pub(crate) progress_exclusive: T,
    next_batch_id: usize,
    pub(crate) unapplied_batches: BTreeMap<usize, (ShardId, Vec<u8>, T)>,
    batch_idx: HashMap<Vec<u8>, usize>,
    pub(crate) datas: BTreeMap<ShardId, DataTimes<T>>,
    pub(crate) unapplied_registers: VecDeque<(ShardId, T)>,
    only_data_id: Option<ShardId>,
}
Expand description

A cache of the txn shard contents, optimized for various in-memory operations.

§Implementation Details

Reads of data shards are almost as straightforward as writes. A data shard may be read normally, using snapshots, subscriptions, shard_source, etc, through the most recent non-empty write. However, the upper of the txns shard (and thus the logical upper of the data shard) may be arbitrarily far ahead of the physical upper of the data shard. As a result, we do the following:

  • To take a snapshot of a data shard, the as_of is passed through unchanged if the timestamp of that shard’s latest non-empty write is past it. Otherwise, we know the times between them have no writes and can fill them with empty updates. Concretely, to read a snapshot as of T:
    • We read the txns shard contents up through and including T, blocking until the upper passes T if necessary.
    • We then find, for the requested data shard, the latest non-empty write at a timestamp T' <= T.
    • We wait for T' to be applied by watching the data shard upper.
    • We compare_and_append empty updates for (T', T], which is known by the txn system to not have writes for this shard (otherwise we’d have picked a different T').
    • We read the snapshot at T as normal.
  • To iterate a listen on a data shard, when writes haven’t been read yet they are passed through unchanged, otherwise if the txns shard indicates that there are ranges of empty time progress is returned, otherwise progress to the txns shard will indicate when new information is available.

Note that all of the above can be determined solely by information in the txns shard. In particular, non-empty writes are indicated by updates with positive diffs.

Also note that the above is structured such that it is possible to write a timely operator with the data shard as an input, passing on all payloads unchanged and simply manipulating capabilities in response to data and txns shard progress. See crate::operator::txns_progress.

Fields§

§txns_id: ShardId§init_ts: T

The since of the txn_shard when this cache was initialized. Some writes with a timestamp < than this may have been applied and tidied, so this cache has no way of learning about them.

Invariant: never changes.

§progress_exclusive: T

The contents of this cache are updated up to, but not including, this time.

§next_batch_id: usize§unapplied_batches: BTreeMap<usize, (ShardId, Vec<u8>, T)>

The batches needing application as of the current progress.

This is indexed by a “batch id” that is internal to this object because timestamps are not unique.

Invariant: Values are sorted by timestamp.

§batch_idx: HashMap<Vec<u8>, usize>

An index into unapplied_batches keyed by the serialized batch.

§datas: BTreeMap<ShardId, DataTimes<T>>

The times at which each data shard has been written.

Invariant: Contains all unapplied writes and registers. Invariant: Contains the latest write and registertaion >= init_ts for all shards.

§unapplied_registers: VecDeque<(ShardId, T)>

The registers and forgets needing application as of the current progress.

Invariant: Values are sorted by timestamp.

§only_data_id: Option<ShardId>

If Some, this cache only tracks the indicated data shard as a performance optimization. When used, only some methods (in particular, the ones necessary for the txns_progress operator) are supported.

TODO: It’d be nice to make this a compile time thing. I have some ideas, but they’re decently invasive, so leave it for a followup.

Implementations§

source§

impl<T: Timestamp + Lattice + TotalOrder + StepForward + Codec64 + Sync> TxnsCacheState<T>

source

fn new(txns_id: ShardId, init_ts: T, only_data_id: Option<ShardId>) -> Self

Creates a new empty TxnsCacheState.

init_ts must be == the critical handle’s since of the txn shard.

source

pub(crate) async fn init<C: TxnsCodec>( only_data_id: Option<ShardId>, txns_read: ReadHandle<C::Key, C::Val, T, i64>, ) -> (Self, Subscribe<C::Key, C::Val, T, i64>)

Creates and initializes a new TxnsCacheState.

txns_read is a ReadHandle on the txn shard.

source

pub fn txns_id(&self) -> ShardId

Returns the ShardId of the txns shard.

source

pub fn registered_at_progress(&self, data_id: &ShardId, ts: &T) -> bool

Returns whether the data shard was registered to the txns set as of the current progress.

Specifically, a data shard is registered if the most recent register timestamp is set but the most recent forget timestamp is not set.

This function accepts a timestamp as input, but that timestamp must be equal to the progress exclusive, or else the function panics. It mainly acts as a way for the caller to think about the logical time at which this function executes. Times in the past may have been compacted away, and we can’t always return an accurate answer. If this function isn’t sufficient, you can usually find what you’re looking for by inspecting the times in the most recent registration.

source

pub(crate) fn all_registered_at_progress(&self, ts: &T) -> Vec<ShardId>

Returns the set of all data shards registered to the txns set as of the current progress. See Self::registered_at_progress.

source

pub fn data_snapshot(&self, data_id: ShardId, as_of: T) -> DataSnapshot<T>

Returns a token exchangeable for a snapshot of a data shard.

A data shard might be definite at times past the physical upper because of invariants maintained by this txn system. As a result, this method discovers the latest potentially unapplied write before the as_of.

Callers must first wait for TxnsCache::update_gt with the same or later timestamp to return. Panics otherwise.

source

pub fn data_listen_next(&self, data_id: &ShardId, ts: &T) -> DataListenNext<T>

Returns the next action to take when iterating a Listen on a data shard.

A data shard Listen is executed by repeatedly calling this method with an exclusive progress frontier. The returned value indicates an action to take. Some of these actions advance the progress frontier, which results in calling this method again with a higher timestamp, and thus a new action. See DataListenNext for specifications of the actions.

Note that this is a state machine on self.progress_exclusive and the listen progress. DataListenNext indicates which state transitions to take.

source

pub(crate) fn data_subscribe( &self, data_id: ShardId, as_of: T, ) -> DataSubscribe<T>

Returns a token exchangeable for a subscribe of a data shard.

Callers must first wait for TxnsCache::update_gt with the same or later timestamp to return. Panics otherwise.

source

pub fn min_unapplied_ts(&self) -> &T

Returns the minimum timestamp not known to be applied by this cache.

source

fn min_unapplied_ts_inner(&self) -> &T

source

pub(crate) fn unapplied( &self, ) -> impl Iterator<Item = (&ShardId, Unapplied<'_>, &T)>

Returns the operations needing application as of the current progress.

source

pub(crate) fn filter_retractions<'a>( &'a self, expected_txns_upper: &T, retractions: impl Iterator<Item = (&'a Vec<u8>, &'a ([u8; 8], ShardId))>, ) -> impl Iterator<Item = (&'a Vec<u8>, &'a ([u8; 8], ShardId))>

Filters out retractions known to have made it into the txns shard.

This is called with a set of things that are known to have been applied and in preparation for retracting them. The caller will attempt to retract everything not filtered out by this method in a CaA with an expected upper of expected_txns_upper. So, we catch up to that point, and keep everything that is still outstanding. If the CaA fails with an expected upper mismatch, then it must call this method again on the next attempt with the new expected upper (new retractions may have made it into the txns shard in the meantime).

Callers must first wait for TxnsCache::update_ge with the same or later timestamp to return. Panics otherwise.

source

pub(crate) fn push_entries( &mut self, entries: Vec<(TxnsEntry, T, i64)>, progress: T, )

Update contents with entries and mark this cache as progressed up to progress.

source

fn push_register(&mut self, data_id: ShardId, ts: T, diff: i64, compacted_ts: T)

source

fn push_append(&mut self, data_id: ShardId, batch: Vec<u8>, ts: T, diff: i64)

source

pub(crate) fn mark_register_applied(&mut self, ts: &T)

Informs the cache that all registers and forgets less than ts have been applied.

source

fn compact_data_times(&mut self, data_id: &ShardId)

Compact the internal representation for data_id by removing all data that is not needed to maintain the following invariants:

  • The latest write and registration for each shard are kept in self.datas.
  • All unapplied writes and registrations are kept in self.datas.
  • All writes in self.datas are contained by some registration in self.datas.
source

pub(crate) fn update_gauges(&self, metrics: &Metrics)

source

fn assert_only_data_id(&self, data_id: &ShardId)

source

pub(crate) fn validate(&self) -> Result<(), String>

Trait Implementations§

source§

impl<T: Debug> Debug for TxnsCacheState<T>

source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

§

impl<T> Freeze for TxnsCacheState<T>
where T: Freeze,

§

impl<T> RefUnwindSafe for TxnsCacheState<T>
where T: RefUnwindSafe,

§

impl<T> Send for TxnsCacheState<T>
where T: Send,

§

impl<T> Sync for TxnsCacheState<T>
where T: Sync,

§

impl<T> Unpin for TxnsCacheState<T>
where T: Unpin,

§

impl<T> UnwindSafe for TxnsCacheState<T>

Blanket Implementations§

source§

impl<T> Any for T
where T: 'static + ?Sized,

source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
source§

impl<T> Borrow<T> for T
where T: ?Sized,

source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
source§

impl<T, U> CastInto<U> for T
where U: CastFrom<T>,

source§

fn cast_into(self) -> U

Performs the cast.
source§

impl<T> CopyAs<T> for T

source§

fn copy_as(self) -> T

source§

impl<T> From<T> for T

source§

fn from(t: T) -> T

Returns the argument unchanged.

source§

impl<T> FutureExt for T

source§

fn with_context(self, otel_cx: Context) -> WithContext<Self>

Attaches the provided Context to this type, returning a WithContext wrapper. Read more
source§

fn with_current_context(self) -> WithContext<Self>

Attaches the current Context to this type, returning a WithContext wrapper. Read more
source§

impl<T> Instrument for T

source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
source§

impl<T, U> Into<U> for T
where U: From<T>,

source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

source§

impl<T> IntoRequest<T> for T

source§

fn into_request(self) -> Request<T>

Wrap the input message T in a tonic::Request
source§

impl<Unshared, Shared> IntoShared<Shared> for Unshared
where Shared: FromUnshared<Unshared>,

source§

fn into_shared(self) -> Shared

Creates a shared type from an unshared type.
source§

impl<T> Pointable for T

source§

const ALIGN: usize = _

The alignment of pointer.
§

type Init = T

The type for initializers.
source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
source§

impl<P, R> ProtoType<R> for P
where R: RustType<P>,

source§

impl<T> Same for T

§

type Output = T

Should always be Self
source§

impl<'a, S, T> Semigroup<&'a S> for T
where T: Semigroup<S>,

source§

fn plus_equals(&mut self, rhs: &&'a S)

The method of std::ops::AddAssign, for types that do not implement AddAssign.
source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

§

type Error = Infallible

The type returned in the event of a conversion error.
source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

source§

fn vzip(self) -> V

source§

impl<T> WithSubscriber for T

source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more
source§

impl<T> Allocation for T
where T: RefUnwindSafe + Send + Sync,