Static mz_persist::indexed::columnar::arrow::SCHEMA_ARROW_KVTD

source ·
pub static SCHEMA_ARROW_KVTD: Lazy<Arc<Schema>>
Expand description

The Arrow schema we use to encode ((K, V), T, D) tuples.

Both Time and Diff are presented externally to persist users as a type parameter that implements mz_persist_types::Codec64. Our columnar format intentionally stores them both as i64 columns (as opposed to something like a fixed width binary column) because this allows us additional compression options.

Also note that we intentionally use an i64 over a u64 for Time. Over the range [0, i64::MAX], the bytes are the same and we’ve talked at various times about changing Time in mz to an i64. Both millis since unix epoch and nanos since unix epoch easily fit into this range (the latter until some time after year 2200). Using a i64 might be a pessimization for a non-realtime mz source with u64 timestamps in the range (i64::MAX, u64::MAX], but realtime sources are overwhelmingly the common case.