Static mz_persist::indexed::columnar::arrow::SCHEMA_ARROW_RS_KVTD
source · pub static SCHEMA_ARROW_RS_KVTD: LazyLock<Arc<Schema>>
Expand description
The Arrow schema we use to encode ((K, V), T, D) tuples.
Both Time and Diff are presented externally to persist users as a type parameter that implements mz_persist_types::Codec64. Our columnar format intentionally stores them both as i64 columns (as opposed to something like a fixed width binary column) because this allows us additional compression options.
Also note that we intentionally use an i64 over a u64 for Time. Over the
range [0, i64::MAX]
, the bytes are the same and we’ve talked at various
times about changing Time in mz to an i64. Both millis since unix epoch and
nanos since unix epoch easily fit into this range (the latter until some
time after year 2200). Using a i64 might be a pessimization for a
non-realtime mz source with u64 timestamps in the range (i64::MAX, u64::MAX]
, but realtime sources are overwhelmingly the common case.