Module mz_persist_types::columnar

Expand description

Columnar understanding of persisted data

For efficiency/performance, we directly expose the columnar structure of persist’s internal encoding to users during encoding and decoding. Interally we use the arrow crate that gets durably written as parquet data.

Some of the requirements that led to this design:

Support a separation of data and schema because Row is not self-describing: e.g. a Datum::Null can be one of many possible column types. A RelationDesc is necessary to describe a Row schema.
Narrow down arrow::datatypes::DataType (the arrow “logical” types) to a set we want to support in persist.
Do dyn Any downcasting of columns once per part, not once per update.

Finally, the Schema2 trait maps an implementor of Codec to the underlying column structure. It also provides a ColumnEncoder and ColumnDecoder for amortizing any downcasting that does need to happen.

Traits§

ColumnDecoder
A decoder for values of a fixed schema.
ColumnEncoder
An encoder for values of a fixed schema
FixedSizeCodec
A stable encoding for a type that gets durably persisted in an arrow::array::FixedSizeBinaryArray.
Schema2
Description of a type that we encode into Persist.

Functions§

codec_to_schema2
Helper to convert from codec-encoded data to structured data.
data_type
Returns the data type of arrays generated by this schema.
schema2_to_codec
Helper to convert from structured data to codec-encoded data.

Module mz_persist_types::columnarCopy item path

Traits§

Functions§

Module mz_persist_types::columnar