Module columnar

Help

Expand description

Columnar understanding of persisted data

For efficiency/performance, we directly expose the columnar structure of persist’s internal encoding to users during encoding and decoding. Interally we use the arrow crate that gets durably written as parquet data.

Some of the requirements that led to this design:

Support a separation of data and schema because Row is not self-describing: e.g. a Datum::Null can be one of many possible column types. A RelationDesc is necessary to describe a Row schema.
Narrow down arrow::datatypes::DataType (the arrow “logical” types) to a set we want to support in persist.
Do dyn Any downcasting of columns once per part, not once per update.

Finally, the Schema trait maps an implementor of Codec to the underlying column structure. It also provides a ColumnEncoder and ColumnDecoder for amortizing any downcasting that does need to happen.

Traits§

ColumnDecoder: A decoder for values of a fixed schema.
ColumnEncoder: An encoder for values of a fixed schema
FixedSizeCodec: A stable encoding for a type that gets durably persisted in an arrow::array::FixedSizeBinaryArray.
Schema: Description of a type that we encode into Persist.

Functions§

codec_to_schema: Helper to convert from codec-encoded data to structured data.
data_type: Returns the data type of arrays generated by this schema.
schema_to_codec: Helper to convert from structured data to codec-encoded data.

Module columnarCopy item path

Traits§

Functions§

Module columnar