Expand description
APIs to read from Parquet format.
Re-exports§
pub use schema::infer_schema;pub use parquet2::fallible_streaming_iterator;
Modules§
- API to perform page-level filtering (also known as indexes)
- APIs to handle Parquet <-> Arrow schemas.
- APIs exposing
parquet2’s statistics as arrow’s statistics.
Structs§
- Metadata for a column chunk.
- A descriptor for leaf-level primitive columns. This encapsulates information such as definition and repetition levels and is used to re-assemble nested data.
- A
CompressedDataPageis compressed, encoded representation of a Parquet data page. It holds actual data and thus cloning it is expensive. - Decompressor that allows re-using the page buffer of
PageIterator. - Metadata for a Parquet file.
- An iterator of
Chunks coming from row groups of a parquet file. - The state of nested data types.
- A fallible
IteratorofCompressedDataPage. This iterator reads pages back to back until all pages have been consumed. The pages from this iterator always haveNonecrate::page::CompressedDataPage::selected_rows()since filter pushdown is not supported without a pre-computed page index. - A
MutStreamingIteratorof pre-read column chunks - Metadata for a row group.
- An [
Iterator<Item=RowGroupDeserializer>] from row groups of a parquet file. - An iterator adapter over
NestedArrayIterassumed to be encoded as Struct arrays
Enums§
- The initial info of nested data types.
- A
Pageis an uncompressed, encoded representation of a Parquet page. It may hold actual data and thus cloning it may be expensive. - Errors generated by this crate
- Representation of a Parquet type describing primitive and nested fields, including the top-level schema of the parquet file.
- The set of all physical types representable in Parquet
- State of
MutStreamingIterator.
Traits§
- A fallible, streaming iterator.
- A special kind of fallible streaming iterator where
advanceconsumes the iterator. - Trait describing a
FallibleStreamingIteratorofPage
Functions§
- Reads the column indexes of all
ColumnChunkMetaDataand deserializes them intoIndex. Returns an empty vector if indexes are not available - Reads a
FileMetaDatafrom the reader, located at the end of the file. - Asynchronously reads the files’ metadata
- Creates a new
ListArrayorFixedSizeListArray. - Decompresses the page, using
bufferfor decompression. Ifpage.buffer.len() == 0, there was no decompression and the buffer was moved. Else, decompression took place. - Returns a
ColumnIteratorof column chunks corresponding tofield. - Returns all
ColumnChunkMetaDataassociated tofield_name. For non-nested parquet types, this returns a single column - Returns all
ColumnChunkMetaDataassociated tofield_name. For non-nested parquet types, this returns a single column - Creates a new iterator of compressed pages.
- Returns a stream of compressed data pages
- Initialize
NestedStatefrom&[InitNested]. - Returns the number of (parquet) columns that a
DataTypecontains. - Reads all columns that are part of the parquet field
field_name - Reads all columns that are part of the parquet field
field_name - Returns a vector of iterators of
Arraycorresponding to the top level parquet fields whose name matchesfields’s names. - Reads parquets’ metadata syncronously.
- Reads parquets’ metadata asynchronously.
- Read
PageLocations from theColumnChunkMetaDatas. Returns an empty vector if indexes are not available