Expand description
APIs to read from Parquet format.
Re-exports§
pub use schema::infer_schema;
pub use parquet2::fallible_streaming_iterator;
Modules§
- API to perform page-level filtering (also known as indexes)
- APIs to handle Parquet <-> Arrow schemas.
- APIs exposing
parquet2
’s statistics as arrow’s statistics.
Structs§
- Metadata for a column chunk.
- A descriptor for leaf-level primitive columns. This encapsulates information such as definition and repetition levels and is used to re-assemble nested data.
- A
CompressedDataPage
is compressed, encoded representation of a Parquet data page. It holds actual data and thus cloning it is expensive. - Decompressor that allows re-using the page buffer of
PageIterator
. - Metadata for a Parquet file.
- An iterator of
Chunk
s coming from row groups of a parquet file. - The state of nested data types.
- A fallible
Iterator
ofCompressedDataPage
. This iterator reads pages back to back until all pages have been consumed. The pages from this iterator always haveNone
crate::page::CompressedDataPage::selected_rows()
since filter pushdown is not supported without a pre-computed page index. - A
MutStreamingIterator
of pre-read column chunks - Metadata for a row group.
- An [
Iterator<Item=RowGroupDeserializer>
] from row groups of a parquet file. - An iterator adapter over
NestedArrayIter
assumed to be encoded as Struct arrays
Enums§
- The initial info of nested data types.
- A
Page
is an uncompressed, encoded representation of a Parquet page. It may hold actual data and thus cloning it may be expensive. - Errors generated by this crate
- Representation of a Parquet type describing primitive and nested fields, including the top-level schema of the parquet file.
- The set of all physical types representable in Parquet
- State of
MutStreamingIterator
.
Traits§
- A fallible, streaming iterator.
- A special kind of fallible streaming iterator where
advance
consumes the iterator. - Trait describing a
FallibleStreamingIterator
ofPage
Functions§
- Reads the column indexes of all
ColumnChunkMetaData
and deserializes them intoIndex
. Returns an empty vector if indexes are not available - Reads a
FileMetaData
from the reader, located at the end of the file. - Asynchronously reads the files’ metadata
- Creates a new
ListArray
orFixedSizeListArray
. - Decompresses the page, using
buffer
for decompression. Ifpage.buffer.len() == 0
, there was no decompression and the buffer was moved. Else, decompression took place. - Returns a
ColumnIterator
of column chunks corresponding tofield
. - Returns all
ColumnChunkMetaData
associated tofield_name
. For non-nested parquet types, this returns a single column - Returns all
ColumnChunkMetaData
associated tofield_name
. For non-nested parquet types, this returns a single column - Creates a new iterator of compressed pages.
- Returns a stream of compressed data pages
- Initialize
NestedState
from&[InitNested]
. - Returns the number of (parquet) columns that a
DataType
contains. - Reads all columns that are part of the parquet field
field_name
- Reads all columns that are part of the parquet field
field_name
- Returns a vector of iterators of
Array
corresponding to the top level parquet fields whose name matchesfields
’s names. - Reads parquets’ metadata syncronously.
- Reads parquets’ metadata asynchronously.
- Read
PageLocation
s from theColumnChunkMetaData
s. Returns an empty vector if indexes are not available