pub(super) struct ParquetUploader {
desc: Arc<RelationDesc>,
next_file_index: usize,
key_manager: S3KeyManager,
batch: u64,
max_file_size: u64,
sdk_config: Arc<SdkConfig>,
row_group_size_bytes: u64,
arrow_builder_buffer_bytes: u64,
active_file: Option<ParquetFile>,
params: CopyToParameters,
}
Expand description
A ParquetUploader
that writes rows to parquet files and uploads them to S3.
Spawns all S3 operations in tokio tasks to avoid blocking the surrounding timely context.
§Buffering
There are several layers of buffering in this uploader:
-
The uploader will hold a
ParquetFile
object after the first row is added. ThisParquetFile
holds anArrowBuilder
and anArrowWriter
. -
The
ArrowBuilder
builds a structure of in-memorymz_arrow_util::builder::ColBuilder
s from incomingmz_repr::Row
s. Eachmz_arrow_util::builder::ColBuilder
holds a specificarrow::array::builder
type for constructing a column of the given type. The entireArrowBuilder
is flushed to theParquetFile
’sArrowWriter
by converting it into aarrow::record_batch::RecordBatch
once we’ve given it more than the configured arrow_builder_buffer_bytes. -
The
ParquetFile
holds aArrowWriter
that buffers until it has enough data to write a parquet ‘row group’. The ‘row group’ size is usually based on the number of rows (in the ArrowWriter), but we also force it to flush based on data-size (see below for more details). -
When a row group is written out, the active
ParquetFile
provides a reference to the row group buffer to itsS3MultiPartUploader
which will copy the data to its own buffer. If this upload buffer exceeds the configured part size limit, theS3MultiPartUploader
will upload parts to S3 until the upload buffer is below the limit. -
When the
ParquetUploader
is finished, it will flush the activeParquetFile
which will flush itsArrowBuilder
and any open row groups to theS3MultiPartUploader
and upload the remaining parts to S3.
┌───────────────┐
│ mz_repr::Rows │
└───────┬───────┘
┌─────────────│───────────────────────────────────────────┐
│ │ ParquetFile │
│ ┌───────────▼─────────────┐ │
│ │ ArrowBuilder │ │
│ │ │ ┌──────────────────┐ │
│ │ Vec<ArrowColumn> │ │ ArrowWriter │ │
│ │ ┌─────────┐ ┌─────────┐ │ │ │ │
│ │ │ │ │ │ │ │ ┌──────────┐ │ │
│ │ │ColBuildr│ │ColBuildr│ ├────┼──►│ buffer │ │ │
│ │ │ │ │ │ │ │ └─────┬────┘ │ │
│ │ └─────────┘ └─────────┘ │ │ │ │ │
│ │ │ │ ┌─────▼────┐ │ │
│ └─────────────────────────┘ │ │ row group│ │ │
│ │ └─┬────────┘ │ │
│ │ │ │ │
│ └─────┼────────────┘ │
│ ┌──────┼────────────────┐ │
│ │ │ S3MultiPart│ │
│ │ ┌────▼─────┐ Uploader │ │
│ │ │ buffer │ │ │
│ ┌─────────┐ │ └───┬─────┬┘ │ │
│ │ S3 API │◄───────────────┤ │ │ │ │
│ └─────────┘ │ ┌───▼──┐ ┌▼─────┐ │ │
│ │ │ part │ │ part │ │ │
│ │ └──────┘ └──────┘ │ │
│ │ │ │
│ └───────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────┘
§File Size & Buffer Sizes
We expose a ‘MAX FILE SIZE’ parameter to the user, but this is difficult to enforce exactly since we don’t know the exact size of the data we’re writing before a parquet row-group is flushed. This is because the encoded size of the data is different than the in-memory representation and because the data pages within each column in a row-group are compressed. We also don’t know the exact size of the parquet metadata that will be written to the file.
Therefore we don’t use the S3MultiPartUploader’s hard file size limit since it’s difficult to handle those errors after we’ve already flushed data to the ArrowWriter. Instead we implement a crude check ourselves.
This check aims to hit the max-size limit but may exceed it by some amount. To ensure that amount is small, we set the max row-group size to a configurable ratio (e.g. 20%) of the max_file_size. This determines how often we’ll flush a row-group, but is only an approximation since the actual size of the row-group is not known until it’s written. After each row-group is flushed, the size of the file is checked and if it’s exceeded max-file-size a new file is started.
We also set the max ArrowBuilder buffer size to a ratio (e.g. 150%) of the row-group size to avoid the ArrowWriter buffering too much data itself before flushing a row-group. We’re aiming for the encoded & compressed size of the ArrowBuilder data to be roughly equal to the row-group size, but this is only an approximation.
TODO: We may want to consider adding additional limits to the buffer sizes to avoid memory issues if the user sets the max file size to be very large.
Fields§
§desc: Arc<RelationDesc>
The output description.
next_file_index: usize
The index of the next file to upload within the batch.
key_manager: S3KeyManager
Provides the appropriate bucket and object keys to use for uploads.
batch: u64
Identifies the batch that files uploaded by this uploader belong to.
max_file_size: u64
The desired file size. A new file upload will be started when the size exceeds this amount.
sdk_config: Arc<SdkConfig>
The aws sdk config.
row_group_size_bytes: u64
§arrow_builder_buffer_bytes: u64
§active_file: Option<ParquetFile>
The active parquet file being written to, stored in an option since it won’t be initialized until the builder is first flushed, and to make it easier to take ownership when calling in spawned tokio tasks (to avoid doing I/O in the surrounding timely context).
params: CopyToParameters
Upload and buffer params
Implementations§
source§impl ParquetUploader
impl ParquetUploader
sourceasync fn start_new_file(&mut self) -> Result<&mut ParquetFile, Error>
async fn start_new_file(&mut self) -> Result<&mut ParquetFile, Error>
Start a new parquet file for upload. Will finish the current file if one is active.
Trait Implementations§
source§impl CopyToS3Uploader for ParquetUploader
impl CopyToS3Uploader for ParquetUploader
fn new( sdk_config: SdkConfig, connection_details: S3UploadInfo, sink_id: &GlobalId, batch: u64, params: CopyToParameters, ) -> Result<ParquetUploader, Error>
Auto Trait Implementations§
impl Freeze for ParquetUploader
impl !RefUnwindSafe for ParquetUploader
impl Send for ParquetUploader
impl !Sync for ParquetUploader
impl Unpin for ParquetUploader
impl !UnwindSafe for ParquetUploader
Blanket Implementations§
source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
source§impl<T> FmtForward for T
impl<T> FmtForward for T
source§fn fmt_binary(self) -> FmtBinary<Self>where
Self: Binary,
fn fmt_binary(self) -> FmtBinary<Self>where
Self: Binary,
self
to use its Binary
implementation when Debug
-formatted.source§fn fmt_display(self) -> FmtDisplay<Self>where
Self: Display,
fn fmt_display(self) -> FmtDisplay<Self>where
Self: Display,
self
to use its Display
implementation when
Debug
-formatted.source§fn fmt_lower_exp(self) -> FmtLowerExp<Self>where
Self: LowerExp,
fn fmt_lower_exp(self) -> FmtLowerExp<Self>where
Self: LowerExp,
self
to use its LowerExp
implementation when
Debug
-formatted.source§fn fmt_lower_hex(self) -> FmtLowerHex<Self>where
Self: LowerHex,
fn fmt_lower_hex(self) -> FmtLowerHex<Self>where
Self: LowerHex,
self
to use its LowerHex
implementation when
Debug
-formatted.source§fn fmt_octal(self) -> FmtOctal<Self>where
Self: Octal,
fn fmt_octal(self) -> FmtOctal<Self>where
Self: Octal,
self
to use its Octal
implementation when Debug
-formatted.source§fn fmt_pointer(self) -> FmtPointer<Self>where
Self: Pointer,
fn fmt_pointer(self) -> FmtPointer<Self>where
Self: Pointer,
self
to use its Pointer
implementation when
Debug
-formatted.source§fn fmt_upper_exp(self) -> FmtUpperExp<Self>where
Self: UpperExp,
fn fmt_upper_exp(self) -> FmtUpperExp<Self>where
Self: UpperExp,
self
to use its UpperExp
implementation when
Debug
-formatted.source§fn fmt_upper_hex(self) -> FmtUpperHex<Self>where
Self: UpperHex,
fn fmt_upper_hex(self) -> FmtUpperHex<Self>where
Self: UpperHex,
self
to use its UpperHex
implementation when
Debug
-formatted.source§impl<T> FutureExt for T
impl<T> FutureExt for T
source§fn with_context(self, otel_cx: Context) -> WithContext<Self>
fn with_context(self, otel_cx: Context) -> WithContext<Self>
source§fn with_current_context(self) -> WithContext<Self>
fn with_current_context(self) -> WithContext<Self>
source§impl<T> Instrument for T
impl<T> Instrument for T
source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
source§impl<T> IntoRequest<T> for T
impl<T> IntoRequest<T> for T
source§fn into_request(self) -> Request<T>
fn into_request(self) -> Request<T>
T
in a tonic::Request
source§impl<T, U> OverrideFrom<Option<&T>> for Uwhere
U: OverrideFrom<T>,
impl<T, U> OverrideFrom<Option<&T>> for Uwhere
U: OverrideFrom<T>,
source§impl<T> Pipe for Twhere
T: ?Sized,
impl<T> Pipe for Twhere
T: ?Sized,
source§fn pipe<R>(self, func: impl FnOnce(Self) -> R) -> Rwhere
Self: Sized,
fn pipe<R>(self, func: impl FnOnce(Self) -> R) -> Rwhere
Self: Sized,
source§fn pipe_ref<'a, R>(&'a self, func: impl FnOnce(&'a Self) -> R) -> Rwhere
R: 'a,
fn pipe_ref<'a, R>(&'a self, func: impl FnOnce(&'a Self) -> R) -> Rwhere
R: 'a,
self
and passes that borrow into the pipe function. Read moresource§fn pipe_ref_mut<'a, R>(&'a mut self, func: impl FnOnce(&'a mut Self) -> R) -> Rwhere
R: 'a,
fn pipe_ref_mut<'a, R>(&'a mut self, func: impl FnOnce(&'a mut Self) -> R) -> Rwhere
R: 'a,
self
and passes that borrow into the pipe function. Read moresource§fn pipe_borrow<'a, B, R>(&'a self, func: impl FnOnce(&'a B) -> R) -> R
fn pipe_borrow<'a, B, R>(&'a self, func: impl FnOnce(&'a B) -> R) -> R
source§fn pipe_borrow_mut<'a, B, R>(
&'a mut self,
func: impl FnOnce(&'a mut B) -> R,
) -> R
fn pipe_borrow_mut<'a, B, R>( &'a mut self, func: impl FnOnce(&'a mut B) -> R, ) -> R
source§fn pipe_as_ref<'a, U, R>(&'a self, func: impl FnOnce(&'a U) -> R) -> R
fn pipe_as_ref<'a, U, R>(&'a self, func: impl FnOnce(&'a U) -> R) -> R
self
, then passes self.as_ref()
into the pipe function.source§fn pipe_as_mut<'a, U, R>(&'a mut self, func: impl FnOnce(&'a mut U) -> R) -> R
fn pipe_as_mut<'a, U, R>(&'a mut self, func: impl FnOnce(&'a mut U) -> R) -> R
self
, then passes self.as_mut()
into the pipe
function.source§fn pipe_deref<'a, T, R>(&'a self, func: impl FnOnce(&'a T) -> R) -> R
fn pipe_deref<'a, T, R>(&'a self, func: impl FnOnce(&'a T) -> R) -> R
self
, then passes self.deref()
into the pipe function.source§impl<T> Pointable for T
impl<T> Pointable for T
source§impl<P, R> ProtoType<R> for Pwhere
R: RustType<P>,
impl<P, R> ProtoType<R> for Pwhere
R: RustType<P>,
source§fn into_rust(self) -> Result<R, TryFromProtoError>
fn into_rust(self) -> Result<R, TryFromProtoError>
RustType::from_proto
.source§fn from_rust(rust: &R) -> P
fn from_rust(rust: &R) -> P
RustType::into_proto
.source§impl<'a, S, T> Semigroup<&'a S> for Twhere
T: Semigroup<S>,
impl<'a, S, T> Semigroup<&'a S> for Twhere
T: Semigroup<S>,
source§fn plus_equals(&mut self, rhs: &&'a S)
fn plus_equals(&mut self, rhs: &&'a S)
std::ops::AddAssign
, for types that do not implement AddAssign
.source§impl<T> Tap for T
impl<T> Tap for T
source§fn tap_borrow<B>(self, func: impl FnOnce(&B)) -> Self
fn tap_borrow<B>(self, func: impl FnOnce(&B)) -> Self
Borrow<B>
of a value. Read moresource§fn tap_borrow_mut<B>(self, func: impl FnOnce(&mut B)) -> Self
fn tap_borrow_mut<B>(self, func: impl FnOnce(&mut B)) -> Self
BorrowMut<B>
of a value. Read moresource§fn tap_ref<R>(self, func: impl FnOnce(&R)) -> Self
fn tap_ref<R>(self, func: impl FnOnce(&R)) -> Self
AsRef<R>
view of a value. Read moresource§fn tap_ref_mut<R>(self, func: impl FnOnce(&mut R)) -> Self
fn tap_ref_mut<R>(self, func: impl FnOnce(&mut R)) -> Self
AsMut<R>
view of a value. Read moresource§fn tap_deref<T>(self, func: impl FnOnce(&T)) -> Self
fn tap_deref<T>(self, func: impl FnOnce(&T)) -> Self
Deref::Target
of a value. Read moresource§fn tap_deref_mut<T>(self, func: impl FnOnce(&mut T)) -> Self
fn tap_deref_mut<T>(self, func: impl FnOnce(&mut T)) -> Self
Deref::Target
of a value. Read moresource§fn tap_dbg(self, func: impl FnOnce(&Self)) -> Self
fn tap_dbg(self, func: impl FnOnce(&Self)) -> Self
.tap()
only in debug builds, and is erased in release builds.source§fn tap_mut_dbg(self, func: impl FnOnce(&mut Self)) -> Self
fn tap_mut_dbg(self, func: impl FnOnce(&mut Self)) -> Self
.tap_mut()
only in debug builds, and is erased in release
builds.source§fn tap_borrow_dbg<B>(self, func: impl FnOnce(&B)) -> Self
fn tap_borrow_dbg<B>(self, func: impl FnOnce(&B)) -> Self
.tap_borrow()
only in debug builds, and is erased in release
builds.source§fn tap_borrow_mut_dbg<B>(self, func: impl FnOnce(&mut B)) -> Self
fn tap_borrow_mut_dbg<B>(self, func: impl FnOnce(&mut B)) -> Self
.tap_borrow_mut()
only in debug builds, and is erased in release
builds.source§fn tap_ref_dbg<R>(self, func: impl FnOnce(&R)) -> Self
fn tap_ref_dbg<R>(self, func: impl FnOnce(&R)) -> Self
.tap_ref()
only in debug builds, and is erased in release
builds.source§fn tap_ref_mut_dbg<R>(self, func: impl FnOnce(&mut R)) -> Self
fn tap_ref_mut_dbg<R>(self, func: impl FnOnce(&mut R)) -> Self
.tap_ref_mut()
only in debug builds, and is erased in release
builds.source§fn tap_deref_dbg<T>(self, func: impl FnOnce(&T)) -> Self
fn tap_deref_dbg<T>(self, func: impl FnOnce(&T)) -> Self
.tap_deref()
only in debug builds, and is erased in release
builds.