Expand description
Uploads a consolidated collection to S3
Modules§
Structs§
- dyncfg parameters for the copy_to operator, stored in this struct to avoid requiring storage to depend on the compute crate. See
src/compute-types/src/dyncfgs.rs
for the definition of these parameters.
Traits§
- This trait is used to abstract over the upload details for different file formats. Each format has its own buffering semantics and upload logic, since some can be written in a streaming fashion row-by-row, whereas others use a columnar-based format that requires buffering a batch of rows before writing to S3.
Functions§
- Copy the rows from the input collection to s3.
worker_callback
is used to send the final count of rows uploaded to s3, or an error message if the operator failed. This is per-worker, and these responses are aggregated upstream by the compute client.sink_id
is used to identify the sink for logging purposes and as a unique prefix for files created by the sink. - Renders the ‘completion’ operator, which expects a
completion_stream
that it reads over a Pipeline edge such that it receives a single completion event per worker. Then forwards this result to theworker_callback
after any cleanup work (see below). - Renders the ‘initialization’ operator, which does work on the leader worker only.
- Renders the
upload operator
, which waits on thestart_stream
to ensure initialization is complete and then handles the uploads to S3. Returns acompletion_stream
which contains 1 event per worker of the result of the upload operation, either an error or the number of rows uploaded by the worker.