Module dataflow::source::s3 [−][src]
Expand description
Functionality for creating S3 sources
This source is constructed as a collection of Tokio tasks that communicate over local (worker-pinned) queues to send data into dataflow. We spin up a single “downloader” task which is responsible for performing s3 object downloads and shuffling the data into dataflow. Then, for each object source, we spin up another task which is responsible for collecting object names from an object name source and sending that name to the downloader.
+----------------+
| bucket scanner +- -------
+----------------+ \--- -/ \-
+----------------+ \-- +------------+ / \
| sqs listener +--------X->| downloader +--->| dataflow |
+----------------+ /-- +------------+ \ /
. . . . /-- -\ /-
etc . . . -- -------
. . . .
Modules
Implementation of deserialization of AWS S3 Bucket notifications
Structs
Number of records This source has downloaded
Information required to load data from S3
Enums
Constants
Size of data chunks we send to dataflow
Functions
Find the unambiguous prefix of a glob
Send the relevant parts of the message to the download objects task
Set the SQS visibility timeout back to zero, allowing the messages to be sent to other clients