Module dataflow::source::s3[][src]

Expand description

Functionality for creating S3 sources

This source is constructed as a collection of Tokio tasks that communicate over local (worker-pinned) queues to send data into dataflow. We spin up a single “downloader” task which is responsible for performing s3 object downloads and shuffling the data into dataflow. Then, for each object source, we spin up another task which is responsible for collecting object names from an object name source and sending that name to the downloader.

+----------------+
| bucket scanner +-                               -------
+----------------+ \---                         -/       \-
+----------------+     \--   +------------+    /           \
| sqs listener   +--------X->| downloader +--->| dataflow  |
+----------------+     /--   +------------+    \           /
       .  .  .  .   /--                         -\       /-
      etc .  .  . --                              -------
       .  .  .  .

Modules

Implementation of deserialization of AWS S3 Bucket notifications

Structs

Number of records This source has downloaded

Information required to load data from S3

Enums

Current dataflow status

Constants

Size of data chunks we send to dataflow

Functions

Find the unambiguous prefix of a glob

Send the relevant parts of the message to the download objects task

Set the SQS visibility timeout back to zero, allowing the messages to be sent to other clients

Type Definitions