Module mz_environmentd::telemetry

source ·
Expand description

Telemetry collection.

This report loop collects two types of telemetry data on a regular interval:

  • Statistics, which represent aggregated activity since the last reporting interval. An example of a statistic is “number of SUBSCRIBE queries executed in the last reporting interval.”

  • Traits, which represent the properties of the environment at the time of reporting. An example of a trait is “number of currently active SUBSCRIBE queries.”

The reporting loop makes two Segment API calls each interval:

  • A group API call 0 to report traits. The traits are scoped to the environment’s cloud provider and region, as in:

    {
        "aws": {
            "us-east-1": {
                "active_subscribes": 2,
                ...
            }
        }
    }
    

    Downstream tools often flatten these traits into, e.g., aws_us_east_1_active_subscribes.

  • A track API call 1 for the “Environment Rolled Up” event, containing both statistics and traits as the event properties, as in:

    {
        "cloud_provider": "aws",
        "cloud_provider_region": "us-east-1",
        "active_subscribes": 1,
        "subscribes": 23,
        ...
    }
    

    This event is only emitted after the first reporting interval has completed, since at boot all statistics will be zero.

The reason for including traits in both the group and track API calls is because downstream tools want easy access to both of these questions:

  1. What is the latest state of the environment?
  2. What was the state of this environment in this time window?

Answering question 2 requires that we periodically report statistics and traits in a track call. Strictly speaking, the track event could be used to answer question 1 too (look for the latest “Environment Rolled Up” event), but in practice it is often far more convenient to have the latest state available as a property of the environment.

Structs§

  • Telemetry configuration.

Constants§

Functions§