Struct mz_persist_client::cfg::PersistConfig
source · pub struct PersistConfig {Show 17 fields
pub(crate) build_version: Version,
pub hostname: String,
pub now: NowFn,
pub(crate) dynamic: Arc<DynamicConfig>,
pub compaction_enabled: bool,
pub compaction_concurrency_limit: usize,
pub compaction_queue_size: usize,
pub consensus_connection_pool_max_size: usize,
pub writer_lease_duration: Duration,
pub reader_lease_duration: Duration,
pub critical_downgrade_interval: Duration,
pub pubsub_connect_attempt_timeout: Duration,
pub pubsub_connect_max_backoff: Duration,
pub pubsub_client_sender_channel_size: usize,
pub pubsub_client_receiver_channel_size: usize,
pub pubsub_server_connection_channel_size: usize,
pub pubsub_state_cache_shard_ref_channel_size: usize,
}
Expand description
The tunable knobs for persist.
Tuning inputs:
- A larger blob_target_size (capped at KEY_VAL_DATA_MAX_LEN) results in fewer entries in consensus state. Before we have compaction and/or incremental state, it is already growing without bound, so this is a concern. OTOH, for any “reasonable” size (> 100MiB?) of blob_target_size, it seems we’d end up with a pretty tremendous amount of data in the shard before this became a real issue.
- A larger blob_target_size will results in fewer s3 operations, which are charged per operation. (Hmm, maybe not if we’re charged per call in a multipart op. The S3Blob impl already chunks things at 8MiB.)
- A smaller blob_target_size will result in more even memory usage in readers.
- A larger batch_builder_max_outstanding_parts increases throughput (to a point).
- A smaller batch_builder_max_outstanding_parts provides a bound on the amount of memory used by a writer.
- A larger compaction_heuristic_min_inputs means state size is larger.
- A smaller compaction_heuristic_min_inputs means more compactions happen (higher write amp).
- A larger compaction_heuristic_min_updates means more consolidations are discovered while reading a snapshot (higher read amp and higher space amp).
- A smaller compaction_heuristic_min_updates means more compactions happen (higher write amp).
Tuning logic:
- blob_target_size was initially selected to be an exact multiple of 8MiB (the s3 multipart size) that was in the same neighborhood as our initial max throughput (~250MiB).
- batch_builder_max_outstanding_parts was initially selected to be as small as possible without harming pipelining. 0 means no pipelining, 1 is full pipelining as long as generating data takes less time than writing to s3 (hopefully a fair assumption), 2 is a little extra slop on top of 1.
- compaction_heuristic_min_inputs was set by running the open-loop benchmark with batches of size 10,240 bytes (selected to be small but such that the overhead of our columnar encoding format was less than 10%) and manually increased until the write amp stopped going down. This becomes much less important once we have incremental state. The initial value is a placeholder and should be revisited at some point.
- compaction_heuristic_min_updates was set via a thought experiment. This is
an
O(n*log(n))
upper bound on the number of unconsolidated updates that would be consolidated if we compacted as the in-mem Spine does. The initial value is a placeholder and should be revisited at some point.
TODO: Move these tuning notes into SessionVar descriptions once we have SystemVars for most of these.
Fields§
§build_version: Version
Info about which version of the code is running.
hostname: String
Hostname of this persist user. Stored in state and used for debugging.
now: NowFn
A clock to use for all leasing and other non-debugging use.
dynamic: Arc<DynamicConfig>
Configurations that can be dynamically updated.
compaction_enabled: bool
Whether to physically and logically compact batches in blob storage.
compaction_concurrency_limit: usize
In Compactor::compact_and_apply_background, the maximum number of concurrent compaction requests that can execute for a given shard.
compaction_queue_size: usize
In Compactor::compact_and_apply_background, the maximum number of pending compaction requests to queue.
consensus_connection_pool_max_size: usize
The maximum size of the connection pool to Postgres/CRDB when performing consensus reads and writes.
writer_lease_duration: Duration
Length of time after a writer’s last operation after which the writer may be expired.
reader_lease_duration: Duration
Length of time after a reader’s last operation after which the reader may be expired.
critical_downgrade_interval: Duration
Length of time between critical handles’ calls to downgrade since
pubsub_connect_attempt_timeout: Duration
Timeout per connection attempt to Persist PubSub service.
pubsub_connect_max_backoff: Duration
Maximum backoff when retrying connection establishment to Persist PubSub service.
pubsub_client_sender_channel_size: usize
Size of channel used to buffer send messages to PubSub service.
pubsub_client_receiver_channel_size: usize
Size of channel used to buffer received messages from PubSub service.
pubsub_server_connection_channel_size: usize
Size of channel used per connection to buffer broadcasted messages from PubSub server.
pubsub_state_cache_shard_ref_channel_size: usize
Size of channel used by the state cache to broadcast shard state references.
Implementations§
source§impl PersistConfig
impl PersistConfig
sourcepub fn new(build_info: &BuildInfo, now: NowFn) -> Self
pub fn new(build_info: &BuildInfo, now: NowFn) -> Self
Returns a new instance of PersistConfig with default tuning.
sourcepub fn sink_minimum_batch_updates(&self) -> usize
pub fn sink_minimum_batch_updates(&self) -> usize
The minimum number of updates that justify writing out a batch in persist_sink
’s
write_batches
operator. (If there are fewer than this minimum number of updates,
they’ll be forwarded on to append_batch
to be combined and written there.)
sourcepub fn storage_sink_minimum_batch_updates(&self) -> usize
pub fn storage_sink_minimum_batch_updates(&self) -> usize
The same as Self::sink_minimum_batch_updates
, but
for storage persist_sink
’s.
source§impl PersistConfig
impl PersistConfig
sourcepub const DEFAULT_BLOB_TARGET_SIZE: usize = 134_217_728usize
pub const DEFAULT_BLOB_TARGET_SIZE: usize = 134_217_728usize
Default value for DynamicConfig::blob_target_size
.
sourcepub const DEFAULT_COMPACTION_MINIMUM_TIMEOUT: Duration = _
pub const DEFAULT_COMPACTION_MINIMUM_TIMEOUT: Duration = _
Default value for DynamicConfig::compaction_minimum_timeout
.
sourcepub const DEFAULT_CRDB_CONNECT_TIMEOUT: Duration = _
pub const DEFAULT_CRDB_CONNECT_TIMEOUT: Duration = _
Default value for DynamicConfig::consensus_connect_timeout
.
sourcepub const DEFAULT_CRDB_TCP_USER_TIMEOUT: Duration = _
pub const DEFAULT_CRDB_TCP_USER_TIMEOUT: Duration = _
Default value for DynamicConfig::consensus_tcp_user_timeout
.
sourcepub const DEFAULT_STATS_AUDIT_PERCENT: usize = 0usize
pub const DEFAULT_STATS_AUDIT_PERCENT: usize = 0usize
Default value for DynamicConfig::stats_audit_percent
.
sourcepub const DEFAULT_STATS_COLLECTION_ENABLED: bool = false
pub const DEFAULT_STATS_COLLECTION_ENABLED: bool = false
Default value for DynamicConfig::stats_collection_enabled
.
sourcepub const DEFAULT_STATS_FILTER_ENABLED: bool = false
pub const DEFAULT_STATS_FILTER_ENABLED: bool = false
Default value for DynamicConfig::stats_filter_enabled
.
sourcepub const DEFAULT_PUBSUB_CLIENT_ENABLED: bool = false
pub const DEFAULT_PUBSUB_CLIENT_ENABLED: bool = false
Default value for DynamicConfig::pubsub_client_enabled
.
sourcepub const DEFAULT_PUBSUB_PUSH_DIFF_ENABLED: bool = true
pub const DEFAULT_PUBSUB_PUSH_DIFF_ENABLED: bool = true
Default value for DynamicConfig::pubsub_push_diff_enabled
.
sourcepub const DEFAULT_SINK_MINIMUM_BATCH_UPDATES: usize = 0usize
pub const DEFAULT_SINK_MINIMUM_BATCH_UPDATES: usize = 0usize
Default value for PersistConfig::sink_minimum_batch_updates
.
sourcepub const DEFAULT_NEXT_LISTEN_BATCH_RETRYER: RetryParameters = _
pub const DEFAULT_NEXT_LISTEN_BATCH_RETRYER: RetryParameters = _
Default value for DynamicConfig::next_listen_batch_retry_params
.
pub(crate) const DEFAULT_READ_LEASE_DURATION: Duration = _
pub(crate) const NEED_ROLLUP_THRESHOLD: u64 = 128u64
pub fn set_state_versions_recent_live_diffs_limit(&self, val: usize)
Trait Implementations§
source§impl BlobKnobs for PersistConfig
impl BlobKnobs for PersistConfig
source§fn operation_timeout(&self) -> Duration
fn operation_timeout(&self) -> Duration
source§fn operation_attempt_timeout(&self) -> Duration
fn operation_attempt_timeout(&self) -> Duration
source§fn connect_timeout(&self) -> Duration
fn connect_timeout(&self) -> Duration
source§fn read_timeout(&self) -> Duration
fn read_timeout(&self) -> Duration
source§impl Clone for PersistConfig
impl Clone for PersistConfig
source§fn clone(&self) -> PersistConfig
fn clone(&self) -> PersistConfig
1.0.0 · source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source
. Read moresource§impl ConsensusKnobs for PersistConfig
impl ConsensusKnobs for PersistConfig
source§fn connection_pool_max_size(&self) -> usize
fn connection_pool_max_size(&self) -> usize
source§fn connection_pool_ttl(&self) -> Duration
fn connection_pool_ttl(&self) -> Duration
source§fn connection_pool_ttl_stagger(&self) -> Duration
fn connection_pool_ttl_stagger(&self) -> Duration
source§fn connect_timeout(&self) -> Duration
fn connect_timeout(&self) -> Duration
source§fn tcp_user_timeout(&self) -> Duration
fn tcp_user_timeout(&self) -> Duration
source§impl Debug for PersistConfig
impl Debug for PersistConfig
source§impl From<&PersistConfig> for BatchBuilderConfig
impl From<&PersistConfig> for BatchBuilderConfig
source§fn from(value: &PersistConfig) -> Self
fn from(value: &PersistConfig) -> Self
source§impl From<&PersistConfig> for CompactConfig
impl From<&PersistConfig> for CompactConfig
source§fn from(value: &PersistConfig) -> Self
fn from(value: &PersistConfig) -> Self
Auto Trait Implementations§
impl !RefUnwindSafe for PersistConfig
impl Send for PersistConfig
impl Sync for PersistConfig
impl Unpin for PersistConfig
impl !UnwindSafe for PersistConfig
Blanket Implementations§
source§impl<T> FutureExt for T
impl<T> FutureExt for T
source§fn with_context(self, otel_cx: Context) -> WithContext<Self>
fn with_context(self, otel_cx: Context) -> WithContext<Self>
source§fn with_current_context(self) -> WithContext<Self>
fn with_current_context(self) -> WithContext<Self>
source§impl<T> Instrument for T
impl<T> Instrument for T
source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
source§impl<T> IntoRequest<T> for T
impl<T> IntoRequest<T> for T
source§fn into_request(self) -> Request<T>
fn into_request(self) -> Request<T>
T
in a tonic::Request
source§impl<P, R> ProtoType<R> for Pwhere
R: RustType<P>,
impl<P, R> ProtoType<R> for Pwhere R: RustType<P>,
source§fn into_rust(self) -> Result<R, TryFromProtoError>
fn into_rust(self) -> Result<R, TryFromProtoError>
RustType::from_proto
.source§fn from_rust(rust: &R) -> P
fn from_rust(rust: &R) -> P
RustType::into_proto
.