pub struct PersistConfig {
Show 23 fields pub build_version: Version, pub hostname: String, pub is_cc_active: bool, pub now: NowFn, pub configs: ConfigSet, pub dynamic: Arc<DynamicConfig>, pub compaction_enabled: bool, pub compaction_concurrency_limit: usize, pub compaction_queue_size: usize, pub compaction_yield_after_n_updates: usize, pub consensus_connection_pool_max_size: usize, pub consensus_connection_pool_max_wait: Option<Duration>, pub writer_lease_duration: Duration, pub critical_downgrade_interval: Duration, pub pubsub_connect_attempt_timeout: Duration, pub pubsub_request_timeout: Duration, pub pubsub_connect_max_backoff: Duration, pub pubsub_client_sender_channel_size: usize, pub pubsub_client_receiver_channel_size: usize, pub pubsub_server_connection_channel_size: usize, pub pubsub_state_cache_shard_ref_channel_size: usize, pub pubsub_reconnect_backoff: Duration, pub isolated_runtime_worker_threads: usize,
}
Expand description

The tunable knobs for persist.

Tuning inputs:

  • A larger blob_target_size (capped at KEY_VAL_DATA_MAX_LEN) results in fewer entries in consensus state. Before we have compaction and/or incremental state, it is already growing without bound, so this is a concern. OTOH, for any “reasonable” size (> 100MiB?) of blob_target_size, it seems we’d end up with a pretty tremendous amount of data in the shard before this became a real issue.
  • A larger blob_target_size will results in fewer s3 operations, which are charged per operation. (Hmm, maybe not if we’re charged per call in a multipart op. The S3Blob impl already chunks things at 8MiB.)
  • A smaller blob_target_size will result in more even memory usage in readers.
  • A larger batch_builder_max_outstanding_parts increases throughput (to a point).
  • A smaller batch_builder_max_outstanding_parts provides a bound on the amount of memory used by a writer.
  • A larger compaction_heuristic_min_inputs means state size is larger.
  • A smaller compaction_heuristic_min_inputs means more compactions happen (higher write amp).
  • A larger compaction_heuristic_min_updates means more consolidations are discovered while reading a snapshot (higher read amp and higher space amp).
  • A smaller compaction_heuristic_min_updates means more compactions happen (higher write amp).

Tuning logic:

  • blob_target_size was initially selected to be an exact multiple of 8MiB (the s3 multipart size) that was in the same neighborhood as our initial max throughput (~250MiB).
  • batch_builder_max_outstanding_parts was initially selected to be as small as possible without harming pipelining. 0 means no pipelining, 1 is full pipelining as long as generating data takes less time than writing to s3 (hopefully a fair assumption), 2 is a little extra slop on top of 1.
  • compaction_heuristic_min_inputs was set by running the open-loop benchmark with batches of size 10,240 bytes (selected to be small but such that the overhead of our columnar encoding format was less than 10%) and manually increased until the write amp stopped going down. This becomes much less important once we have incremental state. The initial value is a placeholder and should be revisited at some point.
  • compaction_heuristic_min_updates was set via a thought experiment. This is an O(n*log(n)) upper bound on the number of unconsolidated updates that would be consolidated if we compacted as the in-mem Spine does. The initial value is a placeholder and should be revisited at some point.

TODO: Move these tuning notes into SessionVar descriptions once we have SystemVars for most of these.

Fields§

§build_version: Version

Info about which version of the code is running.

§hostname: String

Hostname of this persist user. Stored in state and used for debugging.

§is_cc_active: bool

Whether this persist instance is running in a “cc” sized cluster.

§now: NowFn

A clock to use for all leasing and other non-debugging use.

§configs: ConfigSet

Persist Configs that can change value dynamically within the lifetime of a process.

TODO(cfg): Entirely replace dynamic with this.

§dynamic: Arc<DynamicConfig>

Configurations that can be dynamically updated.

§compaction_enabled: bool

Whether to physically and logically compact batches in blob storage.

§compaction_concurrency_limit: usize

In Compactor::compact_and_apply_background, the maximum number of concurrent compaction requests that can execute for a given shard.

§compaction_queue_size: usize

In Compactor::compact_and_apply_background, the maximum number of pending compaction requests to queue.

§compaction_yield_after_n_updates: usize

In Compactor::compact_and_apply_background, how many updates to encode or decode before voluntarily yielding the task.

§consensus_connection_pool_max_size: usize

The maximum size of the connection pool to Postgres/CRDB when performing consensus reads and writes.

§consensus_connection_pool_max_wait: Option<Duration>

The maximum time to wait when attempting to obtain a connection from the pool.

§writer_lease_duration: Duration

Length of time after a writer’s last operation after which the writer may be expired.

§critical_downgrade_interval: Duration

Length of time between critical handles’ calls to downgrade since

§pubsub_connect_attempt_timeout: Duration

Timeout per connection attempt to Persist PubSub service.

§pubsub_request_timeout: Duration

Timeout per request attempt to Persist PubSub service.

§pubsub_connect_max_backoff: Duration

Maximum backoff when retrying connection establishment to Persist PubSub service.

§pubsub_client_sender_channel_size: usize

Size of channel used to buffer send messages to PubSub service.

§pubsub_client_receiver_channel_size: usize

Size of channel used to buffer received messages from PubSub service.

§pubsub_server_connection_channel_size: usize

Size of channel used per connection to buffer broadcasted messages from PubSub server.

§pubsub_state_cache_shard_ref_channel_size: usize

Size of channel used by the state cache to broadcast shard state references.

§pubsub_reconnect_backoff: Duration

Backoff after an established connection to Persist PubSub service fails.

§isolated_runtime_worker_threads: usize

Number of worker threads to create for the crate::IsolatedRuntime, defaults to the number of threads.

Implementations§

source§

impl PersistConfig

source

pub fn new_default_configs(build_info: &BuildInfo, now: NowFn) -> Self

Returns a new instance of PersistConfig with default tuning and default ConfigSet.

source

pub fn new(build_info: &BuildInfo, now: NowFn, configs: ConfigSet) -> Self

Returns a new instance of PersistConfig with default tuning and the specified ConfigSet.

source

pub fn sink_minimum_batch_updates(&self) -> usize

The minimum number of updates that justify writing out a batch in persist_sink’s write_batches operator. (If there are fewer than this minimum number of updates, they’ll be forwarded on to append_batch to be combined and written there.)

source

pub fn storage_sink_minimum_batch_updates(&self) -> usize

The same as Self::sink_minimum_batch_updates, but for storage persist_sink’s.

source

pub fn storage_source_decode_fuel(&self) -> usize

The maximum amount of work to do in the persist_source mfp_and_decode operator before yielding.

source

pub fn set_reader_lease_duration(&self, val: Duration)

Overrides the value for “persist_reader_lease_duration”.

source

pub fn set_rollup_threshold(&self, val: usize)

Overrides the value for “persist_rollup_threshold”.

source

pub fn set_next_listen_batch_retryer(&self, val: RetryParameters)

Overrides the value for the “persist_next_listen_batch_retryer_*” configs.

source

pub fn new_for_tests() -> Self

Returns a new instance of PersistConfig for tests.

source§

impl PersistConfig

Methods from Deref<Target = ConfigSet>§

source

pub fn entries(&self) -> impl Iterator<Item = &ConfigEntry>

Returns the configs currently registered to this set.

Trait Implementations§

source§

impl BlobKnobs for PersistConfig

source§

fn operation_timeout(&self) -> Duration

Maximum time allowed for a network call, including retry attempts.
source§

fn operation_attempt_timeout(&self) -> Duration

Maximum time allowed for a single network call.
source§

fn connect_timeout(&self) -> Duration

Maximum time to wait for a socket connection to be made.
source§

fn read_timeout(&self) -> Duration

Maximum time to wait to read the first byte of a response, including connection time.
source§

fn is_cc_active(&self) -> bool

Whether this is running in a “cc” sized cluster.
source§

impl Clone for PersistConfig

source§

fn clone(&self) -> PersistConfig

Returns a copy of the value. Read more
1.0.0 · source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
source§

impl Debug for PersistConfig

source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
source§

impl Deref for PersistConfig

§

type Target = ConfigSet

The resulting type after dereferencing.
source§

fn deref(&self) -> &Self::Target

Dereferences the value.
source§

impl PostgresClientKnobs for PersistConfig

source§

fn connection_pool_max_size(&self) -> usize

Maximum number of connections allowed in a pool.
source§

fn connection_pool_max_wait(&self) -> Option<Duration>

The maximum time to wait to obtain a connection, if any.
source§

fn connection_pool_ttl(&self) -> Duration

Minimum TTL of a connection. It is expected that connections are routinely culled to balance load to the backing store.
source§

fn connection_pool_ttl_stagger(&self) -> Duration

Minimum time between TTLing connections. Helps stagger reconnections to avoid stampeding the backing store.
source§

fn connect_timeout(&self) -> Duration

Time to wait for a connection to be made before trying.
source§

fn tcp_user_timeout(&self) -> Duration

TCP user timeout for connection attempts.

Auto Trait Implementations§

Blanket Implementations§

source§

impl<T> Any for T
where T: 'static + ?Sized,

source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
source§

impl<T> Borrow<T> for T
where T: ?Sized,

source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
source§

impl<T, U> CastInto<U> for T
where U: CastFrom<T>,

source§

fn cast_into(self) -> U

Performs the cast.
source§

impl<T> DynClone for T
where T: Clone,

source§

fn __clone_box(&self, _: Private) -> *mut ()

source§

impl<T> From<T> for T

source§

fn from(t: T) -> T

Returns the argument unchanged.

source§

impl<T> FromRef<T> for T
where T: Clone,

source§

fn from_ref(input: &T) -> T

Converts to this type from a reference to the input type.
source§

impl<T> FutureExt for T

source§

fn with_context(self, otel_cx: Context) -> WithContext<Self>

Attaches the provided Context to this type, returning a WithContext wrapper. Read more
source§

fn with_current_context(self) -> WithContext<Self>

Attaches the current Context to this type, returning a WithContext wrapper. Read more
source§

impl<T> Instrument for T

source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
source§

impl<T, U> Into<U> for T
where U: From<T>,

source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

source§

impl<T> IntoRequest<T> for T

source§

fn into_request(self) -> Request<T>

Wrap the input message T in a tonic::Request
source§

impl<Unshared, Shared> IntoShared<Shared> for Unshared
where Shared: FromUnshared<Unshared>,

source§

fn into_shared(self) -> Shared

Creates a shared type from an unshared type.
source§

impl<T> Pointable for T

source§

const ALIGN: usize = _

The alignment of pointer.
§

type Init = T

The type for initializers.
source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
source§

impl<T> ProgressEventTimestamp for T
where T: Data + Debug + Any,

source§

fn as_any(&self) -> &(dyn Any + 'static)

Upcasts this ProgressEventTimestamp to Any. Read more
source§

fn type_name(&self) -> &'static str

Returns the name of the concrete type of this object. Read more
source§

impl<P, R> ProtoType<R> for P
where R: RustType<P>,

source§

impl<T> PushInto<Vec<T>> for T

source§

fn push_into(self, target: &mut Vec<T>)

Push self into the target container.
source§

impl<T> Same for T

§

type Output = T

Should always be Self
source§

impl<T> ToOwned for T
where T: Clone,

§

type Owned = T

The resulting type after obtaining ownership.
source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

§

type Error = Infallible

The type returned in the event of a conversion error.
source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

source§

fn vzip(self) -> V

source§

impl<T> WithSubscriber for T

source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more
source§

impl<T> Data for T
where T: Clone + 'static,