Expand description
Merge-batcher for Column chunks with per-chunk paging.
Forks the differential_dataflow merge-batcher framework so chains can
hold PagedColumn entries — letting the ColumnPager page chunks
out as they’re produced and fetch them back lazily during merge / extract.
Reuses the resident building blocks from super::batcher: the inherent
Column::merge_from / Column::extract methods (per-chunk merge / split).
Input consolidation happens upstream: the chunker
(super::batcher::ColumnChunker) is supplied to the arrange operator
separately, so this batcher receives already-consolidated Column chunks
via PushInto.
Structs§
- Column
Merge Batcher - Drives the merge-batcher over
Columnchunks routed through aColumnPager. - Fetch
Iter - Streaming materializer over a chain of
PagedColumnentries.
Constants§
- MAX_
RECYCLE_ 🔒BYTES - Don’t park a buffer larger than this in the free-list. A transiently oversize merge buffer (post-explosion, past the natural ship threshold) held resident would compete with the pager’s budget; drop it and let a fresh default regrow. 2 × the natural ship word count (≈ 4 MiB serialized) keeps normal ship-sized chunks while excluding pathological ones.
- STASH_
CAP 🔒 - Max recycled empty chunks held in the per-batcher stash. Deliberately
tight: the stash is a hot-buffer cache for the result/keep/ship churn,
not a hoard. Stash entries are cleared
Column::Typedallocations that retain capacity but are not tracked byColumnPager’sResidentTicketaccounting, so each one is a chunk’s worth of resident bytes the pager’s budget doesn’t see. There’s one stash per arrange batcher per worker, so this multiplies fast.
Functions§
- account_
chunk 🔒 - Resident-only accounting. Returns
(records, size_bytes, capacity_bytes, allocations)for a single chain entry; paged-out entries contribute 0 across the board. - drain_
side 🔒 - Helper for
merge_chains’s drain phase: copy a partially-consumed head intoresult(via 1-inputmerge_from), shipresultif non-empty, then pass the remaining queuedPagedColumns straight through. - extract_
chain - Streaming extract: walks
mergedchunk-by-chunk viaColumn::extract, routing each filled keep/ship chunk through its sink after pageing. Mirrors the per-chunk ship-threshold yield already insideColumn::extract. - merge_
chains - Two-way merge driver. Reuses today’s per-chunk gallop / ship-threshold
logic from
Column::merge_from, but pulls heads fromFetchIterand emits finished output chunks throughsinkafter routing them through the pager exposed byFetchIter::pager. - recycle_
capped 🔒 - Recycle
chunkonly if the stash isn’t already atSTASH_CAPand the chunk isn’t oversize perMAX_RECYCLE_BYTES.length_in_bytesis measured before clear, so it reflects the data the chunk was carrying (a proxy for the capacity we’d park).