Struct parquet::file::properties::BloomFilterProperties
source · pub struct BloomFilterProperties {
pub fpp: f64,
pub ndv: u64,
}
Expand description
Controls the bloom filter to be computed by the writer.
Fields§
§fpp: f64
False positive probability, should be always between 0 and 1 exclusive. Defaults to DEFAULT_BLOOM_FILTER_FPP
.
You should set this value by calling WriterPropertiesBuilder::set_bloom_filter_fpp
.
The bloom filter data structure is a trade of between disk and memory space versus fpp, the smaller the fpp, the more memory and disk space is required, thus setting it to a reasonable value e.g. 0.1, 0.05, or 0.001 is recommended.
Setting to very small number diminishes the value of the filter itself, as the bitset size is
even larger than just storing the whole value. You are also expected to set ndv
if it can
be known in advance in order to largely reduce space usage.
ndv: u64
Number of distinct values, should be non-negative to be meaningful. Defaults to DEFAULT_BLOOM_FILTER_NDV
.
You should set this value by calling WriterPropertiesBuilder::set_bloom_filter_ndv
.
Usage of bloom filter is most beneficial for columns with large cardinality, so a good heuristic is to set ndv to number of rows. However it can reduce disk size if you know in advance a smaller number of distinct values. For very small ndv value it is probably not worth it to use bloom filter anyway.
Increasing this value (without increasing fpp) will result in an increase in disk or memory size.
Trait Implementations§
source§impl Clone for BloomFilterProperties
impl Clone for BloomFilterProperties
source§fn clone(&self) -> BloomFilterProperties
fn clone(&self) -> BloomFilterProperties
1.0.0 · source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source
. Read moresource§impl Debug for BloomFilterProperties
impl Debug for BloomFilterProperties
source§impl Default for BloomFilterProperties
impl Default for BloomFilterProperties
source§impl PartialEq for BloomFilterProperties
impl PartialEq for BloomFilterProperties
impl StructuralPartialEq for BloomFilterProperties
Auto Trait Implementations§
impl Freeze for BloomFilterProperties
impl RefUnwindSafe for BloomFilterProperties
impl Send for BloomFilterProperties
impl Sync for BloomFilterProperties
impl Unpin for BloomFilterProperties
impl UnwindSafe for BloomFilterProperties
Blanket Implementations§
source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
source§default unsafe fn clone_to_uninit(&self, dst: *mut T)
default unsafe fn clone_to_uninit(&self, dst: *mut T)
clone_to_uninit
)