growable_bloom_filter

Struct GrowableBloom

Source
pub struct GrowableBloom { /* private fields */ }
Expand description

A Growable Bloom Filter

§Overview

Implementation of Scalable Bloom Filters which also provides serde serialization and deserialize.

A bloom filter lets you insert items, and then test association with contains. It’s space and time efficient, at the cost of false positives. In particular, if contains returns true, it may be in filter. But if contains returns false, it’s definitely not in the bloom filter.

You can control the failure rate by setting desired_error_prob and est_insertions appropriately.

§Applications

Bloom filters are typically used as a pre-cache to avoid expensive operations. For example, if you need to ask ten thousand servers if they have data XYZ, you could use GrowableBloom to figure out which ones do NOT have XYZ.

§Example

use growable_bloom_filter::GrowableBloom;

// Create and insert into the bloom filter
let mut gbloom = GrowableBloom::new(0.05, 1000);
gbloom.insert(&0);
assert!(gbloom.contains(&0));

// Serialize and Deserialize the bloom filter
use serde_json;

let s = serde_json::to_string(&gbloom).unwrap();
let des_gbloom: GrowableBloom = serde_json::from_str(&s).unwrap();
assert!(des_gbloom.contains(&0));

// Builder API
use growable_bloom_filter::GrowableBloomBuilder;
let mut gbloom = GrowableBloomBuilder::new()
    .estimated_insertions(100)
    .desired_error_ratio(0.05)
    .build();
gbloom.insert(&0);
assert!(gbloom.contains(&0));

Implementations§

Source§

impl GrowableBloom

Source

pub fn new(desired_error_prob: f64, est_insertions: usize) -> GrowableBloom

Create a new GrowableBloom filter.

§Arguments
  • desired_error_prob - The desired error probability (eg. 0.05, 0.01)
  • est_insertions - The estimated number of insertions (eg. 100, 1000).

Note: You really don’t need to be accurate with est_insertions. Power of 10 granularity should be fine (~1000 is decent).

§Example
// 5% failure rate, estimated 100 elements to insert
use growable_bloom_filter::GrowableBloom;
let mut gbloom = GrowableBloom::new(0.05, 100);
§Panics
  • Panics if desired_error_prob is less then 0 or greater than 1.
  • Panics if capacity is zero. If you’re unsure, set it to 1000.
§Builder API

An alternative way to construct a GrowableBloom.

See GrowableBloomBuilder for documentation. It allows you to specify other constants to control bloom filter behaviour.

use growable_bloom_filter::GrowableBloomBuilder;
let mut gbloom = GrowableBloomBuilder::new()
    .estimated_insertions(100)
    .desired_error_ratio(0.05)
    .build();
Source

pub fn contains<T: Hash>(&self, item: T) -> bool

Test if item in the Bloom filter.

If true is returned, it may be in the filter. If false is returned, it’s NOT in the filter.

§Arguments
  • item - The item to test
§Example
use growable_bloom_filter::GrowableBloom;
let mut bloom = GrowableBloom::new(0.05, 10);
let item = 0;

bloom.insert(&item);
assert!(bloom.contains(&item));
Source

pub fn insert<T: Hash>(&mut self, item: T) -> bool

Insert item into the filter.

This may resize the GrowableBloom.

§Arguments
  • item - The item to insert
§Example
use growable_bloom_filter::GrowableBloom;
let mut bloom = GrowableBloom::new(0.05, 10);
let item = 0;

bloom.insert(&item);
bloom.insert(&-1);
bloom.insert(&vec![1, 2, 3]);
bloom.insert("hello");
Source

pub fn clear(&mut self)

Clear the bloom filter.

This does not resize the filter.

§Example
use growable_bloom_filter::GrowableBloom;
let mut bloom = GrowableBloom::new(0.05, 10);
let item = 0;

bloom.insert(&item);
assert!(bloom.contains(&item));
bloom.clear();
assert!(!bloom.contains(&item)); // No longer contains item
Source

pub fn is_empty(&self) -> bool

Whether this bloom filter contain any items.

Source

pub fn len(&self) -> usize

The current estimated number of elements added to the filter. This is an estimation, so it may or may not increase after an insertion in the filter.

§Example
use growable_bloom_filter::GrowableBloom;
let mut bloom = GrowableBloom::new(0.05, 10);

bloom.insert(0);
assert_eq!(bloom.len(), 1);
Source

pub fn capacity(&self) -> usize

The current estimated capacity of the filter. A filter starts with a small capacity and will expand to accommodate more items. The actual ratio of increase depends on the values used to construct the bloom filter.

Note: An empty filter has capacity zero as we haven’t calculated the necessary bloom filter size. Subsequent inserts will result in the capacity updating.

§Example
use growable_bloom_filter::GrowableBloom;
let mut bloom = GrowableBloom::new(0.05, 10);

assert_eq!(bloom.capacity(), 0);

bloom.insert(0);
// After an insert, our capacity is no longer zero.
assert_ne!(bloom.capacity(), 0);
Source

pub fn check_and_set<T: Hash>(&mut self, item: T) -> bool

Record if item already exists in the filter, and insert it if it doesn’t already exist.

Returns true if the item already existed in the filter.

Note: This isn’t faster than just inserting.

§Example
use growable_bloom_filter::GrowableBloom;
let mut bloom = GrowableBloom::new(0.05, 10);
let item = 0;

let existed_before = bloom.check_and_set(&item);
assert!(existed_before == false);

let existed_before = bloom.check_and_set(&item);
assert!(existed_before == true);

Trait Implementations§

Source§

impl Clone for GrowableBloom

Source§

fn clone(&self) -> GrowableBloom

Returns a copy of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for GrowableBloom

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl<'de> Deserialize<'de> for GrowableBloom

Source§

fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>
where __D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more
Source§

impl PartialEq for GrowableBloom

Source§

fn eq(&self, other: &GrowableBloom) -> bool

Tests for self and other values to be equal, and is used by ==.
1.0.0 · Source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
Source§

impl Serialize for GrowableBloom

Source§

fn serialize<__S>(&self, __serializer: __S) -> Result<__S::Ok, __S::Error>
where __S: Serializer,

Serialize this value into the given Serde serializer. Read more
Source§

impl StructuralPartialEq for GrowableBloom

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dst: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dst. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,