Table of contents
1.
Introduction 
2.
Redis Bloom
3.
Quick Start 
4.
Launch RedisBloom with Docker
5.
Launch RedisBloom with Docker
6.
Running
7.
Configuration
7.1.
Error rate and Initial Size for Bloom Filter
7.2.
Initial Size for Cuckoo Filter
8.
Commands
8.1.
Bloom Filter Commands
8.2.
Cuckoo Filter Commands
8.3.
Count Min Sketch Commands
8.4.
TopK list commands
9.
Frequently Asked Questions
9.1.
What are data types in Redis?
9.2.
Is Redis a cache or database?
9.3.
What is Redis and Kafka?
9.4.
What is a Redis module?
10.
Conclusion
Last Updated: Mar 27, 2024

Probabilistic Data Structures for Redis Stack

Introduction 

A type of Data Structure known as the probabilistic data structure is advantageous in big data and streaming applications. In general, hash functions are used to randomise and compactly represent a set of elements in these data structures.

RedisBloom adds four probabilistic data structures to Redis: a scalable Bloom filter, a cuckoo filter, a count-min sketch, and a top-k. We will understand more about them, what they do and how we can implement them. 

Redis Bloom

RedisBloom is a Redis extension that adds support for more probabilistic data structures. It enables the fast processing of computer science issues in a consistent memory space with a low error rate. It includes scalable Bloom and Cuckoo filters for determining if an item is present or missing from a collection (to a specified degree of certainty).

There are four data kinds available in the RedisBloom module:

  • Bloom filter: A probabilistic data format for detecting the existence of something. A Bloom filter is a data structure that tells you whether an element is present in a set quickly and efficiently. When inserting things, Bloom filters often have higher performance and scalability (so if you're frequently adding items to your dataset, Bloom may be suitable).
  • Cuckoo filter: Cuckoo filters are an alternative to Bloom filters that include capabilities for deleting elements from a set. On check operations, these filters are faster.
  • Count-min sketch: The frequency of events in a stream is usually determined using a count-min sketch. The count-min sketch can be used to estimate the frequency of any given event.
  • Top-K: In RedisBloom, the Top-K probabilistic data structure is a deterministic approach for approximating frequencies for the top k items. When components are added to or removed from your Top-K list, Top-K will notify you in real-time. The dropped element will be retrieved if an element add-command enters the list.

  Recommended Topic hash function in data structure

Quick Start 

Launch RedisBloom with Docker

docker run -p 6379:6379 --name redis-redisbloom redislabs/rebloom:latest

Launch RedisBloom with Docker

docker run -p 6379:6379 --name redis-redisbloom redislabs/rebloom:latest

Running

# Assuming you have a redis build from the unstable branch:
/path/to/redis-server --loadmodule ./redisbloom.so

Configuration

RedisBloom has a few run-time configuration settings that must be chosen when the module is loaded.

In general, configuration options are passed by attaching arguments following the —load-module argument in the command line, the load module configuration directive in a Redis config file, or the loadmodule configuration directive in a Redis config file or the MODULE LOAD command. For example:

In redis.conf:

loadmodule redisbloom.so OPT1 OPT2

From redis-cli:

127.0.0.6379> MODULE load redisbloom.so OPT1 OPT2

From command line:

$ redis-server --loadmodule ./redisbloom.so OPT1 OPT2

Error rate and Initial Size for Bloom Filter

When loading the module, use the ERROR RATE and INITIAL SIZE arguments to change the default error ratio and initial filter size (for bloom filters), respectively e.g.

$ redis-server --loadmodule /path/to/redisbloom.so INITIAL_SIZE 400 ERROR_RATE 0.004

The default error rate is 0.01, and the initial default capacity is 100.

Initial Size for Cuckoo Filter

For the Cuckoo filter, the default capacity is 1024.

Must Read Stack Operations

Commands

The HyperLogLog is a single Probabilistic Data Structure (PDS) in Redis that is used to count different elements in a multiset. RedisBloom extends Redis with five new PDS.

  • Bloom Filter -Check if items in a set are members.
  • Cuckoo Filter -check if elements in a set are members.
  • Count-Min Sketch - Count the number of times each element appears in a stream.
  • TopK - keep track of the most common K items in a stream.

Here is the list of some commands to access the APIs, but more details can be found in the official documentation.

Bloom Filter Commands

source 

Cuckoo Filter Commands

source 

Count Min Sketch Commands

source 

TopK list commands

source 

Frequently Asked Questions

What are data types in Redis?

Redis data types

  • Strings. Strings are the most basic kind of Redis value. 
  • Lists. Redis Lists are simply lists of strings, sorted by insertion order. 
  • Sets. Redis Sets are an unordered collection of Strings. 
  • Hashes. 
  • Sorted Sets. 
  • Bitmaps and HyperLogLogs. 
  • Streams. 
  • Geospatial indexes.

Is Redis a cache or database?

Everyone knows Redis began as a caching database, but it has since evolved to a primary database. Many applications built today use Redis as a primary database. However, most Redis service providers support Redis as a cache, but not as a primary database.

What is Redis and Kafka?

Redis is an Enterprise Cache Broker, in-memory database, and high-performance database, whereas Kafka is an Enterprise Messaging Framework. Both have their own benefits, but they are used and implemented differently.

What is a Redis module?

Redis modules are dynamic libraries that can be loaded into Redis using the MODULE LOAD command or during startup. Redis provides a C API in the form of a single Redis module C header file.

Conclusion

So, in a nutshell, Redis is a wonderful tool to work with the support of probabilistic data structures to complete such operations in optimised time is very useful in many cases. 

Check out our Coding Ninjas Studio Guided Path to learn about Data Structures and AlgorithmsCompetitive ProgrammingJavaScriptSystem Design, and more, Take a look at the mock test series and participate in the contests hosted by Coding Ninjas Studio if you want to improve your coding skills. If you are new to preparation and want to work for firms such as Amazon, Microsoft, Uber, and others, you should review the problemsinterview experiences, and interview bundle for placement preparations.

Consider taking one of our paid courses to give yourself and your profession an edge!

Please vote for our blogs if you find them valuable and exciting.

Happy Learning!!

Live masterclass