Redis (Remote Directory Server) is an in-memory data structure. It's a disk-permanent key-value database that can handle a variety of data structures or data types. This means that, in addition, to mapping key-value-based strings for storing and retrieving data (similar to the data model offered by traditional types of databases), Redis supports other complicated data structures such as lists, sets, etc.
As we go along, we'll look at the data structures that Redis supports. We will also learn about Redis's distinguishing qualities. Today we will discuss the data types associated with redis.
Data Types
Redis is more than a key-value store; it is a data structures server that accepts various values. Unlike typical key-value stores, where string keys are associated with string values, Redis' value is not restricted to a simple string but can also carry more complicated data structures. The following is a list of all the data structures that Redis supports, each of which will be explored separately in this tutorial:
✅Binary Strings
✅Lists
✅Sets
✅Sorted Sets
✅Hashes
✅Bit Arrays
✅HyperLog Logs
✅Streams
Let’s start with learning each of them in brief.
Keys
Redis keys are binary-safe. You can use any type of binary sequence as a key, from a string like "foo" to the content of a JPEG file. The empty string can likewise be used as a valid key.
Let’s have a look at the rules of keys data structure.
💡Keys that are too long are not a good idea. A key of 1024 bytes, for example, is a bad notion not only in terms of memory but also because looking up the key in the dataset may necessitate numerous costly vital comparisons. Even when the goal is to match the existence of a vast number, hashing it (for example, with SHA1) is preferable, especially in terms of memory and bandwidth.
💡Concise keys are not always a good option. It's pointless to use "u1000flw" as a key when you can use "user:1000:followers." The latter is more readable, and the additional space is insignificant compared to the space consumed by the key object and the value object. While short keys will use less memory, it is your task to strike the correct balance. Very short keys are not always a good option. It's pointless to use "u1000flw" as a key when you can use "user:1000:followers."
💡Try to stick to a schema. For example, "object-type:id," as in "user:1000," is an excellent concept. For multi-word fields, dots or dashes are commonly used, as in "comment:1234:reply.to" or "comment:1234:reply-to."
Strings
The Redis String value type is the essential value type associated with a Redis key. Because it is the only data type in Memcached.
Because Redis keys are strings, using the string type as a value also maps a string to another string. The string data type is helpful for various applications, such as caching HTML fragments or pages.
Let's explore Strings a bit with the string type, using redis-cli
set mykey somevalueofkey
OK
> get mykey
"somevalueofkey"
As you can see, we set and receive a string value by using the SET and GET instructions. SET will overwrite any existing value stored in it if the key already exists, even if the key is connected with a non-string value. As a result, SET executes a task.
Values can be any string (including binary data). For example, a jpeg image can be stored within a value. A value cannot be greater than 512 MB in size.
The SET command includes intriguing options, passed as extra arguments. For example, I could tell SET to fail if the key already exists, or vice versa:
> set mykey newval nx
(nil)
> set mykey newval xx
OK
Even if strings are basic values of Redis, there are interesting operations you can perform with them. For instance, one is atomic increment:
The INCR command converts the string value to an integer, increments it by one, and then sets the result as the new value. Other similar instructions include INCRBY, DECR, and DECRBY. Internally, it's always the same command, but it's executed slightly differently.
What does it mean for INCR to be atomic? Numerous clients issuing INCR against the same key would never cause a race scenario. For example, it is impossible for client 1 to read "10" and client 2 to read "10" simultaneously, incrementing to 11 and setting the new value to 11. The final value is always 12, and the read-increment-set process is executed while no other clients are executing commands simultaneously.
There are several commands for working with strings. The GETSET command, for example, changes the value of a key and returns the old value as a result, for example, if you have a system that uses INCR to increment a Redis key every time a new visitor arrives at your website. You could want to gather this data once an hour without missing a single increment. You can GETSET the key, setting it to the new value "0" and read back the previous value.
The ability to set or get the value of numerous keys in a single command is also advantageous in terms of latency. As a result, the MSET and MGET commands exist:
> mset a 10 b 20 c 30
OK
> mget a b c
1) "10"
2) "20"
3) "30"
When MGET is used, Redis will return an array of values.
Lists
To describe the List data type, it's best to start with some theory, as information technology professionals frequently misuse the name List. For example, "Python Lists" are not Linked Lists as the name implies, but rather than Arrays (the same data type is called Array in Ruby, actually). In the most generic sense, a list is just a succession of ordered elements: 10,20,1,2,3 is a list. However, the attributes of a List implemented with an Array differ significantly from those of a List built with a Linked List.
Linked Lists are used to implement Redis lists. This means that even if a list contains millions of elements, the action of adding a new element at the head or tail of the list is executed in constant time. The pace of adding a new element to the head of a list of ten elements with the LPUSH command is the same as adding an element to the head of a list of ten million elements.
What's the disadvantage? Accessing an element by the index is highly speedious in lists implemented using an Array (constant time-indexed access) but not so fast in linked lists (where the operation require a amount of work which is proportional to the index of the accessed element).
So, why did they implement using a linked list?
Because it is critical for a database system to be able to add elements to a massive list in a very quick manner, Redis Lists are designed with linked lists. Another significant advantage, as you'll see in a moment, is that Redis Lists may be accessed at a constant length and at a constant time. When quick access to the middle of a big collection of components is required, a new data structure known as assorted sets might be employed. Sorted sets will be discussed more in this course.
Operations in Lists
The LPUSH command inserts a new element into a list on the left (at the top), whereas the RPUSH command inserts a new element into a list on the right (at the tail). Finally, the LRANGE command extracts element ranges from lists:
> rpush mylist A
(integer) 1
> rpush mylist B
(integer) 2
> lpush mylist first
(integer) 3
> lrange mylist 0 -1
1) "first"
2) "A"
3) "B"
LRANGE requires two indexes, the beginning and last element of the range, to return. Both indexes can be negative, instructing Redis to begin counting from the end: -1 is the last element, -2 is the list's penultimate entry, etc.
RPUSH appends the elements on the right of the list, while LPUSH appended the element on the left.
Both instructions are variadic, which means you can add several elements to a list in a single call:
The ability to pop elements is a critical operation defined on Redis lists. Popping elements is the operation of extracting an element from a list while also removing it from the list. You can pop components from the left and right, just as you can push elements from both sides of the list:
> rpush mylist a b c
(integer) 3
> rpop mylist
"c"
> rpop mylist
"b"
> rpop mylist
"a"
We added three elements and popped three elements. Thus the list is empty at the end of this command sequence, and there are no more elements to pop. If we try to add another element, we get the following result:
> rpop mylist
(nil)
To indicate that there are no elements in the list, Redis returned a NULL result.
Capped Lists
We wish to use lists to keep the most recent items, social-network updates, logs, or anything else in many circumstances.
Using the LTRIM command, we can use lists as a capped collection, remembering only the most recent N items and discarding all the older ones.
The LTRIM command is similar to LRANGE in that it displays the supplied range of elements, but instead of displaying it, it sets this range as the new list value. All components that fall outside of the specified range are eliminated.
The preceding LTRIM command instructs Redis to take only list elements from index 0 to 2, discarding everything else. This enables a very basic but valuable pattern: combining a List push operation and a List trim action to add a new member and reject those that exceed a limit:
LPUSH mylist <some element>
LTRIM mylist 0 999
The preceding combination adds a new element and adds only the 1000 most recent elements to the list. LRANGE allows you to get the top things without having to remember very old info.
While LRANGE is theoretically an O(N) command, accessing small ranges near the top or bottom of the list is a constant time process.
While hashes are helpful for representing objects, the amount of fields you can put inside a hash has no practical constraints (aside from available RAM). Thus you can utilize hashes in a variety of ways within your application.
The command HMSET sets several hash fields, whereas HGET returns a single field. HMGET, like HGET, returns an array of values:
Redis Sets are unsorted string collections. The SADD command is used to add new elements to a set. It is also possible to do a variety of other operations on sets, such as determining whether a given element already exists, performing the intersection, union, or difference of several sets, and so on.
I've added three elements to my set and told Redis to return them. As you can see, that they are not sorted; Redis is free to return the items in any order because there is no contract with the user regarding element ordering.
Checking to see if an element exists, for example:
Sets are helpful in representing object-to-object relationships. For example, we can efficiently utilize sets to implement tags.
A straightforward way to describe this problem is to create a set for each object that has to be tagged. The IDs of the tags linked with the object are stored in the set.
One example is labeling news stories. If article ID 1000 is tagged with tags 1, 2, 5, and 77, the following tag IDs can be associated with the news item:
> sadd news:1000:tags 1 2 5 77
(integer) 4
Sorted Sets💫
Sorted sets are a data type that resembles a cross between a Set and a Hash. Sorted sets, like sets, are formed of unique, non-repeating string elements. Hence a sorted set is also a set in some ways.
While items within sets are not ordered, each element in a sorted set is assigned a floating-point value known as the score (this is why this is also similar to hash, since every element is mapped to a value).
Bitmaps
Bitmaps are not a data type in and of themselves but rather a set of bit-oriented operations defined on the String type. Because strings are binary safe blobs with a maximum length of 512 MB, they can be used to set up to 2^32 distinct bits. Bit operations are classified into two types: constant-time single bit operations, such as setting a bit to 1 or 0 or retrieving its value, and operations on groups of bits, such as counting number of set bits in a given range of the bits (e.g., population counting).
One of the most significant advantages of bitmaps is that they frequently give significant space reductions when storing information. In a system where incremental user IDs represent different users, for example, it is possible to remember a single bit of information (for example, whether a user wants to receive a newsletter) for 4 billion users using only 512 MB of memory.
The SETBIT command takes a bit number as its first parameter and a value to set the bit to, which can be 1 or 0. The command automatically enlarges the string if addressed bit is outside the current string length. GETBIT just returns the bit value at the supplied index. Out of the range bits (addressing a bit that is longer than the length of the string contained in the target key) are always treated as zero.
BITOP is a program that performs bitwise operations on strings. AND, OR, XOR, and NOT are the available operations.
BITCOUNT helps to count the number of bits that are set to 1 and reports the number of bits that are not set to 1.
BITPOS searches for the first bit with the provided value of 0 or 1.
🤗Bitmaps are used in
Real-time analytics of all the kinds.
Storing space-efficient but high-performance boolean information associated with object IDs.
Assume you want to determine which of your website's visitors has the longest run of daily visits. You begin counting days from zero, the day you made your website public, and set a bit using SETBIT every time a user visits the website. Simply take the current unix time, and subtract the beginning offset, and divide by the number of seconds in a day (usually, 3600*24) to get a bit index.
HyperLogLogs🚥
A HyperLogLog is a probabilistic data structure that is used to count the number of unique items (technically, this is referred to as estimating the cardinality of a set). Counting unique items typically necessitates using memory proportional to the number of items to be counted because you must remember the elements you have already seen in the past to prevent counting them several times. However, there is a set of the algorithms that trade memory for precision: you end up with an estimated measure with a standard error that is less than 1% in the case of the Redis implementation. The brilliance of this technique is that you can use a constant amount of memory instead of a proportional amount of memory proportional to the number of objects counted! In the worst scenario, 12k bytes, or much less if your HyperLogLog (we'll just call them HLL from now on) has seen very few entries.
While HLLs are technically a distinct data structure in Redis, they are encoded as a Redis string, so you can use GET to serialise one and SET to deserialize it back to the server.
The HLL API is conceptually equivalent to utilizing Sets to accomplish the same objective. You would SADD every observed element into a set and then use SCARD to count the number of unique items within the set, as SADD will not re-add an existing element.
While you don't actually add items to an HLL because the data structure just holds a state without actual elements, the API is the same:
When you notice a new element, use PFADD to add it to the count.
You utilize the PFCOUNT function every time you want to retrieve the current approximation of the unique elements contributed using PFADD thus far.
An example of a use case for this data structure is counting unique queries performed by users in a search form every day.
> pfadd hll a b c d
(integer) 1
> pfcount hll
(integer) 4
Streams
The Stream is a new data type introduced with Redis 5.0 that abstractly mimics a log data structure. However, the essence of a log remains: Redis Streams are an append-only data structure, similar to a log file, which is commonly implemented as a file open in append-only mode. Redis Streams, at least conceptually, enable strong operations to circumvent the restrictions of a log file because they are an abstract data type represented in memory.
We have discussed it in brief. Click hereto read more about it.
Let's move to our FAQs section.
FAQs
Can Redis store integers?
Redis maintains integers in their integer representation, so there is no overhead for keeping the string representation of the integer in string values that hold an integer.
How the data is stored in Redis?
Redis is a non-relational key-value store that runs in memory (sometimes referred to as a data structure server). This implies that it stores data using keys and values – think of it as a huge dictionary that stores information using words and definitions.
Is Redis atomic?
Redis transactions are atomic as well. Atomic indicates that either all or none of the commands are processed.
Does Redis save to disk?
Redis maintains snapshots of the dataset on the disc by default in a binary file named dump. You can set up Redis to save the dataset every N seconds if there are at least M changes, or you can call the SAVE, or BGSAVE commands manually.
Can Redis lose the data?
Redis is regarded as a fast cache rather than a database that ensures data consistency. As a result, its use cases are often different from those of real databases: For example, you can store sessions, performance counters, or anything else in it with unrivaled performance and no real loss in the event of a crash.
Conclusion
In this article, we have extensively discussed various data types in redis, how they differ from each other, and the properties that make them useful for redis. For more such blogs, you can visit our Blogs section, also read, Tools for Redis, Streams, No SQL DataBase