I’m talking, of course, about Redis. Do you know all it can do? Let’s discover this database together. The goal is to understand when Redis is the right tool to use. And why it is interesting to learn how to use it.
We’ll go through Redis’ basic principles, the different data structures that Redis offers, and the specific features that it offers. What it brings to the table. The situations where Redis might be the solution to your problem.
Redis. What is it, what does it do? Why is it called “Redis”?
Redis stands for Remote Dictionary Server. And Redis is a distributed NoSQL in-memory database.
It is a key/value data store or, more exactly, a key/data-structure store.
It is worth noting that Redis has been voted among the most liked database in the Stack Overflow survey every year since 2017. And according to the DB-Engines ranking, it is the most popular key-value database. Why is that?
Well, it's two things: first, everything happens in memory. And second, the data structures Redis supports. Those two features are what give Redis its strength,
First: Redis stores data in memory. What is the point of that?
Well, memory is fast. In contrast, let’s look at what happens when a SQL database receives a request for a row with an index. The database has to use the index to find out where the data is, then read a file from the disk, then load the data into memory, and then return the data.
Accessing in-memory data is faster by an order of magnitude. There’s no interaction with the file system. Neither when locating where the data is nor when reading it. So everything just happens much faster. An in-memory data store allows for much greater speed, including real-time usage.
Now, what is the point of data structures?
In short, Redis allows some interesting use cases that other traditional databases cannot meet natively.
To understand this, we’re going to go through them in detail. But first, let's look at some of Redis’ particular features.
First of all, you interact with Redis using text commands. So, for example, to create a key, we send
SET key value. To retrieve it, we will send
You can send these commands with the REDIS command line client called
redis-cli when testing. You can also find an interactive tutorial on the Redis website to explore these commands.
Now, what are these commands? First, there are a lot of general-use commands which apply regardless of the data structure used.
For example, listing the different keys that follow a certain pattern. Or deleting one or more keys. Or moving or copying the content of a key, or retrieving the type of the content of a key.
And then of course, there are specific commands for each type of data structure. When coding, these commands are sent by the client libraries.
LUA scripts can also trigger REDIS commands, which allows complex queries.
The commands do little more than read and write data, and LUA scripting allows you to create complex joins or even apply some transactional logic.
And as the script runs on the database, there’s no back-and-forth between the application server and the database, which makes things even faster.
PUB / SUB
Redis also provides mechanisms for publishing and subscribing to message channels.
This allows for some very interesting mechanisms. For example, it allows Redis to function as a message broker. These mechanisms are somewhat outside the scope of this article. It’s more interesting to focus on the different data structures.
But before going on those, a word about keys. They are binary character strings and can theoretically be as big as 512 Mb...
In practice, it’s better to keep them much shorter. I would recommend structuring so that they contain a bit of information For example, by specifying the type of object, a colon, and the identifier. For example user:42
You should know, by the way, that it is possible, as I said, to delete these keys, but it is also possible to give them a TTL, a "time to live" after which they disappear. This is typically useful when using Redis as a cache.
The first type of data structure, the simplest, is the String.
This is the go-to data type for caching. For example, when you need to store the result of a complicated query. And as I mentioned before, this works particularly well with a TTL.
We just need to serialise the data using JSON, and the next time we retrieve the stored value instead of doing the calculation.
You can put binary data in, but you are again limited to 512 MB.
And you should know that even if it is a string, you can apply various manipulations to it.
For example, if the string represents a number, you can increment or decrement it, which allows us to create counters.
We can also have conditional assignments, which only assign a value if it doesn’t exist. So for example, we’d say "if the counter doesn't exist, assign it the value 0".
The Hash Map
Now let’s talk about a slightly more complex data structure, but one that is just as useful: the hash map.
This is typically the equivalent of an SQL row or MongoDB document.
For example, we'll can make a user (colon) 42 key with first name being David, last name being Kodaps, and the id being 42.
The hash-map allows us to represent the properties of a given entity in the code.
For those of you who are more familiar with SQL, it should be noted that there is no schema. Just because a key is called user:id does not mean it has the same fields as another key following the same naming convention.
It is the developer’s responsibility to manage any missing data.
We have the same notion of the counter inside the hash in the sense that we can increment the value of a key.
You can assign and retrieve all the values of a hash map at once, or retrieve only some. We can also delete a field in the hash map and list the keys in the hash.
This makes it an inefficient data model for locating an element by its index in the middle of the chain.
However... when adding or removing elements on each side at constant memory cost, whatever the length, it’s perfect. All you have to do is recreate or delete a new connection at one of the ends of the list.
This is perfect for making queues (in first-in-first-out or FIFO mode) or stacks (in last-in-first-out or LIFO mode)
We can limit this list’s length, making it trivial to manage the list of the last X elements.
Sets are a collection of unique strings. They’re handy for counting the number of occurrences. They’re also useful for modelling 1-N or N-N relationships. For example, you can imagine that a Set represents all the tags of a blog post or all the blog posts with the same tag.
It becomes trivial to retrieve these relationships, which might require a heavy join query in SQL.
You can also make equiprobable random draws in these Sets, which offer fun possibilities like choosing one or more users at random or making a card draw.
Redis also has sorted Sets, where each element is associated with a score.
This makes it trivial — in terms of performance — to extract the top values to make a leaderboard, a ranking, or to retrieve a segment around a given score.
Sorted Sets are also useful, therefore, to model time series.
Other data structures
There is the BitMap data structure. These are binary tables which model a binary mask type. These can be very useful for stuff like managing rights in a session.
Next, there is the HyperLogLogs data structure. The name is weird. The concept is fascinating but a bit complex to wrap your mind around it. HyperLogLogs are probabilistic sets. What does this mean? HyperLogLogs allow you to estimate the number of unique things (such as IPs) while sacrificing accuracy for performance and maintaining a small memory footprint. Typically this is useful for analytics, where you need orders of magnitude, not exact numbers.
There are also data models for geohashing and streaming log flows.
In short, as you can see, Redis offers a whole load of special features with very versatile data models.
Redis’ Use Cases
Now, when should you use Redis, and when should you not?
Let’s start with when you should not.
The most obvious case where you should avoid Redis is when manipulating very large Data Sets. Redis is made for data that fits within a computer’s memory. And even if it is possible to cluster Redis servers, memory is still the most expensive storage medium.
The other case where you should avoid Redis, which is a little more subtle, is when you must have ACID transactions. I have a video that explains what this means.
Since memory storage is volatile, you risk losing a few seconds of information if there is a power cut, even if you are backing up your data to disk. However, if you don't need that level of Durability (the D of ACID), Redis can function as a primary database.
Now what else makes a good Redis use case?
Redis is ideal for use as a cache, i.e. to speed up the performance of a stack where data access is slower. The caching of frequently used objects is probably Redis’ most frequent use case. Redis is also very useful for managing sessions or analytical data. In general, Redis is very useful for all its specific use cases
To code a leaderboard, a tasks queue, or a message transfer system.
In short, whatever stack you use, Redis can probably be used as a complement. It can provide specific functionalities it provides that traditional databases do not.
But to make that choice, you need to know and understand its features, and I hope this article has given you a taste of what Redis can do for you.