- Data Structures / Types
Binary-safe Strings (Keys)
Redis is actually data structures server than can be used a plain key-value store like Memcached.
The Redis API has way more robust data structures than Memcached.
It's not always trivial to grasps how Redis data types work or what to use to solve a specific problem.
The Redis API supports the following data structures (citing redis.io):
These are the simplest values you can associate with a Redis key. Also in memcached, newcomers to Redis tend to rely on them entirely.
Redis keys are binary safe, meaning that you can use any binary sequence as a key. The empty string is also a key.
> set mykey somevalue OK > get mykey "somevalue"
SET and GET commands are used to assign and retrieve strings.
The value a string (as shown above) or even a binary data of any kind e.g. a JPEG. The maximum allowed size is 512MB
Memcached, by contrast, does not understand data structures - you must upload data that is pre-serialized. It does not care what your data looks like. Items are made up of a key, an expiration date, optional flags and raw data.
Redis Strings Common Use Cases
Used mostly in caching HTML fragments or pages.
Redis Lists are a sequence of ordered elements but the properties of a List are implemented via Linked Lists. Properties of a lists implemented through Linked Lists are different than those implemented through Arrays.
This means that the operation of adding a new element in the head or in the tail is performed in constant time. The speed of adding just 10 elements with the LPUSH method would be the same as that of adding 10 million elements.
Accessing an element by index is faster in lists implemented in constant time than in those implemented by Linked Lists.
Redis Lists are implemented with linked lists because for a database system it is crucial to be able to add elements to a very very long list in a very fast way. Redis Lists can also be taken at constant length in constant time.
> rpush mylist A (integer) 1 > rpush mylist B (integer) 2 > lpush mylist first (integer) 3 > lrange mylist 0 -1 1) "first" 2) "A" 3) "B"
RPUSH command adds a new element into the tail of a list while the LPUSH command add into the head.
LRANGE command extracts ranges of elements from lists.
Redis Lists Common Use Cases
Often used to remember the latest updates posted by users on a social network.
Twitter takes the latest tweets posted by users into Redis lists.
Used for communication between processes using a consumer-producer pattern (where the producer pushed items into a list and a consumer executes those actions). In fact, Redis has special list commands to make this more reliable and efficient.
Used for storing logs.
For simple list storage, Redis allows the use of lists as a capped collection - only remembering the latest
n items and discarding the oldest items using the LTRIM command.
Redis Lists are suitable for blocking operations (inter process communications) in their usual producer /consumer setup. Items are pushed into the list using LPUSH and the extracted using RPOP by consumers.
Redis Sets are simply unordered collections of Strings.
The SADD command adds elements to a set.
Possible operations in a Redis Set include testing existence of specific elements, performing intersections, unions, differences between multiple sets etc.
> sadd myset 1 2 3 (integer) 3 > smembers myset 1. 3 2. 1 3. 2 > sismember myset 3 (integer) 1 > sismember myset 30 (integer) 0
The elements inserted into the
myset above are not sorted.
3 is a member of
30 is not.
Common Use Cases for Redis Sets
Sets are good for expressive relationships e.g implementing tags. Used in this way, we would have a set for every object we intend to tag. The set would then contain the IDs of the tags associated with the object.
To model a web based poker game, you may want to represent your deck with a set and the use the SPOP command to extract elements.
Redis Sorted Sets and Hashes
Redis Sorted Sets are a mix between a Set and a Hash.
Like Sets they are composed of unique, non-repeating string elements.
Unlike Sets, the elements in a Sorted Set are, well.. sorted. Each element is orderered and mapped to a floating point value (the score) like a hash (hashed are mapped to values also).
The elements of Redis Sorted Sets are ordered with the following two rules:
xis greater than
xis greater than score
xis greater than
yif the string is lexicographically greater.
Redis Sorted Sets Use Cases
Ranges of sorted sets can be inclusive or exclusive, this feature is important because it makes a Sorted Sets useful as a generic index.
Redis Sorted Sets are also used as in updating leader board scores because their scores can be updated anytime!
Bitmaps (Bit Arrays)
Bitmaps are sets of bit-oriented operations defined on String types.
Since Strings are binary safe blobs (with a max length of 512MB), they aresuitable to set up 2 ^ 32 different bits.
Bit operations can be either contant-time single bit operations or operations on groups of bits.
Setting a bit to 1 or 0 and getting its value would be single operations.
Counting the number of set bits in a given range of bits (e.g. population counting) would be an operation on groups of bits.
Bitmaps often provide extreme space saving when storing information.
Use Cases of Redis Bitmaps
In a system where different users are stored with incremental IDs, it is possible to find out if a user wants to receive a newsletter from 4 billion users using just 512MB of memory.
Hyperloglogs are probabilistic data structures used to count unique objects (estimating the cardinality of a set).
Usually counting unique items requires using an amount of memory proportional to the number of items being counted (to remember those already counted).
With hyperloglogs, you no longer need to use the same amount of memory as items being counted. You can use constant memory instead.
Hyperloglogs in Redis are encoded as a Redis String, so you can GET to serialize and SET to unserialize it back to the server.
Memcached has no other data types besides Strings.
One of the key areas where Redis completely blows Memcached out of the water is data persistence.
Memcached doesn't provide any sort of data persistence. Data stored in Memcached doesn't persist reboots because it is not synced with disk data.
There is no Redis feature that is misunderstood as Redis Persistence.
Redis provides three persistence options:
- RDB persistence.
- AOF persistence.
- Both RDB and AOF Persistence in the same instance.
RBD Persistence (Point-in-time Snapshotting)
RDB persistence takes point-in-time snapshots of your dataset at specified intervals.
By default Redis saves the snapshots on disk in a binary file called
dump.rdb. You can the configure Redis to have it save the dataset every
n seconds in there is atleast
m changes in the dataset.
Alternatively, you can manually call SAVEN or BGSAVE commands to take snapshots.
RDB is a very compact single-file point-in-time representation of your Redis data.
The Pros of RDB Persistence
- RDB files are perfect for backups. This allows easy data restoration on failovers.
- RDB is good for disaster recovery. A single-compact file is easily portable between data centers.
- Maximizes Redis performance. Parent process never performs disk I/O it only forks a child that does the rest.
- Allows faster restarts for big data.
The Cons of RDB Persistence
- Does not completely eliminate the chance of data loss if Redis stops working (say, during power outages). You'll usually create an RDB snapshot every five minutes or more, the latest data could be lost.
- RDB needs to fork often, forks can be time consuming with sub-optimal CPU performance if the dataset is huge with.
AOF persistence logs every write operation received by the server, this will then be used to reconstruct the original dataset at server startup.
The logs are done in Redis protocol, in an append-only fashion. Writing is done as a background task for when too big.
Reconstructing the original dataset guaratees completeness.
The Pros of AOF Persistence
- AOF logs are more durable because you have different fsync policy options. You could choose to fsync every query or after every second etc.
- AOF logs are append-only. There are no corruptions, no seeks even on power outages. There is also a redis-check-aof tool for fixing half-written commands.
- Rewrites can be done to the background while a minimal log is maintained for current operations.
- AOF logs are easy to parse, export.
The Cons of AOF Persistence
- AOF files are usually bigger than equivalent RDB files for a similar dataset.
- With an exact fsync policy, AOF can be slower than RDB. But with fsync disables, it should be as fast as RDB under high load.
- There have been bugs in AOF logs hindering exact data reproduction on reloading.
Which Redis Persistence Option Should You Use?
Short answer. Both.
Using both gives you a degree of consistency akin to what PostgreSQL provides. If you can live with a few minutes of data loss, you can use RDB Persistence. You should not use AOF Persistence alone because point-in-time RDB snapshots are perfect for backups, for restarts and for persistence incase of AOF bugs.
Memcached has no support for data replication.
Data replication is used both for scalability (multiple slaves for read-only queries) and data safety.
Redis, by contrast has a very intuitive master-slave replication that allows slave Redis servers to be exact copies of their masters.
Slaves automatically connect to masters every time connections break.
A master can have multiple slaves.
Redis replication is achieved in three main ways:
- When master and slave are connected, the master streams commands to replicate real-time changes in the master dataset to the slave.
- When connection breaks, the slave reconnects and attempts a partial resynchronization (only the missed part).
- When the partial resync fails, the slave tries a full resync. For this, the master needs to take a full snapshot of its data, send it to the slave and then resume the streams.
By default, Redis replication is asynchronious, high performance and high latency.
Synchronious replication is available for clients using the WAIT command. Clients run the risk of losing acknowledged writes this way incase of a failover.
Slaves can connect with other slaves in a cascading-like structure. Sub-slaves then get the exact same replication stream.
Redis replication is non-blocking. Slave Redis servers are only blocking for a brief window after the initial sync - this is when the old dataset is deleted and the new one loaded.
Redis replication can be used to avoid the cost of having the master write the full dataset to disk - usually by configuring
When Redis replication is used, Redis persistence is strongly advised in both the master and slaves, otherwise, instances should be configured to avoid restarting automatically after a reboot.
This averts the risk of data being wiped out from the master and all its slaves during a failure. Same logic applies when using Redis Sentinel.
Memcached is often preferable when caching small and static because memory management is more efficient than Redis in the simplest use cases.
When data is dynamic, the memory efficiency disappears rather quickly. Large data sets almost always serialize data, which requires more storage space.
Although memcache is more efficient than Redis for simple key value stores, Redis becomes much more efficient when using hashes.
Using Hashes to Abstract Plain Key/Val Stores in Redis
Its always recommended to use hashes in Redis whenever possible. Hashes are encoded in a very small space. For instance, instead of storing users in a web app with different keys for fields such as names, email, password etc. use a single has with all the required fields instead.
Basically, its possible to model a plain key-value store using Redis wher values can be just Strings. This is not just more memory efficient than Redis plain keys but its much more memory efficient than Memcacached.
The logic is simple: a few keys use a lot more memory than a single key (containing a hash with a few fields).
Data Persistence: Data stored in Redis persists reboots because it is synced with disk data every two seconds. Memcached does not support persistent data.
Data Analytics: Redis is, functionally, a database that resides in-memory (one of the reasons developers love using Redis for real-time analytics). In contrast Memcached retrieves it data from servers memory instead of hitting the database again.