Persisting Redis data on disk

Redis data lives in memory, that makes it is very fast to write to and read from, but in case of server crashes you lose all that's in the memory, for some applications it's ok to lose these data in case of crash, but for other apps it's important to be able to reload Redis data after server restarts.

In this post we're going to discuss the two persistence options Redis provides:

Snapshotting

Redis stores snapshots of your data to disk in a dump.rdb file in the following conditions:

  1. Every minute if 1000 keys were changed
  2. Every 5 minutes if 10 keys were changed
  3. Every 15 minutes if 1 key was changed

So if you're doing heavy work and changing lots of keys, then a snapshot per minute will be generated for you, in case your changes are not that much then a snapshot every 5 minutes, if it's really not that much then every 15 minutes a snapshot will be taken.

In case Redis couldn't create a snapshot of your data it'll hang and stop accepting any new writes and give you an error so that you know something is wrong.

To backup your data in case of damage you can transfer your rdb file to Amazon S3 every x interval.

You can change how often these snapshots are taken, and you can manually create a snapshot on run-time using a save command.

Append-only file

This works differently, every time you send a command to Redis it'll be stored in a file, later on you can use that file to re-build the entire dataset.

After a while this file can get really big since it contains the entire history of every key, for that Redis rewrites that file every once in a while to keep it as small as possible, so instead of storing the entire history of a key it starts with the latest state of that key.

// instead of
increment counter
increment counter
increment counter

// it stores
set counter 3

You can configure how often this rewrite happens based on the current file size and the allowed percentage of increase since the size of the latest rewrite.

Even though Redis sends the new command to be appended to the file, the OS usually keeps these commands in a buffer and flushes that buffer every X number of seconds, meaning if something happened while these commands are still in the buffer and not stored to stored to the disk yet you lose it. For that Redis forces buffer flush every 1 second so that you don't lose anything.

You can however configure Redis to force flush on every single command, not just every second, but that makes Redis responses per command very slow and it's not recommended, but it's the safest ¯\_(ツ)_/¯.

What happens after server restarts

Redis will load your data from your backup files and put them into memory. In case you use both Snapshotting and and Append-only file, Redis will use the AOF (Append-only file) since it's guaranteed to have the most recent data.

What persistence strategy should I use?

By default Redis is configured to use snapshotting, you can keep using it only if you're ok with loosing a few minutes of data, but if you really need to guarantee no data loss then use AOF, it's possible that you use both strategies so that you have a clean backup of your data using the snapshot strategy and also an appent-only file.