Interview with Salvatore Sanfilippo about Redis

Submitted by mudge on 6 March, 2011 - 22:36

Salvatore Sanfilippo (or antirez) is author of Redis, an open source, key-value data store. Also known as a data structure server because values can be strings, hashes, lists, sets and sorted sets.

Why did you choose to write Redis in C, and why ANSI C specifically?

The first prototype of Redis was a Tcl script. I started prototyping with Tcl since the language is designed basically from the start with event driven programming in mind.

The prototype was useful to make me understand if the idea was actually viable, and it showed that it was but that memory efficiency and speed were crucial in the project.

So I started rewriting the whole project in C, and after a week I had a prototype (that I posted on hacker news). I used C because I needed speed, memory efficiency, real time features (no garbage collector or other delays I was not able to control), and the ability to talk well with the operating system layer, so that every optimization was possible. It helped that I'm fluent with C too.

I don't know Java at all and it does not have the features I needed. C++ was out of question as I don't like the language complexity. So basically C was more or less the only option. Given that I love C that was not a problem at all.

How do you feel about the Redis code base?

I think it's a pretty clean code, but that there is space for improvement, however both I and Pieter prefer to perform the refactoring while there is some specific need triggering it instead of doing refactoring sessions without a specific need. In general we also try to find a decent design from the start, not the very best possibly but something viable.

One thing I try hard is avoiding adding complexity in the form of intermediate layers of abstraction when I can't benefit from the new layer in many parts of the program. If something is more or less equivalent I always go for the raw form instead of the possibly more elegant but non obvious design.

Redis was born from scaling LLOOGG. Can you tell me about your experience in scaling LLOOGG?

My experience was that while MySQL is a great tool for a programmer, the SQL approach is not good for modeling many real world problems without adding a huge overhead.

I wanted to add elements in a linked list, and then get the most recent 100 elements. I also wanted only the last 5000 items saved, so I needed to delete old entries. I needed this to be very fast. With SQL you basically have to do strange things to obtain such an obvious linked-list alike effect. Why I should use ORDER BY if I'm already adding elements in the same order I want to retrieve them?

The reason is, it's a different paradigm. A very interesting one as SQL flexibility is pretty amazing, but many problems where you know beforehand what kind of queries you need to run against your data are better modeled with a different API in my opinion.

What are your favorite things about working on Redis?

I think I really love the process of hacking something new into Redis, and a few weeks later seeing that people are taking advantage of it. People exactly like me, other programmers that are designing systems to solve real world problems.

What has been the hardest thing to overcome with the Redis project?

I think the hardest thing is finding a good balance between hacking on the project with maximum freedom but at the same time providing from time to time new stable releases that people can actually use. At the same time providing decent documentation and support. Basically the 'software engineering' part is really really important in a project, and may be the difference between a failed project and a successful one.

What do you use Redis for?

I use Redis for my personal projects every time I can, but also I'm involved as advisor and shareholder with two small companies based in Sicily doing web apps and iPhone / iPad apps. In these two companies Redis is used a lot both for caching and as a primary database. It is also used a lot to exchange messages using Resque.

Can you tell me what you are working on right now and why?

Currently I'm working on a number of things in parallel:

1) The 'diskstore' feature, to optionally make Redis able to use the disk instead of memory. Our main goal is the memory storage but we want to provide an alternative for use cases where it makes sense. As part of this effort I'm also writing a btree implementation.
2) Redis Cluster. We need an easy way to partition bigger databases into N nodes, with fault tolerance, ability to add/remove nodes at runtime and so forth.
3) 2.2 improvements. For instance two days ago I committed changes into a 2.2 sub branch that makes saving and loading RDB databases many times faster.

What are the plans for the O'Reilly Redis book?

Sure, it's already on their site:

http://oreilly.com/catalog/0636920014294

Currently I'm writing the replication chapter.

In the last two years what has been the most surprising thing about the Redis project?

Probably that people cared so much. Redis saw the light when there were already tons of NoSQL DBs out there. Even if Redis is created with one developer full time (me) and one part time (Pieter Noordhuis) the project gained a lot of users and attention. I guess the reason is because we are helping people to get things done easily.

What is your favorite scaling experience?

I love it when people tell me that with some Redis sorted set they can do things that were formerly *not possible* at all with their infrastructure without adding prohibitive costs.

What are some of your favorite programming books?

A few books I like are:

1) Introduction to Algorithms
2) Structure and Interpretation of Computer Programs
3) Applied Cryptography

Has your personal life changed since working on Redis full time?

In some way yes. I work more hours every day, but at the same time I'm more relaxed as I enjoy working at Redis more than any other work activity I did in the past.

How did it happen that VMware hired you to work full time on Redis?

I simply started chatting with VMware a few months before I was hired. They were interested in understanding more about Redis, but without pressing me at all. VMware picked a very soft and sane approach I think.

Eventually I had to pick one company to continue Redis development as it was no longer viable to do it for free, so I realized that VMware was the best pick for Redis, and I was hired after talking with VMware.

.