
The Engineering Wisdom Behind Redis’s Single-Threaded Design
In the relentless pursuit of performance, our industry often gravitates toward seemingly obvious solutions: more cores, more threads, more concurrency. Yet Redis—one of the most performant databases in the world—has maintained its commitment to a primarily single-threaded execution model. As various Redis forks emerge claiming dramatic performance improvements through multi-threading, I want to explore why Redis’s core architectural choices remain fundamentally sound, even at scale.
1
2
3
4
5
6
7
8
9
10
11
* Redis main event loop */
void aeMain(aeEventLoop *eventLoop) {
eventLoop->stop = 0;
while (!eventLoop->stop) {
/* Process pending time events */
processTimeEvents(eventLoop);
/* Wait for I/O or timer events */
processFileEvents(eventLoop);
}
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Single-threaded Redis:
┌─────────────────────────┐
│ CPU Cache │
│ ┌─────────────────┐ │
│ │ Dict structures │ │
│ │ Hash tables │ │
│ │ Recent keys │ │
│ └─────────────────┘ │
└──────────┬──────────────┘
│
│ Cache hit (~1ns)
▼
┌─────────────────────────┐
│ Redis Process │
│ │
│ Process command 1 │
│ Process command 2 │
│ Process command 3 │
└─────────────────────────┘
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Multi-threaded Redis:
┌─────────────────┐ ┌─────────────────┐
│ CPU 1 Cache │ │ CPU 2 Cache │
│ ┌─────────────┐ │ │ ┌─────────────┐ │
│ │Dict structures│◄────┼─┤Dict structures│
│ │Hash tables │ │ │ │Hash tables │ │
│ └─────────────┘ │ │ └─────────────┘ │
└─────────────────┘ └─────────────────┘
│ │
│ │
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ Thread 1 │ │ Thread 2 │
│ │ │ │
│ Process cmd 1 │ │ Process cmd 2 │
└─────────────────┘ └─────────────────┘
│ │
│ Cache line │
└───────invalidation────┘
HMSET
that modifies multiple fields in a hash. In a multi-threaded environment, this would require acquiring a lock on the database, acquiring a lock on the key’s entry, acquiring a lock on the hash structure, performing the modifications, and releasing all locks in reverse order. Each lock operation adds overhead, and under contention, this overhead grows dramatically. The single-threaded model eliminates this overhead entirely.1
2
3
4
5
6
7
edis (single-threaded) execution flow:
┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐
│ HSET │→ │ HSET │→ │ EXPIRE │→ │ GET │
│user:123 │ │user:123│ │user:123│ │config: │
│login │ │status │ │3600 │ │timeout │
└────────┘ └────────┘ └────────┘ └────────┘
1ms 1.2ms 0.3ms 0.6ms
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Multi-threaded database execution flow:
┌────────┐ ┌────────┐ ┌────────┐
│ HSET │ │ GET │ │ EXPIRE │
│user:123│ │config: │ │user:123│
│login │ │timeout │ │3600 │
└───┬────┘ └───┬────┘ └───┬────┘
│ │ │
▼ ▼ ▼
┌────────┐ ┌────────┐ ┌────────┐
│Thread 1│ │Thread 2│ │Thread 3│
└────────┘ └────────┘ └────────┘
│
▼
┌────────┐
│ HSET │
│user:123│
│status │
└────────┘
(slow - why?)
MONITOR
provide accurate representations of system behavior. The MONITOR
command, which streams all commands processed by Redis, is particularly valuable for debugging. In a multi-threaded system, such a tool would need to merge events from multiple threads, potentially losing the causal relationships between operations.PFCOUNT
, PFMERGE
, GET
, EXISTS
, LRANGE
, HSET
, HGETALL
, and more. These optimizations focus on fundamental efficiency improvements that demonstrate a crucial insight: algorithmic and implementation improvements often yield better results than simply throwing more threads at a problem.GET
and SET
, where even small improvements have a significant cumulative impact. For computation-intensive operations, Redis 8 leverages Single Instruction Multiple Data (SIMD) instructions available in modern CPUs. Vector instructions like AVX2/AVX-512 are used for operations such as CRC64 calculation, allowing multiple data elements to be processed simultaneously within a single thread. The team also implemented algorithms specifically designed for SIMD execution.1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Redis 8's I/O Threading Architecture:
┌─────────────────────────────────────────────────────┐
│ Main Thread │
│ │
│ ┌───────────┐ ┌───────────┐ ┌───────────┐ │
│ │ Command │ │ Command │ │ Command │ │
│ │ Processing│ │ Processing│ │ Processing│ │
│ └───────────┘ └───────────┘ └───────────┘ │
└────────┬────────────────┬────────────────┬──────────┘
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌─────────────┐ ┌─────────────────┐
│ I/O Thread 1 │ │I/O Thread 2 │ │ I/O Thread N │
│ │ │ │ │ │
│ ┌─────────────┐ │ │┌───────────┐│ │ ┌─────────────┐ │
│ │Socket Read/ │ │ ││Socket Read││ │ │Socket Read/ │ │
│ │Write/Parse │ │ ││Write/Parse││ │ │Write/Parse │ │
│ └─────────────┘ │ │└───────────┘│ │ └─────────────┘ │
└─────────────────┘ └─────────────┘ └─────────────────┘
io-threads
parameter sets the number of I/O threads (default: 1, effectively disabling threading). The io-threads-do-reads
parameter determines whether I/O threads should handle reads (default: off). The io-threads-do-writes
parameter controls whether I/O threads should handle writes (default: on).1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
efore Redis 8:
┌──────────────┐ ┌──────────────┐
│ Primary │ │ Replica │
│ │ │ │
│ 1. RDB dump │─────────────────▶│ 1. Load RDB │
│ 2. Buffer │ │ │
│ changes │ │ │
│ 3. Send │ │ 2. Apply │
│ buffer │─────────────────▶│ changes │
└──────────────┘ └──────────────┘
Redis 8:
┌──────────────┐ ┌──────────────┐
│ Primary │──────Stream 1───▶│ Replica │
│ │ │ │
│ 1. RDB dump │ │ 1. Load RDB │
│ 2. Stream │ │ │
│ changes │──────Stream 2───▶│ 2. Apply │
│ in parallel│ │ changes │
└──────────────┘ └──────────────┘
Hidden Complexities of Multi-Threaded Databases
1
2
3
4
5
Speedup = 1 / ((1 - P) + P/N)
Where:
- P is the proportion of the program that can be parallelized
- N is the number of processors
- 2 threads: 1.18x speedup
- 4 threads: 1.29x speedup
- 8 threads: 1.36x speedup
- 16 threads: 1.40x speedup
- 32 threads: 1.42x speedup
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Memory Access Hierarchy:
┌─────────────────────────────────────────────────────────┐
│ L1 Cache: ~1ns │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ L2 Cache: ~3ns │ │
│ │ ┌─────────────────────────────────────────────────┐ │ │
│ │ │ L3 Cache: ~10ns │ │ │
│ │ │ ┌─────────────────────────────────────────────┐ │ │ │
│ │ │ │ │ │ │ │
│ │ │ │ Main Memory: ~100ns │ │ │ │
│ │ │ │ │ │ │ │
│ │ │ └─────────────────────────────────────────────┘ │ │ │
│ │ └─────────────────────────────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────┘
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
UMA Architecture:
┌─────────────┐ ┌─────────────┐
│ CPU 0 │ │ CPU 1 │
│ ┌─────────┐ │ │ ┌─────────┐ │
│ │ Cores │ │ │ │ Cores │ │
│ └─────────┘ │ │ └─────────┘ │
│ ┌─────────┐ │ │ ┌─────────┐ │
│ │ Memory │◄┼─┼─┤ Memory │ │
│ └─────────┘ │ │ └─────────┘ │
└─────────────┘ └─────────────┘
▲ ▲
│ │
└───────┬───────┘
│
Interconnect
(Slower Access)
1
2
3
4
if (counter > 0) {
// Another thread might change counter here
counter++;
}
1
2
3
4
5
synchronized (lock) {
if (counter > 0) {
counter++;
}
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Redis Cluster Scaling:
┌────────────┐ ┌────────────┐ ┌────────────┐
│ Redis │ │ Redis │ │ Redis │
│ Instance 1 │ │ Instance 2 │ │ Instance 3 │
│ │ │ │ │ │
│ Keys 0-5461│ │ Keys 5462- │ │ Keys 10923-│
│ │ │ 10922 │ │ 16383 │
└────────────┘ └────────────┘ └────────────┘
▲ ▲ ▲
│ │ │
└──────────────┼──────────────┘
│
▼
┌─────────────────────────────────────┐
│ Redis Cluster Client │
│ │
│ (automatically routes commands to │
│ the appropriate instance) │
└─────────────────────────────────────┘