Latency Numbers Every Programmer Should Know

Dr. Jeff Dean originally popularized this list at Google. Knowing these orders of magnitude is the difference between designing a system that feels "instant" and one that feels "broken".

Latency Chart

The Human Scale Metaphor

If 1 CPU cycle (0.3ns) was 1 second, then:

L1 Cache Access: 0.5 ns → 1.5 seconds (Picking up a pen)
L2 Cache Access: 7 ns → 23 seconds (Walking to the bookshelf)
Main Memory (RAM): 100 ns → 5.5 minutes (Walking to the corner store)
NVMe SSD Read: 15 µs → 14 hours (Driving to the next city)
Disk Seek (HDD): 10 ms → 1 year (Sailing around the world)
Packet CA -> Netherlands: 150 ms → 15 years (A significant portion of your life)

The Real Numbers (2026)

Hardware has gotten faster, but the ratios remain brutal.

Operation	Time	Notes
L1 Cache Reference	0.5 ns	Basically free.
Branch Misprediction	2.5 ns	This is why `if` statements in tight loops matter.
L2 Cache Reference	7 ns	14x slower than L1.
Mutex Lock/Unlock	25 ns	Concurrency is not free.
Main Memory Reference	100 ns	The "Memory Wall".
Compress 1KB w/ Zippy	10 µs	CPU is fast at math.
Send 2KB over 1Gbps Network	20 µs	Networking is surprisingly fast locally.
Read 1MB sequentially from RAM	250 µs	Bandwidth is high.
Round trip within Datacenter	500 µs	The "microservices tax".
Read 1MB sequentially from SSD	1 ms	Flash storage is a miracle.
Disk Seek (HDD)	10 ms	Spinning rust is the bottleneck. Avoid it.
Packet CA -> Netherlands	150 ms	The speed of light is a hard limit.

Why This Matters for System Design

Avoid Network Calls in Loops: A roundtrip to Redis (0.5ms) is 5,000x slower than reading from RAM. Get all your data in one batch.
Locality is King: Sequential reads are orders of magnitude faster than random reads. This is why LSM Trees (Cassandra/Kafka) are faster for writes than B-Trees—they turn random writes into sequential writes.
The "Microservices Tax": Splitting a monolith into 10 services introduces 10x network hops. If your user request hits 50 services, that's 50 * 0.5ms = 25ms of pure overhead, ignoring processing time.

First Principles Thinking

When designing a system, do "Back of the Envelope" math:

Goal: 100,000 requests per second. Constraint: Each request writes to DB. Math: Disk seek is 10ms (100 IOPS). Conclusion: You literally cannot do this with a single HDD. You need RAM buffering (Redis) or sequential appends (Kafka).

Don't guess. Calculate.

Latency Numbers Every Programmer Should Know (2026 Edition)

Latency Numbers Every Programmer Should Know

The Human Scale Metaphor

The Real Numbers (2026)

Why This Matters for System Design

First Principles Thinking

Comments

More from this blog

The Evolution of DevOps: From Scripts to Platform Engineering

Arrays: The OG Data Structure (and Why They Still Matter)

How to Choose a Database

The Ultimate Guide to Engineering Design Docs

Command Palette

Latency Numbers Every Programmer Should Know

The Human Scale Metaphor

The Real Numbers (2026)

Why This Matters for System Design

First Principles Thinking

Comments

More from this blog