 Command

Pranesh Nikhar's personal site. Vim-style keybinds for navigation; theme + font pickers below.

Theme
 Font Body Code
Reader
Keybinds
Navigation
j / ↓ Next item k / ↑ Previous item g First item in region G Last item in region zz Center focused item h / l Move left/right region ] / [ Next/previous heading } / { Next/previous block d / u Half-page down/up
Layout
<zh> / <zl> Toggle left/right sidebar <zr> Toggle reader view <zj> / <zk> Focus main/navbar <S-h/j/k/l> Focus left/main/navbar/right ⌃H / ⌃L Focus left/right sidebar ⌃J / ⌃K Focus main/navbar ⇧C / ⇧E Collapse / expand all sections
Dialogs
⌃P / : Command palette ⌃X Theme picker / Search ? Show keybinds Esc / ⌃C Close dialog
History
n Next document b Previous document ⌃O History back ⌃I History forward
 Search
about: Pranesh Nikhar about/more: πŸͺͺ More docs/test: Docs Test ideas: πŸ’‘ Ideas more: βž• More now: Now posts: πŸ“¬ Posts projects: πŸ“š Projects webtui: Style posts/agentic-eda: πŸ“Š AgenticEDA β€” Automated Exploratory Data Analysis with LangGraph posts/cap-theorem-outage-story: 🌐 CAP Theorem with a Real Outage Story posts/codepilot: ✈️ CodePilot β€” From Requirements to Deployable FastAPI Backend posts/common-auth-mistakes: πŸ” Common Auth Mistakes Developers Make posts/compiled-vs-jit-vs-interpreted: ⚑ Why Is X Language Fast or Slow? β€” Compiled vs JIT vs Interpreted posts/cs-degree-gaps: πŸŽ“ Things CS Degrees Don't Teach You posts/cve-2025-breach-analysis: πŸ›‘οΈ CVE-2025 Breach Analysis β€” Midnight Blizzard and the 16 Billion Credential Leak posts/fixloop: πŸ”„ FixLoop β€” AI Agent Loop for Self-Correcting Code posts/functional-vs-oop: ⚑ Functional vs OOP β€” Same Problem, Both Ways posts/getman: 🦾 Getman β€” Declarative API Tester for CLI & TUI posts/how-compilers-optimize: βš™οΈ How Compilers Actually Optimize Your Code posts/http3-quic: ⚑ HTTP/3 and QUIC β€” Why They Matter posts/leetcode-vs-engineering: 🧩 LeetCode vs Real Engineering Skills posts/llm-from-scratch: 🧠 LLM from Scratch β€” GPT-Style Transformer in PyTorch posts/lsm-trees-bloom-filters: 🌳 LSM Trees & Bloom Filters β€” Production Deep Dive posts/mcp-workflow-builder: πŸ”§ MCP Workflow Builder β€” Visual DAG for MCP Tools posts/persistent-memory: 🧠 Persistent Memory β€” Long-Term Memory for AI Agents via MCP posts/playcli: 🎬 PlayCLI β€” Terminal Video Player posts/postgres-mvcc: πŸ—„οΈ How PostgreSQL MVCC Works β€” Multi-Version Concurrency Control Deep Dive posts/raft-consensus: β›΅ Raft Consensus Algorithm Explained posts/rust-borrow-checker: πŸ¦€ Rust Borrow Checker β€” Catches Real Bugs posts/titan: πŸ€– Titan β€” Terminal AI Coding Agent posts/what-happens-url: 🌐 What Happens Between Typing a URL and Seeing the Page posts/what-happens-when-you-run-a-program: βš™οΈ What Actually Happens When You Run a Program posts/zero-knowledge-proofs: πŸ” Zero-Knowledge Proofs Explained Simply webtui/components/accordion: Accordion webtui/components/badge: Badge webtui/components/button: Button webtui/components/checkbox: Checkbox webtui/components/dialog: Dialog webtui/components/input: Input webtui/components/popover: Popover webtui/components/pre: Pre webtui/components/progress: Progress webtui/components/radio: Radio webtui/components/range: Range webtui/components/separator: Separator webtui/components/spinner: Spinner webtui/components/switch: Switch webtui/components/table: Table webtui/components/textarea: Textarea webtui/components/tooltip: Popover webtui/components/typography: Typography webtui/components/view: View webtui/contributing/contributing: Contributing webtui/contributing/contributing: ## Local Development webtui/contributing/contributing: ## Issues webtui/contributing/contributing: ## Pull Requests webtui/contributing/style-guide: Style Guide webtui/contributing/style-guide: ## CSS Units webtui/contributing/style-guide: ## Selectors webtui/contributing/style-guide: ## Documentation webtui/installation/astro: Astro webtui/installation/astro: ## Scoping webtui/installation/astro: ### Frontmatter Imports webtui/installation/astro: ### β€Ήstyleβ€Ί tag webtui/installation/astro: ### Full Library Import webtui/installation/nextjs: Next.js webtui/installation/vite: Vite webtui/plugins/plugin-dev: Developing Plugins webtui/plugins/plugin-dev: ### Style Layers webtui/plugins/plugin-nf: Nerd Font Plugin webtui/plugins/theme-catppuccin: Catppuccin Theme webtui/plugins/theme-custom: Custom Theme webtui/plugins/theme-everforest: Everforest Theme webtui/plugins/theme-gruvbox: Gruvbox Theme webtui/plugins/theme-nord: Nord Theme webtui/plugins/theme-vitesse: Vitesse Theme webtui/start/ascii-boxes: ASCII Boxes webtui/start/changelog: Changelog webtui/start/installation: Installation webtui/start/installation: ## Installation webtui/start/installation: ## Using CSS webtui/start/installation: ## Using ESM webtui/start/installation: ## Using a CDN webtui/start/installation: ## Full Library Import webtui/start/installation: ### CSS webtui/start/installation: ### ESM webtui/start/installation: ### CDN webtui/start/intro: Introduction webtui/start/intro: ## Features webtui/start/plugins: Plugins webtui/start/plugins: ## Official Plugins webtui/start/plugins: ### Themes webtui/start/plugins: ## Community Plugins webtui/start/theming: Theming webtui/start/theming: ## CSS Variables webtui/start/theming: ### Font Styles webtui/start/theming: ### Colors webtui/start/theming: ### Light & Dark webtui/start/theming: ## Theme Plugins webtui/start/theming: ### Using Multiple Theme Accents webtui/start/tuis-vs-guis: TUIs vs GUIs webtui/start/tuis-vs-guis: ## Monospace Fonts webtui/start/tuis-vs-guis: ## Character Cells
 Theme Current: Light j/k or ↑/↓ + Enter

🌐 CAP Theorem with a Real Outage Story

CAP theorem defined, why "pick two" is wrong, real outage stories from GitHub and DynamoDB, CP vs AP systems, CRDTs, tunable consistency, and a trade-off decision table for real workloads.

🧭 The Most Misunderstood Theorem in Distributed Systems

Every developer has heard β€œCAP theorem: pick two of Consistency, Availability, Partition Tolerance.” This is wrong β€” or at least dangerously incomplete.

The CAP theorem (Brewer’s conjecture, proven by Gilbert and Lynch in 2002) actually says:

When a network partition occurs, you must choose between consistency and availability.

Not at design time. At runtime, during a partition. The β€œpick two” framing makes it sound like you choose your trade-off once during architecture design. In reality, you must design for partitions (P is non-negotiable), then decide what happens when they occur.

Let’s look at what CAP actually means, what real outages teach us, and how systems like DynamoDB and Cassandra implement the trade-offs in practice.


πŸ“ CAP Defined

CAP TRIANGLE:
                      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”
                      β”‚        β”‚
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€   CP   β”œβ”€β”€β”€β”€β”€β”€β”€β”
              β”‚       β”‚        β”‚       β”‚
              β”‚       β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜       β”‚
              β–Ό                        β–Ό
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚        β”‚   Partition   β”‚        β”‚
        β”‚   CA   │◄────────────►│   AP   β”‚
        β”‚        β”‚   (not real) β”‚        β”‚
        β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜              β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜
PropertyMeaning
ConsistencyEvery read receives the most recent write or an error. All nodes see the same data at the same time (linearizability).
AvailabilityEvery request receives a (non-error) response, without guarantee that it contains the most recent write.
Partition ToleranceThe system continues to operate despite an arbitrary number of messages being dropped or delayed between nodes.

Critical insight: In a distributed system, partitions are inevitable β€” network switches fail, packets are dropped, links degrade. You must tolerate partitions (P). The real question is: during a partition, do you prefer C or A?

Why CA Is a Lie

A β€œCA” system (Consistent + Available, no Partition Tolerance) would need a perfectly reliable network β€” which doesn’t exist. A single-node database is CA by default, but no distributed system can be both C and A when the network splits. If you claim your system is β€œCA,” it means you haven’t thought about partitions.


πŸ’₯ Real Outage #1: GitHub Availability Incident (October 2018)

On October 21, 2018, GitHub experienced its most severe outage in years. A network partition between their US East Coast and US West Coast data centers caused 24 hours of degraded service.

What Happened

GitHub uses MySQL with Orchestrator for automatic failover. The partition:

GitHub's US East DC                     GitHub's US West DC
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    partition     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  MySQL Primary     β”‚β”€β”€β”€β”€β”€β”€β”€βœ—β”€β”€β”€β”€β”€β”€β”€β”€β”€β”‚  MySQL Replica     β”‚
β”‚  (writable)        β”‚                  β”‚  (read-only)       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

During the network partition, Orchestrator (which manages MySQL failover) determined that the US West replica could not reach the US East primary. Orchestrator’s automated failover logic promoted the US West replica to primary. But the US East node was still running as primary β€” it just couldn’t talk to US West.

Result: Two MySQL primaries accepting writes (split-brain).

Since the systems that read from the US East primary (GitHub.com, API, Issues, Pull Requests) continued reading from it, and the systems that read from the newly-promoted US West primary also continued, the two data sets diverged.

User makes a PR comment on github.com
    β†’ hits US East β†’ writes to MySQL-1 (old primary)
User makes another comment
    β†’ hits US West β†’ writes to MySQL-2 (new primary)
After partition heals:
MySQL-1 has: "comment A"
MySQL-2 has: "comment B"
Replication can't merge these β€” conflict!

GitHub had to:

  1. Identify which primary had the authoritative data
  2. Manually resolve data conflicts
  3. Rebuild replicas from the authoritative primary
  4. Accept some data loss (some comments/issues lost)

CAP Analysis

GitHub’s MySQL setup was configured as a CP system β€” consistent replication with strict ordering. But the automatic failover violated the C guarantee by allowing writes to two primaries. During the partition, GitHub chose availability (keep writing) when their automation ran, but the system was designed for consistency. The mismatch caused the 24-hour outage.

Lesson: If you design for CP, you need to actually refuse writes during a partition. GitHub’s Orchestrator accidentally made the system AP during the outage, with all the conflict-resolution pain that entails.


πŸ’₯ Real Outage #2: DynamoDB’s Pounding (AWS re:Invent 2012)

At AWS re:Invent 2012, Netflix’s presentation revealed how DynamoDB’s design choices during partitions affected real users.

The Setup

DynamoDB is built on Dynamo principles (the 2007 Dynamo paper). It’s an AP system by default: during a partition, DynamoDB prefers to accept writes on both sides and reconcile later.

DynamoDB Ring (simplified):
                    β”Œβ”€β”€β”€β”€β”€β”
                    β”‚ N1  β”‚
                    /     \
            β”Œβ”€β”€β”€β”€β”€β”         β”Œβ”€β”€β”€β”€β”€β”
            β”‚ N2  β”‚         β”‚ N3  β”‚
            β””β”€β”€β”€β”€β”€β”˜         β””β”€β”€β”€β”€β”€β”˜
                \             /
                    β”Œβ”€β”€β”€β”€β”€β”
                    β”‚ N4  β”‚
                    β””β”€β”€β”€β”€β”€β”˜

Each DynamoDB table has:

  • N (replication factor, default 3)
  • R (read quorum size)
  • W (write quorum size)

For strong consistency: R + W > N (e.g., R=2, W=2, N=3) For eventual consistency: W = 1, R = 1

What Happened

During the re:Invent keynote demo, DynamoDB’s request rates for some tables hit unexpected levels. The system’s partition detection kicked in, and some tables became unavailable for strongly-consistent reads while the partition was being resolved.

Normal:                        Partition:
Read "key_xyz":                Read "key_xyz" (strong):
  R β†’ N1, N2 (strong)             R β†’ N1 (can't reach N2!)
  N1: value                      βœ— Can't reach R quorum
  N2: value                       β†’ Return error
  β†’ Return value                 
                                 
                                 Read "key_xyz" (eventual):
                                   R β†’ N1
                                   N1: value (may be stale)
                                   β†’ Return value

The AP trade-off in action: during a partition, DynamoDB refused strongly-consistent reads (because it couldn’t assemble a full quorum) but continued to accept eventually-consistent reads and all writes.

DynamoDB also offers tunable consistency β€” you choose per-request:

# Eventually consistent (default β€” faster, cheaper)
response = table.get_item(Key={'pk': '123'})
# β†’ "EventuallyConsistent" = True (half the read capacity cost)

# Strongly consistent (slower, 2Γ— RCU cost)
response = table.get_item(Key={'pk': '123'}, ConsistentRead=True)
# β†’ Returns the latest write or an error

The cost difference is real: strongly-consistent reads consume 2Γ— the read capacity units because DynamoDB must contact all nodes in the quorum, not just the fastest replica.


πŸ—οΈ CP Systems: PostgreSQL Sync Replication

A classic CP design. PostgreSQL with synchronous replication:

Client writes "x = 42"
    β”‚
    β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ PostgreSQL   β”‚  WAL flushed to disk βœ“
β”‚ Primary      β”‚  Waiting for replica...
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
       β”‚ WAL record
       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ PostgreSQL   β”‚  WAL flushed to disk βœ“
β”‚ Replica      β”‚  Sends ACK to primary
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
       β”‚ ACK
       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Primary      β”‚  Write confirmed to client
β”‚ returns OK   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

During a Partition

Client writes "x = 42"
    β”‚
    β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ PostgreSQL   β”‚  WAL flushed βœ“
β”‚ Primary      β”‚  Waiting for replica ACK...
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
       β”‚ Partition! Packet dropped!
       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ βœ— Replica    β”‚  Unreachable
β”‚              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
After timeout:
β†’ Primary refuses the write!
β†’ Client gets: "ERROR: could not serialize access"
β†’ Primary is still serving reads, still alive
β†’ But writes are blocked until replica comes back

This is the CP trade-off: you get consistency (if the replica can’t confirm the write, the write doesn’t happen) at the cost of availability (writes fail during the partition).

PostgreSQL also supports quorum sync (PostgreSQL 13+): you specify that G out of N replicas must ACK. If G=2, N=3, you lose one replica but still accept writes. This is a hybrid β€” you’re trading availability granularity.


🌊 AP Systems: Cassandra

Cassandra is the most prominent AP system. It’s a Dynamo-style database (same lineage as DynamoDB):

Cassandra Ring:
Each row has a partition key β†’ determines coordinator node

Write "x = 42":
1. Client sends to any node (coordinator)
2. Coordinator writes to all replicas in parallel
3. Responds to client after W nodes acknowledge

Read "x": 
1. Coordinator queries R replicas
2. Picks the most recent version (by timestamp)
3. If versions diverge β†’ read repair or hinted handoff

During a Partition

Replicas: N1, N2, N3 (RF=3)
Partition splits the cluster:

Group A (reachable): N1, N2
Group B (isolated):  N3

Write "x = 42" with W=2 (consistency level ONE):
β†’ N1, N2 acknowledge β†’ client gets OK
β†’ N3 is missed β†’ but it's fine! W=1 requires 1 node

Later, partition heals:
β†’ N3 has old value for x
β†’ Read repair triggers during the next read
β†’ OR: hinted handoff replays the write to N3
β†’ OR: anti-entropy repair runs periodically

In Cassandra, you choose consistency level per operation:

-- Strongest: QUORUM (R + W > RF)
SELECT * FROM users WHERE id = 123
    CONSISTENCY QUORUM;

-- Fastest: ONE (eventual)
SELECT * FROM users WHERE id = 123
    CONSISTENCY ONE;

-- Tolerance: ANY (write to coordinator's memory, even if all replicas down)
INSERT INTO users (id, name) VALUES (123, 'Alice')
    CONSISTENCY ANY;
Consistency LevelR / WBehavior During Partition
ANYW=anyWrite accepted by coordinator β€” may be lost on coordinator crash
ONEW=1Write to any single replica β€” fastest, most available
LOCAL_QUORUMR=2, W=2Quorum within a single datacenter β€” ignores cross-DC
EACH_QUORUMR=2, W=2 (each DC)Strong but requires all DCs β€” unavailable during cross-DC partition
ALLR=3, W=3Write to all replicas β€” zero tolerance for failure
SERIALR=quorum + paxosLinearizable consistency via Paxos β€” slowest

🧬 CRDTs: Reconciling Conflicts Automatically

Conflict-free Replicated Data Types (CRDTs) are the mechanism that makes AP systems work without human intervention. They provide automatic conflict resolution based on mathematical properties.

State-based CRDT (CvRDT)

Each node maintains a state that can be merged with any other node’s state using a commutative, associative, idempotent merge function:

# A Grow-Only Counter (G-Counter)
class GCounter:
    def __init__(self, node_id, num_nodes):
        self.node_id = node_id
        self.counts = [0] * num_nodes
    
    def increment(self):
        self.counts[self.node_id] += 1
    
    def value(self):
        return sum(self.counts)
    
    def merge(self, other):
        # Element-wise max β€” commutative, associative, idempotent
        for i in range(len(self.counts)):
            self.counts[i] = max(self.counts[i], other.counts[i])

# Node A: increment β†’ [1, 0, 0]
# Node B: increment β†’ [0, 1, 0]
# After partition + merge: [1, 1, 0] β†’ value = 2 (correct!)

Operation-based CRDT (CmRDT)

Instead of merging states, nodes broadcast operations. If all operations are commutative, the order doesn’t matter:

# A Grow-Only Set (G-Set)
class GSet:
    def __init__(self):
        self.elements = set()
    
    def add(self, e):
        self.elements.add(e)  # Idempotent: adding twice is same as once
    
    def merge(self, other):
        self.elements |= other.elements  # Union is commutative

Real CRDT Implementations

SystemCRDT TypeReal Usage
RiakState-based (vectors)Riak’s β€œlast write wins” is a simple CRDT
Redis EnterpriseCRDT sets, counters, mapsActive-Active Redis geo-distributed
Automerge (JavaScript)Multi-Value Registers + SequencesCollaborative editing (like Google Docs)
delta-CRDTsState-based, but sends diffsRiak 2.0, NDN (Naspers)
SoundCloudCustom CRDTsPlaylist ordering across devices

🎯 Practical Trade-off Decision Table

When would you choose CP vs AP? Here’s a decision table:

Use CaseCAP ChoiceWhyExample Systems
Banking / LedgerCPCannot lose or duplicate transactions. Refuse writes during partition rather than risk inconsistency.PostgreSQL sync replication, Spanner
DNSAPBetter to serve a slightly stale IP than return error. The internet itself works this way.All DNS servers (AP by necessity)
Shopping cartAPLosing a cart item is worse than briefly seeing a stale cart. CRDTs reconcile smoothly.DynamoDB, Cassandra
User sessionsAPStale session data (e.g., showing user as logged out for 1 second) is acceptable. Downtime is not.Redis Cluster, ElastiCache
Stock inventoryCPOverselling stock due to inconsistent counts costs real money and trust.MySQL sync replication, PostgreSQL
Social feedAPSeeing an old post for a few seconds is fine. The site being down is a headline.Cassandra (used by Instagram?), DynamoDB
CI/CD pipeline stateCPRecording an incorrect β€œbuild passed” status erodes trust. Wait for quorum.PostgreSQL, etcd, Consul
Distributed locks / coordinationCPLinearizability is non-negotiable for locks. An unavailable lock is better than a broken lock.etcd (Raft), Zookeeper (Zab), Consul
Content delivery (CDN)APServe stale cache during partition. Cannot serve = bad UX. Serving old version > 500 error.CloudFront, Fastly, CloudFlare

🧠 Key Takeaways

  1. β€œPick two” is misleading. You must pick P β€” partitions are inevitable. The real choice is C vs A during a partition.

  2. Tunable consistency (DynamoDB, Cassandra) is the pragmatic middle ground. Choose consistency per operation, not per system.

  3. PostgreSQL sync replication is CP: it refuses writes during partition. This is correct for financial data, terrible for social media.

  4. Cassandra/DynamoDB are AP: they accept writes during partition and reconcile later. This works for most web workloads.

  5. CRDTs make AP systems practical β€” they provide automatic conflict resolution without human intervention or complex rollback logic.

  6. True CA systems don’t exist in distributed settings. If your system is β€œCA,” you haven’t experienced a partition yet.

  7. The real world demands both. Many systems offer tunable consistency so you can be CP for critical operations and AP for everything else.

The CAP theorem doesn’t tell you what to build. It tells you what you’re giving up β€” so you can make that choice deliberately rather than discovering it during your next outage.


πŸ“– Series Navigation

 praneshnikhar.site / posts / cap-theorem-outage-story Β· Top 1:1