Skip to content

🗄️ HLD: Database Scaling & Caching

📝 Overview

Scaling a database is one of the hardest challenges in HLD. This module covers techniques for handling massive read/write loads and ensuring high availability.

Core Concepts

  • Sharding: Horizontal partitioning. Choosing a shard key and avoiding hotspots.
  • Consistent Hashing: Minimizing data movement during re-sharding.
  • Replication: Active-Passive vs. Active-Active setups.
  • Indexing: B-Trees, LSM Trees, and how they impact query speed vs. write speed.

🚀 Strategies

1. Caching Strategies

  • Cache-Aside: Application manages cache.
  • Write-Through: Cache updated synchronously with DB.
  • Write-Back: Cache updated first, DB updated asynchronously (high performance, risk of data loss).

2. Distributed Transactions

  • Two-Phase Commit (2PC): Strong consistency, high latency, blocking.
  • Three-Phase Commit (3PC): Non-blocking but higher complexity.

3. Concurrency Control

  • Optimistic Concurrency: Versioning (e.g., version_id). Best for low contention.
  • Pessimistic Concurrency: Locks (e.g., SELECT FOR UPDATE). Best for high contention.
  • Two-Phase Locking (2PL): Ensuring serializability.