Okay, let's move on to 2.2.b Caching. We touched upon this in Phase 1, but now we'll dive deeper into strategies, types, and considerations, as caching is fundamental to optimizing performance and reducing load in almost any system.
- Recap: Caching involves storing frequently accessed data temporarily in a faster storage layer (the cache) to speed up subsequent requests for that same data.
- Primary Goals:
- Reduce Latency: Serve requests faster by avoiding slower backend operations (like database queries or complex computations).
- Reduce Load: Decrease the number of requests hitting slower, more expensive backend resources (like databases or external APIs).
Where Can Caching Occur? (Layers of Caching)
Caching can happen at various levels in a system architecture:
-
Client-Side Caching (Browser/Mobile App):
- Web browsers automatically cache static assets (HTML, CSS, JS, images) based on HTTP headers (e.g.,
Cache-Control
,Expires
). - Mobile apps can implement local caching for API responses or frequently used data.
- Pros: Fastest access (no network request), reduces network traffic.
- Cons: Less control over invalidation, limited storage, data might be stale if not managed properly.
- Web browsers automatically cache static assets (HTML, CSS, JS, images) based on HTTP headers (e.g.,
-
CDN (Content Delivery Network):
- Geographically distributed network of proxy servers caching static content closer to users.
- Pros: Significantly reduces latency for global users, offloads traffic from origin servers, provides DDoS protection.
- Cons: Primarily for static assets (though some CDNs offer dynamic content caching), incurs cost.
-
Load Balancer Caching:
- Some advanced Layer 7 load balancers can cache responses from backend servers.
- Pros: Reduces load on application servers.
- Cons: Cache features might be less sophisticated than dedicated cache solutions.
-
Application/Server-Side Caching:
- In-Process Cache: Data stored within the application server's memory (e.g., using language-specific libraries like Guava Cache in Java, or simple dictionaries in Python).
- Pros: Extremely fast access (no network overhead).
- Cons: Cache is local to each server instance (not shared), limited by server RAM, data lost on application restart.
- Distributed Cache (External): A separate caching service shared by multiple application servers.
- Pros: Shared cache accessible by all application instances, survives application restarts, highly scalable, offers advanced features (data structures, persistence options).
- Cons: Introduces network latency (though typically much lower than DB access), requires managing separate infrastructure (or using a managed cloud service).
- Examples: Redis, Memcached.
- In-Process Cache: Data stored within the application server's memory (e.g., using language-specific libraries like Guava Cache in Java, or simple dictionaries in Python).
-
Database Caching:
- Databases often have internal caches (e.g., buffer pools, query caches) to optimize query performance.
- Pros: Often transparent to the application.
- Cons: Limited control from the application perspective, primarily focuses on optimizing DB operations, not necessarily caching application-level data objects.
Caching Strategies (How Applications Interact with the Cache)
-
Cache-Aside (Lazy Loading):
- Workflow:
- Application checks the cache for data.
- Cache Hit: Return data from cache.
- Cache Miss: Application fetches data from the database.
- Application stores the fetched data in the cache.
- Application returns data.
- Pros: Resilient (application can still function if cache fails, albeit slower), only requested data is cached.
- Cons: Higher latency on initial cache misses (the "miss penalty"), potential for stale data between cache updates. This is a very common and versatile strategy.
- Workflow:
-
Read-Through:
- Workflow:
- Application queries the cache for data.
- Cache checks if data exists internally.
- Cache Hit: Cache returns data to the application.
- Cache Miss: Cache fetches data from the underlying database.
- Cache stores the data internally.
- Cache returns data to the application.
- Pros: Application code is simpler (treats cache like the primary data source).
- Cons: Requires cache provider support, similar miss penalty and staleness concerns as cache-aside.
- Workflow:
-
Write-Through:
- Workflow:
- Application writes data to the cache.
- Cache writes the data to the underlying database synchronously.
- Database confirms write completion to the cache.
- Cache confirms write completion to the application.
- Pros: High consistency between cache and database, data is durable (written to DB). Reduces risk of data loss on cache failure.
- Cons: Higher write latency (waits for both cache and DB), cache can be a write bottleneck.
- Workflow:
-
Write-Back (Write-Behind):
- Workflow:
- Application writes data to the cache.
- Cache immediately confirms write completion to the application.
- Cache writes the data to the underlying database asynchronously later (often in batches).
- Pros: Very low write latency, high write throughput (good for write-heavy loads).
- Cons: High risk of data loss if the cache fails before data is persisted to the DB. Eventual consistency between cache and DB. More complex implementation.
- Workflow:
-
Write-Around:
- Workflow:
- Application writes data directly to the database (bypassing the cache).
- Data might be loaded into the cache later during a read operation (using cache-aside or read-through).
- Pros: Avoids flooding the cache with data that might not be read soon (useful for write-heavy workloads where reads are infrequent). Database handles write load directly.
- Cons: Reads of recently written data will result in a cache miss, increasing read latency initially.
- Workflow:
Cache Eviction Policies (Recap):
- When the cache is full, some data must be removed (evicted). Common policies:
- LRU (Least Recently Used): Removes the item not accessed for the longest time. (Common default)
- LFU (Least Frequently Used): Removes the item accessed the fewest times.
- FIFO (First-In, First-Out): Removes the item that has been in the cache the longest.
- The best policy depends on the data access patterns.
Cache Invalidation (Recap & Importance):
- Ensuring data in the cache is up-to-date is crucial.
- TTL (Time-To-Live): Set an expiration time for cache entries. Simple but can lead to stale data if the underlying data changes before TTL expires.
- Active Invalidation: Explicitly remove or update cache entries when the corresponding data changes in the database (e.g., using database triggers or application logic). More complex but ensures better consistency.
Key Considerations & Trade-offs:
- Cache Hit Ratio: The percentage of requests served from the cache. Aim to maximize this.
- Staleness: How long are you willing to tolerate potentially stale data? (Performance vs. Consistency)
- Cache Size & Cost: Balancing the amount of data cached vs. memory/infrastructure costs.
- Complexity: Implementing and managing caching adds complexity to the system.
In an Interview:
Discuss where you would place caches in your design (CDN, distributed cache, etc.). Explain the caching strategy you would use (e.g., cache-aside) and justify it based on the read/write patterns and consistency requirements. Mention eviction policies and TTL/invalidation strategies to handle stale data. Show you understand the trade-offs.