Overview
What is caching?
A cache: A high-speed data storage layer which stores a subset of data. Caching allows efficiently reuse previously retrieved or computed data. Caching types
CPU caches: L1, L2, L3, TLB. HTTP caching
Usage
Use Cache-Control header:
Cache-Control: no-store
Cache-Control: no-cache
Cache-Control: private
Cache-Control: public
Cache-Control: max-age=<seconds>
Cache-Control: must-revalidate
Cache validation:
ETag: "33a64df551425fcc55e4d42a148795d9f25f89d4" // Strong validator
ETag: W/"67ab43" // Week validator
Request header: If-None-Match If-None-Match: "bfc13a64729c4290ef5b2c2730249c88ca92d82d"
If-None-Match: W/"67ab43", "54ed21", "7892dd"
If-None-Match: *
Response header: Last-Modified Last-Modified: Wed, 21 Oct 2015 07:28:00 GMT
If-Modified-Since: Wed, 21 Oct 2015 07:28:00 GMT
Strategies
Week validation vs strong validation:
Strong validation consists of guaranteeing that the resource is, byte to byte, identical to the one it is compared to. Weak validation considers two versions of the document as identical if the content is equivalent. For example, a page that would differ from another only by a different date in its footer, would be considered identical to the other with weak validation. Infrequently updated files are named in a specific way: in their URL, usually in the filename, a revision (or version) number is added. Varying responses:
Vary HTTP response header describes the parts of the request message aside from the method and URL that influenced the content of the response it occurs in Vary: *
Vary: <header-name>, <header-name>, ...
Vary: Accept-Encoding
Normalization: Caching servers will by default match future requests only to requests with exactly the same headers and header values. To avoid unnecessary requests and duplicated cache entries, caching servers should use normalization to pre-process the request and cache only files that are needed. Accept-Encoding: gzip,deflate,sdch
Accept-Encoding: gzip,deflate, Accept-Encoding: gzip
// Normalize Accept-Encoding
if (req.http.Accept-Encoding) {
if (req.http.Accept-Encoding ~ "gzip") {
set req.http.Accept-Encoding = "gzip";
}
// elseif other encoding types to check
else {
unset req.http.Accept-Encoding;
}
}
Database cache
Avoid using MySQL query cache (for version < 5.7.20):
Enabling the query cache adds some overhead for both reads and writes: (From High Performance MySQL v3):
Read queries must check the cache before beginning. If the query is cacheable and isn’t in the cache yet, there’s some overhead due to storing the result after generating it. There’s overhead for write queries, which must invalidate the cache entries for queries that use tables they change. Invalidation can be very costly if the cache is fragmented and/or large (has many cached queries, or is configured to use a large amount of memory). Server application caching
Cache types
Using a shared cache
Eviction Policies
Frequency and recency (, ) Caching strategies
Cache-aside
Pros:
Only requested data is cached. Node failures aren't fatal for your application. Cons:
Cache stampede when cache miss. Read through/Write-through
Pros:
Data in the cache is never stale. It simplifies cache expiration. Cons:
Cache can be filled with unnecessary objects. When cache nodes fail, cached objects may no longer be in the cache. Implementation:
In the application layer.
Write-behind (write-back)
Refresh-ahead
Use Change Data Capture (CDC) pattern, listen to changes from databases and refresh related cache data. Use a cron job to periodically refresh cache data. Cache data
Result of expensive calculation. Should use binary format like msgpack, cbor, bson. Concerns when using cache
Monitoring: