Cache Part 2 — Cache update strategy

SAKSHI CHHABRA
3 min readMar 22, 2022

Since we can only store a limited amount of data into cache, we need to determine best cache update strategy to make best use of limited memory.

We need to update data frequently so that the least recently data exists in the cache. Our target is to maximize cache hit, while minimizing cache miss.

In case you haven’t read my previous blog, i would recommend reading it here first. It covers basics of cache along with various cache levels, advantages, disadvantages and use cases.

There are mainly four cache update strategy that we are going to discuss here:

Cache aside

cache aside

The application is responsible for reading and writing data onto database as well as cache. The cache does not interact with the data store directly.

The application looks for an entry in cache. If it’s a cache hit, voila. But if it’s a cache miss, the application fetches the data from database and adds entry onto cache.

Advantages:

  • This is also known as lazy loading as only requested data is loaded onto the cache which avoids filling up cache with data that isn’t requested
  • Subsequent reads of data added onto cache are fast

Disadvantages:

  • Noticeable delay as each cache miss results in three round trips
  • When a node fails, it’s replaced by new empty node, resulting in increased latency.
  • Data can become stale if it isn’t updated. Can be mitigated by configuring TTL (time-to-live) which forces update on cache entry.

Ex: Memcache is generally used in the same manner

Write Through

write through

The application uses cache as the main data storage system. The application reads, writes or updates data onto the cache, while its cache’s responsibility to synchronously read, write or update data onto the data store.

This is an overall slow process due to write operation, but subsequent reads of the recently written operations are fast. But it works out as users are more patient when writing data compared to reading data.

Advantages:

  • Data is never stale as the data is first written to cache
  • Subsequent reads of just written data is fast

Disadvantages:

  • A major proportion of data would never be read. This can be handled by configuring TTL which would remove most recently used data
  • When a new node is created due to failure or scaling, the new node wont cache entries until it is updated in database.

Write Behind

write behind

The application uses cache as the main storage system. The application reads/writes data onto cache, the cache further adds the event onto queue, and event processor executes the event asynchronously, thus improving write performance.

Advantages:

  • Faster write performance due to asynchronous writes onto data store
  • Data is never stale as the data is first written to cache
  • Subsequent reads of just written data is fast

Disadvantages:

  • Possible data loss if cache goes down before contents are added onto data store.
  • More complex to implement compared to above two processes.

Refresh Ahead

refresh ahead

The cache can be configured to refresh any recently added data before its expiration(TTL).

Refresh ahead can result in reduced latency if it can predict which items would be needed in future and update them.

Advantages:

  • Provides customer with closely up to data while increasing performance.

Disadvantages:

  • Reduced performance if it cannot predict well which items are needed in future

Every application has different needs and based on the use case, the cache update strategy should be selected. A combination of update strategy could also be used.

In the next blog, i will cover communication between client and server.

Follow me if you want to get notified about my upcoming blog!!!

--

--

SAKSHI CHHABRA

Master's student in Computer Science from University of Florida. I love to write and help others, so here am i.