Introduction to Cache Part 1

6 min readMar 18, 2022

Cache is a high speed data storing layer that stores a very small portion of data, to serve future requests faster compared to if data was accessed from primary storage systems.

Caches are key-value stores(average complexity O(1)) that can be set up at many levels to increase throughput and reduce latency.

Caching is the process to store most recently used item in the storage layer.

In this blog we will cover basics of caching, the motivation, some facts about caching, levels of caching, advantages, disadvantages and use cases.

Need for caching:

Cache improves page load time and reduces lot of load on servers and/or databases. The main purpose of caching is be able to do frequent tasks more quickly to serve customers efficiently.

Let’s use a simple example to understand importance of caching —

Fetching website without cache

Without using caches: Suppose, user requests a website from the browser. The browser fetches the html page from the server and renders it to the user. Let’s say after awhile, the user again requests the same website, browser will again fetch html page from the server and render it.

Under this model, the browser has to do complete round trip to fetch html page every time — even very popular ones that are frequently used. Internet connection is the slowest link in the entire process and we want to setup something, so that these round trips can be minimized.

Fetching website with cache

With caches: So, we have a cache in hard disk which stores html pages that are received from the server.

When user requests a website from browser for the first time, the browser fetches the html page from server, stores a copy in cache and displays it to browser. When the user requests the same website next time, the browser checks in the cache. If the html page exists in cache, the browser renders the page from cache. Otherwise, the browser fetches page from server, stores a copy in cache and renders it. In this case, the browser is faster with a cache and the client is served more efficiently.

Suppose, the client requests for a website that isn’t available in cache. In this case, the browser is less efficient with a cache than without one, because the browser takes the time to look into the cache first.

One of the challenges of cache design is to minimize the impact of cache search. The latency time of searching in the cache should be so small compared to the time taken to get html page that it can be ignored.

Facts about caching:

  1. Cache technology is the use of faster but smaller memory type consumed that is used to accelerate a slower but larger memory type task.
  2. If an item exist in cache, it’s called a cache hit. Otherwise, its called cache miss and the slower task needs to be executed.
  3. The cache must be faster memory type with small size such that lookup time becomes negligible compared to lookup in large but slower memory area. In the above eg: the faster memory type is hard disk, while slower memory type is internet.
  4. Multi level caching can be done. The first level could be at the browser, while the second level could be at the load balancer that distributes requests to the server.

Types of caches:

  • Client cache: Adding a cache at the client side i.e. in the client computer
  • CDN cache: Content delivery network is a network of proxy servers that serve contents to user from a closer location. More info can be read here content delivery network.
  • Web server cache: These caches are set up at load balancer that is used to distribute load evenly among all servers. These cache return content directly without contacting servers. More info can be read here load balancer
  • Database cache: Usually default configurations of database include some form of caching to maintain uniform distribution of read and writes. This prevents popular items from skewing the distribution, thus causing bottlenecks. Putting cache in front of database helps absorb spikes in traffic.
  • Within Application cache: These caches are setup between application and the database. Since in-memory cache holds data in RAM, it is much faster compared to accessing data from database(data is stored on disk). Algorithms such as LRU — Least Recently Used can keep ‘hot’ entries in RAM and invalidate ‘cold’ data.

Within application, caching is done at two main levels:

(a) Database level: Hash the query as key and storing the database query result as value in cache. The drawbacks include deleting entire cached query if one field might have changed and complex queries are hard to delete.

(b) Object level: Storing the data as an object, rather than storing query result. Easy to add object-level caching if your application is already structured with DAO layer. The object can be removed when underlying data is changed and object level allows for asynchronous processing.


  • Improves Application performance: As accessing from memory is significantly many times faster than disk access, hence accessing data from in-memory cache is tremendously fast. This significantly fastens up memory access, thus improving overall application performance.
  • Reduces DB cost: A single cache can serve 100s of requests per second, thus replacing a number of database instances. This is especially helpful when database costs per throughput, thus reduces overall cost.
  • Helps eliminate hotspots: In many applications: a small subset(5%) of data will be accessed much more frequently than the rest 95%. Rather than over-provisioning database resources based on application throughput, we can store most frequently used entires in cache, thus eliminating need for more resources.


  • Data storage is temporary i.e. data is stored as long as power is supplied
  • It could become a bottleneck if its not configured properly because of the additional cost along with main memory lookup.
  • Memory is much more expensive compared to primary memory or secondary memory.

Use case:

  • Content Delivery Network: When your traffic is dispersed globally, its not cost-efficient to replicate entire infrastructure globally. CDN provides the ability to utilize global network to deliver cached copy of web contents from the nearest location(with reference to customer) to reduce response time.
  • Domain Name System: Every web request that is made on the internet queries the DNS server to get the address of the website requested by client. Setting up cache to store DNS query results can reduce the load on DNS server and speed up the internet connectivity between client and server.
  • Session Management: HTTP session contains the data exchanged between the users and web application such as login information, customer information, cart information etc. HTTP sessions provide great user experience by storing user information and their preferences.
  • Database: Setting up cache between application and database increases throughput and reduces querying latency by many folds, thus improving overall performance of application.
  • Web caching: Major latency is caused by transferring images, videos, html documents etc.while delivering web contents. Caching can be set up at client side and server side to reduce server load and speed up website loading.

and many more ….

Popular in-memory caches are Redis and Memcached.

To understand structure and algorithm of cache, go through the problem on Leetcode — LRU Cache.

In the upcoming blog, i will be covering cache update in detail

Follow me if you want to get notified to my upcoming blogs on system design.




Master's student in Computer Science from University of Florida. I love to write and help others, so here am i.