Notes from Harvard scalability lecture

5 min readJun 6, 2021

Lecture 9 Scalability Harvard Web Development by David Malan is a good start to System Design. Link: https://youtu.be/-W9F__D3oY4 . Go ahead, watch the video and then go through the notes for better clarity else just go through the notes. Personally, I prefer reading over watching to go through more content in less time.

https://codeburst.io/load-balancers-an-analogy-cc64d9430db0

Vertical Scaling:

Get resources with more processor, more RAM, more disk space.
Not a full solution due to real-world constrains on the machine and bigger machines become financial expensive.

Horizontal Scaling:

Using a lot more cheaper hardwares rather than few expensive machines.
Inbound HTTP request distributed over various available web servers by load balancer.
IP address of the load balancer is returned to the client. So, load balancer will have public IP address and servers can instead have private IP address.
Since the IP addresses of the servers is private, it cannot be contacted by anyone on the internet and it is safe from the bad guys. The world is running out of 32 bit IP address, so it has become expensive to get ip addresses.

Load balancing with BIND:

We can store same data on all the servers and distribute client requests based on load on server(complex) but this results in redundancy. Else we could have dedicated servers for various tasks.
Round robin can be used to distribute client requests on various servers. Downside: One server could get all computational intensive tasks while other servers could get undemanding tasks.
Browser caches DNS lookups for limited time i.e. TTL.

RAID:

RAID 0: Two hard-drives of identical size where consecutive data is stored in separate HD. Each HD takes time to write data, waiting time is minimized by writing data to other HD.
RAID 1: Data is written in both HD. Data is redundantly saved across two HD and there is little performance overhead due to data being written twice. Upside: Data can be restored if one HD is damaged and needs to be replaced.
RAID 10: Combination of RAID 0 and RAID 1. 4HD are needed.
RAID 5: Variant of RAID 1. One HD is used for redundancy, other HD are fully used.
RAID 6: Same as RAID 5 but two HD are used instead of one.

https://www.imperva.com/learn/availability/sticky-session-persistence-and-cookies/

Sticky Sessions:

Server saves session cookies (like username, password). If load balancer directs a client request to different server in different session, it would result in redundant data.

Approaches to solve the problem:

Shared storage: We can handle this by storing session cookies database on load balancer but there’s a chance of it dying which can be handled by using more than one balancers. Robust system with faster write but redundant.
Cookies: We can store a huge random number in cookies and load balancer would direct that client request to the server based on that random number.

PHP Accelerators:

Interpreted languages are not as fast the compiled languages. PHP accelerators precompile into bytecode.
Gives ability to handle more requests per sec.

https://www.coconutlizard.co.uk/blog/cacheing-up/

Caching:

HTML: Data is cached in .html file. Redundancy as tags are stored in all the files. Doesn’t require interaction with database and .html files are very fast to process. Making changes in the site requires changes in all the stored files which is a huge amount of work.

MySQL Caching: MySQL caches results of identical queries.

Memcached: Expensive results are stored in cache. Cache size can get big which cannot fit in RAM. Results have expiration time, so data can be garbage after sometime and expiration time increases with every cache hit.

Storage Engines:

innoDB supports transactions v/s MyISAM uses full table locks
Memory engine/heap engine: Stored in RAM, lost when server dies but avoids touching large database.
Archive engine: Compressed by default but slower to query
NDB: clustered which avoids single points of failure

Replication:

https://www.youtube.com/watch?v=-W9F__D3oY4

Master Slave: Master connected to many slaves where every slave has a copy of master DB. Any query on master is copied in slave. Upside: If one dies, we have other servers. Single point of failure for writes until slave is promoted.

Master master: Query at one server gets replicated to other server.

Load Balancer:

Active Active: Load balancers are constantly listening for connections where either one of them can receive packets from outside world and then relay them to backend servers. Send heartbeat to each other to indicate they are operational.
Active Passive: Active load balancer handles all the requests. If active dies, passive takes over and handles all the requests from outside world.
Partitioning: Approach1 — Distribution of different clients to different users(eg. separate server for Harvard and MIT). Approach 2 — Balance load based on high level user information.

Can use different database for every server only if partitioning is used based on user information. The other solution is to network all the database to all the servers. But this would get very ugly with many databases and servers. This can be handled by putting in more than one load balancer between servers and database. Keep the data centers at various locations to avoid network/power failure and keep them connected.

Security:

Get expensive load balancers that can handle cryptography and computational but we don’t need that security between load balancers and servers, so it works out well even if servers are cheap.

Hope you enjoyed reading it! Happy learning!

I publish blogs every Wednesday and Sundays! Follow me for System Design!

Notes from Harvard scalability lecture

Written by SAKSHI CHHABRA

No responses yet