System Design of WhatsApp (Tutorial)

Table of Contents[Hide][Show]

1. Key Requirements
2. Estimating Capacity
3. High-level architecture+−
4. Designing key features+−
5. Bottlenecks
6. Optimization techniques
Conclusion

WhatsApp is a social messaging program that allows users to exchange messages with one another.

Have you ever considered how WhatsApp works?

What are the concepts that underpin its creation and operation?

This article will go over the basics of WhatsApp system design.

We’ll also go through WhatsApp’s general architecture, which can be used to build any kind of chat software.

So, without further ado, let’s have a look at WhatsApp’s system design!

1. Key Requirements

WhatsApp is a highly scalable technology that is used by many people all over the world. As a result, it should be well-designed to be virtually always dependable and functioning.

As a result, determining the system’s critical needs is critical.

These are the minimum requirements for the WhatsApp messenger:

Capable of facilitating one-on-one interactions.
Message acknowledgment and last seen are both possible (Sent, Delivered, and Read).
Allow end-to-end encryption and media support (images/videos).

Let’s find out how much capacity our necessary service requires.

2. Estimating Capacity

Our objective is to create a platform capable of handling a large amount of traffic. Assume that 10 billion SMS are sent per day. We’ve got:

Every day, 10 billion SMS are sent by one billion people.
At peak traffic (per second), 700,000 people were active (6X average)
During peak usage, 40 million messages are transmitted per second.
The average length of a message is 160 characters: 10B * 160 = 1.6TB of data is generated every day.
Take ten years of service as an example: 10 * 1.6B * 365 PB
The entire application will be made up of microservices, each of which will execute a specialized task. Assume that sending a message takes 20 milliseconds and that there are 100 concurrent connections per server. As a result, the anticipated number of chat servers required = (chat messages per second Latency)/ concurrent connections per server = 40M * 20ms / 100 = 8000 servers.

3. High-level architecture

This system is built on two core services. Chat service and transitory service, for example. The chat service handles all of the traffic generated by users’ online messages. Simultaneously, the temporary service handles traffic when the user is offline.

If the user is online, the chat service is in charge of delivering messages.

It will verify whether the message’s recipient is online or not; if the recipient is online, this service will deliver the message immediately; if the recipient is not online, the transitory service will send the message to them when they return online.

High Level Architecture

The transitory service keeps a separate storage area for keeping temporarily accessible data till the offline user reconnects.

Designing High-Level APIs

This service has two high-level functioning APIs for sending and reading messages. The system can be implemented using the REST architecture.

Parameters for sending messages

This API will be used to transmit messages between two users.

Paramters Of Sending Message

Parameters of conversation

This API is used to display threaded chats. Consider this the first thing you see when you open WhatsApp. We’d only want to get a few messages for one user in a single API query. To handle this, the offset and message count parameters are needed.

Paramters Of Conversation

What are the functions of features like last seen, single tick, and double tick?

The important role in the deployment of these services is the acknowledgment service. These features were developed since this service continues to generate and verify acknowledgment answers.

Single tick: When a message from User A reaches User B, the server sends a single tick acknowledging that the message has been transmitted.
Double tick: After the server’s message has been sent to User B through the proper connection, User B will acknowledge receipt of the message to the server. The server will then provide User A with another acknowledgment. As a result, a duplicate tick will appear.
Blue tick: User B will send another acknowledgment to the server after checking the message. The server will then send User A an additional acknowledgment message. A blue tick will appear on User A’s screen after that.
Last seen: The heartbeat mechanism is entirely responsible for the last seen feature. Every 5 seconds, a heartbeat is transmitted to the server, which keeps track of each user’s last seen status in a table that can be readily accessed by any other user.

4. Designing key features

Personalized interaction

This is a necessary part of the Chat service. A user can simply send messages to another user using this service. Let’s have a look at how this works:

Assume Jay wants to communicate with Aayush. Jay is linked to a chat server with which he receives the message. Jay receives confirmation from the chat server that the message was despatched. The chat server is now requesting information from the data store about the chat server to which Aayush is connected. Jay’s chat server now transmits the message to Aayush’s chat server, and Aayush receives the message via a push mechanism. Aayush now sends an acknowledgment to Jay’s chat server, which notifies Jay that the message has been delivered. If Aayush read the message again, a fresh acknowledgment that the message had been read was delivered to Jay.

Status of User Activity

The last time a person was active is a regular feature of instant messengers.

User Activity Status

A system for maintaining a connection between the client and the server is depicted in this diagram. Web sockets were used to establish a bidirectional connection between the server and the client. These connections send heartbeats, which are used to monitor the user’s activity status.

End-to-End Privacy

End-to-end encryption is a key feature that ensures that only the conversing users can read the communications. A public key is shared among all users involved in the communication and is critical to sustaining End-to-End encryption. Assume that there are two users on the channel, Jay, and Aayush, who communicate with each other.

Jay has Aayush’s public key, and Aayush has Jay’s public key as well as their non-shared private key. As a result, when Jay transmits the message, he encrypts it with Aayush’s public key, which can only be decoded with Aayush’s private key.

Similarly, Jay will only be able to decode Aayush’s communication. As a result, only Jay and Aaysuh will be able to see each other’s communications, and the server will just function as a gateway in the whole process.

5. Bottlenecks

Every system is prone to malfunction. To manage such a large volume of traffic, the service must stay operational and fault-tolerant at all times to avoid bottlenecks. Because our service is entirely reliant on Chat and Transient servers, we must solve all of the issues that arise from their operation.

Failure of the Chat Server: This is the heart of our system. When users are online, it is responsible for managing and delivering messages. As a result, this system maintains links with its users.

As a result, if this service fails, the entire architecture will suffer. There are two approaches to managing the failure of the chat server. One method is to shift TCP connections to another server, while another one is to allow users to begin connections automatically in the event of a connection loss.

Failure of Transient Storage: Another component prone to failure that might eventually damage the entire service is transient storage. Messages en route to offline users are lost if this service fails.

We can prevent message loss by replicating each user’s temporary storage. As a result, the replica can be employed to process the functions whenever the user returns online. If the original server becomes accessible, both the original and replica instances of the user’s transitory storage are combined into a single store.

6. Optimization techniques

Latency: To deliver a seamless and improved client experience, the messenger service must be real-time. As a result, latency must be reduced by caching part of the often accessed data. We can cache user activity status and recent conversations in memory using a distributed cache like Redis.

Availability: We need our service to be available the majority of the time. Our system must be fault-tolerant, thus we can keep several copies of transitory messages so that any message that is lost can be quickly recovered from its duplicates. As a result, the system’s availability cannot be jeopardized.

Conclusion

Our system now supports only a few capabilities, but we can easily expand it to add group chats to distribute messages to several individuals. You can also provide video and phone call capabilities. This system can also be developed such that users can publish status updates or narratives and read each other.

I worked hard to provide you with a high-level overview of the WhatsApp system design. I hope you enjoyed it and will put it to good use.

System Design of WhatsApp (Tutorial)

1. Key Requirements

2. Estimating Capacity