If you are curious about how browsers work or want to quickly prepare for your frontend interview 👀 then you have come to the right place. This article is basically from my notepad where I summarized the process of browser rendering while I was studying it myself. I felt it will be better to make it an article where I can easily find it for reference and also share with you. Alright, let's get straight to business.
This happens the first time a client navigates to a certain domain. The browser requests a DNS lookup and the server responds with an IP address which will likely be cached for a certain duration so that subsequent requests will be faster.
Once the IP address is known, the browser sets up a connection via a TCP-three way handshake and this technique is often referred to as "SYN, SYN-ACK, ACK"(SYNchronize, SYNchronize-ACKnowledgement, and ACKnowledge). This is done so that computers that want to communicate can negotiate the parameters of connection before transmitting data such as HTTP browser requests.
The host, generally the browser, sends a TCP SYNchronize packet to the server. The server receives the SYN and sends back a SYNchronize-ACKnowledgement. The host receives the server's SYN-ACK and sends an ACKnowledge. The server receives ACK and the TCP socket connection is established.
TLS negotiation is another handshake for establishing secure connection. This determines which cipher will be used to encrypt the communication, verifies the server and establishes that a secure connection is in place before beginning the actual data transfer.
Once secure connection has been established, the browser makes a HTTP GET Requests and the response for this initial requests contains the first byte of data received.
TIME TO FIRST BYTE (TTFB) - This is the time between when the user made the request and the receipt of the first packet of HTML which is usually 14kb of data.
The browser turns the data it receives over the network into DOM and CSSOM which is used by renderer to paint a page on the screen.
Building the DOM tree
The browser parses the HTML mark up by tokenization and builds the DOM tree. The tokens include opening and closing tags, attribute names and values. The browser keeps parsing the HTML even when it sees non-blocking resources like an image or CSS file but pauses parsing when it sees script tags without an async or defer attribute.
This is an optimization feature that reduces blockages by retrieving resources in the background so that by the time the HTML parser reaches the requested assets, it may have already been in flight or downloaded
Building the CSSOM
The browser converts the CSS rules into a map of styles by going through each rule set in the CSS, creating a tree of nodes with parent, child and sibling relationship based on the CSS selectors.
Accessibility Object Model(AOM): the browser also builds an accessibility tree that assistive devices use to parse and interpret content. AOM is like a semantic version of the DOM.
The DOM tree and CSSOM tree are combined into a render tree then used to compute the layout of visible elements.
This is the process by which the width, height and location of the nodes in the render tree are determined, as well as the size and position of objects on the page.
The browser converts each box calculated in the layout phase to pixels on the screen drawing visual parts of an element like text colors, borders, shadows etc.
- MDN Docs - Populating the page: how browsers work
If you would like to connect with me, I'm available on;
- Twitter: BrunoElo