Post

The Definitive Guide to Web Performance - Book Notes

The Definitive Guide to Web Performance - Book Notes

Components of TCP

The Internet has two core protocols: IP and TCP; IP (Internet Protocol), which is responsible for routing and addressing between networked hosts; and TCP (Transmission Control Protocol), which is responsible for providing a reliable abstraction layer over unreliable transmission channels. It hides from the application layer most of the complex details of network communication, such as packet loss retransmission, sequential sending, congestion control and avoidance, and data integrity.

Disadvantages

  • Handshake mechanism: each transmission of application data must be preceded by a complete round trip (three handshakes)
  • Congestion crashes: in complex networks, IP gateways are vulnerable to congestion crashes. It may be that the round trip time exceeds the maximum interrupt interval for all hosts, and the host will send each packet several times. The corresponding host will create more and more copies of the datagram in the network, paralyzing the network. Eventually, the buffers of all switching nodes will be filled and the extra packets must be deleted.
  • Flow control: after three handshakes, each party to a TCP connection advertises its receive window size (rwnd), which contains information about the size of the buffer space that can hold data.
  • Slow Start: In order to prevent either end from sending too much data to a potential network, the maximum amount of data transmitted by a TCP connection is taken as the minimum of rwnd and cwnd (congestion window size). When TCP starts transmitting data on a network or discovers that data has been lost and starts retransmitting, it first slowly tries out the actual capacity of the network to avoid blocking due to sending too much data. The host sends a message and then stops and waits for an answer. Each time an answer is received, the congestion window increases in length until it equals the set threshold.

Optimization

  • Increase TCP’s initial congestion window: this allows TCP to transfer more data on the first round trip and subsequent speedups are significant.
  • Slow start restart: disabling slow start when the connection is idle improves performance for long TCP connections that send data instantaneously
  • Enable window scaling: this increases the maximum acceptance window size, allowing high latency connections to achieve better throughput
  • TCP Fast Open: allows application data to be sent in the first SYN packet (first handshake)

Transport Layer Security (TLS)

SSL (Secure Sockets Layer) 3.0 upgraded version that is TLS (Transport Layer Security) 1.0. TLS is located in TCP (Transport Layer) on the upper layer (Session Layer). It does not affect the upper layer (application layer) protocols, but is able to secure network communications with the upper layer protocols.

Encryption, authentication and integrity

  • Encryption: a mechanism for obfuscating data. the handshake mechanism specified by TLS uses asymmetric key encryption. The detailed process is mentioned in the following section of T
  • Authentication: Mechanism for verifying the validity of an identity. It allows both ends of the communication to verify each other’s identity. This authentication begins with the establishment of a “Certification Authority Chain of Trust”.
  • Integrity: Mechanism for detecting whether a message has been tampered with or forged. Signs each message with a MAC (Message Authentication Code). Whenever a TLS record is sent, a MAC value is generated and appended to the message, and the receiver determines the integrity and reliability of the message by calculating and verifying this MAC value.

TLS handshake

  1. The TLS handshake operates on the reliable Transport Layer (TCP), i.e., the “three handshakes” of TCP are completed first. 2.
  2. After the TCP connection is established, the client sends some specifications in plain text, including the version of the TLS protocol, the supported encryption suites, etc. 3.
  3. The server sends a response to the client based on the client’s specifications and with its own certificate. You can also send a request for the client to provide a certificate, etc.
  4. Assuming that both ends have determined a common version and encryption suite, the client also provides its own certificate to the server. The client then generates a new symmetric key, encrypts it with the server’s public key, and sends the encryption to the server.
  5. The server decrypts the symmetric key sent by the client with its own private key, checks the integrity of the message by verifying the MAC of the message, and returns an encrypted “finished” message to the client. 6.
  6. The client decrypts this message with its previously generated symmetric key, verifies the MAC, and if all is well, establishes the channel and starts sending application data.

Chain of trust and certificate authorities

How to verify the identity of both ends of the communication? The most common solution is authentication through a CA (Certificate Authorrity). A CA is a third party trusted by both the certificate owner and the party relying on the certificate. The browser specifies a trusted CA, and the CA audits and verifies that the site’s certificate has not been misused or impersonated, or that the site’s certificate needs to be revoked.

Web Performance Essentials

The main factor limiting Web performance is the network round-trip delay between client and server. This is a direct consequence of the performance impact of underlying protocol features such as the TCP handshake mechanism, flow and congestion control, and head-of-queue congestion caused by packet loss.

Optimization

Most browsers do these optimizations for us automatically:

  • Resource prefetching and prioritization
  • DNS pre-resolution: by learning navigation history, user mouse hover, etc.
  • TCP pre-connection
  • Page pre-rendering: pre-render entire pages in hidden tabs

Optimization suggestions for developers:

  • Optimize page structure: important resources such as CSS and JS should be in the document as early as possible; CSS should be delivered as early as possible so as to unblock and allow JS to execute, etc.
  • Embedding hints in the document to trigger the browser’s optimization mechanisms: e.g.
<!-- 预解析域名 -->

<link rel="dns-prefetch" href="//xxx.com">
<!-- 预取得关键性资源 -->
<link rel="subresource" href="/xxx.js">
<!-- 预取得导航要用的资源 -->
<link rel="prefetch" href="/xxx.jpeg">
<!-- 预渲染特定页面 -->
<link rel="prerender" href="//xxx/xxx.html">

HTTP1.

Persistent connection

One of the major improvements of HTTP 1.1 was the introduction of persistent HTTP connections. The total time spent on HTTP requests sent by a new TCP connection is equal to at least two network round trips: one for handshaking and one for request and response. And by adding support for persistent HTTP connections and reusing the underlying connection, the three handshakes for the second TCP connection can be avoided, eliminating another TCP slow-start round trip and saving an entire network latency. Thus:

  • Without persistent connection: each request results in two round-trip delays
  • With persistent connections: only the first request causes two round-trip delays, subsequent requests cause only one round-trip delay

HTTP Pipeline

Persistent connections allow us to reuse existing connections to fulfill multiple application requests that must strictly satisfy the first-in-first-out (FIFO) queue order. HTTP pipes, on the other hand, allow us to migrate the FIFO queue from the client (request queue) to the server (response queue).

  • No HTTP pipeline: the client initiates the request, the server processes the request, and then the response is returned, then the client initiates the request again, and the server processes the request again
  • HTTP pipeline: the server processes the request in parallel, eliminating the wait time for sending the request and response, and then returns the response serially.

Because of the limitations of the protocol, multiplexing (interleaving the arrival of multiple response data on a single connection) is not possible and responses can only be returned serially. This has the potential to cause the following problems:

  • Queue head blocking: such as the client sends two requests at the same time (first HTML and then CSS), and the CSS resource is ready first, the server will also send the HTML response first, and then deliver the CSS
  • A single slow response can block all subsequent requests
  • A failed response may terminate the TCP connection, forcing the client to resend requests for all subsequent resources, resulting in duplicate processing

wait a minute!

Using multiple TCP connections

Since the protocol does not support multiplexing and sending requests one after another is just too slow, browser developers allow us to open multiple TCP connections in parallel. Most browsers support hosts opening 6 connections.

Domain name partitioning

For pages containing multiple resources, 6 parallel connections may still not be enough. Therefore, instead of making all resources available through only one host, we can manually spread all resources across multiple subdomains, thus breaking the connection limitations of browsers and achieving higher parallelism. However, the disadvantages of this are: each new hostname requires an additional DNS query, and each additional socket consumes more resources on both ends; resources need to be manually separated and hosted on multiple hosts

Other optimizations

  • Measuring and controlling protocol overhead: servers and clients can extend initials, and protocols do not place limits on initial size. Therefore, there should be less data (highly repetitive and uncompressed) to transfer to the header
  • concatenation and collocation: combining multiple js files (or other resources) into a single file; combining multiple images into a larger composite image (image sprites). Disadvantages of this approach: reduces initial startup speed, speed of updating resources, memory usage. Solution: separate out CSS for first draw, incrementally deliver smaller js blocks, etc.
  • Embedded resources: small, only used once resources can be directly embedded in the page. Such as script and style blocks, data URIs (base64) etc.

HTTP2.0

The purpose of HTTP 2.0 is to reduce latency by supporting multiplexing of requests and responses, minimizing protocol overhead by compressing HTTP first fields, and adding support for request prioritization and server-side push.

Binary framing layer

At the heart of HTTP 2.0’s performance enhancements lies the new binary framing layer (headers frames and data frames), which defines how HTTP messages are encapsulated and transmitted between the client and the server. HTTP 2.0 reduces the basic unit of HTTP protocol communication to frames, which correspond to messages in a logical stream. Accordingly, many streams can exchange messages in parallel over the same TCP connection.

Multi-directional request and response

As stated in 2.3, in HTTP 1.x, multiple TCP connections must be used if the client wants to send multiple parallel requests as well as improve performance. This is because the delivery model of HTTP 1.x guarantees that each connection delivers only one response at a time (multiplexing is not supported), which also leads to head-of-queue blocking and reduces the efficiency of the TCP connection. HTTP 2.0’s binary framing layer breaks through these limitations and enables multi-way requests and responses: clients and servers can break up HTTP messages into non-dependent frames, send them out in a jumbled order, and finally put them back together again at the other end. This enables multiple requests or responses to be sent in parallel and interleaved using only one connection, with no interactions between requests/responses, eliminating unnecessary delays and reducing page load times.

Request priority

Each stream can carry a priority value of 31 bits.HTTP 2.0 further improves performance by optimizing the interleaving and transmission order of these frames.

  • 0 indicates highest priority
  • 2^31 - 1 indicates lowest priority

The browser prioritizes requests based on the type of resource and its position on the page, and even learns prioritization patterns from previous visits.

Other optimizations

  • One connection per source: no longer rely on multiple TCP connections as in HTTP 1.x, only one connection is needed between client and server
  • Flow control: multiple streams over the same TCP connection means sharing bandwidth. Flow control is performed on a per-hop basis, window update frames, etc.
  • Server push: the server can send multiple responses to a client request. That is, in addition to the response to the initial request, the server can additionally push resources to the client without the client having to explicitly request them.
  • Header Compression: Compresses header metadata, using a “header table” to track and store previously sent key-value pairs. This header table exists for the duration of the connection and is updated by both the client and the server. This way, the second (and subsequent) requests only need to send the changed part of the first part, and the generic key-value pairs, which hardly ever change, only need to be sent once.

Classic performance optimization best practices

Transport layer/network layer aspects

  • Reduce DNS lookups
  • Reuse TCP connections, using persistent connections where possible, eliminating TCP handshakes and slow start delays
  • Reduce HTTP redirects, especially between different domains
  • Use CDNs (Content Delivery Networks) to put data closer to the user’s geographic location
  • Remove unnecessary resources

Application layer (HTTP) aspects

  • Caching resources on the client side (cache)
  • Transmit compressed content
  • Eliminate unnecessary request overhead, reducing the first data of a request
  • Parallel processing of requests and responses, HTTP 2.0 optimizations
  • Optimizations for protocol versions

Browser APIs and protocols

Having previously discussed the performance characteristics of TCP and HTTP, it is now important to understand how to most appropriately utilize the browser’s network PIs, protocols, and services to bring performance gains to applications.

Browser network overview

  • Automated socket pool management
  • Network security and sandboxing: manages all open socket pools and enforces connection limits; formats all requests and automates response decoding; performs TLS handshakes and necessary certificate checks; same-origin policies.
  • Resource and client state caching
  • Application PIs and Protocols: XMLHttpRequest, Sever-Sent-Event, WebSocket, etc.

XMLHttpRequest

XMLHttpRequest (XHR) is an application API provided by the browser that allows developers to implement data transfer via JavaScript, enabling asynchronous communication in the browser.

This post is licensed under CC BY 4.0 by the author.