Types of performance testing

Today performance testing includes many types of tests.

  • Latency Test – Measures end to end transaction time. This test can also vary from user point of view and within data center.
  • Throughput Test – Measures number of concurrent transactions a system can handle.
  • Load Test – a boolean test, if the system can handle the load or not.
  • Stress Test – Finds out the breaking point of a system.
  • Endurance Test – Measures if there are any anomalies during tests.
  • Capacity Planning Test – Made to find out whether the system performs as expected based on capacity planning and provisioning.
  • Degradation Test – Made to find out when the system performance degrades.

Multiple of these tests should be done to properly know about your working environment.

Resource hints for performance optimization

Resource hints can help you boost your web site/app performance.

While you can read the resource hints on W3 or on scoth.io. Below I will write up a brief.

You can also checkout resource hints usage on caniuse.com

pre-resolves DNS hostnames for objects in the page. Read more on Optimizing DNS Lookups. Usage is following:

<link rel="dns-prefetch" href="//cdn.example.com" crossorigin>

preconnect tries to hint the browser to initiate early connections. It includes TCP handshake, DNS lookup and TLS negotiation. Usage is following:

<link rel="preconnect" href="//cdn.example.com" crossorigin>

prefetch assets for further navigation. in place cache. Usage is following:

<link rel="prefetch" href="/images/hello.jpg">

pre-renders a page in the background. Usage:

<link rel="prerender" href="/about/index.html">

Bundling and minification

Bundling and minification is good reducing HTTP requests, which will have positive impact on the performance.

Less HTTP requests will boost your website speed.

Today in many web applications, several css and JS files are being used. Each JS and css files requires HTTP requests that goes to edge or origin. Even if you are using Persistent connections and multiplexing there is an associated latency cost. In order to avoid Round Trip Times (RTT) JS and css files can be bundled into a single file and minified. Fewer bytes means fewer round trips which means less time spent.

Moreover, you can compress HTML output and remove white spaces along with new lines. This improvement also will increase the response time.

For text files bundling and minification while using HTTP compression makes a difference. You will feel the difference right away. A server will compress objects before they are sent and result in a 90% reduction in bytes on the wire.

All textual content (html, js, css, svg, xml, json, fonts, etc.), can benefit from compression and minification.

Client side caching (Browser caching)

Nothing is faster than serving resources from the client’s machine. You won’t use any network resources.

The fastest request is the one you don’t make. The directive that tells the browser how long to cache an object is called the Time to Live or TTL.

It is not easy to decide the best TTL for a resource, however there are some heuristics.

Client side caching TTL can be set through the HTTP header “cache control” and the key “max-age” (in seconds), or the “expires” header.

Static content like images, JS, CSS and other files can be versioned. Once the version changes, client makes a request to get the newer version from the server.


v is the version number of the files. Once it changes, client goes to the server and request the changed static file, in our case, css file.

If you are using CDN, the usually embrace client side caching.

Avoid HTTP redirects

Aside from SEO purposes, intentional redirects are bad for performance.

In many high traffic sites you can see a HTTP 301 redirect. This is usually done for SEO purposes. HTTP 301 redirects can be cached.

However, if you do a redirect to another domain which mean a new HTTP connection which can add DNS lookup and latency.

If it is a redirect on the same domain, you can use rewrite rules to avoid new connections and provide transparency to the user.

On a different note, HTTP 301 and 302 has newer versions like 307 and 308.

See this stackoverflow question.

Optimizing DNS Lookups

A DNS lookup needs to be made before a connection to a host. It is essential to have this resolution to be made as fast as possible.

Following can be implemented as best practices:

Limit the number of unique hostnames in your application. Having too many domains in your app would increase your response time due to DNS lookups. On the other hand, recall Domain Sharding, so you will have to balance it.

You can use dns-prefetch, browser hint to prefetch DNS resolution of the resources.

<link rel="dns-prefetch" href="//ajax.googleapis.com">

In this example, while the initial HTML is being loaded and processed, DNS resolution will take place for this particular host.

You can easily spot these browser hints on source code of amazon.com and other high traffic sites. You will also see  a meta tag to enable or disable prefetch control that directs browser on doing so.

<meta http-equiv="x-dns-prefetch-control" content="on">
<link rel="dns-prefetch" href="https://s3.images.amazon.com">

With this approach DNS lookups happens in the background so that once the user needs it, browser won’t need to do additional DNS lookups which reduces the latency when the user takes action.

Hopefully these techniques will reduce DNS lookups.

Domain Sharding

Even though this method is considered an anti-pattern or obsolete nowadays (for HTTP/2), it is good to mention and valid for HTTP/1.1 applications. Domain sharding is the most extreme, and also possibly the most successful, HTTP/1 optimization.

As I mentioned in Connection Management post, browsers usually open 6 connection per host/domain. In order to increase the performance of a web page/app, we need the browser open more connections. This way the assets/objects on that page can be downloaded in parallel. So we shard our application to load resources from multiple domains.

If  the we want a faster Web site or application, it is possible to force the opening more connections. Instead of serving all resources from the same domain, say www.foobar.com, we can split over several domains, www1.foobar.com, www2.foobar.com, www3.foobar.com. Each of these domains resolve to the same server, and the Web browser will open 6 connections to each (in our example, we will have 18 connections). This technique is called domain sharding.

Without domain sharding:

With domain sharding:

With some DNS and app server tricks you don’t have to host the files/assets on a different server, you can use the same server to serve them.

The cost we have here is extra DNS lookups for and connection setup for sharded domains.

It is also good to mention that it is good to have a * certificate for the shards you are using.

While domain sharding helps performance by providing higher paralellism there are some drawbacks as well.

First of all, domain sharding introduces extra complexity to our application and code. It has a cost associated with it during development. There is no perfect number of shards. Each connection to the shards need to consume resources  (IO, CPU) and race with each other for bandwidth which causes poor TCP performance.


Connection Management in HTTP

Connection management is a key topic in HTTP which effects performance. Opening and maintaining connections impacts the performance of Web sites and Web applications.

HTTP uses TCP for its reliability. There has been different models of connection management throughout the evolution of HTTP.

Initially the connection model was Short lived Connections. Prior to HTTP/1.1 for every request, a new connection was setup used and disposed. As you can imagine this operation has major effect on performance. At that time it was the simplest deliverable solution to have working HTTP. Each HTTP request is completed on its own connection; this means a TCP handshake happens before each HTTP request, and these are serialized.

Opening each TCP connection is a resource-consuming operation. Several messages (RTTs) must be exchanged between the client and the server. Network latency and bandwidth affect performance when a request needs sending.

The TCP handshake itself is time-consuming, but a TCP connection adapts to its load, becoming more efficient with more sustained (or warm) connections. Short-lived connections do not make use of this efficiency feature of TCP, and performance degrades from optimum by persisting to transmit over a new, cold connection.

With HTTP/1.1 two new models were introduced namely, Persistent Connections and HTTP Pipelining.

A picture is worth thousand words.

The persistent-connection (keep-alive) model keeps connections opened between successive requests, reducing the time needed to open new connections and thus saving resources. HTTP pipelining goes one step further and sends multiple requests without waiting for a response.

Connection header was introduced with HTTP/1.1 by default persistent connections are enabled.

A persistent connection is a one which remains open for a period, and can be reused for several requests, saving the the need for a new TCP handshake, and utilizing TCP’s performance. This connection will not stay open forever: idle connections are closed after some time.

One drawback of persistent connections is that, they consume resources on servers. They must be closed after a period of time.

HTTP pipelining is not used or removed from browsers due to the complexity.

Cookieless domains

“An HTTP cookie (also called web cookie, Internet cookie, browser cookie or simply cookie) is a small piece of data sent from a website and stored on the user’s computer by the user’s web browser while the user is browsing. Cookies were designed to be a reliable mechanism for websites to remember stateful information (such as items added in the shopping cart in an online store) or to record the user’s browsing activity (including clicking particular buttons, logging in, or recording which pages were visited in the past). They can also be used to remember arbitrary pieces of information that the user previously entered into form fields such as names, addresses, passwords, and credit card numbers.” Wikipedia.

Cookies are sent to the server in request header. For every request cookies are sent to the server. In HTTP/1 headers are not compressed. Today in many modern apps uses several cookies to store several user information and it is not unusual to see the cookie sizes larger than a single TCP packet( ~ 1.5 KB ). As a result sending the cookies for every request has an associated cost with it which results in increased round trips.

That’s why it has been a rational recommendation to setup cookie-less domains for resources that don’t rely on cookies, for instance images, css and other static files. This is the case for HTTP/1. For HTTP/2 we have a different story.

The advantage of serving static objects from the same hostname as the HTML eliminates additional DNS lookups and (potentially) socket connections that delay fetching the static resources.

It is a best practice for HTTP/1 applications to serve static files from cookieless domain.

** The only time you want to send cookies to the server while requesting for an image or a static file is when you want to track the user. This is what ad serving businesses usually do.