A gentle introduction to HTTP/2

Let’s start with HTTP/0.9

The original HTTP as defined in 1991 was developed by Tim Barnes Lee who is a web developer. It was a text based request and response protocol. You could do HTTP GET and only for HTML (response type). It was solely used for sharing documents mostly about physics. After each request connection was closed.

Later on, HTTP/0.9 was extended to HTTP/1.0, request and response headers were added, also, you request images, text files, CSS and other.

In 1999, HTTP/1.1 showed up, Persistent connections (keep alive) was introduced, chunked transfer encoding and host header were added. With host headers, it was possible to host multiple sites for an IP. It was a huge SUCCESS!

Problems with HTTP/1.1:

  • It wasn’t designed for todays web pages
    • 100+ HTTP requests, and 2MB+ page size.
  • Requires multiple connections
  • Lack of prioritization
  • Verbose headers
  • Head of line blocking

Lets break it down.

Multiple connections:

When the app requires 100+ HTTP requests, there is a limit to the number of connections a browser can open per host, most browsers support 6 connections simultaneously. This becomes a problem, it takes time to establish and be efficient, there is 3 way handshake every time it needs a connection.

Before HTTP/1.1 each resource required a 3 way hand shake which was very wasteful.

With HTTP/1.1 Connection header was introduced. By default Connection: Keep-alive was introduced. With this header, the problem with three way handshake was eliminated and everything was able to be done via single TCP connection.

TCP is a reliable protocol. Each packet sent we receive a acknowledgement.

This introduced the head of line blocking problem as below. If the data 2 fail, it will block the subsequent requests.

There are other problems like slow start and sliding window. Adjusting the transfer rate by the condition of the network. This is called Flow Control.

Lack of prioritization:

Priority is decided by browser. Not be developer or the application. There is no way to specify the order or responses. Browsers need to decide how to best use the connections and the order of resources. There is no prioritization built on HTTP/1.1

Verbose headers:

There is no header compression. You can use GZIP to compress the content however, the headers such as Cookie, User Agent, Referrer and others are not compressed. HTTP is stateless by default; therefore Cookies, User Agent and other headers are sent to server every time, which is inefficient especially for very high volume sites.

Head of line blocking:

Once you create a connection, and send a request to a resource, that connection is dedicated for that request, until the response comes back, you can’t use that connection. ie: If you need to get 30 resources from host, you can get 6 at a time. Once you request 6 resources, the others must wait for these 6 requests to finish. So, HTTP/1.1 works in a serial way (serial request and response). This is called Head of line blocking.

These are the problems with HTTP/1.1. Then we have other problems like bandwidth and latency.

Bandwidth is measured in bits per second, which is relatively easy to add more to a system. Bandwidth is usually expressed as network bandwidth, data bandwidth or digital bandwidth.

Latency is the time interval between cause and affect in the system. In internet latency is typically the time interval between request and response. Latency is measured in milliseconds, which is based on distance and speed of light, there is not much to do when it comes to latency. You can use CDN to beat latency, however that comes with a price.

For more information about latency and impacts read “it’s the latency, Stupid“, by Stuart Cheshire.

Increasing bandwidth helps improve web performance, page loading times etc. However, there is a limit to it. On the other hand, fixing the latency problems almost helps linearly to performance. You can read this post for more information about bottlenecks. Based on the tests done on this post indicates that improving latency is more efficient than improving bandwidth, when it comes to web page optimization.

If we were to compare internet to a highway bandwidth is the number of lanes and latency is time it takes to travel a specific distance which depends on traffic, speed limit and others.

Goals of HTTP/2

  • Minimize impact of Latency
  • Avoid head of line blocking
  • Use single connection per host
  • Backwards compatible
    • Fall back to HTTP/1.1
    • HTTP/1.1 methods, status, headers still work.

Biggest success of HTTP/2 is reducing latency by introducing full request and response multiplexing.

HTTP/2 Major Features

  • Binary framing layer
    • Not a text based protocol anymore, binary protocols are easier to parse and more robust.
  • Resource prioritization
  • Single TCP connection
    • Fully multiplexed
    • Able to send multiple requests in parallel over a single TCP connection.
  • Header compression
    • It uses HPACK to reduce overhead.
  • Server push

HTTP/2 introduces some more improvements, more details: HTTP/2 RFC7540

Binary framing layer uses doesn’t use any text, currently we can trace and debug, this change will require tools for debugging.

Resource prioritization will allow browsers and developers to prioritize the resources requested. Priorities can be changed at any time based on resources or application. So far so good. However, if there is a problem with a high priority resource, browser can intervene and request the low priority resources.

Most important change in HTTP/2 is Single TCP connection per host which solves lot of problems. HTTP/2 multiplexes request and response frames from various streams. Lots of less resources are used, there is no 3 ways handshake, no TCP Start slow and Head of Line Blocking.

 

When a user requests a web page, headers are sent for every single request which are Cookies, User Agent and others. It doesn’t make too much sense to send User Agent for every single request. To solve this problem dynamic table was introduced.

When you send a request to a host, the following headers are sent along. On the consequent requests, not the whole values are sent, instead the compression values are transmitted. In the future requests if the compressed values are same, nothing will be sent again. If User-Agent doesn’t change it won’t send anymore.

HTTP headers

Original value Compression value
Method GET 2
Scheme HTTP 6
User-agent Mozilla Firefox/Windows/10/Blah 34
Host Yahoo 20
Cookie Blah 89

 

Currently server push is experimental today yet, servers try to predict what will be requested next and push to client.

ALNP is needed for HTTP/2.

As part of the spec, HTTP/2 doesn’t require HTTPS however, if you need HTTPS, you need to use TLS 1.2+ and also don’t use some cipher suites.

Can you use HTTP/2?

With HTTP/2 we have single multiplex connection, which means some of the performance optimization techniques sorta becomes obsolete such as CSS sprites, JS bundling, and domain sharding. Since connections are cheaper with HTTP/2 and more efficient, resources can be cached, modified and requested independently, there are of course tradeoffs which you need to decide how to implement.

However, I think, web performance optimizations like fewer HTTP requests, send as little and as infrequently as possible still applies.

While many major Internet giants are using HTTP/2, it is still not adapted as much. I assume, adoption will take a while and maturity of this new exciting protocol will come along.

Here are some demos, showing the difference between HTTP/1.1 vs HTTP/2.

Akamai demo

CDN 77 demo

 

Sharing is caring Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedInDigg thisEmail this to someoneShare on Reddit

Git view files in another branch without checkout

If you like to view a file in another branch without (without checking it out) you can do this:

git show branch:file

So if you are in a branch called Feature_fancy and there is a file called fancy_controller.js:

git show Feature_fancy:fancy_controller.js

This will show the content of the file without having to check out this branch.

Sharing is caring Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedInDigg thisEmail this to someoneShare on Reddit

Git remove untracked files

If you have bunch of untracked files that you haven’t staged and they are sitting around, you can clean them all in once. While running git clean you have 3 options.

-n option does a dry run and shows you which files will be removed.

#git clean -n

-i option asks you interactively what to do:

#git clean -i

The last option is with -f option which forces removal of the unstaged files.

#git clean -f

That is about it.

Sharing is caring Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedInDigg thisEmail this to someoneShare on Reddit

Git Amend your last commit

Git has great features. One of them is to amend your previous commit. If you left something out on your last commit which can a file, or a change in a file that you just committed. Git amend can easily fix your problem.

You need to stage your changes with git add first.

#git add .

Then commit it again with :

#git commit –amend

You can also use –m option to overwrite the previous commit message.

#git commit –amend -m “your new message”

Once you are done you can check the log with:

#git log –stat

Git amend is pretty useful to recover from incomplete commits.

Sharing is caring Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedInDigg thisEmail this to someoneShare on Reddit

Git Cherry-pick

In a recent project, release team is having lots of problems while going to deployment.

Scenario: There are set defects, and new features which are planned for a release cycle. However, not all of them can make it to production. In current situation the whole release is postponed or cancelled.

Or, you have a feature branch that isn’t ready for a full merge but you have some commits that you want to push to master.

Git cherry-pick to rescue.

From your feature branch, defect or any branch, copy the first six characters of  commit hash that you want to cherry-pick.

In order to with the commit hash you can use git log.

#git log –oneline

Now checkout the branch that you want to insert the commit onto such as master.

#git cherry-pick ad9381b

Now if you do a git log you will see your cherry-picked commit at the top of the history.

If you like to indicate that the commit is cherry picked you can use -x option so that message will contain cherry-pick information. You can see in your branch that this commit has been cherry-picked with something like following:

(cherry picked from commit 676aac131c1349ebd5610b270c470a33bddc7cb0)

Pay attention to cherry-picking a lot of commits out of order, the Git log will reflect the order in which you cherry-picked, not the chronological order of the original commits (The original commit date is preserved, however).

While cherry-picking is pretty handy, use with caution. Finally, I hope the image below can help you understand it better.

Sharing is caring Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedInDigg thisEmail this to someoneShare on Reddit

Circuit breakers

In my previous post, bulkheads, I mentioned that “Since the popularity of service oriented architectures and then microservices, people are talking about bulkheads and some other terms like circuit breakers, timeouts etc. However, I see that many lacks the understanding of what these terms really mean or how to implement them. I decided to cover some of them and wanted to start with Bulkheads.”. In this post I will briefly cover up Circuit breakers.

Not many years ago, people used to plug too many electronical appliances into their circuit, each appliance drew certain amount of current and when the current is resisted, it created so much head in the walls of the house and then the house burns down. Later on there were some optimizations but still houses on fire.

Circuit breakers come to rescue.

This idea can also be applied to software applications with integration points. A software with one or more integrations are destined to fail at any given time. This is a given. Circuit breakers can help prevent operations that is already known not to be healthy.

Circuit breakers are a way to degrade functionality when the system is under stress. Changes in the circuit breakers’ state should always be logged and monitored. Circuit breakers are effective at guarding against integration points, cascading failures, slow responses etc.

In a normal closed state, circuit breakers executes operations as usual. Other services can be invoked, internal operations can proceed. However if any operations fail, circuit breakers store this information, that is including any times. Once the number of failures exceeds a certain threshold, circuit breaker trips and opens the circuit. Any call made to the circuit breaker, it fails immediately.

This is very important, because most of the failures occurs  due to blocked threads, race conditions and dead locks. Hence, if a service is not responding or continuously timing out, what is the point of invoking it. All yours threads will be blocked and you will run out of them. Soon, your JVM or runtime will crash.

After a configurable amount of time, circuit breaker can go into half open state, in which calls can pass-through and if all goes well, circuit breaker closes. If things are still failing, circuit breakers opens again.

If circuit breaker is open, you can either let the user know something is not working and check back soon. Or you can have fallback services. The latter is much more better however services unfortunately can not have fallback due to their responsibility and function.

You can have multiple circuit breakers for different purposes such as timeouts, connections refused and other type of failures.

There are several tools that can help you implement circuit breakers in your system. Netflix has an open source project for this purpose called hystrix . You can check out it and see how things work.

 

Sharing is caring Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedInDigg thisEmail this to someoneShare on Reddit

Bulkheads

Since the popularity of service oriented architectures and then microservices, people are talking about bulkheads and some other terms like circuit breakers, timeouts etc. However, I see that many lacks the understanding of what these terms really mean or how to implement them. I decided to cover some of them and wanted to start with Bulkheads.

This term comes from Ships. In a ship, bulkheads are partitions that can be sealed and closed during an emergency. Something like following:

If one of the compartment starts taking water, it can be sealed once hatches are closed which prevents the water moving from one compartment to another hence sinking the ship.

Same technique can be employed for software and architecture. By partitioning your system you can avoid cascading failures. Bulkheads can be applied to physical and application services in such a way that if one of the hardware or application fails, the system should continue functioning. Critical applications should be partitioned and bulkheads should be implemented.

Imagine you have an application A and application B. Then there is a critical common service called service C. This service is very critical for both apps. In a conventional architecture, the design is as follows:

The problem with this architecture is if Service C goes down for any reason both of the apps will be affected. So Bulkheads pattern recommends the following:

Deploying Service C for both of the Apps provides better stability for the apps. This can be simply independent hardware, application host or thread pool.  You can partition thread pool in an application by deploying to multiple virtual machine.

Today many application servers provide means to separate runtime environments for applications. You can deploy the same application under different context and assign seperate JVM or CLR to go with it.

Also today we have docker and several Virtualization software which makes implementing Bulkheads easily.

 

Sharing is caring Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedInDigg thisEmail this to someoneShare on Reddit

The notorious SSL Handshake

The notorious SSL handshake process happens as following.

  1. The client issues a secure session request.
  2. Server sends back x.509 certificate containing server’s public key.
  3. Client authenticates server’s certificate against list of known CAs (Certificate Authority). If the certificate is not in the list, user is prompt to accept the certificate.
  4. Client generates random symmetric key and encrypts it using server’s public key.
  5. Client and user now both have the symmetric key. Client send data using this symmetric key to the server during the session.

If you like to see it in action. Open up your Chrome tools. Browse to chrome://net-internals/#events

Then go a secure URL, something like https://amazon.com . In the events log you will see the events for SSL handshaking.

If you browse through events, you will see the handshaking process.

Sharing is caring Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedInDigg thisEmail this to someoneShare on Reddit