Exposé of AWS’s ELB “pre-open” Optimization, part 2

or “what are all these ..reading.. requests” for?

Summary:

To get straight to the point: the pre-opened connections are an Amazon ELB provided optimization to eliminate the tcp handshake time between the ELB and EC2 instance when a client request is placed – I’ve confirmed this behavior with a member of the Amazon ELB engineering team. Note: if you do not wish to use the pre-open optimization AWS support can disable the optimization for you. If the details interest you the remainder of the post will describe how I went about discovering the purpose of the Amazon pre-open optimization prior to my discussion with AWS.

Investigation:

From the start I suspected that the “pre-open” behavior was a result of the AWS ELB “warming” connections to the EC2 backend. By identifying the source of a request by the source IP of the ELB, I came to understand that the tcp connection “pre-open” flow resulted in the following:

  1. A tcp connection being established: SYN (LB) -> SYN,ACK (EC2) -> SYN,ACK (LB)
  2. The EC2 instance transmitting a second ACK 1 second later.
  3. The EC2 instance closing the established tcp stream connecting 20 seconds later (note: the time waiting for the stream to be closed is set by the Apache configuration).

The image below is a screenshot of a captured “…reading…” connection.

ELB Reading Connection - TCP Stream

I wanted to confirm that this behavior was, in fact, a “pre-open” optimization and to confirm my suspicion that these connections would be used by the ELB if requests were placed during the window in which the connection was open. To test this, planned to prime the ELB with a number of requests, wait a number of seconds (for this example, 5 seconds) and then generate additional requests of a particular unique URL. When performing this test results showed that the pre-open connection was created by an ELB and then, after 5 seconds, an HTTP request would be placed over an already established connection. By changing the “wait” number in my tests, I was able to effectively control the period after a connection was opened that additional data would be sent. For example, if I primed the ELB with traffic and waited 5 seconds before sending a future http request, I’d see a tcp connection that waited 5 seconds before delivering an http request. If I primed an ELB with traffic and waited 18 seconds before sending a future http request, I’d see a tcp connection that waited 18 seconds before delivering an http request.

Final Thoughts:
  1. No documentation exists for the “pre-open” optimization. This isn’t good.
  2. In most cases the ELB “pre-open” optimization will have a positive impact on performance but and will not have a negative impact on service availability.
  3. Amazon support can be persuaded to turn off the “pre-open” optimization for ELBs that do not use sticky-sessions.

7 thoughts on “AWS ELB pre-open Connection Exposé, Part 2

  1. Really informative and insightful. I’d be curious to run the same tests with other web services like Nginx. I have a couple questions; When do the extra reading requests become an issue? Is it because of apache’s max connections? Or are there other scenarios when theses reading requests cause issues? Lastly are you implying that any ELB not using sticky sessions should have pre opened connections turned off?

    1. My colleague’s concern (and the reason I did the investigation) was that the pre-opened connections would use up all of Apache’s client pool and then force subsequent connections into the backlog. The belief was that the ELB health check itself did *not* use one of the pre-opened connections, causing the Health Check request to get caught in “ListenBacklog” (http://httpd.apache.org/docs/2.2/mod/mpm_common.html#listenbacklog) – the ListenBacklog wouldn’t decrease in size because the pre-opened connections weren’t being terminated. I believe the pre-opened connections / (…reading…) requests are a smart optimization but the lack of documentation creates a problem.

  2. *sorry posted same question twice – didn’t realize the first post went through. Anyway gtk – especially about the possible non terminated reading requests. Thanks for the update! 🙂 Again really useful for anyone using ELBs / AWS.

  3. Awesome blog post. Found this after encountering this issue ourselves. We are using mod_php and as result have a low number of workers to keep memory in check. The ELB was keeping all the workers in reading state and as result our monitoring server was getting timeouts to otherwise very healthy servers. Pushing AWS support to turn this off for us now, only solution we see would be to move to php-fpm.

  4. Thanks so much for this great blog post! I spent several days trying to figure out what was causing these ‘reading’ requests before finding your article. We also use php (Zend) and had to switch to ‘medium’ Amazon instances in order to avoid apache going into swap because of these excessive ‘reading’ connections…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s