The web-2.0 Ajax Comet use-cases have been getting most the attention when it comes to asynchronous features of servlet containers. The Asynchronous features are needed so that a request may wait without a thread allocated while waiting for a comet event to deliver to the Ajax client.
However, there are some compelling use-cases for traditional web-1.0 applications to benefit greatly from asynchronous features. Previously I have looked at how threadless waiting for resources can improve quality of service. In this blog, I look at how an asynchronous servlet container like Jetty can greatly improve the scalability for serving medium and large content. Furthermore, I reveal how your benchmarking is probably not giving you the full story when it comes to scalability.
When applications are benchmarked and/or stress tested, it is frequently the case that while the server environment is a good approximation of the deployment environment, the simulated client environment is by necessity only a very rough approximation of the real world. Consider the network speed between test client and test server, this is frequently a 100Mbs (or even a gigabit) LAN connection. Unfortunately the load profile of 1000 simulated users share 1 fast connection can be vastly different from that of 1000 real users on 1000 ISP limited connections.
Assuming all users are broadband and that average transfer rates can be maintained at an average of 1Mbs. This means that content can take over 100 times as long to transfer over an ISP limited connection than over a fast local LAN. This means that a 20KB logo takes 0.15s to serve, the 250KB content of a typical home page takes 1.9s, a 1MB images takes 8s and a 5MB mpeg or pdf takes 40s
The problem with this latency, is that blocking servlet containers will need to allocate a thread while that content is being flushed to the client. If a blocking container is serving 10 mpegs per second then 400 threads could be needed by a non-asynchronous server just to flush the content! Serving 20KB dynamic content to 1000 requests per second needs 156 threads dedicated to flushing content above and beyond the threads needed to receive requests, handling them and generate the responses.
While servlets are blocking, Jetty 6 has asynchronous features that allow threads to be freed while content is being flushed after a servlet has returned from service. If the content being served is static content, then an NIO memory mapped buffer is used to directly send the content to the Socket Channel, a thread is only allocated to handle the initial request and the remainder of the content is flushed by the JVM and operating system without any user threads being allocated.
For dynamic content, Jetty 6 will buffer content until the buffer is filled, the servlet flushes the buffer or the servlet returns from the service method. In this later case, Jetty is able to flush the buffer asynchronously and only allocate a thread for the short period needed to initiate the write of each block of data. Because of Jetty’s flexible buffer allocation mechanism, large content buffers can be allocated which will mean that servlets will frequently run to completion without blocking and then the generated content is asynchronously flushed.
For example, I ran Jetty 6 on a mid-sized amazon EC2 node serving 20KB dynamic content at 80 requests per second. Over an 8Mbs link with a 180ms ping time over 40 threads were needed to when running in blocking mode (content flushed by servlet). In asynchronous mode, the content was served with only 2 threads! When the same test was run locally, 2 threads were sufficient in both cases.
So in summary, there are two morals to this blog! Firstly asynchronous features can have a significant impact on the scalability of your web application, even if you are not using any Ajax clients. Secondly, running benchmarks on fast networks may give you a very false reading on the true scalability of your application.