Blog

Jetty @ Eclipse Summit Europe
I have presented the advanced features of Jetty at the Eclipse Summit Europe 2010.

Highlights of the features that I presented are:
- Jetty & OSGi, showing the full integration of Jetty with the OSGi world
- Jetty’s asynchronous HTTP client
- Jetty’s WebSocket protocol support
- Jetty’s WTP integration
- Jetty’s Cloudtide, for multi-tenanted deployments of web applications
Here you can download the slides of the presentation.
09/11/2010
Comet Panel Video

In July I participated in London to the Comet Panel, organized by the London Ajax User Group.

Me and a bunch of other Comet experts talked about our own Comet projects, and I talked about CometD.

Here you can watch the video of the meeting (English, 1hr 38m).

08/11/2010
CometD 2 Annotated Services
A new feature that has been recently added to the upcoming CometD 2.1.0 release is
the support for annotated services, both on server side and on client side.

Services are the heart of a CometD application, because allow to write business logic that is executed when a message
arrives on a particular channel.
For example, in chat applications when a new member joins a chat room, the chat room’s members list is broadcasted to all
existing chat room members.

This is how an annotated service looks like:
```
@Service("echoService")
public class EchoService
{
    @Inject
    private BayeuxServer bayeux;
    @Session
    private ServerSession serverSession;

    @PostConstruct
    public void init()
    {
        System.out.println("Echo Service Initialized");
    }

    @Listener("/echo")
    public void echo(ServerSession remote, ServerMessage.Mutable message)
    {
        String channel = message.getChannel();
        Object data = message.getData();
        remote.deliver(serverSession, channel, data, null);
    }
}
```
Annotations are great because they carry a semantic meaning to their annotation target.
Even if you have not written a CometD service before, the code above is pretty straightforward, and it is easy
to understand that the echo() method is invoked when a message arrives on the "/echo" channel.

Annotated services also reduce the amount of boilerplate code, especially on client side, where it is typical
to add listeners and subscribers as anonymous inner classes.

Compare:
```
bayeuxClient.getChannel(Channel.META_CONNECT).addListener(new ClientSessionChannel.MessageListener()
{
    public void onMessage(ClientSessionChannel channel, Message message)
    {
        // Connect handling...
    }
});
```
with:
```
@Listener(Channel.META_CONNECT)
public void metaConnect(Message connect)
{
    // Connect handling...
}
```
I find the second version much more readable, and since code is much more read than written, this is a fine improvement.

Lastly, CometD annotated services leverage the standard annotations for dependency injection and lifecycle management,
and therefore integrate nicely with Spring 3.x.

Take a look at the annotated services documentation
and to the CometD 2 Spring integration for details.

Enjoy !
13/10/2010
ITConversation podcast on Cometd and Push Technology

Phil Windley of Tecnometria has recorded an interview with me on Cometd and Push Technology. The podcast is available from ITConversations and provides an introduction to comet and cometd.

03/08/2010
Cometd-2 Throughput vs Latency
With the imminent release of cometd-2.0.0, it’s time to publish some of our own lies, damned lies and benchmarks. It has be over 2 years since we published the 20,000
reasons that cometd scales and in that time we have completely reworked both the client side and server side of cometd, plus we have moved to Jetty 7.1.4 from eclipse as the main web server for cometd.

Cometd is a publish subscribe framework that delivers events via comet server push techniques from a HTTP server to the browser. The cometd-1 was developed in parallel to the development of many of the ideas and techniques for comet, so the code base reflected some of the changed ideas and old thinking as was in need of a cleanup. Cometd-2 was a total redevelopment of all parts of the java and javascript codebase and provides:
- Improved Java API for both client and server side interaction.
- Improved concurrency in the server and client code base.
- Fully pluggable transports
- Support for a websocket transport (that works with latest chromium browsers).
- Improved extensions
- More comprehensive testing and examples.
- More graceful degradation under extreme load.
The results have been a dramatic increase in throughput while maintaining sub second latencies and great scalability.

The chart above shows the preliminary results of recent benchmarking carried out by Simone Bordet for a 100 room chat server. The test was done on Amazon EC2 nodes with 2 x amd64 CPUs and 8GB of memory, running ubuntu Linux 2.6.32 with Sun’s 1.6.0_20-b02 JVM. Simone did some tuning of the java heap and garbage collector, but the operating system was not customized other than to increase the file descriptor limits. The test used the HTTP long polling transport. A single server machine was used and 4 identical machines were used to generate the load using the cometd java client that is bundled with the cometd release.

It is worth remembering that the latencies/throughput measured include the time in the client load generator, each running the full HTTP/cometd stack for many thousands of clients when in a real deployment each client would have a computer/browser. It is also noteworthy that the server is not just a dedicated comet server, but the fully featured Jetty Java Servlet container and the cometd messages are handled within the rich application context provided.

It can be seen from the chart above, that message rate has been significantly improved from the 3800/s achieved in 2008. All scenarios tested were able to achieve 10,000 messages per second with excellent latency. Only with 20,000 clients did the average latency start to climb rapidly once the message rate exceeded 8000/s. The top average server CPU usage was 140/200 and for the most part latencies were under 100ms over the amazon network, which indicates that there is some additional capacity available for this server. Our experience of cometd in the wild indicates that you can expect another 50 to 200ms network latency crossing the public internet, but that due to the asynchronous design of cometd, the extra latency does not reduce throughput.

Below is an example of the raw output of one of the 4 load generators, which shows some of the capabilities of the java cometd client, which can be used to develop load generators specific for your own application:
```
Statistics Started at Mon Jun 21 15:50:58 UTC 2010
Operative System: Linux 2.6.32-305-ec2 amd64
JVM : Sun Microsystems Inc. Java HotSpot(TM) 64-Bit Server VM runtime 16.3-b01 1.6.0_20-b02
Processors: 2
System Memory: 93.82409% used of 7.5002174 GiB
Used Heap Size: 2453.7236 MiB
Max Heap Size: 5895.0 MiB
Young Generation Heap Size: 2823.0 MiB
- - - - - - - - - - - - - - - - - - - -
Testing 2500 clients in 100 rooms
Sending 3000 batches of 1x50B messages every 8000
 
```
23/06/2010

Lies, Damned Lies and Benchmarks

Benchmarks like statistics can be incredibly misleading in ways that are only obvious with detailed analysis. Recently the apache HTTPCore project released some benchmark results whose headline results read as:

Jetty
HttpCore

Linux BIO
35,342
56,185

Linux NIO
1,873
25,970

Windows BIO
31,641
29,438

Windows NIO
6,045
13,076

Looking at these results, you see that HttpCore has better throughput in all scenarios except bocking IO running on windows. More importantly, for Linux NIO the performance of Jetty is an order of magnitudes behind HttpCore!!
So is HttpCore really an faster than Jetty and does Jetty NIO suck? For this particular benchmark, the answer is obviously YES and YES. But the qualification “for this particular benchmark” is very important, since this benchmark is setup to places a huge penalty on the kind of latency that jetty uses to dispatch requests. Normally latency can be traded off for throughput, but with this benchmark, adding 2ms of latency to a request is the difference between 56,000 requests/sec and 8000 requests/sec. Jetty makes frequent latency vs throughput tradeoffs, is thus is severly penalized by this benchmark.
[Note that I’m not saying the HttpCore team have done anything wrong and the “Lie, Damned Lies” head line is only a joking reference to the Mark Twain quote about the power of numbers to show almost anything. Our own benchmarks are biases towards our own sweet spots. This blog seeks only to explain the reasons for the results and not to criticize the HttpCore team].

HTTP
Server Throughput Limits

Typically the throughput of a server is going to be limited by the minimum of one of the following factors:

Network Bandwidth Limitations

The total network capacity may limit the maximum throughput. If each response is 10KB in size and the network is only capable of 10MB/s, then 1024 requests per second will saturate that benchmark. The HttpCore benchmark used 2048B messages of the localhost network, which essentially has no maximum throughput. So for the modelling of this benchmark, I have assumed a GB network, which would have a potential maximum through put of 524288 requests/sec, if it is not limited by other factors.

CPU Limitations

The number of request that can be processed may be limited by the available CPU power. If each request took 2ms of CPU time to process, then each CPU could only handle 500 requests per second. For the HttpCore benchmark, they had a 4 CPU box and they have very simple/efficient request handling that took less than 0.018ms per request, which results in potential maximum throughput of 4*1000/0.018 = 222,222 requests/sec, if
it is not limited by other factors.

Connection Limitations

HTTP typically has 1 request outstanding per connection (except when pipelines are used (rarely)), thus the maximal throughput of the server may be limited by the sum of the maximal throughput of each connection. The maximal throughput of a HTTP connection is mostly governed by the round trip time of each request, for example if each request takes 10ms in it’s round trip, then a connection can only handle 100 requests per second. The HttpCore benchmark has requests that take 0.45ms round trip on 25 connections, which results in a potential maximum throughput of 25*1000/0.45ms = 56180 requests/second.

HttpCore Throughput Limitation

It can be seen from the analysis above that the HttpCore benchmark throughput is the limited at 56180 requests/second by the number connections and the round trip time of a request over each connection. More importantly, this limit is numerically sensitive to the specific values chosen for the number of connections and latency. The following chart shows the minimum of the 3 limitations for the HttpCore benchmark against the number of connections and additional request latency (either in network or the server):

CPU’s 4

CPU time per request (ms) 0.018

Network latency (ms) 0.427

Max requests/s by CPU 222222

Request size 2048

Network bandwidth (MB) 1024

Max requests/s by bandwidth 524288

It can be seen that the network bandwidth limitation (524288/s) is never the limiting factor. The CPU limitation (222222/s) is only applicable once the number of connections
exceeds 125. At the 25 connections used by the HttpCore benchmark, it can be seen that any extra latency results in a rapid reduction in throughput from almost 60000/s to less than 2000/s.

The benchmark puts both Jetty and HttpCore on the red (25 connection) curve, but HttpCore is on the absolute left hand side of the curve, while jetty is a few ms of latency to the right. Thus jetty, which uses extra latency (for good reasons described below), is heavily punished by this benchmark, because the benchmark happens to be on one of the steepest sections of that graph (although it looks like it could be worse at 125 connections, but I expect some other limitation would prevent HttpCore reaching 222222/s).

Realistic Throughput limitations

The configuration of the HttpCore benchmark do not well match the reality faced by most HTTP servers for which throughput is a concern. Specifically:

The localhost network has less than 1ms of round trip latency, when real internet applications must expect at least 10s if not 100s of ms of network latency.
A modern browser will open 6 connections to the same host, so 25 connections represent only 4 simultaneous users. The expectation for a loaded HTTP server is that it will see at least 100s if not 1000s of simultaneous connections.
Real connections are mostly idle and will hardly ever see a new request <1ms after a response is sent.
Java webservers are typically used for dynamic pages that will take more that 0.018ms to generate. If the CPU time per request is increased to 0.15ms per request, then the CPU Limitation is reduced to 26667/s.

The chart below is updated with these more realistic assumptions:

CPU’s 4

CPU time per request (ms) 0.150

Network latency (ms) 20.000

Max requests/s by CPU 26667

Request size 2048

Network bandwidth (MB) 100

Max requests/s by bandwidth 51200

This shows that when realistic assumptions are applied, the throughput is far less sensitive to additional latency. Above a 500 connections, the throughput is rapidly limited by available CPU and is unaffected by any extra latency in the handling of each request. Even at 125 connections, extra latency only slightly reduces throughput. This shows that there is little or no cost associated with increased latency and thus a server can consider using extra latency if it has a good reason to (see below).
I invite you to download the spreadsheet used to generate these graphs and experiment with the assumptions, so that you can see that in many (if not most) scenarios, that throughput is not significantly sensitive to latency. It is only with the specific assumptions used by HttpCore that latency is a sensitive parameter.

Why use Latency?

I have demonstrated that in realistic scenarios (with many connections and some network latency), then additional latency in handling a request should not have a significant impact on throughput. So why does Jetty have a higher latency per request than HttpCore?

The HttpCore NIO server as configured for the benchmark used a single thread per CPU core, each allocated to a select set terminating a proportion of the connections. Each thread reads a request, writes the response and then loops back to the select set, looking for a the next connection to read the next available request. This is very efficient, but only if all requests can be handled without blocking. If the handling the request blocks for any reason (eg writing response to slow client, waiting for DB, waiting for a synchronize lock, etc) then all the other requests from the same select set will also be blocked and throughput will be greatly reduced. For a 4 CPU machine, it would only take 4 slow clients or 4 long DB queries to block the entire server and prevent any requests from being handled. The HttpCore benchmark avoids this situation by having simple non blocking requests handlers and no slow clients, networks or databases etc.

It is unacceptable in most real HTTP deployments to allow one request to block unrelated requests due to thread starvation. Thus most HTTP servers operate with at thread pool and dispatch the handling of each request to different thread from that handling the NIO select set. Since each request is handled in a dedicated thread, then it may block without affecting other requests or reducing the throughput of the server.
When a request is dispatched to a thread pool, it typically waits for a few ms in a job queue for a thread to be allocated to handle the request. For most realistic scenarios this extra latency has little or no cost and significant benefit, but in the HttpCore benchmark this latency is heavily penalized as it delays the response, thus it delays the load generator sending the next request. Throughput is reduced because the client sends less requests, not because the server cannot handle them.
Also the benchmark compared the raw HTTP components of HttpCore vs the rich servlet environment of Jetty. Jetty will consume some extra CPU/latency to establish the servlet context which provides many benefits of functionality to the application developer. Jetty could also be configured as a simple HTTP handler and would thus use both less CPU and less latency.

Conclusion

The HttpCore benchmark is essentially comparing apples with oranges. The benchmark is setup to mostly measure the raw speed of the HttpCore HTTP parsing/generating capabilities. and does not represent a realistic load test. The Jetty configuration used has been optimized to be a general purpose application server and HTTP server for large numbers of mostly idle connections. Given this disparity, I think it is great that Jetty was
able to able to achieve similar and sometimes better performance in some of the scenarios. This shows that Jetty’s own HTTP parsing/generation is no slouch and that it would be interesting to compare jetty if it was stripped of it’s thread pool and servlet container. If we find the time we may provide such a configuration.
For anybody that really wants to know which server would be faster for them (and the different feature sets do not guide their selection), then they need to setup their own benchmark with a load generator that will produce a traffic profile as close as possible to what their real application will experience.

17/06/2010

Guide to Jetty Webinar (Thu 8 April)

Jan Bartel and I will be presenting a "Guide to Jetty" Webinar on Thu, Apr 8, 2010 8:00 AM – 9:00 AM PDT. We’ll present an overview of Jetty and then show some hands on examples of running Jetty, deploying webapps, coding against the embedded API, plus a cometd demo. We’ll also take questions from the attendees.

Please register at http://gotomeeting.com/429603674

07/04/2010
The Streamlined Life with Jetty

Johannes Brodwall has posted another great description of his techniques dealing with Jetty in everyday development life. It was great to meet him in person finally in Oslo recently, and he definitely has thought through what he does and why, which for your reference, you can find at his blog. And, stay tuned on the monitoring side!

09/03/2010
Websocket Chat
The websocket protocol has been touted as a great leap forward for bidirectional web applications like chat, promising a new era of simple comet applications. Unfortunately there is no such thing as a silver bullet and this blog will walk through a simple chat room to see where websocket does and does not help with comet applications. In a websocket world, there is even more need for frameworks like cometd.

Simple Chat

A chat is the “helloworld” application of web-2.0 and a simple websocket chat room is included with the jetty-7 which now supports websockets. The source of the simple chat can be seen in svn for the client side and server side. The key part of the client side is to establish a WebSocket connection:
```
        join: function(name) {          this._username=name;          var location = document.location.toString().replace('http:','ws:');          this._ws=new WebSocket(location);          this._ws.onopen=this._onopen;          this._ws.onmessage=this._onmessage;          this._ws.onclose=this._onclose;        },
```
It is then possible for the client to send a chat message to the server:
```
        _send: function(user,message){          user=user.replace(':','_');          if (this._ws)            this._ws.send(user+':'+message);        },
```
and to receive a chat message from the server and to display it:
```
        _onmessage: function(m) {          if (m.data){            var c=m.data.indexOf(':');            var from=m.data.substring(0,c).replace('<','<').replace('>','>');            var text=m.data.substring(c+1).replace('<','<').replace('>','>');
            var chat=$('chat');            var spanFrom = document.createElement('span');            spanFrom.className='from';            spanFrom.innerHTML=from+': ';            var spanText = document.createElement('span');            spanText.className='text';            spanText.innerHTML=text;            var lineBreak = document.createElement('br');            chat.appendChild(spanFrom);            chat.appendChild(spanText);            chat.appendChild(lineBreak);            chat.scrollTop = chat.scrollHeight - chat.clientHeight;             }        },
```
For the server side, we simply accept incoming connections as members:
```
        public void onConnect(Outbound outbound)        {            _outbound=outbound;            _members.add(this);        }
```
and then for all messages received, we send them to all members:
```
        public void onMessage(byte frame, String data)        {            for (ChatWebSocket member : _members)            {                try                {                    member._outbound.sendMessage(frame,data);                }                catch(IOException e)                {                    Log.warn(e);                }            }        }
```
So we are done right? We have a working chat room – let’s deploy it and we’ll be the next google gchat!! Unfortunately reality is not that simple and this chat room is a long way short of the kinds of functionality that expect from a chat room – even a simple one.

Not So Simple Chat

On Close?

With a chat room, the standard use-case is that once you establish your presence in the room and it remains until you explicitly leave the room. In the context of webchat, that means that you can send receive a chat message until you close the browser or navigate away from the page. Unfortunately the simple chat example does not implement this semantic because the websocket protocol allows for an idle timeout of the connection. So if nothing is said in the chat room for a short while then the websocket connection will be closed, either by the client, the server or even and intermediary. The application will be notified of this event by the onClose method being called.
So how should the chat room handle onClose? The obvious thing to do is for the client to simply call join again and open a new connection back to the server:
```
        _onclose: function() {          this._ws=null;          this.join(this.username);        }
```
This indeed maintains the user’s presence in the chat room, but is far from an ideal solution since every few idle minutes the user will leave the room and rejoin. For the short period between connections, they will miss any messages sent and will not be able to send any chat themselves.

Keep Alives

In order to maintain presence, the chat application can send keep-alive messages on the websocket to prevent it being closed due to an idle timeout. However, the application has no idea at all about what the idle timeout are, so it will have to pick some arbitrary frequent period (eg 30s) to send keep-alives and hope that is less than any idle timeout on the path (more or less as long-polling does now). Ideally a future version of websocket will support timeout discovery, so it can either tell the application the period for keep-alive messages or it could even send the keep alives on behalf of the application.
Unfortunately keep-alives don’t avoid the need for onClose to initiate new Websockets, because the internet is not a perfect place and specially with wifi and mobile clients, sometimes connections just drop. It is a standard part of HTTP that if a connection closes while being used, the GET requests are retried on new connections, so users are mostly insulated from transient connection failures. A websocket chat room needs to work with the same assumption and even with keep-alives, it needs to be prepared to reopen a connection when onClose is called.

Queues

With keep alives, the websocket chat connection should be mostly be a long lived entity, with only the occasional reconnect due to transient network problems or server restarts. Occasional loss of presence might not been seen to be a problem, unless you’re the dude that just typed a long chat message on the tiny keyboard of your vodafone360 app or instead of chat you are playing on chess.com and you don’t want to abandon a game due to transient network issues. So for any reasonable level of quality of service, the application is going to need to “pave over” any small gaps in connectivity by providing some kind of message queue in both client and server. If a message is sent during the period of time that there is no websocket connection, it needs to be queued until such time as the new connection is established.

Timeouts

Unfortunately, some failures are not transient and
sometimes a new connection will not be established. We can’t allow queues to grow for ever and to pretend that a user is present long after their connection is gone. Thus both ends of the
chat application will also need timeouts and user will not be seen to
have left the chat room until they have no connection for the period
of the timeout or until an explicit leaving message is received.
Ideally
a future version of websocket will support an orderly close message so
the application can distinguish between a network failure (and keep the
user’s presence for a time) and an orderly close as the user leaves the page (and
remove the user’s present).

Message Retries

Even with message queues, there is a race condition that makes it difficult to completely close the gaps between connections. If the onClose method is called very soon after a message is sent, then the application has no way to know if that close happened before or after the message was delivered. If quality of service is important, then the application currently has not option but to have some kind of per message or periodic acknowledgement of message delivery. Ideally a future version of websocket will support orderly close, so that delivery can be known for non failed connections and a complication of acknowledgements can be avoided unless the highest quality of service is required.

Backoff

With onClose handling, keep-alives, message queues, timeouts and retries, we finally will have a chat room that can maintain a users presence while they remain on the web page. But unfortunately the chat room is still not complete, because it needs to handle errors and non transient failures. Some of the circumstances that need to be avoided include:
- If the chat server is shut down, the client application is notified of this simply by a call to onClose rather than an onOpen call. In this case, onClose should not just reopen the connection as a 100% CPU busy loop with result. Instead the chat application has to infer that there was a connection problem and to at least pause a short while before trying again – potentially with a retry backoff algorithm to reduce retries over time. Ideally a future version of websocket will allow more access to connection errors, as the handling of no-route-to-host may be entirely different to handling of a 401 unauthorized response from the server.
- If the user types a large chat message, then the websocket frame sent may exceed some resource level on the client, server or intermediary. Currently the websocket response to such resource issues is to simply close the connection. Unfortunately for the chat application, this may look like a transient network failure (coming after a successful onOpen call), so it may just reopen the connection and naively retry sending the message, which will again exceed the max message size and we can lather, rinse and repeat! Again it is important that any automatic retries performed by the application will be limited by a backoff timeout and/or max retries. Ideally a future version of websocket will be able to send an error status as something distinct from a network failure or idle timeout, so the application will know not to retry errors.
Does it have to be so hard?

The above scenario is not the only way that a robust chat room could be developed. With some compromises on quality of service and some good user interface design, it would certainly be possible to build a chat room with less complex usage of a WebSocket. However, the design decisions represented by the above scenario are not unreasonable even for chat and certainly are applicable to applications needing a better QoS that most chat rooms.
What this blog illustrates is that there is no silver bullet and that WebSocket will not solve many of the complexities that need to be addressed when developing robust comet web applications. Hopefully some features such as keep alives, timeout negotiation, orderly close and error notification can be build into a future version of websocket, but it is not the role of websocket to provide the more advanced handling of queues, timeouts, reconnections, retries and backoffs. If you wish to have a high quality of service, then either your application or the framework that it uses will need to deal with these features.

Cometd with Websocket

Cometd version 2 will soon be released with support for websocket as an alternative transport to the currently supported JSON long polling and JSONP callback polling. Cometd supports all the features discussed in this blog and makes them available transparently to browsers with or without websocket support. We are hopeful that websocket usage will be able to give us even better throughput and latency for cometd than the already impressive results achieved with long polling.
01/03/2010
Webinar on reliable messaging with Jetty, Cometd and ActiveMQ
Jan Bartel (Intalio) and Daan Van Santeen (Progress FUSE) will be giving a series of live webinars on how Jetty, Cometd and ActiveMQ can be used to provide a reliable messaging platform to the browser.
- What Jetty is and how its CometD Bayeux implementation implements messaging to the browser
- What Apache ActiveMQ is and how its JMS implementation allows for reliable messaging
- What the JMS and Bayeux messaging protocols are and how they complement each other
  How
  these technologies can be combined to create a flexible and reliable
  platform to implement messaging from a back-end system to a browser
The webinar includes a simple demonstration of fault tolerance failover from the browser.

Click to register:
- Monday, February 15, 2010 11:00 am
  Europe Time (Paris, GMT+01:00)
- Tuesday, February 16, 2010 12:00 pm
  Pacific Standard Time (San Francisco, GMT-08:00)
10/02/2010

	Jetty	HttpCore
Linux BIO	35,342	56,185
Linux NIO	1,873	25,970
Windows BIO	31,641	29,438
Windows NIO	6,045	13,076

Blog

HTTP Server Throughput Limits

Network Bandwidth Limitations

CPU Limitations

Connection Limitations

HttpCore Throughput Limitation

Realistic Throughput limitations

Why use Latency?

Conclusion

Simple Chat

Not So Simple Chat

On Close?

Keep Alives

Queues

Timeouts

Message Retries

Backoff

Does it have to be so hard?

Cometd with Websocket

HTTP
Server Throughput Limits