Category: Uncategorized

Missing Filter Mapping.

Because I consider some of the “features” of the latest 2.4 specification rather dubious or expensive,
I was reluctant to implement them as core features of Jetty. Instead, Jetty 5.1 will use optional
filters to provide JSR77 statistics, wrap under dispatch semantics, request events and request attribute
events.
Thus Jetty is able to support these features needed for specification compliance, but they can
be simply removed with no lingering classpath dependencies simply be removing the filters.
While this demonstrates the power of filters, it has also revealed a missing filter mapping in
the specification. There is no way to define a filter to intercept ALL dispatches in a webapplication.
A Filter mapped to / for REQUEST,FORWARD,INCLUDE and ERROR comes close, but is unable to filter
named dispatches. Named dispatches can be filtered with filters mapped to a servlet name, but it
is not possible to generically map a filter to all servlets. Something for the 2.5 spec…

03/10/2004
Next Generation Jetty and Servlets
Jetty 5.0.0 is out the door and the 2.4 servlet spec is implemented.
So what’s next for Jetty and what’s next for the servlet API? It’s been a long journey for
Jetty from it’s birth in late 1995 to the 5.0.0 release. When the core of Jetty was written,
there was no J2ME, J2EE, or even J2SE, nor a servlet API, no non-blocking IO and multiple CPU
machines were rare and no JVM used them. The situation is much the same for the servlet
API, which has grown from a simple protocol handler to a core component architecture for
enterprise solutions.
While these changing environments and requirements have mostly been handled well, the
results are not perfect: I have previously blogged about the servlet API
problems
and Jetty is no longer best of breed when it comes to raw speed or features.
Thus the I believe the mid term future for Jetty and the Servlet API should involve a bit more
revolution than evolution. For this purpose the
JettyExperimental(JE) branch
has been created and is being used to test ideas to greatly improve the raw HTTP performance
as well as the application API. This blog introduces JE and some of my ideas for how Jetty and
the servlet API could change.
Push Me Pull You
At the root of many problems with the servlet API is that it is a pull-push API, where the servlet
is given control and pulls headers, parameters and other content from the request object before pushing
response code, headers and content at the request object. This style of API, while very convenient for
println style dynamic content generation, has many undesirable consequences:
- The request headers must be buffered in complex/expensive hash structures so that the application
  can access it in arbitrary order. One could ask why application code should be handling HTTP headers anyway…
- The application code contains the IO loops to read and write content. These IO loops
  are written assuming the blocking IO API.
- Pull-push API is based on stream, read and writer abstractions, which makes it impossible for the
  servlet application code
  to use efficient IO mechanisms such as gather-writes or memory mapped file buffers for
  static content.
- The response headers must be buffered in complex/expensive hash structures so that applications
  can set and reset them in arbitrary order. One could ask why application code should be writing HTTP headers anyway…
- The application code needs to be aware of HTTP codes and headers. The API itself provides
  no support for separating the concerns of content generation and content transport.
From a container implementers point of view, it would be far more efficient for
the servlet API to be push-pull, where the container pushes headers, parameters and content
into the API as they are parsed from a request and then pulls headers and content from the
application as they are needed to construct the response.
This would remove the need for applications to do IO, additional buffering, arbitrary
ordering and dealing with application developers that don’t read HTTP RFCs.
Unfortunately a full push-pull API would also push an event driven model onto the application,
which is not an easy model to deal with nor suitable for the simple println style of
dynamic content generation used for most “hello world” inspired servlets.
The challenge of Jetty and servlet API reform is to allow the container to be written in
the efficient push-pull style, but to retain the fundamentals of pull-push in the application
development model we have come to know and live with. The way to do this is to change the
semantic level of what is being pushed and pulled, so that the container is written to
push-pull HTTP headers and bytes of data, but the application is written to pull-push content
in a non-IO style.
Content IO
Except for the application/x-www-form-urlencoded
mime-type, the application must perform it’s own IO to read content from the request and to
write content to the response. Due to the nature of the servlet API and threading model,
this IO is written assuming blocking semantics.
Thus it is difficult to apply alternative IO methods, such as NIO
Unfortunately the event driven nature of non-blocking IO is incompatible with the servlet
threading model, so it is not possible to simply ask developers to start writing IO assuming
non-blocking IO semantics or using NIO channels.
The NIO API cannot be effectively used without direct access to the low level IO classes, as
low level API is required to efficiently write static content using a file
MappedByteBuffer to a
WritableByteChannel
or to combine content
and HTTP header into a single packet without copying using a
GatheringByteChannel.
The true power of the NIO
API cannot be abstracted into InputStreams and OutputStreams.
Thus to use NIO, the servlet API must either expose these low levels (bad idea – as NIO might not
always be the latest and greatest) or to take away content IO responsibilities from the application
developers.
The answer is to take away from the application servlets the responsibility for performing
IO. This has already been done for application/x-www-form-urlencoded, so
why not let the container handle the IO for text/xml, text/html etc.
If the responsible for reading and writing bytes (or characters) was moved to the container,
then the application servlet could code could deal with higher level content Objects
such as org.w3c.dom.Document, java.io.File or java.util.HashMap. Such a container mechanism
would avoid the current need for many webapps to provide their own implementation of a
multipart request class or
Compression filter.
If we look at the client side of HTTP connections, the
java.net package provides the
ContentHandlerFactory mechanism so that
the details of IO and parsing content can be hidden behind a simple call to
getContent(). Adding a similar mechanism (and
a setContent() equivalent) to the servlet API would move the IO responsibility
to the container. The container could push-pull bytes from the content factories and
the application could pull-push high level objects from the same factories.
Note that a content based API does not preclude streaming of content or require that large
content be held in memory. Content objects passed to and from the container could include
references to content (eg File), content handlers (JAXP handler) or even Readers, Writers,
InputStream and OutputStreams.
HTTP Headers
As well as the IO of content, the application is currently responsible for handling the
associated meta-data such as character and content encoding, modification dates and caching control.
This meta-data is mostly well specified in HTTP and MIME RFCs and could be best handled by
the container itself rather than by the application or the libraries bundled with it. For
example it would be far better for the container to handle gzip encoding of content directly
to/from it’s private buffers rather than for webapps to bundle their own CompressFilter.
Without knowledge of what HTTP headers that an application uses or in what order they
will be accesses, the container is forced to parse incoming requests into expensive hashtables
of headers. The vast majority of application do not deal with most headers in
a HTTP request, for example with the following request from mozilla-firefox:
```
GET /MB/search.png HTTP/1.1
Host: www.mortbay.com
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040614 Firefox/0.8
Accept: image/png,image/jpeg,image/gif;q=0.2,*/*;q=0.1
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
keep-alive: 300
Connection: keep-alive
Referer: http://localhost:8080/MB/log/
Cookie: JSESSIONID=1ttffb8ss1idk
If-Modified-Since: Fri, 21 Nov 2003 16:59:29 GMT
Cache-Control: max-age=0
```
an application is likely to only make indirect usage of the Host and Cookie
headers and perhaps direct usage of the If-Modified-Since and Accept-Encoding
headers. Yet all these headers and values are available via the HttpServletRequest object to be pulled
by the application at any time during the request processing. Expensive hashmaps are created
and values received as bytes either have to be stringified or buffers kept aside for later
lazy evaluation.
If the application was written at a content level, then most (if not all) HTTP header
handling could be performed by the content factories. For example, if given an org.w3c.dom.Document
to write, the container could set the http headers for a content type of text/xml with
an acceptable charset and encoding selected by server configuration and the request headers.
Once the headers are set, the byte content can be generated accordingly by the container,
but scheduled so that excess buffering is not required and non-blocking IO can be done.
Unfortunately, not all headers will be able to be handled directly from the content objects.
For example, If-Modified-Since headers could be handled for a File content Objects,
but not for a org.w3c.dom.Document. So a mechanism for the application to communicate additional
meta-data will need to be provided.
Summary and Status
JettyExperimental now implements most of HTTP/1.1 is a push-pull architecture that works with
either bio or nio. When using nio, gather writes are used to combine header and content into
a single write and static content is served directed from mapped file buffers. An advanced
NIO scheduler avoid many of the
NIO problems
inherent with a producer/consumer model.
Thus JE is ready as a platform to experiment with the content API ideas introduced
above. I plan to initially work toward a pure content based application API and thus to
discover what cannot be easily and efficiently fitted into that model. Hopefully what
will result is lots of great ideas for the next generation servlet API and a great HTTP
infrastructure for Jetty6.
24/09/2004
Servlet Performance Report – The Jetty Response.

Christopher Merrill of Web Performance Inc has release a report of Servlet Performance, which included Jetty.

Firstly it is good to see somebody again tackling the difficult task of benchmarking servlet containers. It is a difficult task and will always draw criticism and the results will always be disputed.

The basic methodology of the test is to gradually increase load on the servers tested and chart their performance over time. The results in a nutshell were that all servers did pretty much the same up to a point. One by one, the servers reached a limit and their performance degraded.

Unfortunately, Jetty was one of the first servers to break ranks and display degraded performance. This however, is not as disturbing as might seem at first. The test used the default configuration for all the servers, which greatly differs. For example, Jetty by default is configured with 50 threads maximum, while tomcat has 150 by default. This is a significant flaw in the study and makes it all but useless for comparing the maximum performances of the different containers. In fact it is amazing that Jetty kept pace with the others for so long, considering it had limited resources.

The bad news for Jetty, is that even if you ignore the other containers in the study, the shape of the curve is not nice. When a server hits a configured limit it should gracefully degrade it performance in the face of more load. Unfortunately when presented with the load in this study, Jetty’s degradation is a bit less than graceful. We have not seen this ugly curve in our own testing and reports from the field are mostly about grace under fire. So hopefully this is an artifact from the artificial load used in the test. However, this test does show that there is at least one load profile that causes Jetty grief once resources are exhausted. So we have some work to do.

In summary, the study is interesting in that it at least shows that most servlet containers are approaching terminal velocity for less than pathological loads. But if you want to maximise throughput for your web application, then don’t use the default configuration. Instead read the Jetty Optimisation Guide.

08/07/2004

Jetty on the Mort Bay host.

This casestudy describes how
Jetty
has been used on our own sites, to show that we are “eating our own dogfood”.
While there is nothing revolutionary in this blog, it is sometimes good to see
examples of the ordinary and I believe it is a good example
of how the simplicity and flexibility of Jetty has allowed simple things to
be done simply.

The jetty host is donated to the Jetty project by
Mort Bay Consulting
and
InetU, and the machine is
now not of the highest spec: 500Mhz Celeron with 128MB running FreeBSD at 1061
BogoMIPS ( about half the speed of my aging notebook ). On this
machine we run over 13 web contexts for 6 domains in a single
Jetty server with a 1.4.1-b21 Sun JVM using the latest release of Jetty 5.

The Sites:
The websites run by the server are for diverse purposes and
are implemented using diverse technologies:

/jetty/*	–	The Jetty site is implemented as a serlvet that wraps a look and feel around static content and is deployed as a unpacked web application.
/demo/*	–	The demo context is custom context build with a collection of Jetty handlers using the java API called from the Jetty configuration XML.
/servlets-examples/*	–	The jakarta servlet examples deployed and run as a packed WAR.
/jsp-examples/*	–	The jakarta JSP examples deployed precompiled as a packed WAR.
/javadoc/*	–	The jetty javadoc in a jar file deployed as a webapplication. Because there is no WEB-INF structure, the jar is served purely as static content.
/cgi-bin/*	–	A context configured to run the Jetty CGI servlet
www.mortbay.com/	–	The Mort Bay site is a look and feel servlet wrapping static content deployed as a standard webapplication.
www.mortbay.com/images/holidays	–	A foto diary site, created by deploying a directory structure as a webapplication so that it’s static content is served. A HTAccess handler is used to secure access to some areas (nothing exciting I’m afraid).
www.mortbay.com/MB	–	This blog site, which uses the excellent Blojsom web application using velocity rendering and log4j
www.collettadicastelbianco.com/	–	A site about the Italian borgo telematico in which I sometime live and work. Written in dynamic JSP2.0 with tag files and heavy use of Servlet 2.4 Filters as aspects. Responsible for me finally liking JSPs
www.jsig.com/	–	Static content web application.
www.safari-afrika.com/	–	Static content web application.
www.ncc.com.au/	–	Static content web application.

Configuration:

The server is configured from a single Jetty XML file using explicit
adding of all contexts rather than automatic discover. Doing this is good for security, but also allows extra configuration to
be added for each context, such as customized logging.

While all domains have unique IP addresses, the site is actually configured to treat them as virtual hosts. This allows simpler
configuration of a single set of listeners for all contexts. A default root context is also configured to redirect requests without
a host header to an appropriate context.

Two listeners (http 8080 & https 8443) are configured using a shared thread pool of max 30 threads. Ipchains is used to redirect
port 80 and 8443 to these listeners and the server is run as an unpriviledged user.

Two authentication realms are defined, for jetty demo and jakarta servlet demo. Both use simple property files.
The realm name is used to map the realm to the webapplications.

Logging:

Jetty 5 usings commons logging plus the Jetty 4 logger wrapped as a commons logger. This is configured in the jetty.xml
to log to a file that is rolled over daily and historic files are kept for 90 days. A specific logger instance is
declared for the classes from the colletta web site. This logger is also mapped to the context name so that
ServletContext.log calls are also directed to it by jetty and all the log information generated by the app is in one file.

NCSA requests logs are defined for all the main contexts and a catch all request log is defined for those without.
The webalizer utility is used to generate regular reports of our loads on each context.

Conclusion:

The whole thing is kept running using the
Bernstein daemontools supervise program which calls “java -jar start.jar”
with no special parameters.

So really nothing unusual to see here, just business as usual, so move on…

08/07/2004

Living la vita JSP.
I have finally seen the JSP light! I think I actually like them and
I will have to stop pouring scorn upon them (and all who use them)!
So maybe 2004 is a bit late to be blogging about the wonders of JSPs, but
the agent of my conversion has been the new JSP 2.0 combined with the latest
2.4 servlet specification. Together with a bit of careful thought, the new
features result in a capable MVC web framework directly on a raw servlet container without the need for heavy weight framework like struts.

Living “la vita open source” as I do, I have recently moved to a telecommuting
village on the Italian riviera: Colletta
di Castelbianco. I undertook to redevelop the village’s
website, which was a rushed job
that needed an internationalised wiki wrapped around a booking engine.
As there are too many web frameworks out there to evaluate, I decided to try to do
without.

My basic complaint against JSPs has always been that they do not really
separate concerns very well. In fact JSPs originally munged concerns into
a single file of unreadable HTML and java code together.
The later addition of custom
tags improved the principal of JSPs, but the practise was still a
mess of descriptors, classes and way too much <% } %> going on.

JSP2.0 and 2.4 Servlets have changed all that. They finally provide the features
needed to develop parameterized HTML in modules that are well separated from
the business and implementation logic. The key new features are:
The Colletta site has been internationalised with a 167 line LocaleFilter. It’s initial
job is to look at requests and to determine the locale for them, either from
the session, cookies set previously or from HTTP accept headers. Once the locale
is determined for the request, the input and output encodings are handled and
the filter can forward the request to a localised resource. Thus a request for logo.gif
may be forwarded to logo_it.gif or logo_en.gif. More importantly, because 2.4 filters can
be applied to request dispatches, this filter is in the path of a
<jsp:include page="text.jsp"/> and can include text_it.jsp instead.
Thus layout JSPs can simply include localised content.

But localisation cannot be done simply by bulk includes of content. Thus the PageFilter
uses the locale and the requested resource to select a localised property file
for the request (eg page_it.properties in the same directory as the requested
resource). The resulting property files is combined with those from all ancestor
directories, cached and set as the “prop” attribute on the request. The presentation
JSPs can thus access localised properties via JSP expressions like
${prop.name}. This allows simple localisation of complex layouts and error
messages without making the HTML unreadable.

As a final example, I will show you the JSP code that renders the discount field
in the booking form:
```
<td ...>
${prop.discount}:
</td><td ...>
<tag:curInput
name="discount"
value="${reservation.discount}"
readonly="${!user.managerOf[reservation.aptId]}"/>
</td>
```
This input field is preceded with a localised label and then calls the tag file
for currency Input to render the actual HTML input field. It is passed the current
value of the discount from the reservation option that the BookingFilter has
put in the session. Also passed is the readonly flag that is true if the current
user (placed in the request by the UserFilter) is not the manager of the
apartment.
The power here, is not in the tag file
(whose details I leave to your imagination), but in the way that JSP expressions allow
complex data derived in separate logic to be combined and passed to parameterise modules
of HTML.

Web frame works – who needs them!
11/06/2004
WADI Web Application Distribution Infrastructure.

The problem with the current breed of HTTP servlet session distribution mechanisms
is that they do not scale well. I have just tried the first release of
WADI ‘Web Application Distribution Infrastructure’, and it
shows great promise as a next-generation session distribution mechanism that can be used for
simple load balancing, as well as huge multi redundant clusters.

Web applications are put into clusters with one or more of the following aims:

Scalability: to use N nodes to increase throughput almost N times.

Availability: to use N nodes to ensure that after N-1 failures,
request will find at least 1 node running the application.

Fault Tolerance: to use N nodes to ensure that after N-1 failures, the
results of previous requests are correctly available to the application.

Unfortunately, these aims are not always mutually compatible, nor completely understood
by those who want to implement a cluster. Also, the components that are used to make a cluster
(load balancers, session replicators, EJB replications, distributed databases) all have
different focuses with respect to these aims. Thus it is important to remember why you
want a cluster and not to make the aims of a single component dominate your own.

The current crop of load balancers (eg.
mod_jk,
Pound and IP routers) are very good at scalability
and quickly distribute the HTTP requests to a node in a cluster. But they are not so good
at fault tolerance as once a failure occurs, the mechanisms for selecting the correct node for
a request are replaced by a “pick a node, any node” approach.

The currently available
HTTP session distribution mechanisms
have mostly been written with fault tolerance in mind. When a node
receives some session data, it immediately attempts to replicate it somewhere else. Thus the session is
broadcast on the network or persisted in a database, and significant extra load is generated. Because
load balancers cannot be trusted to select the correct node, the session is often distributed to
all nodes. Thus in an N node cluster, every node must store and broadcast it’s own sessions while
receiving and storing the session from N-1 other nodes. This is great for
fault tolerance but a disaster for scalability. Rarely is 1+1>1 or even 1+1+1+1>1 and throughput
is inversely proportional to cluster size. Worse still, the complexity of the mechanism often
results in more failures and less availability.

It is difficult to achieve scalability and/or availability with a session distribution mechanism
that has been designed for fault tolerance. Thus, many clusters do not use session distribution.
Unfortunately this has its own problems, as load balancers are not perfect at sticking a session
to a node in a cluster. For example, if IP stickiness is used and the client is on AOL, then their
source IP can change mid session and the load balancers will route the requests to the wrong node.
To handle the imperfections of load balancers, a subset of session distribution is needed – namely
session migration.

Session migration is the initial focus of WADI, so
that it can provide scalability and availability to match the available load balancers.
However, WADIs extensible architecture has also been designed for fault tolerance, so eventually
it will be able to handle all the concerns of clustering Servlet sessions.

WADI has been simply integrated with the
Jetty and
Tomcat Servlet containers. For normal operation, it adds very
little overhead to session access. Only the cost of synchronizing
requests to the session and of some AOP interceptors. Expensive mechanisms such as migration or
persistence are only used when a node has been shutdown or a load balancer has directed the
request to a different node.

When a WADI node receives a request for which it has no session data, it can ask the cluster
for the location of the session. This can either be on another node or in persistant storage (after
a timeout or graceful node shutdown). The request can then be directed to the correct node or the session can
be migrated to the current node.

WADI is still alpha software, but is actively being developed by
Julian Gosnell
of Core Developers Network and should soon be included with
Jetty 5 and Geronimo.

10/06/2004
A Shared Jetty Server in a College Environment
When I was systems administrator for the University of North Carolina at
Charlotte department of Computer Science, I saw the need to establish an
environment for our students to experiment with servlets.
The goal was to provide a centralized system for our students to
deploy their servlets / web-applications on without having to shoulder
the additional burden of administrating their own web / servlet containers.
At the time, the department of computer science was a partner in a heavily
centralized workstation-based computing system modeled after MIT’s Project
Athena, using AFS as an enterprise filesystem. Having such a filesystem
solved the problem of getting the student’s code + web-application data
to the servlet container — just save it in a readable portion of the
student’s AFS home directory, and the servlet server can pick it up
and go.

I developed a workable solution with Jetty version 3. I found the internal
architecture of Jetty to follow the ‘as simple as things should be, but
no simpler’ rule, allowing me, a time-constrained sys admin with Java
skills, to squeeze off the project in the time allowed so that courses could begin to utilize the service. Jetty’s native extensibility allowed me
to easily extend its functionality, allowing the students to remote-deploy
and administrate their own servlet / JSP code on the machine via webspace
gestures.

Implementation

The core bits of this was a new HttpHandler implementation which acted as
the main controller. In Jetty 3, HttpHandlers are stacked within
HandlerContext objects, which are mapped to URI patterns within the
Jetty server itself. The HandlerContext most closely matching the
URI named in a HTTP request is asked to handle the request, which it
does so through iterating over each its contained HttpHandlers.
The HandlerContext containing this HttpHandler implementation was mapped to URI “/~*”, so that this handler would be considered to handle a request to “/~username/…”. The handler’s initial
responsibilities were to:
1. Dynamically build contexts for users on demand by first
  reference to their own webspace on the machine, such as the first
  hit to “/~username/*”. This handler would look up the user’s homedir
  in a UNIX passwd file containing only students (no system accounts),
  and then create a new HandlerContext to serve out static content and JSPs
  out of ~username/public_html in filespace, and dynamically
  mapped servlets from ~username/public_html/servlets in filespace. The
  ability to lazily deploy a user’s personal context was paramount, since
  possibly only 20 out of many thousands of possible students would use
  the server any given day. The newly-created HandlerContext would be
  preferred by Jetty to serve out requests to “/~username/*” over this
  handler’s context, since the match to “/~username/” was more specific.
2. Reload any one of the already-deployed user contexts, so that
  Jetty would reload any class files that had been recompiled. This was
  done through merely stopping, removing, and destroying the user context
  in question (easy, since the HttpHandler implementation maintained a
  Map of username -> created context). After removal of the old context,
  we would lazily initialize a new context upon next request to a resource
  in that context via step 1. This action was done through a custom servlet
  in the same package which maintained a reference to the HttpHandler via the
  singleton design pattern. This servlet, when asked via a webspace gesture,
  would make a protected method call into the HttpHandler to perform this
  step to user foo’s context.
As time went on, additional features were added:
- Web applications. Students could deploy web-applications, either
  in expanded format or jar’d up into a subordinate webspace of their
  personal webspace of their own choosing (i.e /~username/mywebapp/*). They
  could then choose to undeploy, redeploy, or to view the logs generated
  by this web-application’s servlet / JSP code (hooks to personal log sinks
  per each webapplication). I chose to have the deployed web-applications
  be ‘sticky’, living through a server reset. This was accomplished by
  serializing the Map of loaded web-applications to disk whenever it changed,
  and to replay it as a log upon server startup. In hindsight, I should have
  deferred the full reload of a known web-application until a resource within
  the web-application was actually referenced, reducing the memory footprint
  of the server, as well as greatly reducing the server startup time (150
  webapps can contain quite a lot of XML to parse).
- User authentication realms. Users could configure simple
  Jetty HashUserRealms via indicating where in their filespace to load in
  the data for the realm. Realms defined by students in this way were
  forced to be named relative to their own username, such as ‘joe:realm’.
  The student’s web-applications could then contain security constraints
  referencing user / role mappings of their own choosing.
Security

Security is an issue for any resource shared by students. The servlet
allowing users to remote-control their own resources was ultimately
made available through SSL, locked down via a HTTP basic security
realm backed by interprocess communication to C code to perform the
AFS/Kerberos authentication checks given a username / password, allowing
the server to accurately trust gestures controlling a given user’s
resources on the server. A java security policy was installed
in the JVM running Jetty, limiting filespace access, as well as disallowing
calls to System.exit() and other obvious baddies, as I quickly found out
that their JDBC code’s SQLException handler often was System.exit().
Unfortunately, the java SecurityManager model cannot protect against
many types of ‘attacks’ brought on by untrusted student code, such as
CPU-hogging broken loops, thread-bombs, and the like. A babysitting
parent process was quickly written to restart the servlet server if it
ever went down, as well as would bounce the server if it had
consumed more CPU than it should have (probable student-code busy-loop).
Daily restarting the server acted as ultimate garbage collection.

AFS supports ACLs on directories, and instead of requiring all
servlet-related files to be flagged as world-readable, the servlet
server process ran authenticated to AFS as a particular entity which
students could grant access to. This reduced the capability of just
stealing their classmates code using filesystem-based methods, but
they could conceivably just write a servlet to do the same thing.
Possibly deeper insight into the java security model could have
corrected this.

The RequestDispatcher API was another thorn in the side of security,
allowing any single servlet to tunnel through a context / web-application
barrier to any other URI valid on the machine, conceivably allowing
a nefarious student to snarf up content served by another student’s
servlets, even if that student had wrapped the servlet in a security
constraint.

Symbolic-link misdirection based thefts were not considered at all.

Ultimately, students were warned many times up and down that this
was a shared server running your peer’s untrustable code, and that
you should only be using it for your coursework and explorations into
the servlet world. Nothing requiring true tamper-proof security should
be deployed on this box.

Lessons Learned

As the service became more and more popular, I wish that I had been able
to move it to a bigger box, something other than a non-dedicated Sun Ultra
5 with 256M RAM. Having more memory available to the JVM would have greatly
helped out when a 30 student section all tried to deploy SOAP-based
application, each using their own jars for apache axis, xalan, etc.

Using an inverse-proxy front-end to the system would have allowed splitting
the users across multiple JVM / Jetty instances, allowing gains on
uptimes (as seen from an individual’s perspective, since a server kick
to clear out a runaway thread would cause downtime for, say, 50% or 33%
of the entire user base, as opposed to 100%). It would also have allowed
me to have the service facade running at port 80, as opposed to the truly
evil JNI hack I had to do to have Jetty start up as root, bind to port
80, then setuid() away its rights N seconds after startup. Way ugly.
After the setuid() call was made, a latch was set in the custom
HttpHandler, allowing it to begin servicing user contexts. However, having
more than one Jetty instance would have complicated the implementation of
the controlling servlet, requiring it to perform RMI when asked to
control a non-local user context. This pattern could have been used
to scale down to one user per JVM, with the inverse-proxy being able
to fork / exec the JVM for the user upon demand, especially with
Jetty now having HTTP proxy capability. That would probably be overkill
for a student service, but having a static inverse-proxy with a fixed
mapping to 2 or 3 Jetty instances (possibly running on distinct
machines) would have been a relatively attractive performance and
reliability enhancer per the effort.

Impressions from the users were mixed. When all of the code being run on
the machine was benign, not memory nor CPU-hoggish, all was well and
the students were generally ambivalent — this service was something that
they had to use to pass their coursework, servlet coding / debugging was
slower and more cumbersome than ‘regular’ coding, etc. Having a hot-redeploy-capable container didn’t seem whiz-bang to them because they
had no other experience in servlet-land. When the machine was unhappy,
such as if it was 3 AM on the night before the project was due and
one student’s code decided to allocate and leak way more memory than
it had any right doing, causing the others to begin to get OutOfMemory
exceptions left and right, then they were (rightly) annoyed and let
me hear about it the next day.

If I were to re-solve the problem today, I would:
- Use some sort of inverse-proxy to smear the load over more
  than one JVM for higher-availablity, allowing the Jetty instances to bind to an unprivileged port.
- Use the JDK 1.4’s internal Kerberos client implementation to
  authenticate the campus users. Both of these steps would eliminate all
  C code from the system.
- Run on at least one bigger dedicated machine.
- Encourage the faculty to work with me to ensure that their
  central APIs can be loaded by the system classloader as opposed to their
  student’s web-application loader so we don’t end up with 30 copies of
  XSTL or SOAP implementations all at once.
- Lazy-load web-applications and auth realms upon first demand
  instead of at server startup.
- Age-out defined web-applications and auth realms if they
  have not been referenced in the past X days, so that they’ll eventually
  be forgotten about completely when a student finished the course.
[Copyright James Robinson 2004]
05/05/2004
Filters vs Aspects
Web Application Filters were added to the 2.3 Servlet specification and have been
enhanced in the just finalised 2.4 spec. Filters
allow cross cutting concerns to be applied to a web application, which is
exactly the gig of Aspect Oriented Programming (AOP).

Being born and breed with OOP, I have always viewed AOP in the cool but
useless category. Thus I’ve never given myself the opportunity to
use them for real. On the other hand, I view webapp Filters as useful
but ugly. It would be great if AOP could replace/enhance Filters so we
could get into the cool and useful territory.

So I downloaded AspectJ with the intent of seeing how AOP cross cutting
concerns could help. The example I considered was the CompressionFilter, which can be applied to
a web applications to gzip the content generated on the fly. A compression
filter can be declared in web.xml with :
```
<filter>
<filter-name>Compression</filter-name>
<filter-class>com.acme.CompressionFilter
</filter-class>
</filter>
<filter-mapping>
<filter-name>Compression</filter-name>
<url-pattern>*.html</url-pattern>
<dispatcher>REQUEST</dispatcher>
<dispatcher>FORWARD</dispatcher>
<dispatcher>ERROR</dispatcher>
</filter-mapping>
```
This CompressionFilter will apply to any request to a URL ending with html
that is not an
include request. The filter wraps the HttpServletResponse object with a facade, that
in turn wraps any OutputStream or Writer provided by the response with a version capable
of compression the output. The wrappers also allow HTTP headers to be intercepted and
adjusted. The multiple wrappers involved have always stuck me as a bit over complex, error
prone and in need of a better way. But it does allow the compression concern to be applied
to a web application without considering how the html content is generated.

So how would aspectJ help applying such a cross cutting concern to a webapplication?
Reading the tutorial I determined that the basic approach needed was to define a
CompressionAspect that contains:
- PointCut definitions that identified the getOutputStream and getWriter calls
  of the response object.
- Advice to wrap the response stream/writer in a compressing stream/writer
- PointCut definitions that identified the header setting methods of the response object.
- Advice to intercept and adjust the HTTP headers and other meta data.
AspectJ certainly provides the Join Points and Advice types required to create this
aspect, but unfortunately I was unable to define PointCut that capture the equivalent
semantics of the CompressionFilter declaration above. I believe that the Point Cut semantics
required are something along the lines of:
It was looking oh-so-good up to those last two clauses! The ending with ‘*.html’ could
be done with an if() point cut, except that there is no way to navigate from a response
object to the associated request. I couldn’t find any point cuts that would assist me
with restricting the aspect to a particular web application or classloader.

My next thought was to get around this problem by creating a no-op Filter and hanging the
PointCut off calls that go via it. I think this approach would have allowed me to
specified the required PointCuts, but then I realized another problem. In order for
these point cuts to work, I would need to use the AspectJ compiler to modify the response
classes used by the web container and passed into the web application. This breaking of
web application encapsulation was not something that I am prepared to do. It would
modify my infrustructure and aspects from one webapplication to be passed if not
executred to other webapplications.

AOP or at least AspectJ, does not appear to be a good replacement for Filters, as:
- PointCuts are defined in the implementation domain (java calls and classes),
  while Filters are defined closer to the application domain (URLs, requests, responses). Using
  the implementation domain events to trigger application level concerns may be impossible or
  at the very least devoid of the application abstractions we so carefully build in OOP.
- The technology of AspectJ does not appear appropriate for Container/Infrastructure
  based deployment. It is not appropriate for the container/infrastructure classes to
  be modified in order to support the application concerns of a particular application.
- The declarative nature of aspects means that PointCuts needs to do a lot of work to
  reduce a global scope to a particular instance. It would be great if procedural semantics
  were available so you could say in code: “wrap that object with this aspect” or “apply this
  advice to that Point Cut”. Such programatic code would also assist with the tracebility
  concerns of AOP.
So my first attempt at AOP has not been successful and I’m still left with the “cool but useless”
feeling. The OOP design of Filters does allow cross cutting concerns to be implemented
in a modular fashion, so useful if not cool applies.
Maybe I’m still missing something or am trying the wrong problem. I hope that an AOP
guru reading this will be able to correct the error of my ways?
01/05/2004
NIO and the Servlet API.
Taylor Crown has written a short paper regarding

Combining the Servlet API and NIO,
which has been briefly

discussed on the serverside.

NIO Servlets have often been discussed as the holy grail of java web application performance.
The promise of efficient buffers and reduced thread loads are very attractive
for providing scalable 100% java web servers. Taylor writes about a mockup NIO server that
he implemented which shows some of this promise.

Taylors results were not with a real Servlet container running realistic
loads. But his results look promising and his approach has inspired me
to try and apply it to the
Jetty Servlet container.

The fundamental problem with using NIO with servlets is how to combine the
non-blocking features of NIO with the blocking streams used by
servlets. I have tried several times before to introduce a
SocketChannelListener to Jetty, which only used non-blocking NIO semantics
to manage idle connections. Connections with active requests were converted
to blocking mode, assigned a thread and handled by the servlet container
normally.
Unfortunately, the cost of manipulating select sets and changing socket modes
was vastly greater than any savings. So while this listener did go into
production in a few sites, there was no significant gain in scalability and an
actual loss in max throughput.

Taylor has tried a different approach, where a producer/consumer model is used
to link NIO to servlets via piped streams. A single thread is responsible for
reading all incoming packets and placing them in the non-blocking pipes.
A pool of worker threads take jobs from a queue of connections with input and
does the actual request handling. I have applied this approach to Jetty as
follows:
- The PipedInputStream used by Taylor requires all data read to be copied
  into byte arrays. My natural loathing of data copies lead me to write a
  ByteBufferInputStream, which allows the NIO direct buffers to be used as the
  InputStream buffers and then recycled for later use.
- Taylors mock server uses direct NIO writes to copy data from a file to the
  response. While a great way to send static content, this is not realistic for
  a servlet container which must treat all content is as dynamic. Thus I wrote
  SocketChannelOutputStream to map a blocking OutputStream to a non-blocking
  SocketChannel. It works on the assumption that a write to a NIO stream will
  rarely return 0 bytes written. I have not well tested this assumption.
- There is no job queue in the Jetty implementation, instead requests are
  directly delegated to the current Jetty thread pool. The effect of this
  change is to reduce the thread savings. A thread is required for all
  simultaneous requests, which is better than a thread per connection, but not
  as trim as Taylors minimal set of worker threads. A medium sized thread
  pool is being used as a fixed size job queue.
- Taylors mock server only handled simple requests for static content, which
  may be handled with a simple 304 response. Thus no requests contained any
  content of size and neither do all responses. This is not a good test for
  the movements of real content that most web applications must do. The
  Jetty test setup is against a more realistic mix of static and dynamic
  content as well as a reasonable mix of POST requests with content.
This code has been written against Jetty 5.0 and is currently checked into
Jetty CVS HEAD in the

org.mortbay.http.nio
package. So far I have not had
time to really optimise or analyse the results, but early indications are
that this is no silver bullet.

The initial effects of using the NIO listener is that the latency of the
server under low load has doubled, and this latency gets worse with load.
The maximum throughput of the server has been reduced by about 10%, but
is maintained to much higher levels of load. In fact, with my current test setup I
was unable to produce enough load to significantly reduce the throughput.
So tecchnically at least, this has delivered on the scalability promise?

The producer/consumer model allows a trade off of some low and mid level
performance in return for grace under extreme load. But you have to ask
yourself, is this a reasonable trade? Do I want to offer crappy service
to 10000 users, or reasonable service to 5000? To answer this, you have to
consider the psychology of the users of the system.

Load generators do not have any psychology and are happy to wait out the
increasing latency to the limits of the timeouts, often 30 seconds or more.
But real users are not so well behaved and often have patience thresholds set well below
the timeouts. Unfortunately a common user response to a slowly displaying web
page is to hit the retry button, or
worse still the shift retry! Having your server handle 1000 requests per
second may not be such a great thing if 50% of those requests are retries
from upset users.

I suspect that the producer/consumer model may be costing real quality of
service in return for good technical numbers. Consider the logical extreme of
the job queue within Taylors mock implementation. If sustained load is
offered in excess of the level that the workers can handle, then that
queue will simply grow and grow. The workers will still be operating at
near their optimal throughput, but the latency of all requests served
with increase until timeouts start to expire. Throughput is maintained, but
well beyond the point of offering a reasonable quality of service.

Even with a limited job queue (as in the Jetty implementation),
the simple producer/consumer model suffers from the inability to target
resources to where they are best used. The single producer thread gives
equal effort towards handling new requests as it does to receiving
packets for requests that have already started processing. On a loaded
server, it is better to use your resources to clear existing requests so
that their resources may be freed for other requests. On a multi-CPU machine, it
will be a significant restriction to only allow a single CPU to perform any
IO reads, as other CPUs may be delayed from doing useful work or real requests, while
one CPU is reading more load onto the system.

Taylors producer/consumers approach is significantly better than my preceding attempts,
but has not produced an easy win when applied to a real Servlet container.
I am also concerned that the analysis has focused too much on throughput without
any due consideration for latency and QOS. This is not to say that this is a dead
end. Just that more thought and effort are required if producer/consumer NIO is to match
the wonderful job that modern JVMs do with threading.

I plan to leave the SocketChannelListener in the development branch of Jetty for
some time to allow further experimentation and analysis. However, I fear that
the true benefits of NIO will not be available to java web applications until we
look at an

API other than Servlets
for our content generation.
09/02/2004
Servlets must DIE! – Slowly
Now that the 2.4 servlet spec is final, I believe the time is right to start
considering the end of life for the API. This may sound a little
strange coming from somebody on the JSR and who has spent years writing
a servlet container, but I think the API has outgrown it’s original purpose
and no longer is the best API to go forward with.

The problem is that servlets are trying to be too many things to too many
people. One one hand servlets are protocol handlers, able to control the
the details of connections, transport and encoding. But they are also
intended as application components, to be used by presentation programmers
and to be plugged and played with during deployment. There is also the
the whole stand-alone versus J2EE application server issue within the
specification.

The problems that result are many:
- Web Applications commonly get deployed with their own CompressionFilter,
  XML parser, logging jars and authentication mechanism. What exactly is the
  "application" infrastructure being provided to them?
- Because protocol concerns are implemented within application components,
  the extensibility and portability of the container is limited.
  For example, compression could be implemented more simply, uniformly and maintainably
  by the container. Instead we have many web applications shipping their own copy of
  the CompressionFilter, complete with the bugs of the original sample code.
  It also limits the efficient portability of servlets, even over versions of HTTP.
  It is often good practise for Servlets written for HTTP/1.0
  consume memory and CPU calculating the content length so that connections can
  be persisted. Because this HTTP concern has been implemented by the
  application, it cannot be disabled when the servlet is deployed in a HTTP/1.1
  container that does not need content length for persistent connections.
- Servlet containers are unable to take advantage of non-blocking
  NIO because application servlets do their own IO assuming blocking semantics. Nor
  are HTTP features like range requests, trailers and accept headers able to be
  used without involving the application developers.
- The contract between servlet and container is not well defined for HTTP headers.
  What does it mean in when a servlet sets a HTTP header?
  Is it a command to the container to do something (e.g. Connection: close), a signal
  that the serlvet has done something (e.g. Transfer-Encoding: gzip ) or a helpful
  hint that the container can use,ignore or enforce (e.g. Content-Length: 42)?
- There is no standard lightweight deployment package for HTTP consumers
  such as SOAP. A full 2.4 web application container is a bit of overkill if
  all you want to do is transport XML blobs. Why should you have a fully
  capable web tier just to accept web services?
I believe the answer to these woes is to create a new API that is purely about
content and is really protocol independent. This API would allow for the creation
of Contentlets, which are application objects that can be selected and queried for meta data and
content without reference to HTTP headers, transport encodings, threading or streams.
Contentlets would be equally able to serve their content over HTTP, email, rsync or
whatever new protocols that emerge that fit a generic request/response style.

It would be the responsibility of the transport implementation to pull meta
data and content from the Contentlet. This is the opposite of Servlets, which push
HTTP headers and content, even if they are not required or accepted by the client.

The Container would be able to efficiently implement transport features such as
compression, conditionals, encodings, ranges and protocol specific features.
The application components could be written with no concern for transport
and thus application developers need not become protocol experts in order
to write save, portable and efficient code.

Of course Servlets could be used to initially implement Contentlets, but
the eventual aim should be to have direct HTTP to Contentlet containers,
perhaps built on a revamped and simplified Servlet 3.0 API? Either way,
we need to begin thinking about the end-of-life of 2.x Servlets.
17/12/2003

Category: Uncategorized

Implementation

Security

Lessons Learned