Author: admin

  • Global firm extends free open source training to Filipinos

    EXIST Engineering, a global company founded by a Filipino, is providing open source technology training in Manila this month, INQ7.net learned.

    In an e-mailed advistory, the company founded by Winston Damarillo, former Intel Capital venture capitalist executive, is offering free training on Ajax, Comet and Jetty technologies on May 6, 2006.

    The company has invited Greg Wilkins and Jan Bartel as speakers cum trainors.

    Exist said Wilkins is a founding partner of the Core Developers Network and the founder of Mortbay Consulting. He is the developer of the Jetty http server and servlet container.

    Bartel, on the other end, is a contributor to the Jetty open source project in a number of capacities, Exist said. She is the author of both the Jetty website and online tutorial, a collaborator on the integration of Jetty with JBoss, and a key developer for J2EE-related service enhancements to Jetty.

    The company said that those interested to be part of this offer can send their confirmation to info-ajaxtraining@exist.com on or before May 3, 2006.

    The training will accommodate 150 people.

    The training will start 8 a.m. to 12 noon at the Assembly Room of the Meralco Foundation Bldg., Ortigas Ave. Extension (just beside the New Medical City), Pasig City, Metro Manila.


    View original article

  • Former Intel Capital VC exec seeds tech firms in RP

    WINSTON Damarillo, a former venture capitalist working at Intel Capital, is set to open more companies in the Philippines that would be involved in open source software development.

    Damarillo is currently running a company called Exist engaged in open source software development. Currently its chairman since its inception in 2001, the company has reported a profitable operation, growing at over 100 percent annually. It currently has two operations in the Philippines: one in Ortigas in Pasig City and another in Cebu City. The company runs an office in Los Angeles.

    Last year, Damarillo became a “celebrity” in the US open source community after he sold Gluecode, a company he co-founded, to IBM for less than 100 million dollars. Gluecode develops open source application servers. He eventually used the money to set up an incubator firm for open source projects called Simula Labs. “Simula” is a Filipino word for beginning, and is used in this context to refer to startups.

    A believer in Filipino software engineering talent, Damarillo revealed that Filipino software engineers in the Philippines account for much of the open source development work done in his company.

    “We have been building our software in the Philippines,” Damarillo, who was in Manila for a visit, told INQ7.net.

    Hoping to reverse the ongoing brain drain in the software industry by sourcing the open source software development jobs in the Philippines, Damarillo said that he is also bringing in open source software experts to train Filipino software engineers in the country.

    “I

  • Scaling Connections for AJAX with Jetty 6

    With most web applications today, the number of simultaneous users can greatly exceed the number of connections to the server.
    This is because connections can be closed during the frequent pauses in the conversation while the user reads the
    content or completes in a form. Thousands of users can be served with hundreds of connections.

    But AJAX based web applications have very different
    traffic profiles to traditional webapps.
    While a user is filling out a form, AJAX requests to the server will be asking for
    for entry validation and completion support. While a user is reading content, AJAX requests may
    be issued to asynchronously obtain new or updated content. Thus
    an AJAX application needs a connection to the server almost continuously and it is no
    longer the case that the number of simultaneous users can greatly exceed the number of
    simultaneous TCP/IP connections.

    If you want thousands of users you need thousands of connections and if you want tens of thousands
    of users, then you need tens of thousands of simultaneous connections. It is a challenge for java
    web containers to deal with significant numbers of connections, and you must look at your entire system,
    from your operating system, to your JVM, as well as your container implementation.

    Operating Systems & Connections

    A few years ago, many operating systems could not cope with more than a few hundred TCP/IP connections.
    JVMs could not handle the thread requirements of blocking models and the poll system call used for asynchronous
    handling could not efficiently work with more than a few hundred connections.

    Solaris 7 introduced the /dev/poll
    mechanism for efficiently handling thousands of connections and Sun have
    continued their development so that now Solaris 10 has a
    new optimized TCP/IP stack that is reported
    to support over 100 thousand simultaneous TCP/IP connections. Linux has also made great advances in this area and
    comes close to S10’s performance. If you want scalable AJAX application server, you must start with such an
    operation system configured correctly and with a JVM that uses these facilities.

    Connection Buffers

    In my previous blog entry I described how
    Jetty 6 uses Continuations and javax.nio to limit the number of threads required to service AJAX traffic. But threads are
    not the only resources that scale with connections and you must also consider buffers. Significant memory can be consumed if
    a buffer is allocated per connection. Memory cannot be saved by shrinking the buffer size good reasons to have
    significantly large buffers:

    • Below 8KB TCP/IP can have efficiency problems with it’s sliding window protocol.
    • When a buffer overflows, the application needs to be blocked. This holds a thread and associated resources
      and increases switching overheads.
    • If the servlet can complete without needing to flush the response, then the container can flush the buffer
      outside of the blocking application context of a a servlet, potentially using non-blocking IO.
    • If the entire response is held in the buffer, then the container can set the content length header and can avoid
      using chunking and extra complexity.

    Jetty 6 contains a number of features designed to allow larger buffers to be used
    in a scalable AJAX server.

    Jetty 6 Split Buffers

    Jetty 6 uses a split buffer architecture and dynamic buffer allocation. An idle connection will have no buffer allocated to it,
    but once a request arrives an small header buffer is allocated. Most requests have no content, so often this is the only
    buffer required for the request. If the request has a little content, then the header buffer is used for that content as
    well. Only if the header received indicates that the request
    content is too large for the header buffer, is an additional larger receive buffer is allocated.

    For responses, a similar approach is used with a large content buffer being allocated once response data starts to be generated.
    If the content might need to be chunked, space is reserved at the start and the end of the content buffer to allow the data to
    be wrapped as a chunk without additional data copying.
    Only when the response is committed is a smaller header buffer allocated.

    These strategies mean that Jetty 6 allocates buffers only when they are required and that these buffers are of
    a size suitable for the specific usage. Response content buffers of 64KB or more can easily be used without
    blowing out total memory usage.

    Gather writes

    Because the response header and response content are held in different buffers, gather writes
    are used to combine the header and response into a single write to the operating system. As efficient direct buffers are used, no
    additional data copying is needed to combine header and response into a single packet.

    Direct File Buffers

    Of course there will always be content larger than the buffers allocated, but if the content is large then it
    is highly desirable to completely avoid copying the data to a buffer. For very large static content,
    Jetty 6 supports the use of mapped file buffers,
    which can be directly passed to the gather write with the header buffer for the ultimate in java io speed.

    For intermediate sized static content, the Jetty 6 resource cache stores direct byte buffers which also can be written
    directly to the channel without additional buffering.

    For small static content, the Jetty 6 resource cache stores byte buffers which are copied into the
    header buffer to be written in a single normal write.

    Conclusion

    Jetty 6 employs a number of innovative strategies to ensure that only the resources that are actually
    required are assigned to a connection and only for the duration of they are needed. This careful
    resource management gives Jetty an architecture designed to scale to meet the needs of AJAX
    applications.

  • Jetty 6.0 Continuations – AJAX Ready!

    The 6.0.0alpha3 release of Jetty is now available
    and provides a 2.4 servlet server in 400k jar, with only 140k of dependencies (2.6M more if you want JSP!!!).
    But as well as being small, fast, clean and sexy, Jetty 6 supports a new feature
    called Continuations that will allow scalable AJAX applications to be built, with
    threadless waiting for asynchronous events.

    Thread per connection

    One of the main challenges in building a scalable servlet server is how to
    handle Threads and Connections. The traditional IO model of java associates a thread
    with every TCP/IP connection. If you have a few very active threads, this model can
    scale to a very high number of requests per second.
    However, the traffic profile typical of many web applications is many persistent HTTP
    connections that are mostly idle while users read pages or search for the next link
    to click. With such profiles, the thread-per-connection model can have problems scaling
    to the thousands of threads required to support thousands of users on large scale deployments.

    Thread per request

    The NIO libraries can help, as it allows asynchronous IO to be used and threads can be
    allocated to connections only when requests are being processed. When the connection is
    idle between requests, then the thread can be returned to a thread pool and the
    connection can be added to an NIO select set to detect new requests. This thread-per-request
    model allows much greater scaling of connections (users) at the expense of a
    reduced maximum requests per second for the server as a whole (in Jetty 6 this expense
    has been significantly reduced).

    AJAX polling problem

    But there is a new problem. The advent of AJAX as a
    web application model is significantly changing the traffic profile seen on the server side. Because
    AJAX servers cannot deliver
    asynchronous events to the client, the AJAX client
    must poll for events on the server. To avoid a busy polling
    loop, AJAX servers will often hold onto a poll request
    until either there is an event or a timeout occurs.
    Thus an idle AJAX application will
    have an outstanding request waiting on the server which can be used to send a response to the
    client the instant an asynchronous event occurs.
    This is a great technique, but it breaks the thread-per-request model, because
    now every client will have a request outstanding in the server. Thus the server again
    needs to have one or more threads for every client and again there are problems scaling
    to thousands of simultaneous users.

    Jetty 6 Continuations

    The solution is Continuations, a new feature introduced in Jetty 6. A java Filter or
    Servlet that is handling an AJAX request, may now request a Continuation object
    that can be used to effectively suspend the request and free the current
    thread. The request is resumed after a timeout or immediately if the resume method
    is called on the Continuation object. In the Jetty 6 chat room demo, the following
    code handles the AJAX poll for events:

    private void doGetEvents(HttpServletRequest request, AjaxResponse response)
    {
    member = (Member)chatroom.get(request.getSession(true).getId());
    // Get an existing Continuation or create a new one if there are no events.
    boolean create=!member.hasEvents();
    Continuation continuation=ContinuationSupport.getContinuation(request,create);
    if (continuation!=null)
    {
    if(continuation.isNew())
    // register it with the chatroom to receive async events.
    member.setContinuation(continuation);
    // Get the continuation object. The request may be suspended here.
    Object event= continuation.getEvent(timeoutMS);
    }
    // send any events that have arrived
    member.sendEvents(response);
    // Signal for a new poll
    response.objectResponse("poll", "");
    }

    When another user says something in the chat room, the event is delivered to
    each member by another thread calling the method:

    class Member
    {
    public synchronized void addEvent(Event event)
    {
    _events.add(event);
    if (_continuation!=null)
    // resume requests suspened in getEvents
    _continuation.resume(event);
    }
    ...
    }

    How it works

    Behind the scenes, Jetty has to be a bit sneaky to work around Java and the Servlet specification
    as there is no mechanism in Java to suspend a thread and then resume it later.
    The first time the request handler calls continuation.getEvent(timeoutMS) a
    RetryReqeuest runtime exception is thrown. This exception propogates out of all the request
    handling code and is caught by Jetty and handled specially.
    Instead of producing an error response, Jetty places the request on a timeout queue and returns the
    thread to the thread pool.
    When the timeout expires, or if another thread calls continuation.resume(event)
    then the request is retried. This time, when continuation.getEvent(timeoutMS)
    is called, either the event is returned or null is returned to indicate a timeout.
    The request handler then produces a response as it normally would.
    Thus this mechanism uses the stateless nature of HTTP request handling to simulate a
    suspend and resume. The runtime exception allows the thread to legally exit the
    request handler and any upstream filters/servlets plus any associated security context.
    The retry of the request, re-enters the filter/servlet chain and any security context
    and continues normal handling at the point of continuation.
    Furthermore, the API of Continuations is portable. If it is run on a non-Jetty6 server
    it will simply use wait/notify to block the request in getEvent. If Continuations prove
    to work as well as I hope, I plan to propose them as part of the 3.0 Servlet JSR.

  • Cin

    This case study looks at Cinémathèque from PowerSource Software Pty Ltd. It is a digital interactive entertainment system that embeds Jetty as the backend server for the set top box browser.

    PowerSource Software is a boutique software developer based in
    Sydney, Australia. The company has particular expertise in several
    interesting real-time areas: conventional wagering and gaming systems
    (totalisator systems for on and off-track betting, lotteries and
    wide-area keno systems), community gaming systems (trade promotions,
    competitions and opinion polls via mobile devices using SMS), and IPTV
    and video on demand (VOD).

    PowerSource became active in IPTV and VOD because the company was
    looking for new ways to capitalise on its core expertise – high speed
    transaction processing. As luck would have it, the majority of the
    company’s betting systems ran on real-time platforms supplied by
    Concurrent Computer Corporation, and Concurrent had started to utilise
    their hardware and real-time operating systems in the development of
    their MediaHawk video servers. At roughly the same time, the first
    deployable IP-based set top boxes also appeared. However,
    commercialisation of these IPTV-related technologies was being stymied
    by the absence of an affordable way of gluing these sophisticated
    components together into a customer-based money making enterprise. So
    PowerSource developed Cinémathèque.

    Cinémathèque is a comprehensive solution for service providers offering
    interactive digital entertainment including IPTV and VOD, for wide area
    residential, residential multi-dwelling, and hospitality environments.
    It provides a tightly integrated suite of monitoring, control and
    support facilities that maximise the features and facilities available
    to subscribers while minimising the operational burden of the service
    provider.

    In an IPTV-VOD system, a high speed two-way network connects the video
    servers and management system at the head-end to set top boxes in
    subscribers’ premises. Perhaps the biggest difference between IPTV
    systems and traditional hybrid-fibre-coax (HFC) pay TV deployments is
    the speed of the network, and especially the speed of the back channel.

    When a viewer selects an on-demand program the set top box sends in a
    play-out request which needs to be authorised before the video server
    will start streaming the content. At any time the viewer can stop,
    pause, rewind or fast-forward the video. It is important to note that
    the content is streamed across the network and played out in real-time.
    It is not stored or buffered in the set top box for later play-out –
    there are no disks in IP set top boxes. All subscriber interactions
    with the set top box – including play-out control – are transmitted
    across the network in real-time to the servers at the head-end.

    Of course, people will only pay to watch if there is something
    worthwhile to watch and it’s available to them at a convenient time. In
    this regard, digital video on demand differentiates itself from older
    hotel movie systems which provide a very narrow range of content, and
    from traditional subscription TV which offers only “near” video on
    demand services in which programs start at pre-designated times. The
    versatility of a digital video on demand system allows a hotel, or a
    residential IPTV service provider, to offer not only the latest
    Hollywood movies, but also classics, cult films, documentaries and the
    crème de la crème of TV. In other words, subscribers can watch what
    they want, when they want.

    It happens that content owners, and the Hollywood studios in particular,
    go to great lengths to ensure that their valuable property is presented
    in an appropriate manner. For this reason, most content is delivered to
    the set top box as a 4 Mbit per second MPEG2 Transport Stream. This
    bit-rate provides the viewer with a near-DVD quality viewing experience
    on a standard television. While this generally guarantees that a movie
    will be seen in the best light, it also makes simultaneously delivering
    a large number of streams quite challenging. Clearly, there is a big
    difference between streaming numerous film clips at 64 or 128 Kbps over
    the net compared to pumping hundreds, if not thousands, of 4 Mbps
    streams simultaneously. This is especially true considering how easily
    human eyes and ears can detect jitter in the video and audio resulting
    from lost frames or uneven play-out. This is the realm of “big-iron” video servers like Concurrent’s MediaHawks.

    An IP set top box has three principal software components: an operating
    system, a highly customised web browser and a media player. Many of the
    better IP set top boxes run Linux – which is either booted out of
    non-volatile memory or over the network – together with a small
    footprint version of the Mozilla browser. The browser is heavily
    customised to cater for the aspect ratio, resolution and colour palette
    of a standard television. It is also adapted to make it easy to use in
    the “lean back” environment in which people watch television.

    Experience shows that in the lean back environment of the TV room, less
    hand-eye coordination is required to successfully operate the remote
    control if “compass” keys are used for navigation instead of a
    track-ball or other mouse-like device that uses a floating cursor. The
    compass keys on the remote control let the subscriber navigate and
    select a hyperlink; the set top box then sends an HTTP request to
    Cinémathèque which returns the appropriate page in response.

    Cinémathèque plays a vital role in a digital entertainment service
    network because it is responsible for handling all subscriber
    interactions. Each time a subscriber follows a hyperlink, and each time
    they request video play-out or select some other supplementary service,
    Cinémathèque must accept and validate the request, secure a transaction
    to disk, update the subscriber’s account and other persistent data
    structures, and format and return a suitable response. This workload
    represents a unique mix of web content requests and complex customer
    transactions.

    Since Cinémathèque essentially provides the virtual shop window for the
    digital entertainment service provider, it must respond quickly even
    under considerable load, and even when the content is being generated
    dynamically. The content itself also has to be thoughtfully designed to
    facilitate effortless navigation to the items of most interest to a
    subscriber. The experience has to be more like watching television than
    surfing the web.

    Although it is widely recognised that the architecture and performance
    of the video servers is vital to satisfy service level expectations, the
    performance characteristics of the management system – the so-called
    middleware layer – are often overlooked. But all subscriber activity
    starts out as an HTTP request to Cinémathèque – only when a response
    from Cinémathèque includes authorisation to commence video play-out does
    a set top box actually communicate with a media server. This is why
    Cinémathèque’s heritage is so important: it relies heavily on
    PowerSource’s experience building high performance transaction
    processing systems.

    Cinémathèque is comprised of three primary functional modules:

    • Jetty servlet engine
    • Javelin transaction processor
    • Cinémathèque application core

    The Jetty servlet engine provides Cinémathèque with the flexibility of a
    conventional web server but without the bloat, without the inevitable
    performance problems, and without the implicit security worries. Jetty
    is embedded within Cinémathèque and acts as a servlet container and
    dispatcher. In this role Jetty is reliable, secure and fast. Jetty
    invokes specialised Cinémathèque servlets in response to requests from
    set top boxes; these servlets interact with the Cinémathèque application
    core to provide the necessary services to subscribers.

    Javelin is PowerSource’s secure, non-stop transaction processing engine
    – it is written in Java and is the component on which all of
    Cinémathèque’s other application features and facilities are based.
    Javelin secures all transactions to duplicated disk files – it handles
    all data mirroring itself rather than delegating this to the operating
    system, and it provides Cinémathèque with a robust and persistent data
    store. On a mid-range Linux server, Javelin is capable of recording in
    excess of 500 transactions per second while maintaining an average
    response time of less than 100 milliseconds. Javelin also performs an
    automatic restart and recovery to ensure that no data is lost as a
    consequence of a system outage.

    The Cinémathèque application, too, is written entirely in Java to
    maximise portability, reliability and flexibility. Cinémathèque
    supports true video on demand, as well as near video on demand via its
    multicast scheduler. Since every subscriber interaction is handled by
    Cinémathèque the number of subscribers watching on-demand programs can
    be monitored in real-time. Similarly, Cinémathèque also tracks, in
    real-time, the number of subscribers tuned to each reticulated
    free-to-air, pay, or multicast TV channel. This permits a service
    provider to perform very accurate capacity planning as well as knowing
    what content sells and what doesn’t.

    Very little of the HTML content returned to the set top box by
    Cinémathèque is static. Instead, Cinémathèque creates portions of many
    pages dynamically according to the attributes of the viewer’s
    subscription package, the titles and packages that they’ve previously
    purchased, titles that are currently book-marked, and the rating level
    of the content that the current user is permitted to see (to safeguard
    children from accessing inappropriate content).

    All transactions, including billing transactions, are processed by
    Cinémathèque in real-time. Cinémathèque gathers operational,
    statistical and performance data continuously, and records this to its
    transaction files; this data is available for on-demand display on
    system administration workstations.

    Cinémathèque is set top box independent and supports any number of
    different types of set top boxes simultaneously within a single
    deployment. Similarly, it does not rely on any set top box specific
    features and doesn’t require any specialised application software or
    middleware to be present in the set top box. Adding support for other
    set top boxes is straight forward and entails adapting several
    JavaScript functions which are embedded in HTML pages returned to the
    set top box; this JavaScript accommodates the inevitable differences
    between the ways that set top box vendors invoke their media players.
    These are important features in Cinémathèque because they help service
    providers avoid set top box vendor lock-in.

    A virtue of IP-based set top boxes is their uniformity – almost without
    exception they provide a consistent “application environment” by way of
    of their standards compliant HTTP, HTML and JavaScript implementations.
    Indeed, all set top box functions, including invoking, controlling and
    monitoring the embedded media player are achieved with JavaScript.
    Although they have the capability to run a Java Virtual Machine, most IP
    boxes don’t for two reasons: firstly, it substantially increases the
    memory footprint (something to be avoided in a cost sensitive consumer
    device), and secondly, most boxes don’t have sufficient CPU resources to
    spare (a box with a 400 MHz clock CPU is considered fast).

    Cinémathèque returns JavaScript objects to the set top box’s browser in
    response to each HTTP request; the data embedded in these objects is
    then rendered using JavaScript. This mechanism allows the look-and-feel
    designer to expose as little or as much of the service or
    program-related “metadata” to subscribers as they like without the
    requirement to change any server-side software.

    For optimum performance, Cinémathèque comes bundled with a Concurrent
    Computer Corporation iHawk application server. iHawks run Concurrent’s
    RedHawk Linux operating system which is a POSIX-compliant, real-time
    version of the open source Linux operating system. RedHawk is based on
    a standard Red Hat distribution but substitutes the usual kernel with a
    real-time enhanced one; it provides enhancements that maximise
    Cinémathèque’s performance.

    Cinémathèque uses Java’s extensive internationalisation support to make
    locale and language customisation straightforward. Each word and phrase
    that appears in the Cinémathèque administration client is maintained in
    a resource bundle – adding support for a new language simply requires
    adding the appropriate translations to the bundle. Cinémathèque
    currently supports English, Japanese, Korean, Simplified Chinese and
    Traditional Chinese and any combination of these languages can be used
    simultaneously within a single system on both set top boxes and the
    system’s administration workstations.

    In residential mode, a subscriber can only access IPTV and other
    chargeable services after first logging in with their unique client id
    and password. Cinémathèque lets a subscriber assign a different
    password to each content rating classification to prevent children from
    accessing inappropriate material, thereby imposing parental control. In
    hospitality mode, access to services is controlled by Cinémathèque which
    receives guest check-in and check-out notifications from the hotel’s
    property management system.

    A subscriber can have an unlimited number of simultaneously active
    rentals and can switch between active rentals and initiate additional
    rentals at any time. Whenever video play-out is suspended, Cinémathèque
    automatically sets a bookmark for that rental – the subscriber can
    resume play-out either from the start of the program or from the
    bookmark. The subscriber can view their active rental list and review
    their complete rental history via their set top box at any time. The
    active rentals and rental history displays are filtered according to the
    rating level of the current login. Again, this prevents children from
    seeing references to inappropriate content.

    Cinémathèque also has an integral customer loyalty program that works in
    conjunction with its customer profiling capabilities. The loyalty
    program provides for standard and VIP customers and reward points can be
    allocated based on spending behaviour. Accumulated reward points can be
    redeemed for specially created package deals and service upgrades.

    Jetty was selected after PowerSource’s engineers had evaluated several
    servlet engines.

    So why did PowerSource choose Jetty?

    Firstly, Jetty offered superior performance. Secondly, it was easy to
    embed within a larger application. In this regard, PowerSource was
    looking for a servlet engine that didn’t “get in the way” of the rest of
    the larger application. Thirdly, it was particularly important that the
    servlet engine wasn’t a resource hog. And fourthly, Cinémathèque
    systems are installed at customer sites and are expected to run
    unattended in a lights out environment – PowerSource was looking for a
    servlet engine that the engineers could “set and forget”.

    Jetty’s reliability and performance counted highly in its favour because
    Cinémathèque essentially controls the delivery of premium subscription
    television services that customers are buying with their discretionary
    expenditure. In this situation, paying customers don’t tolerate service
    unavailability because they’ve become accustomed to TV not being
    interrupted. If the responsiveness of the IPTV service is poor, or if
    it is unreliable, then customers will buy their entertainment elsewhere.

    Finally, as the company’s software engineers were making their minds up
    about Jetty, it became obvious that there was another significant aspect
    related to performance: namely the super-responsiveness of the team at
    Mortbay and the enthusiasm of the Jetty users active on the mailing lists.

    PowerSource have several new products under development; our positive
    experience with Jetty, and our ability to rely on it, means that it will
    remain one of the key components in PowerSource’s systems.

    Screen shots and diagrams

    Related links

    Cinematheque: http://www.powersource.com.au/cine

    RedHawk Linux: http://www.ccur.com/isd_solutions_redhawklinux.asp

    MediaHawk video servers: http://www.ccur.com/vod_default.asp

    Kreatel set top boxes: http://www.kreatel.se

  • Interview with Peter Rodgers of 1060 NetKernelTM

    This Jetty Case Study takes a look at an intriguing
    software infrastructure product called 1060 NetKernelTM.

    Announcing the recent release of version 2 of the product, 1060 Research describes NetKernel as “an advanced service oriented microkernel”. Complex applications are produced by creating simple services and then aggregating or pipelining them together. NetKernel services interact with the external environment via pluggable transports, such as SMTP, SOAP and – importantly – HTTP.

    NetKernel can be used standalone, for example in place of a J2EE application server, or alternatively embedded within a J2EE app server, or in fact embedded in any Java application.

    We spoke via email with Peter Rodgers – founder and CEO of 1060 Research and one of the product’s architects – to find out more about NetKernel, and how Jetty contributes to it:

    1. You describe NetKernel as a “REST” based microkernel. Firstly, can
      you explain what is “REST”?

      REST is an acronym for REpresentation State Transfer. It was coined by
      Roy Fielding in his PhD dissertation which retrospectively presents a
      formalism of the Web architecture.

      What it means in practise is that resources are addressed by URI and
      instead of a resource being generated by hidden interactions it is generated
      by the transfer of state to a service – often though not exclusively
      expressed in the URI.

      Whilst this seems like a new pattern when applied in the context of the
      Web, it is really an old pattern that is basically one of the
      foundational principles of Unix. In Unix, software applications are
      modular services which may be configured by switches to process a
      resource. Higher order applications are created by orchestrating
      pipelines of lower-level software services – today this design pattern
      is frequently called “loose coupling”.

      The NetKernel microkernel allows software applications to be flexibly
      composed like a Unix system, but employs URI addressing and a URI address
      space abstraction to present a uniform application context.

    2. Now can you describe NetKernel for us?

      Fundamentally NetKernel is a virtual operating system abstraction which
      you could describe as “Unix meets the Web”. Although, more practically,
      the microkernel provides the basis for a general purpose Application
      Server which in-particular supports rich XML processes and services.

      Software applications on NetKernel consist of fine-grained URI
      addressable services which may be composed into higher-order services or
      abstracted behind higher-level URI interfaces. Basically the Unix model
      of loose coupling in a URI address space.

      NetKernel’s URI address space is an *internal* abstraction – composite
      services may be exposed to the world by mapping their URI address onto a
      transport.

    3. You mention request shaping and request scheduling. Does NetKernel
      also support load balancing of requests or failover of services?

      We perform request shaping between transports and the microkernel
      scheduler – this ensures that we get close to ideal throughput for any
      given load[1].

      We can do this since the microkernel has a re-entrant asynchronous
      scheduler – in effect every transport-generated-request initiates the
      execution of an asynchronous application on NetKernel.

      This is different to a free-running multi-threaded model typical in
      unmanaged application servers. Basically once there are more than one
      or two threads per native CPU adding more threads does not mean more
      processing, it just increases the native OS context switching overhead.
      A well managed system will linearly increase it’s throughput with load
      until all CPUs are fully occupied and then operate with constant
      throughput as load increases further – the NetKernel throttle ensures
      that the system operates in this regime.

      The NetKernel scheduler actually allows concurrent execution of
      applications on a single Java thread if necessary.

      In terms of system-wide load-balancing – so far we’ve concentrated on
      the architectural fundamentals. NetKernel can be embedded within J2EE
      application servers. So, many Enterprise deployments steal the
      load-balancing infrastructure of their existing application servers!

      We will very soon have native JMS support which will allow NetKernel to
      be used as highly-scaled general purpose message-oriented middleware.

    4. What about hot deployment of services? Graceful service upgrades?

      As a microkernel architecture NetKernel is completely modular – it
      supports hot installation and updates of all services.

      It also supports version enforcement on modules which means that you can
      concurrently execute multiple generations of the same application in
      isolation on the same system. Good for development and it allows legacy
      systems to keep running irrespective of future additions.

    5. Are requests asynchronous, synchronous or either?

      NetKernel is an asynchronous architecture. However the microkernel will
      attempt to optimally reuse threads synchronously such that context
      switching is minimized.

      However, asynchronous applications are generally not easy to build and
      maintain, so we provide the NetKernel Foundation API which is written so
      that most applications can be developed using synchronous patterns – even
      though under the hood they’ll execute asynchronously.

      At the application level, just like on Unix, applications can fork
      asynchronous sub-processes – they may also explicitly join a forked
      sub-process in-order to retrieve a result or handle exceptions.

    6. Requests are received on transports. What type of transports does
      NetKernel support?

      A transport on NetKernel is a little like a device-driver. It issues
      NetKernel URI requests based upon some application or application
      protocol specific event. It is pretty easy to add new transports.

      Out of the box we have HTTP, SMTP, POP, IMAP, Telnet, In-tray (directory
      based), SOAP 1.1/1.2

      The NetKernel administration applications and services run as
      Web-applications over HTTP on port 1060 – not very subtle subliminal
      marketing!

    7. Which brings us to Jetty – what role/s does Jetty play within NetKernel?

      We use Jetty as the backbone of our HTTP transport module. HTTP is a
      very important application protocol since Web-applications are the
      dominant class of Enterprise application today.

      It’s interesting, we didn’t design NetKernel as a Web-application server
      – we came from wanting a general resource processing model – but it
      turns out that it is very simple to expose NetKernel services as
      Web-applications over the Jetty HTTP transport.

      This is quite different from a Servlet which provides a hard-boundary
      between Web and non-Web. With a Web-application on NetKernel the
      Web-boundary becomes a continuum – you can also do some cool things like
      aspect-oriented layering over the web-address space, but that’s probably
      an advanced topic!

      Jetty is an outstanding HTTP server.

    8. Why did you select Jetty?

      We needed a clean, simple HTTP server. To boot strap our system we
      started off writing our own – it worked, but you wouldn’t have wanted to
      rely on it! So we looked around and discovered Jetty – it had all the
      features we were looking for…

      • A pure HTTP server – we had no need for a Servlet engine.
      • Small footprint.
      • Great compliance with HTTP 1.0 and 1.1 standards.
      • Highly scalable.
      • Easy XML-based configurability.
      • Widespread adoption.
      • Extensible with a custom Request Handler chain.
      • An open-source implementation that overlapped with our business model.

      Jetty has proved to be an incredibly dependable infrastructure, to the
      point where we now just don’t really think about it!

    9. Have you written any custom extensions to Jetty for NetKernel, and if
      so, what were your experiences?

      The NetKernel HTTP transport implements a NetKernel ITransport interface
      which is managed by the microkernel. When the transport is started it
      fires up the Jetty container and uses an XML configuration document to
      declaratively configure Jetty – this includes things like request and
      thread limits and SSL.

      Very early on we developed a custom Request Handler which is hooked into
      the Jetty request handler stack. This handler hooks all HTTP requests
      and wraps them as URI requests against the NetKernel URI address space.
      So, with this handler, Jetty is a thin-bridge from the HTTP application
      protocol to the NetKernel virtual address space.

      That’s sort of where Jetty ends and NetKernel begins – though not quite.
      We’ve generalized the idea of the Servlet from being a static interface
      which tightly binds the HTTP protocol. Instead we offer a service
      called the HTTPBridge – this is a configurable HTTP Request
      Filter service which can be transparently layered over the
      URI address space.

      The HTTPBridge pre-processes the low-level HTTPRequest. It can be dynamically
      configured to XML’ize URI parameters or POST data, pre-process file uploads,
      process HTTP headers, process Cookies etc etc. It is a general-purpose
      service which can slice and dice the low level HTTPRequest in many
      ways – including for example performing SOAP HTTP bindings.

      The HTTPBridge re-issues the pre-processed request into the NetKernel URI
      space. Ultimately the Bridge receives a response for the
      request, for which it then performs any final HTTP specific processing –
      such as setting HTTP response codes, cookies, etc and of course
      serializing any generated resource into the HTTPResponse stream.

      So the NetKernel HTTP transport is decomposed into clean, easily
      reconfigurable layers and of course the HTTPRequest/Response objects
      come from Jetty.

    10. Are there any features you would like to see in Jetty and why? For
      example, you mention NIO in relation to the Http transport …

      Jetty is an excellent solution as it is. We have a philosophy of always
      trying to keep everything as simple, as lean and as minimal as possible.
      So we’re always looking for more from less. As mentioned above we do
      this in the kernel by always operating with an efficiently small number
      of threads. We’re also considering a NetKernel Micro-Edition for J2ME
      embedded applications.

      At the moment, actually based on your empirical advice on the Jetty
      site, we have not used NIO. We’d be very interested to understand
      better if NIO would offer any advantage in terms of HTTP server
      footprint etc. Though obviously the performance trade-off would need to
      be understood.

    11. It looks like you guys have had fun with NetKernel – I’m referring
      in particular to a home monitoring system that you’ve put together
      (http://www.1060.org/blogxter/publish/4). . .

      Yes, though NetKernel is a serious software infrastructure, it is
      completely general purpose. This application was put together by Tony
      Butterfield as an example of something a little more entertaining – it
      also means he can talk about how much rain we get from anywhere in the
      world.

      I reckon a good criteria for evaluating software is ‘Is this cool?’.
      I put Jetty into that category straight-away – our hope is that
      people will have the same reaction the first time they boot up NetKernel.

    12. What licensing options are there for NetKernel?

      We have a dual license business model – basically, NetKernel is on the open-source
      commons, to use it we request you to OSI license your code. If you are unable
      to or prefer the additional benefits of a commercial relationship then we
      offer flexible commercial licensing.

    13. You’ve just released NetKernel 2.0, what’s next on the horizon?

      Immediately we have a 2.0.1 update due in the next few weeks. This
      ships some trivial patches but more importantly will provide a new JMS
      transport which didn’t make the release cut for the 2.0 product.

      Our short-term plans are to keep explaining what NetKernel is! We’re
      finding that once people understand, they really like it – but when
      anything is fundamentally different it can take a while to get used to.

      Next steps for NetKernel are a general Unix-like security infrastructure.


    References:


    [1] Whitepaper: NetKernel Scalability

  • Missing Filter Mapping.

    Because I consider some of the “features” of the latest 2.4 specification rather dubious or expensive,
    I was reluctant to implement them as core features of Jetty. Instead, Jetty 5.1 will use optional
    filters to provide JSR77 statistics, wrap under dispatch semantics, request events and request attribute
    events.
    Thus Jetty is able to support these features needed for specification compliance, but they can
    be simply removed with no lingering classpath dependencies simply be removing the filters.
    While this demonstrates the power of filters, it has also revealed a missing filter mapping in
    the specification. There is no way to define a filter to intercept ALL dispatches in a webapplication.
    A Filter mapped to / for REQUEST,FORWARD,INCLUDE and ERROR comes close, but is unable to filter
    named dispatches. Named dispatches can be filtered with filters mapped to a servlet name, but it
    is not possible to generically map a filter to all servlets. Something for the 2.5 spec…

  • Next Generation Jetty and Servlets

    Jetty 5.0.0 is out the door and the 2.4 servlet spec is implemented.
    So what’s next for Jetty and what’s next for the servlet API? It’s been a long journey for
    Jetty from it’s birth in late 1995 to the 5.0.0 release. When the core of Jetty was written,
    there was no J2ME, J2EE, or even J2SE, nor a servlet API, no non-blocking IO and multiple CPU
    machines were rare and no JVM used them. The situation is much the same for the servlet
    API, which has grown from a simple protocol handler to a core component architecture for
    enterprise solutions.
    While these changing environments and requirements have mostly been handled well, the
    results are not perfect: I have previously blogged about the servlet API
    problems
    and Jetty is no longer best of breed when it comes to raw speed or features.
    Thus the I believe the mid term future for Jetty and the Servlet API should involve a bit more
    revolution than evolution. For this purpose the
    JettyExperimental(JE) branch
    has been created and is being used to test ideas to greatly improve the raw HTTP performance
    as well as the application API. This blog introduces JE and some of my ideas for how Jetty and
    the servlet API could change.
    Push Me Pull You
    At the root of many problems with the servlet API is that it is a pull-push API, where the servlet
    is given control and pulls headers, parameters and other content from the request object before pushing
    response code, headers and content at the request object. This style of API, while very convenient for
    println style dynamic content generation, has many undesirable consequences:

    • The request headers must be buffered in complex/expensive hash structures so that the application
      can access it in arbitrary order. One could ask why application code should be handling HTTP headers anyway…
    • The application code contains the IO loops to read and write content. These IO loops
      are written assuming the blocking IO API.
    • Pull-push API is based on stream, read and writer abstractions, which makes it impossible for the
      servlet application code
      to use efficient IO mechanisms such as gather-writes or memory mapped file buffers for
      static content.
    • The response headers must be buffered in complex/expensive hash structures so that applications
      can set and reset them in arbitrary order. One could ask why application code should be writing HTTP headers anyway…
    • The application code needs to be aware of HTTP codes and headers. The API itself provides
      no support for separating the concerns of content generation and content transport.

    From a container implementers point of view, it would be far more efficient for
    the servlet API to be push-pull, where the container pushes headers, parameters and content
    into the API as they are parsed from a request and then pulls headers and content from the
    application as they are needed to construct the response.
    This would remove the need for applications to do IO, additional buffering, arbitrary
    ordering and dealing with application developers that don’t read HTTP RFCs.
    Unfortunately a full push-pull API would also push an event driven model onto the application,
    which is not an easy model to deal with nor suitable for the simple println style of
    dynamic content generation used for most “hello world” inspired servlets.
    The challenge of Jetty and servlet API reform is to allow the container to be written in
    the efficient push-pull style, but to retain the fundamentals of pull-push in the application
    development model we have come to know and live with. The way to do this is to change the
    semantic level of what is being pushed and pulled, so that the container is written to
    push-pull HTTP headers and bytes of data, but the application is written to pull-push content
    in a non-IO style.
    Content IO
    Except for the application/x-www-form-urlencoded
    mime-type, the application must perform it’s own IO to read content from the request and to
    write content to the response. Due to the nature of the servlet API and threading model,
    this IO is written assuming blocking semantics.
    Thus it is difficult to apply alternative IO methods, such as NIO
    Unfortunately the event driven nature of non-blocking IO is incompatible with the servlet
    threading model, so it is not possible to simply ask developers to start writing IO assuming
    non-blocking IO semantics or using NIO channels.
    The NIO API cannot be effectively used without direct access to the low level IO classes, as
    low level API is required to efficiently write static content using a file
    MappedByteBuffer to a
    WritableByteChannel
    or to combine content
    and HTTP header into a single packet without copying using a
    GatheringByteChannel.
    The true power of the NIO
    API cannot be abstracted into InputStreams and OutputStreams.
    Thus to use NIO, the servlet API must either expose these low levels (bad idea – as NIO might not
    always be the latest and greatest) or to take away content IO responsibilities from the application
    developers.
    The answer is to take away from the application servlets the responsibility for performing
    IO. This has already been done for application/x-www-form-urlencoded, so
    why not let the container handle the IO for text/xml, text/html etc.
    If the responsible for reading and writing bytes (or characters) was moved to the container,
    then the application servlet could code could deal with higher level content Objects
    such as org.w3c.dom.Document, java.io.File or java.util.HashMap. Such a container mechanism
    would avoid the current need for many webapps to provide their own implementation of a
    multipart request class or
    Compression filter.
    If we look at the client side of HTTP connections, the
    java.net package provides the
    ContentHandlerFactory mechanism so that
    the details of IO and parsing content can be hidden behind a simple call to
    getContent(). Adding a similar mechanism (and
    a setContent() equivalent) to the servlet API would move the IO responsibility
    to the container. The container could push-pull bytes from the content factories and
    the application could pull-push high level objects from the same factories.
    Note that a content based API does not preclude streaming of content or require that large
    content be held in memory. Content objects passed to and from the container could include
    references to content (eg File), content handlers (JAXP handler) or even Readers, Writers,
    InputStream and OutputStreams.
    HTTP Headers
    As well as the IO of content, the application is currently responsible for handling the
    associated meta-data such as character and content encoding, modification dates and caching control.
    This meta-data is mostly well specified in HTTP and MIME RFCs and could be best handled by
    the container itself rather than by the application or the libraries bundled with it. For
    example it would be far better for the container to handle gzip encoding of content directly
    to/from it’s private buffers rather than for webapps to bundle their own CompressFilter.
    Without knowledge of what HTTP headers that an application uses or in what order they
    will be accesses, the container is forced to parse incoming requests into expensive hashtables
    of headers. The vast majority of application do not deal with most headers in
    a HTTP request, for example with the following request from mozilla-firefox:

    GET /MB/search.png HTTP/1.1
    Host: www.mortbay.com
    User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040614 Firefox/0.8
    Accept: image/png,image/jpeg,image/gif;q=0.2,*/*;q=0.1
    Accept-Language: en-us,en;q=0.5
    Accept-Encoding: gzip,deflate
    Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
    keep-alive: 300
    Connection: keep-alive
    Referer: http://localhost:8080/MB/log/
    Cookie: JSESSIONID=1ttffb8ss1idk
    If-Modified-Since: Fri, 21 Nov 2003 16:59:29 GMT
    Cache-Control: max-age=0
    

    an application is likely to only make indirect usage of the Host and Cookie
    headers and perhaps direct usage of the If-Modified-Since and Accept-Encoding
    headers. Yet all these headers and values are available via the HttpServletRequest object to be pulled
    by the application at any time during the request processing. Expensive hashmaps are created
    and values received as bytes either have to be stringified or buffers kept aside for later
    lazy evaluation.
    If the application was written at a content level, then most (if not all) HTTP header
    handling could be performed by the content factories. For example, if given an org.w3c.dom.Document
    to write, the container could set the http headers for a content type of text/xml with
    an acceptable charset and encoding selected by server configuration and the request headers.
    Once the headers are set, the byte content can be generated accordingly by the container,
    but scheduled so that excess buffering is not required and non-blocking IO can be done.
    Unfortunately, not all headers will be able to be handled directly from the content objects.
    For example, If-Modified-Since headers could be handled for a File content Objects,
    but not for a org.w3c.dom.Document. So a mechanism for the application to communicate additional
    meta-data will need to be provided.
    Summary and Status
    JettyExperimental now implements most of HTTP/1.1 is a push-pull architecture that works with
    either bio or nio. When using nio, gather writes are used to combine header and content into
    a single write and static content is served directed from mapped file buffers. An advanced
    NIO scheduler avoid many of the
    NIO problems
    inherent with a producer/consumer model.
    Thus JE is ready as a platform to experiment with the content API ideas introduced
    above. I plan to initially work toward a pure content based application API and thus to
    discover what cannot be easily and efficiently fitted into that model. Hopefully what
    will result is lots of great ideas for the next generation servlet API and a great HTTP
    infrastructure for Jetty6.

  • Servlet Performance Report – The Jetty Response.

    Christopher Merrill of Web Performance Inc has release a report of Servlet Performance, which included Jetty.

    Firstly it is good to see somebody again tackling the difficult task of benchmarking servlet containers. It is a difficult task and will always draw criticism and the results will always be disputed.

    The basic methodology of the test is to gradually increase load on the servers tested and chart their performance over time. The results in a nutshell were that all servers did pretty much the same up to a point. One by one, the servers reached a limit and their performance degraded.

    Unfortunately, Jetty was one of the first servers to break ranks and display degraded performance. This however, is not as disturbing as might seem at first. The test used the default configuration for all the servers, which greatly differs. For example, Jetty by default is configured with 50 threads maximum, while tomcat has 150 by default. This is a significant flaw in the study and makes it all but useless for comparing the maximum performances of the different containers. In fact it is amazing that Jetty kept pace with the others for so long, considering it had limited resources.

    The bad news for Jetty, is that even if you ignore the other containers in the study, the shape of the curve is not nice. When a server hits a configured limit it should gracefully degrade it performance in the face of more load. Unfortunately when presented with the load in this study, Jetty’s degradation is a bit less than graceful. We have not seen this ugly curve in our own testing and reports from the field are mostly about grace under fire. So hopefully this is an artifact from the artificial load used in the test. However, this test does show that there is at least one load profile that causes Jetty grief once resources are exhausted. So we have some work to do.

    In summary, the study is interesting in that it at least shows that most servlet containers are approaching terminal velocity for less than pathological loads. But if you want to maximise throughput for your web application, then don’t use the default configuration. Instead read the Jetty Optimisation Guide.

  • Jetty on the Mort Bay host.

    This casestudy describes how
    Jetty
    has been used on our own sites, to show that we are “eating our own dogfood”.
    While there is nothing revolutionary in this blog, it is sometimes good to see
    examples of the ordinary and I believe it is a good example
    of how the simplicity and flexibility of Jetty has allowed simple things to
    be done simply.

    The jetty host is donated to the Jetty project by
    Mort Bay Consulting
    and
    InetU, and the machine is
    now not of the highest spec: 500Mhz Celeron with 128MB running FreeBSD at 1061
    BogoMIPS ( about half the speed of my aging notebook ). On this
    machine we run over 13 web contexts for 6 domains in a single
    Jetty server with a 1.4.1-b21 Sun JVM using the latest release of Jetty 5.

    The Sites:
    The websites run by the server are for diverse purposes and
    are implemented using diverse technologies:

    /jetty/* The Jetty site is
    implemented as a serlvet that wraps a look and feel around static content and
    is deployed as a unpacked web application.
    /demo/* The demo context is
    custom context build with a collection of Jetty handlers using the java API called from the
    Jetty configuration XML.
    /servlets-examples/* The jakarta servlet examples deployed and run as a packed WAR.
    /jsp-examples/* The jakarta JSP examples deployed precompiled as a packed WAR.
    /javadoc/* The jetty
    javadoc in a jar file deployed as a webapplication. Because there is no
    WEB-INF structure, the jar is served purely as static content.
    /cgi-bin/* A
    context configured to run the Jetty CGI servlet
    www.mortbay.com/ The Mort Bay site
    is a look and feel servlet wrapping static content deployed as a standard
    webapplication.
    www.mortbay.com/images/holidays A foto diary site, created by deploying a
    directory structure as a webapplication so that it’s static content is served. A HTAccess handler is used to secure access to some areas (nothing exciting
    I’m afraid).
    www.mortbay.com/MB This blog site, which uses the excellent
    Blojsom web application using
    velocity rendering and
    log4j
    www.collettadicastelbianco.com/ A site about the Italian borgo telematico in
    which I sometime live and work. Written in dynamic JSP2.0 with tag files and heavy use of Servlet 2.4 Filters as aspects. Responsible for me
    finally liking JSPs
    www.jsig.com/ Static content web application.
    www.safari-afrika.com/ Static content web application.
    www.ncc.com.au/ Static content web application.

    Configuration:

    The server is configured from a single Jetty XML file using explicit
    adding of all contexts rather than automatic discover. Doing this is good for security, but also allows extra configuration to
    be added for each context, such as customized logging.

    While all domains have unique IP addresses, the site is actually configured to treat them as virtual hosts. This allows simpler
    configuration of a single set of listeners for all contexts. A default root context is also configured to redirect requests without
    a host header to an appropriate context.

    Two listeners (http 8080 & https 8443) are configured using a shared thread pool of max 30 threads. Ipchains is used to redirect
    port 80 and 8443 to these listeners and the server is run as an unpriviledged user.

    Two authentication realms are defined, for jetty demo and jakarta servlet demo. Both use simple property files.
    The realm name is used to map the realm to the webapplications.

    Logging:

    Jetty 5 usings commons logging plus the Jetty 4 logger wrapped as a commons logger. This is configured in the jetty.xml
    to log to a file that is rolled over daily and historic files are kept for 90 days. A specific logger instance is
    declared for the classes from the colletta web site. This logger is also mapped to the context name so that
    ServletContext.log calls are also directed to it by jetty and all the log information generated by the app is in one file.

    NCSA requests logs are defined for all the main contexts and a catch all request log is defined for those without.
    The webalizer utility is used to generate regular reports of our loads on each context.

    Conclusion:

    The whole thing is kept running using the
    Bernstein daemontools supervise program which calls “java -jar start.jar”
    with no special parameters.

    So really nothing unusual to see here, just business as usual, so move on…