Category: Uncategorized

  • Eat What You Kill without Starvation!

    Jetty 9 introduced the Eat-What-You-Kill[n]The EatWhatYouKill strategy is named after a hunting proverb in the sense that one should only kill to eat. The use of this phrase is not an endorsement of hunting nor killing of wildlife for food or sport.[/n] execution strategy to apply mechanically sympathetic techniques to the scheduling of threads in the producer-consumer pattern that are used for core capabilities in the server. The initial implementations proved vulnerable to thread starvation and Jetty-9.3 introduced dual scheduling strategies to keep the server running, which in turn suffered from lock contention on machines with more than 16 cores.  The Jetty-9.4 release now contains the latest incarnation of the Eat-What-You-Kill scheduling strategy which provides mechanical sympathy without the risk of thread starvation in a single strategy.  This blog is an update of the original post with the latest refinements.

    Parallel Mechanical Sympathy

    Parallel computing is a “false friend” for many web applications. The textbooks will tell you that parallelism is about decomposing large tasks into smaller ones that can be executed simultaneously by different computing engines to complete the task faster. While this is true, the issue is that for web application containers there is not an agreement on what is the “large task” that needs to be decomposed.

    From the applications point of view the large task to be solved is how to render a complex page for a user, combining multiple requests and resources, using many services for authentication and perhaps RESTful access to a data model on multiple back end servers. For the application, parallelism can improve quality of service of rendering a single page by spreading the decomposed tasks over all the available CPUs of the server.

    However, a web application container has a different large task to solve: how to provide service to hundreds or thousands, maybe even hundreds of thousands of simultaneous users. Unfortunately, for the container, the way to optimally allocate its this decomposed task to CPUs is completely opposite to how the application would like it’s decomposed tasks to be executed.

    Consider a server with 4 CPUs serving 4 users each which each have 4 tasks. The applications ideal view of parallel decomposition looks like:

    Label UxTy represent Task y for User x. Tasks for the same user are coloured alike

    This view suggests that each user’s combined task will be executed in minimum time. However some users must wait for prior users tasks to complete before their execution can start, so average latency is higher.

    Furthermore, we know from Mechanical Sympathy that such ideal execution is rarely possible, especially if there is data shared between tasks. Each CPU needs time to load its cache and register with data before it can be acted on. If that data is specific to the problem each user is trying to solve, then the real view of the parallel execution looks more like the following, the orange blocks indicating the time taken to load the CPU cache with user and task related data:

    Label UxTy represent Task y for User x. Tasks for the same user are coloured alike. Orange blocks represent cache load time.

    So from the containers point of view, the last thing it wants is the data from one users large problem spread over all its CPUs, because that means that when it executes the next task, it will have a cold cache and it must be reloaded with the data of the next user.  Furthermore, executing tasks for the same user on different CPUs risks Parallel Slowdown, where the cost of mutual exclusion, synchronisation and communication between CPUs can increase the total time needed to execute the tasks to more than serial execution.  If the tasks are fully mutually excluded on user data (unlikely but a bounding case), then the execution could look like:

    For optimal execution from the containers point of view it is far better if tasks from each user, which use common data, are kept on the same CPU so the cache only needs to be loaded once and there is no mutual exclusion on user data:

    While this style of execution does not achieve the minimal latency and throughput of the idealised application view, in reality it is the fairest and most optimal execution, with all users receiving similar quality of service and the optimal average latency.

    In summary, when scheduling the execution of parallel tasks, it is best to keep tasks that share data on the same CPU so that they may benefit from a hot cache (the original blog contains some micro benchmark results that quantifies the benefit).

    Produce Consume (PC)

    In order to facilitate the decomposition of large problems into smaller ones, the Jetty container uses the Producer-Consumer pattern:

    • The NIO Selector produces IO events that need to be consumed by reading, parsing and handling the data.
    • A multiplexed HTTP/2 connection produces Frames that need to be consumed by calling the Servlet Container. Note that the producer of HTTP/2 frames is itself a consumer of IO events!

    The producer-consumer pattern adds another way that tasks can be related by data. Not only might they be for the same user, but consuming a task will share the data that results from producing the task. A simple implementation can achieve this by using only a single CPU to both produce and consume the tasks:

    while (true)
    {
      Runnable task = _producer.produce();
      if (task == null)
        break;
       task.run();
    }

    The resulting execution pattern has good mechanical sympathy characteristics:

    Label UxPy represent Produce Task y for User x, Label UxCy represent Consume Task y for User x. Tasks for the same user are coloured in similar tones. Orange blocks are cache load times.

    Here all the produced tasks are immediately consumed on the same CPU with a hot cache!  Cache load times are minimised, but the cost is that server will suffer from Head of Line (HOL) Blocking, where the serial execution of task from a queue means that execution of tasks are forced to wait for the completion of unrelated tasks.  In this case tasks for U1C0 need not wait for U0C0 and U2C0 tasks need not wait for U1C1 or U0C1 etc. There is no parallel execution and thus this is not an optimal usage of the server resources.

    Produce Execute Consume (PEC)

    To solve the HOL blocking problem, multiple CPUs must be used so that produced tasks can be executed in parallel and even if one is slow or blocks, the other CPU can progress the other tasks.  To achieve this, a typical solution is to have one Thread executing on a CPU that will only produce tasks, which are then placed in a queue of tasks to be executed by Threads running on other CPUs.   Typically the task queue is abstracted into an Executor:

    while (true)
    {
        Runnable task = _producer.produce();
        if (task == null)
            break;
        _executor.execute(task);
    }

    This strategy could be considered the canonical solution to the producer consumer problem, where producers are separated from consumers by a queue and is at the heart of architectures such as SEDA. This strategy well solves the head of line blocking issue, since all tasks produced can complete independently in different Threads on different CPUs:

    This represents a good improvement in throughput and average latency over the simple Produce Consume, solution, but the cost is that every consumed task is executed on a different Thread (and thus likely a different CPU) from the one that produced the task.  While this may appear like a small cost for avoiding HOL blocking, our experience is that CPU cache misses significantly reduced the performance of early Jetty 9 releases.

    Eat What You Kill (EWYK) AKA Execute Produce Consume (EPC)

    To achieve good mechanical sympathy and avoid HOL blocking, Jetty has developed the Execute Produce Consume strategy, that we have nicknamed Eat What You Kill (EWYK) after the expression which states a hunter should only kill an animal they intend to eat. Applied to the producer consumer problem this policy says that a thread should only produce (kill) a task if it intends to consume (eat) it[n]The EatWhatYouKill strategy is named after a hunting proverb in the sense that one should only kill to eat. The use of this phrase is not an endorsement of hunting nor killing of wildlife for food or sport.[/n]. A task queue is still used to achieve parallel execution, but it is the producer that is dispatched rather than the produced task:

        while (true)
        {
            Runnable task = _producer.produce();
            if (task == null)
                break;
            _executor.execute(this); // dispatch production
            task.run(); // consume the task ourselves
        }

    The result is that a task is consumed by the same Thread, and thus likely the same CPU, that produced it, so that consumption is always done with a hot cache:

    Moreover, because any thread that completes consuming a task will immediately attempt to produce another task, there is the possibility of a single Thread/CPU executing multiple produce/consume cycles for the same user. The result is improved average latency and reduced total CPU time.

    Starvation!

    Unfortunately, a pure implementation of EWYK suffers from a fatal flaw! Since any thread producing a task will go on to consume that task,  it is possible for all threads/CPU to be consuming at once.   This was initially seen as a feature as it exerted good back pressure on the network as a busy server used all its resources consuming existing tasks rather than producing new tasks. However, in an application server consuming a task may be a blocking process that waits for more data/frames to be produced. Unfortunately if every thread/CPU ends up consuming such a blocking task, then there are no threads left available to produce the tasks to unblock them. Dead lock!

    A real example of this occurred with HTTP/2, when every Thread from the pool was blocked in a HTTP/2 request because it had used up its flow control window. The windows can be expanded by flow control frames from the other end, but there were no threads available to process the flow control frames!

    Thus the EWYK execution strategy used in Jetty is now adaptive and it can can use the most appropriate of the three strategies outlined above, ensuring there is always at least one thread/CPU producing so that starvation does not occur. To be adaptive, Jetty uses two mechanisms:

    • Tasks that are produced can be interrogated via the Invocable interface to determine if they are nonblocking, blocking or can be run in either mode.  NON_BLOCKING or EITHER tasks can be directly consumed by PC model.
    • The thread pools used by Jetty implement the TryExecutor interface which supports the method boolean tryExecute(Runnable task)which allows the scheduler to know if a thread was available to continue producing and thus allows EWYK/EPC mode, otherwise the task must be passed to an executor to be consumed in PEC mode.  To implement this semantic, Jetty maintains a dynamically sized pool of reserved threads that can respond to tryExecute(Runnable)calls.

    Thus the simple produce consume (PC) model is used for non-blocking tasks; for blocking tasks the EWYK, aka Execute Produce Consume (EPC) mode is used if a reserved thread is available, otherwise the SEDA style Produce Execute Consume (PEC) model is used.

    The adaptive EWYK strategy can be written as :

        while (true)
        {
            Runnable task = _producer.produce();
            if (task == null)
                break;
            if (Invocable.getInvocationType(task)==NON_BLOCKING)
                task.run();                     // Produce Consume
            else if (executor.tryExecute(this)) // recruit a new producer?
                task.run();                     // Execute Produce Consume (EWYK!)
            else
                executor.execute(task);         // Produce Execute Consume
        }
    

    Chained Execution Strategies

    As stated above, in the Jetty use-case it is common for the execution strategy used by the IO layer to call tasks that are themselves an execution strategy for producing and consuming HTTP/2 frames.  Thus EWYK strategies can be chained and by knowing some information about the mode in which the prior  strategy has executed them the strategies can be even more adaptive.

    The adaptable chainable EWYK strategy is outlined here:

      while (true) {
        Runnable task = _producer.produce();
        if (task == null)
          break;
        if (thisThreadIsNonBlocking())
        {
          switch(Invocable.getInvocationType(task))
          {
            case NON_BLOCKING:
              task.run();                 // Produce Consume
              break;
            case BLOCKING:
              executor.execute(task);     // Produce Execute Consume
              break;
            case EITHER:
              executeAsNonBlocking(task); // Produce Consume break;
           }
        }
        else
        {
          switch(Invocable.getInvocationType(task))
          {
            case NON_BLOCKING:
              task.run();                   // Produce Consume
              break;
            case BLOCKING:
              if (_executor.tryExecute(this))
                task.run();                 // Execute Produce Consume (EWYK!)
              else
                executor.execute(task);     // Produce Execute Consume
              break;
            case EITHER:
              if (_executor.tryExecute(this))
                task.run();                 // Execute Produce Consume (EWYK!)
              else
                executeAsNonBlocking(task); // Produce Consume
                break;
           }
        }

    An example of how the chaining works is that the HTTP/2 task declares itself as invocable EITHER in blocking on non blocking mode. If IO strategy is operating in PEC mode, then the HTTP/2 task is in its own thread and free to block, so it can itself use EWYK and potentially execute a blocking task that it produced.

    However, if the IO strategy has no reserved threads it cannot risk queuing an important Flow Control frame in a job queue. Instead it can execute the HTTP/2 as a non blocking task in the PC mode.  So even if the last available thread was running the IO strategy, it can use PC mode to execute HTTP/2 tasks in non blocking mode. The HTTP/2 strategy is then always able to handle flow control frames as they are non-blocking tasks run as PC and all other frames that may block are queued with PEC.

    Conclusion

    The EWYK execution strategy has been implemented in Jetty to improve performance through mechanical sympathy, whilst avoiding the issues of Head of Line blocking, Thread Starvation and Parallel Slowdown.   The team at Webtide continue to work with our clients and users to analyse and innovate better solutions to serve high performance real world applications.

  • CometD 4.0.0 Released

    The CometD Project is happy to announce the availability of CometD 4.0.0.
    CometD 4.0.0 builds on top of the CometD 3.1.x series, bringing improvements and new features.
    You can find a migration guide at the official CometD documentation site.

    What’s new in CometD 4.0.0

    The main theme behind CometD 4.0.x is the complete support for asynchronous APIs.
    CometD 3.1.x had a number of APIs, in particular server-side extensions and server-side channel listeners that required to return a value from API methods that applications implemented.
    A typical example is the support for authentication during a CometD handshake: applications needed to implement SecurityPolicy.canHandshake(...):

    // CometD 3.1.x
    public class MyAppAuthenticator extends DefaultSecurityPolicy {
        @Override
        public boolean canHandshake(BayeuxServer server, ServerSession session, ServerMessage message) {
            // Call third party service via HTTP.
            CompletableFuture future = thirdParty.authenticate(message);
            return future.get(); // Blocks until the result is available.
        }
    }
    

    Because canHandshake(...) returns a boolean, authentication using a third party service via HTTP must block (via CompletableFuture.get()) until the third party service returned a response. Furthermore, it was not possible to call the third party service in a different thread, freeing the CometD thread and allowing it to handle other messages: the CometD thread still had to be blocked until a boolean response was available. This was severely harming the scalability of CometD if hundreds of threads were blocked waiting for a slow third party authentication system, and it was only due to a CometD API design mistake.
    With CometD 4.0.x we took the chance to correct this design mistake and make all CometD APIs completely asynchronous, with the introduction of a CometD Promise class: a callback that holds a future result or that is failed.
    The example above can now be written as:

    // CometD 4.0.x
    public class MyAppAuthenticator extends DefaultSecurityPolicy {
        @Override
        public void canHandshake(BayeuxServer server, ServerSession session, ServerMessage message, Promise promise) {
            // Call third party service via HTTP.
            CompletableFuture future = thirdParty.authenticate(message);
            future.whenComplete((result, failure) -> {
                if (failure == null) {
                    promise.succeed(result);
                } else {
                    promise.fail(failure);
                }
            });
        }
    }
    

    Method canHandshake(...) can now return immediately and handle other messages; when the CompletableFuture returned by the third party authentication system is completed (i.e. succeeded or failed), then also the CometD Promise is completed and the processing of that CometD message can continue.
    This apparently simple API change required a massive rewrite of the internals of CometD and few other breaking API changes, in particular, they related to how to obtain a BayeuxContext instance. In CometD 3.1.x, BayeuxContext instances could be retrieved via BayeuxServer.getContext() thanks to the fact that the BayeuxContext instance was stored in a ThreadLocal. In CometD 4.0.x, it is not possible to use ThreadLocal due to the asynchronous nature of the API: the thread that starts an asynchronous operation is not the same thread that finishes the asynchronous operation. In CometD 4.0.x, you can now obtain a BayeuxContext instance via ServerMessage.getBayeuxContext().
    A complete list of breaking and removed API can be found in the migration guide.
    Another new feature of CometD 4.0.x is the support for JPMS modules. Currently, this is achieved by adding Automatic-Module-Name entries to the relevant CometD jars. Proper support for JPMS modules via module-info.java files is scheduled for CometD 5.x.

    What’s changed in CometD 4.0.0

    CometD 4.0.x now requires JDK 8 and Jetty 9.4.x as detailed in the migration guide.

    Conclusions

    CometD 4.0.x is now the mainstream CometD release, and will be the primary focus for development and bug fixes. CometD 3.1.x enters a maintenance mode, so that only urgent or sponsored fixes will be applied to it, possibly leading to new CometD 3.1.x releases – although these will be rare.
    Work on CometD 5.x will likely require JDK 11 and Jetty 10 and will start as soon as Jetty 10 is released.

  • Fast MultiPart FormData

    Jetty’s venerable MultiPartInputStreamParser for parsing MultiPart form-data has been deprecated and replaced by the much more efficient MultiPartFormInputStream, based on a new MultiPartParser. This is much faster, but less forgiving of non-compliant format. So we have implemented a legacy mode to access the old parser, but with enhancements to make logging of compliance violations possible.

    Benchmarks

    We have achieved an order of magnitude speed-up in the parsing of large uploaded content and even small content is significantly faster.
    We performed a JMH benchmark of the (new) HTTP MultiPartFormInputStream vs the (old) UTIL MultiPartInputStreamParser. Our tests were:

    • testLargeGenerated:  parses a 10MB file of random binary data
    • testParser:  parses a series of small multipart forms captured by a browser

    Our results clearly show that the new multipart processing is superior in terms of speed to the old processing:

    # Run complete. Total time: 00:02:09
    Benchmark                              (parserType)  Mode  Cnt  Score   Error  Units
    MultiPartBenchmark.testLargeGenerated          UTIL  avgt   10  0.252 ± 0.025   s/op
    MultiPartBenchmark.testLargeGenerated          HTTP  avgt   10  0.035 ± 0.004   s/op
    MultiPartBenchmark.testParser                  UTIL  avgt   10  0.028 ± 0.005   s/op
    MultiPartBenchmark.testParser                  HTTP  avgt   10  0.015 ± 0.006   s/op
    

    How To Use

    By default in Jetty 9.4, the old MultiPartInputStreamParser will be used. The default will be switched to the new MultiPartInputStreamParser in jetty-10.  To use the new parser (available since release 9.4.10)  you can change the compliance mode in the server.ini file so that it defaults to using RFC7578 instead of the LEGACY mode.

    ## multipart/form-data compliance mode of: LEGACY(slow), RFC7578(fast)
    # jetty.httpConfig.multiPartFormDataCompliance=LEGACY

    This feature can also be used programmatically by setting the compliance mode through the HttpConfiguration instance which can be obtained through the HttpConnectionFactory in the connector.

    connector.getConnectionFactory(HttpConnectionFactory.class).getHttpConfiguration()
    .setMultiPartFormDataCompliance(MultiPartFormDataCompliance.RFC7578);
    

    Compliance Modes

    There are now two compliance modes for MultiPart form parsing:

    • LEGACY mode which uses the old MultiPartInputStreamParser in jetty-util, this will be slower but more forgiving in accepting formats that are non-compliant with RFC7578.
    • RFC7578 mode which uses the new MultiPartFormInputStream in jetty-http, this will perform faster than the LEGACY mode, however, there may be issues in receiving badly formatted MultiPart forms that were previously accepted.

    The default compliance mode is currently LEGACY, however, this will be changed to RFC7578 a future release.

    Legacy Mode Compliance Warnings

    When the old MultiPartInputStreamParser accepts a format non-compliant with the RFC, a violation is recorded as an attribute in the request. These violations include:

    The list of violations as Strings can be obtained from the request by accessing the attribute  HttpCompliance.VIOLATIONS_ATTR.

    (List<String>)request.getAttribute(HttpCompliance.VIOLATIONS_ATTR);

    Each violation string gives the name of the violation followed by a link to the RFC describing that particular violation.
    Here’s an example:
    CR_LINE_TERMINATION: https://tools.ietf.org/html/rfc2046#section-4.1.1
    NO_CRLF_AFTER_PREAMBLE: https://tools.ietf.org/html/rfc2046#section-5.1.1

    The Future

    The parser is async capable, so expect further innovations with non-blocking uploads and possibly reactive parts.

  • Conscrypting native SSL for Jetty

    By default, Jetty uses the JSSE provider from the JVM for SSL, which has three significant problems:

    • It’s slow!
    • It doesn’t support ALPN in Java 8, which is needed for HTTP/2
    • It’s REALLY slow!

    There are workarounds for both problems: using SSL offloading and/or using our boot path patched JSSE for ALPN. Neither approach is optimal, however, especially as the interest in having connections secure within the data center continues to rise.
    We have previously looked at JNI native integration with a library like OpenSSL, but it was simply too much work to integrate and maintain. Having a well-maintained SSL integration is important as ciphers do change and exploits are found, so it is vital that updates are available as soon as the base library is updated.
    Luckily, all that hard work is now done by others and we are able to stand on the shoulders of taller giants! Google maintains BoringSSL as a fork of the OpenSSL project that is used in their Chrome and Android products, thus it is an excellent well maintained library with good security and performance. Google have also built on the work of the Netty project to develop Conscrypt as the Java library that maps the native BoringSSL API to be a compliant JSSE SecurityProvider.
    So how do you implement Conscrypt in Jetty? It is actually quite easy: instantiate an instance of Conscrypt’s OpenSslProvider; add it as a provider in the JVM’s Security class; set “Conscrypt” as the provider name on Jetty’s SslContextFactory.
    Using the Jetty distribution? Since Jetty-9.4.7, these steps can all be done by enabling the “conscrypt” module:

    cd $JETTY_BASE
    java -jar $JETTY_HOME/start.jar --add-to-start=conscrypt
    

    The integration with ALPN required a bit more work and some collaboration with the Conscrypt team. This will be included in the 9.4.8 release of Jetty as part of the conscrypt module enabled above.
    So far, we’ve had reports of almost a 10 times increase in throughput with Conscrypt! It also provides ALPN support on both Java 8 and Java 9 without the need to amend the boot path.  Try it out!

  • HTTP Trailers in Jetty

    HTTP/1.1 and HTTP/2 have the concept of trailers, that is HTTP headers that can be sent after the message body, in both requests and responses.
    In HTTP/1.1 trailers can be sent using the chunked transfer coding, for example in requests (but the same is valid in responses):

    POST / HTTP/1.1\r\n
    Host: host\r\n
    Transfer-Encoding: chunked\r\n
    \r\n
    A\r\n
    0123456789\r\n
    0\r\n
    Trailer-Name: trailer-value\r\n
    Foo: bar\r\n
    \r\n
    

    As you can see, between the indication of the terminal chunk length 0\r\n and the terminal empty line \r\n, HTTP/1.1 allows to put the trailers.
    In HTTP/2, the situation is similar:

    HEADERS - end_stream=false
    DATA - length=10, end_stream=false
    HEADERS - end_stream=true
    

    The first HEADERS frame contains the request line and headers, followed by a DATA frame that does not end the stream yet, followed by a HEADERS frame that contains the trailers, and that ends the stream.
    A typical use of trailers would be to add dynamically generated metadata about the content, for example message integrity checksums.
    Another typical use is for applications that stream content: in case of problems during the streaming, they can add trailers with information about what went wrong.
    Other protocols such as gRPC make use of the trailers and therefore can be mapped on top of HTTP.
    The Servlet APIs, up to version 3.1, do not expose a standard API to access the trailers. HTTP trailers APIs are, however, now being discussed for inclusion in Servlet 4.0.
    The recently released Jetty 9.4.4.v20170414 includes support for HTTP trailers, for both HTTP/1.1 and HTTP/2, via custom Jetty APIs.
    This is how you can use them in a Servlet:

    public class TrailerServlet extends HttpServlet {
      @Override
      protected void service(HttpServletRequest request, HttpServletResponse response) throws IOException, ServletException {
        Request jettyRequest = (Request)request;
        // Read the content first.
        ServletInputStream input = jettyRequest.getInputStream();
        while (true) {
          int read = input.read();
          if (read < 0) {
            break;
          }
        }
        // Now the request trailers can be accessed.
        HttpFields requestTrailers = jettyRequest.getTrailers();
        // Use the request trailers.
        HttpFields responseTrailers = new HttpFields();
        trailers.put("trailer1", "foo");
        // Set trailer Supplier to tell the container
        // that there will be response trailers.
        Response jettyResponse = (Response)response;
        jettyResponse.setTrailers(() -> trailers);
        // Write some content and commit the response.
        ServletOutputStream output = response.getOutputStream();
        output.write("foo_bar_baz");
        output.flush();
        // Add another trailer.
        trailers.put("trailer2", "bar");
        // Write more content.
        output.write("done");
        // Add a last trailer.
        trailers.put("last", "baz");
      }
    }
    

    Request trailers will only be available after the request content has been fully read.
    For the response trailers, the reason to use a Supplier in the response APIs is to tell the container to use the chunked transfer coding (in case of HTTP/1.1), even if the response content length is known. In this way, the container can prepare for sending the trailers, and eventually send them when the whole content has been sent.
    Try out HTTP trailers in Jetty 9.4.4, and report back how you use it and how you like it (so that we can make it even better) either in the Jetty mailing lists, or in a Jetty GitHub issue (open it just for the discussion).
    Enjoy !

  • Jetty, Cookies and RFC6265 Compliance

    Starting with patch 9.4.3, Jetty will be fully compliant with RFC6265, which presents changes to cookies which may have significant impact for some users.
    Up until now Jetty has supported Version=1 cookies defined in RFC2109 (and continued in RFC2965) which allows for special/reserved characters (control, separator, et al) to be enclosed within double quotes when declared in a Set-Cookie response header:
    Example:

    Set-Cookie: foo="bar;baz";Version=1;Path="/secur"
    

    Which was added to the HTTP Response headers using the following calls.

    Cookie cookie = new Cookie("foo", "bar;baz");
    cookie.setPath("/secur");
    response.addCookie(cookie);

    This allowed for normally non-permitted characters (such as the ; separator found in the example above) to be used as part of a cookie value. With the introduction of RFC6265 (replacing the now obsolete RFC2965 and RFC2109) , this use of Double Quotes to enclose special characters is no longer possible.
    This change was made as a reaction to the strict RFC6265 validation rules present in Chrome/Chromium.
    As such, users are now required to encode their cookie values to use these characters.
    Utilizing javax.servlet.http.Cookie, this can be done as:

    Cookie cookie = new Cookie("foo", URLEncoder.encode("bar;baz", "utf-8"));

    Starting with Jetty 9.4.3, we will now validate all cookie names and values when being added to the HttpServletResponse via the addCookie(Cookie) method.  If there is something amiss, Jetty will throw an IllegalArgumentException with the details.
    Of note, this new addCookie(Cookie) validation will be applied via the ServerConnector, and will work on HTTP/1.0, HTTP/1.1, and HTTP/2.0
    Additionally, Jetty has added a CookieCompliance property to the HttpConfiguration object which can be utilized to define which cookie policy the ServerConnectors will adhere to. By default, this will be set to RFC6265.
    In the standard Jetty Distribution, this can be found in the server’s jetty.xml as:

    <Set name="cookieCompliance">
      <Call class="org.eclipse.jetty.http.CookieCompliance" name="valueOf">
        <Arg><Property name="jetty.httpConfig.cookieCompliance" default="RFC6265"/></Arg>
      </Call>
    </Set>

    Or if you are utilizing the module system in the Jetty distribution, you can set the jetty.httpConfig.cookieCompliance property in the appropriate start INI for your${jetty.base} (such as ${jetty.base}/start.ini or ${jetty.base}/start.d/server.ini):

    ## Cookie compliance mode of: RFC6265
    # jetty.httpConfig.cookieCompliance=RFC6265

    Or, for older Version=1 Cookies, use:

    ## Cookie compliance mode of: RFC2965
    # jetty.httpConfig.cookieCompliance=RFC2965

     

  • Patch for a Patch!

    Are you an Eclipse Jetty user who enjoys contributing to the open source project and wants to let the rest of the world know? Of course you are! As a thank you to our great community,  we’ve had some fancy patches made up and have launched a Patch for a Patch program. If you submit a patch to the Jetty project and it is accepted, we will send you a jetty:// iron-on patch that you can attach to your bag, coat, house, pet…etc. Show friends, family and strangers your dedication to the open source community!
    If you have submitted a patch in the last year and want to take advantage of this offer, please fill out this form, which will ask for your contact information and a link to the patch you submitted. Supplies are limited! We will ship anywhere worldwide that we can reach for a reasonable amount.

  • CometD 3.1.0 Released

    The CometD Project is happy to announce the availability of CometD 3.1.0.
    CometD 3.1.0 builds on top of the CometD 3.0.x series, bringing improvements and new features.
    You can find a migration guide at the official CometD documentation site.

    What’s new in CometD 3.1.0

    CometD 3.1.0 now supports HTTP/2.
    HTTP/2 support should be transparent for applications, since the browser on the client-side and the server (such as Jetty) on the server-side will take care of handling HTTP/2 so that nothing changes for applications.
    However, CometD applications may now leverage the fact that the application is deployed over HTTP/2 and remove the limit of only one outstanding long poll per client.
    This means that CometD applications that are opened in multiple browser tabs and using HTTP/2 can now have each tab performing the long poll, rather than just one tab.
    CometD 3.1.0 brings support for messages containing binary data.
    Now that JavaScript has evolved and that it supports binary data types, the use case of uploading or downloading files or other binary data could be more common.
    CometD 3.1.0 allows applications to specify binary data in messages, and the CometD implementation will take care of converting the binary data into the textual format (using the Z85 encoding) required to send the message, and of converting the textual format back into binary data when the message is received.
    Binary data support is available in both the JavaScript and Java CometD libraries.
    In the JavaScript library, several changes have been made to support both the CommonJS and AMD module styles.
    CometD 3.1.0 is now also deployed to NPM and Bower.
    The package name for both NPM and Bower is cometd, please make sure you filter out all the other variants such as cometd-jquery that are not directly managed by the CometD Project.
    The CometD JavaScript library has been designed in a way that leverages bindings to JavaScript toolkits such as jQuery or Dojo.
    This is because JavaScript toolkits are really good at working around browser quirks/differences/bugs and we did not want to duplicate all those magic workarounds in CometD itself.
    In CometD 3.1.0 a new binding is available, for Angular 1. As a JavaScript toolkit, Angular 1 requires tight integration with other libraries that make XMLHttpRequest calls, and the binding architecture of the CometD JavaScript library fits in just nicely.
    You can now use CometD from within Angular 1 applications in a way that is very natural for Angular 1 users.
    The JavaScript library now supports also vanilla transports. This means that you are not bound to use bindings, but you can write applications without using any framework or toolkit, or using just the bare minimum support given by module loaders such as RequireJS or build-time tools such as Browserify or webpack.
    Supporting vanilla transports was possible since recent browsers have finally fixed all the quirks and agreed on the XMLHttpRequest events that a poor JavaScript developer should use to write portable-across-browsers code.
    A couple of new Java APIs have been added, detailed in the migration guide.

    What’s changed in CometD 3.1.0

    In the JavaScript library, browser evolution also brought support for window.sessionStorage, so now the CometD reload extension is using the SessionStorage mechanism rather than using cookies.
    You can find the details on the CometD reload extension documentation.
    It is now forbidden to invoke handshake() multiple times without disconnecting in-between, so applications need to ensure that the handshake operation is performed only once.
    In order to better support CommonJS, NPM and Bower, the location of the JavaScript files has changed.
    Applications will probably need to change paths that were referencing the CometD JavaScript files and bindings as detailed in the migration guide.
    Adding support for binary data revealed a mistake in the processing of incoming messages. While this has not been fixed in CometD 3.0.x to avoid breaking existing code, it had to be fixed in CometD 3.1.0 to support correctly binary data.
    This change affects only applications that have written custom extensions, implementing either BayeuxServer.Extension.send(...) or ServerSession.Extension.send(...). Refer to the migration guide for further details.
    CometD 3.1.0 now supports all Jetty versions from the 9.2.x, 9.3.x and 9.4.x series.
    While before only the Jetty 9.2.x series was officially supported, now we have decided to support all the above Jetty series to allow CometD users to benefit from bug fixes and performance improvements that come when upgrading Jetty.
    Do not mix Jetty versions, however. If you decide to use Jetty 9.3.15, make sure that all the Jetty libraries used in your CometD application reference that Jetty version, and not other Jetty versions.

    What’s been removed in CometD 3.1.0

    CometD 3.1.0 drops support for Jackson 1.x, since Jackson 2.x is now mainstream.
    Server-side parameter allowMultiSessionsNoBrowser has been removed, since sessions not identified by the CometD cookie are not allowed anymore for security reasons.

    Conclusions

    CometD 3.1.0 is now the mainstream CometD release, and will be the primary focus for development and bug fixes.
    CometD 3.0.x enters the maintenance mode, so that only urgent or sponsored fixes will be applied to it, possibly leading to new CometD 3.0.x releases – although these will be rare.
    Work on CometD 4.x will start soon, using issue #647 as the basis to review the CometD APIs to be fully non-blocking and investigating the possibility of adding backpressure.

  • Thread Starvation with Eat What You Kill

    This is going to be a blog of mixed metaphors as I try to explain how we avoid thread starvation when we use Jetty’s eat-what-you-kill[n]The EatWhatYouKill strategy is named after a hunting proverb in the sense that one should only kill to eat. The use of this phrase is not an endorsement of hunting nor killing of wildlife for food or sport.[/n] scheduling strategy.
    Jetty has several instances of a computing pattern called ProduceConsume, where a task is run that produces other tasks that need to be consumed. An example of a Producer is the HTTP/1.1 Connection, where the Producer task looks for IO activity on any connection. Each IO event detected is a Consumer task which will read the handle the IO event (typically a HTTP request). In Java NIO terms, the Producer in this example is running the NIO Selector and the Consumers are handling the HTTP protocol and the applications Servlets. Note that the split between Producing and Consuming can be rather arbitrary and we have tried to have the HTTP protocol as part of the Producer, but as we have previously blogged, that split has poor mechanical sympathy. So the key abstract about the Producer Consumer pattern for Jetty is that we use it when the tasks produced can be executed in any order or in parallel: HTTP requests from different connections or HTTP/2 frames from different streams.

    Eat What You Kill

    Mechanical Sympathy not only affects where the split is between producing and consuming, but also how the Producer task and Consumer tasks should be executed (typically by a thread pool) and such considerations can have a dramatic effect on server performance. For example, if one thread produced a task then it is likely that the CPU’s cache is now hot with all the data relating to that task, and so it is best that the same CPU consumes that task using the hot cache. This could be achieved with complex core locking mechanism, but it is far more straight-forward to consume the task using the same thread.
    Jetty has an ExecutionStrategy called Eat-What-You-Kill (EWYK), that has excellent mechanical sympathy properties. We have previously explained  this strategy in detail, but in summary it follows the hunters ethic[n]The EatWhatYouKill strategy is named after a hunting proverb in the sense that one should only kill to eat. The use of this phrase is not an endorsement of hunting nor killing of wildlife for food or sport.[/n] that one should only kill (produce) something that you intend to eat (consume). This strategy allows a thread to only run the producing task if it is immediately able to run any consumer task that is produced (using the hot CPU cache). In order to allow other consumer task to run in parallel, another thread (if available) is dispatched to do more producing and consuming.

    Thread Starvation

    EWYK is an excellent execution strategy that has given Jetty significant better throughput and reduced latency. That said, it is susceptible to thread starvation when it bites off more than it can chew.
    The issue is that EWYK works by using the same thread that produced a task to immediately consume the task and it is possible (even likely) that the consumer task will block as it is often calling application code which may do blocking IO or which is set to wait for some other event. To ensure this does not block the entire server, EWYK will dispatch another task to the thread pool that will do more producing.
    The problem is that if the thread pool is empty (because all the threads are in blocking application code) then the last non-blocked producing thread may produce a task which it then calls and also blocks. A task to do more producing will have been dispatched to the thread pool, but as it was generated from the last available thread, the producing task will be waiting in the job queue for an available thread. All the threads are blocking and it may be that they are all blocking on IO operations that will only be unblocked if some data is read/written.  Unless something calls the NIO Selector, the read/write will not been seen. Since the Selector is called by the Producer task, and that is waiting in the queue, and the queue is stalled because of all the threads blocked waiting for the selector the server is now dead locked by thread starvation!

    Always two there are!

    Jetty’s clever solution to this problem is to not only run our EWYK execution strategy, but to also run the alternative ProduceExecuteConsume strategy, where one thread does all the producing and always dispatches any produced tasks to the thread pool. Because this is not mechanically sympathetic, we run the producer task at low priority. This effectively reserves one thread from the thread pool to always be a producer, but because it is low priority it will seldom run unless the server is idle – or completely stalled due to thread starvation. This means that Jetty always has a thread available to Produce, thus there is always a thread available to run the NIO Selector and any IO events that will unblock any threads will be detected. This needs one more trick to work – the producing task must be able to tell if a detected IO task is non-blocking (i.e. a wakeup of a blocked read or write), in which case it executes it itself rather than submitting the task to any execution strategy. Jetty uses the InvocationType interface to tag such tasks and thus avoid thread starvation.
    This is a great solution when a thread can be dedicated to always Producing (e.g. NIO Selecting). However Jetty has other Producer-Consumer patterns that cannot be threadful. HTTP/2 Connections are consumers of IO Events, but are themselves producers of parsed HTTP/2 frames which may be handled in parallel due to the multiplexed nature of HTTP/2. So each HTTP/2 connection is itself a Produce-Consume pattern, but we cannot allocate a Producer thread to each connection as a server may have many tens of thousands connections!
    Yet, to avoid thread starvation, we must also always call the Producer task for HTTP/2. This is done as it may parse HTTP/2 flow control frames that are necessary to unblock the IO being done by applications threads that are blocked and holding all the available threads from the pool.
    Even if there is a thread reserved as the Producer/Selector by a connector, it may detect IO on a HTTP/2 connection and use the last thread from the thread pool to Consume that IO. If it produces a HTTP/2 frame and EWYK strategy is used, then the last thread may Consume that frame and it too may block in application code. So even if the reserved thread detects more IO, there are no more available threads to consume them!
    So the solution in HTTP/2 is similar to the approach with the Connector. Each HTTP/2 connection has two executions strategies – EWYK, which is used when the calling thread (the Connector’s consumer) is allowed to block, and the traditional ProduceExecuteConsume strategy, which is used when the calling thread is not allowed to block. The HTTP/2 Connection then advertises itself as an InvocationType of EITHER to the Connector. If the Connector is running normally a EWYK strategy will be used and the HTTP/2 Connection will do the same. However, if the Connector is running the low priority ProduceExecutionConsume strategy, it invokes the HTTP/2 connection as non-blocking. This tells the HTTP/2 Connection that when it is acting as a Consumer of the Connectors task, it must not block – so it uses its own ProduceExecuteConsume strategy, as it knows the Production will parse the HTTP/2 frame and not perform the Consume task itself (which may block).
    The final part is that the HTTP/2 frame Producer can look at the frames produced. If they are not frames that will block when handled (i.e. Flow Control) they are handled by the Producer and not submitted to any strategy to be Consumed. Thus, even if the Server is on it’s last thread, Flow Control frames will be detected, parsed and handled – unblocking other threads and avoiding starvation!

  • HTTP/2 at JAX

    I was invited to speak at the JAX conference in Mainz about HTTP/2.
    Jetty has always been a front-runner when it’s about web protocols: first with WebSocket, then with SPDY and finally with HTTP/2.
    We believe that HTTP/2 is going to make the web much better, and we try to spread the word at conferences.
    The JAX conference was great, and despite most of the sessions being in German, I had the chance to network with various speakers – it is always great to be able to speak to top notch people over breakfast or dinner, or while waiting for the next session.
    Below you can find Oracle’s Yolande Poirier video interviewing me about HTTP/2 and the JAX textual interview about the same argument.
    Enjoy !