Pathfinder Blog
Topic Archive: Scalability

Grizzly - Infrastructure for COMET and AJAX

Other than being the black hole into which the JavaMail API has disappeared, the Open Source Glassfish project -- a J2EE app server building on the Toplink and J2EE code donated by Oracle and Sun respectively -- has some interesting stuff under the hood. They have a new NIO-based HTTP Connector named Grizzly. See their Webtier page; down a little on the page, you'll see a heading entitled "HTTP Connector." Along with a nice high level diagram, the section describes Grizzly as

Grizzly is an HTTP Listener using Java's NIO technology and implemented entirely in Java. A re-usable NIO based framework that can be used for any HTTP related operations (HTTP Listener/Connector) as well as non-HTTP operations, thus allowing the creation of any type of scalable multi-threaded server.

With NIO, as anyone who has ever used the Unix select system call, you can have a single thread operating on a bunch of connections instead of one thread per connection. This can save tremendous overhead and can achieve a surprisingly high degree of performance. See the old single-threaded HTTP servers Boa and Tiny httpd for evidence that this is not a new concept.

To understand why Grizzly helps in one piece of the COMET puzzle, we look at Jean-Francois Arcand's blog entry, Can a Grizzly run faster than a Coyote. He ran ApacheBench against Tomcat and Glassfish and came up with the following result: Tomcat needs 500 threads where Grizzly needs only 10 to handle a large benchmark test.

The results are a little controversial -- nobody likes their app server trashed -- and I don't want to get into a discussion about the relative performance of Tomcat vs Glassfish, but the ability to handle the IO of an application server with a small number of threads bodes well for COMET. COMET, remember, keeps a connection open to the browser while the Servlet is performing some long running calculation or waiting on a message, and that means we'll have more connections open at one time. If we needed one thread per connections, we'd be hosed.

What NIO doesn't solve, however, is the that the waiting Servlet still chews up a thread. As we've mentioned previously, Jetty 6 has a continuation mechanism for Servlets that allows waiting Servlets to give up their threads. That's the other part of the puzzle for COMET. Jetty 6, BTW, also has an NIO HTTP Connector to go with its Servlet Continuations, so this may be the way to go for COMET early adopters.

 

Stomping out the Misconceptions

A reader pointed out this blog entry from Infoworld, Mercury: AJAX has its drawbacks. It's from the middle of April, but it is still worth responding.

"AJAX is incredible where people are starting
to adopt it and it immediately causes a lot of problems because it's
not very structured," said Rajesh Radhakrishnan, vice president of
Application Delivery at Mercury. Several Mercury executives met with
InfoWorld editors at Mercury offices in Mountain View, Calif. on
Tuesday morning.

"We've seen tons and tons of problems," with AJAX, Radhakrishnan
said. In testing for functionality and regression, Mercury has seen an
increased number of regressions in AJAX, said Radhakrishnan.

As a workaround, Radhakrishnan suggests using AJAX for the cutting
edge part of UI development, to enable interactions between the client
and server in which the server is able to respond to client requests
later. "For the rest of it, you don't really use AJAX,""Radhakrishnan
said.

It is precisely using AJAX for the "cutting edge" parts of a UI that causes the problems. Calling an architectural principle like AJAX "not very structured" betrays an ignorance of the topic. It's like calling web service "not very structured" because people are using raw java.net sockets to do everything.

This post is from April of 2006, not 2005 when this sort of comment would have been excusable. Now there are several stable intermediate forms such as DWR that allow for fairly structured development of AJAX solutions on top of existing webapp frameworks. Further, there are already some more advanced forms such as Tibco GI, OpenLaszlo, ZK and Echo2 that allow for development of sophisticated desktop-type apps.

Mercury may be trying to discourage folks from developing AJAX apps until they've had a chance to update their testing software to keep pace. I suggest that they work harder on their next release instead.

Not There Yet: COMET with Apache and Jetty

I had intended to marry the nice Apache2 event MPM and Jetty 6 with Continuations in order to achieve a thrifty, thread-sparing COMET capable Java app. The idea is that the Event MPM module in apache frees up a thread when an open connection to the browser is snoozing and Jetty frees up a thread when the backend servlet is snoozing. There are two problems, however.

  1. The Event MPM module seems to only snooze between requests. With COMET the need to snooze happens during a requests, i.e. we are expecting something to come back from the server.
  2. The request handler module -- in this case mod_jk or mod_proxy_ajp -- and not the Event MPM module handles the socket connections to the servlet container. From my reading of the most recent SVN branch, the modules are using blocking I/O and not polling.

It seems there's still a bit of work to be done to make Apache and Jetty do the COMET dance.

COMET: Socket Hungry AJAX

From back in late March, Alex Russel over at IrishDev writes about a new AJAX technique, calling it COMET. What is COMET? Basically the browser makes a request of the servers, but the server keeps the socket open over a long period of time.

[COMET applications] all use long-lived HTTP
connections to reduce the latency with which messages are passed to the
server. In essence, they do not poll the server occasionally. Instead the server has an open line of communication with which it can push data to the client.

Does it scale? We've talked about this stuff before when we spoke about Jetty Continuations. I wrote then that

I don't like this method because it is wasteful in terms of sockets and
threads; also, it is likely to stress stateful firewalls, load
balancers, etc., and may break in lots of client environments.

I stand by that statement. Beyond the issue of migrating these connections between nodes in a load-balanced cluster (yes, you could close the connection and have the client automatically reopen the connection), there are serious scaling issues.

One of the things that made HTTP based applications scalable was that they made use of small, stateless requests. This meant you could handle requests from an order of magnitude or more users than a comparable stateful application.

It's true that the typical AJAX polling for async updates also puts a burden on the server, firewall, load balancers, etc., but that depends partly on the frequency of the polling and the number of clients doing the polling. Even if I have 10,000 users polling the server every half second, I may still only have a few hundred sockets open at any one time if the request/response size is small and the user's network latency is low.

Modifications to server software like Apache and Jetty to conserve resources like threads and make use of IO multiplexing is a first and probably necessary step. Maybe I'm making too much of this stateful thing. We may have so much application state information floating around on the server side anyway that is will dwarf any OS and network resources that COMET and related technologies seek to spend.

Update 1: DWR's next release should have an implementation of COMET that they are calling "Reverse AJAX." More interesting, however, is the fact that they are releasing an API that allows one to write Javascript by writing Java.

Ajax: The “Husky” Client

Scott Dietzen over at Zimbra has a post in a continuing series on AJAX scalability. Besides coining the humorous term "Husky" Client to describe AJAX -- not quite thin, but not quite fat -- he makes some excellent points about the importance of design and choosing the appropriate browser/server boundary for an application in order to minimize the impact on the server.

I thought the following early paragraph was a nice observation by someone who clearly has a bit of experience developing applications:

Traditional fat client applications, on the other hand, off-load all of
the UI and most of the business logic (modulo stored procedures and
triggers) from the server to the client. Fat client app's could
nevertheless hammer their servers simply by not being sophisticated
about how much and how often data was being requested---that is, data shipping to the client can be more expensive than function shipping
to the server (with stored procedures, triggers, et al). With a
reasonably smart design, however, fat client applications typically use
more client and less server CPU per operation than a corresponding
server-centric application.

I think the generation of RIA's (Rich Interaction Application) that are about to sweep the web are likely going to repeat many of the mistakes of the client/server age. As Scott points out, how poorly these applications perform is going to be in part dependent on how well they are designed. As Santayana wrote, "Those who cannot remember the past are condemned to repeat it."

Again on Scalability

Since my rant started flowing, I just thought I'd pull this out of the comments section at Ajaxian.com:

Beyond becoming multi-threaded, AJAX is beginning to open up
the web to new types of applications, ones that are document centric
(word processors, modeling tools, etc.) rather than data and
transaction centric, i.e. all those rectangular CRUD applications that
make up 99.9% of webapps. It also means that the sorts and types of
rich client interactions are going to dwarf the traffic that we see
today.

That means 1. abandoning the forms-and-reports way of
writing webapps (which will break when you try to write something like
rational rose as a webapp) and moving to the component GUI model (like
Swing, Winforms, etc.) and 2. being very clever about the frequency and
size of your XHR conversations with the server. From my unscientific
tests (Yahoo mail and Google calendar)
it seems that some are winning and some are losing the battle on fat
XHR. I don’t think any amount of JSON or compressed XML magic will
solve the problem of poor design.

I think the right way to
achieve all of this is by moving AJAX/Webapp development to component
GUI application frameworks. Properly done, they have the potential to
hide all of the messy bits like exposing too much of your business
logic on the client side, optimizing XHR requests for components that
have empty server-side event listeners, reducing the impedence mismatch
between the Javascript/CSS/XHTML world and the business logic.

Those
who don’t move in this direction will be stuck building and maintaining
ever more complex applications because they didn’t make the shift to a
new design. It’s time to think of the browser simply as a display server.

Jetty 6’s Continuation Mechanism for Ajax

I've touched on the topic of updates and asynchronous processing before. My preferred method of performing updates between the browser and server is via a polling mechanism that returns quickly. An alternative is to open up an XHR connection and keep it open to wait for a response (or a timeout). I don't like this method because it is wasteful in terms of sockets and threads; also, it is likely to stress stateful firewalls, load balancers, etc., and may break in lots of client environments.

Nevertheless, if you want to keep a connection open for notification initiated by the server, this is the way for now. And the Jetty 6 server has at least addressed the thread issue with Continuations.

Behind the scenes, Jetty has to be a bit sneaky to work around Java
and the Servlet specification as there is no mechanism in Java to
suspend a thread and then resume it later. The first time the request
handler calls continuation.getEvent(timeoutMS) a RetryReqeuest runtime
exception is thrown. This exception propogates out of all the request
handling code and is caught by Jetty and handled specially. Instead of
producing an error response, Jetty places the request on a timeout
queue and returns the thread to the thread pool.

When the timeout expires, or if another thread calls
continuation.resume(event) then the request is retried. This time, when
continuation.getEvent(timeoutMS) is called, either the event is
returned or null is returned to indicate a timeout. The request handler
then produces a response as it normally would.

Sockets are still consumed, though. Hopefully the next servlet specification will address some of these issues. Until that time, this may be a good workaround.

Still, my preference is to keep everything except for the display logic on the server side, and that includes handling complex communication with async processing.

Update 1: ActiveMQ can make use of Jetty 6's continuation mechanism.

Update 2: Greg Wilkins has some more extensive thoughts on using Jetty 6 to scale Ajax apps.

About Pathfinder

  • We design and build extraordinary applications for companies looking to make the next great idea a reality.
  • learn more

Topics

WordPress

Comments about this site: info@pathf.com