mod_cluster 1.0.0 GA released

Last week, the mod_cluster team (comprised of members of the JBoss Web and AS Clustering teams) announced the release of version 1.0.0 GA1.

What is mod_cluster?

mod_cluster is a extension of the Apache httpd mod_proxy module in collaboration with a server side java library for load balancing web requests across multiple instances of JBoss Application Server, JBoss Web standalone, or Tomcat.

We already have mod_jk and mod_proxy_balancer.  Why yet another load balancer for httpd?

mod_jk and mod_proxy_balancer are both great, but have the following notable shortcomings:

  1. Static balancer member configuration
    These load balancers require that each AS node (AJP connector address/port) be predefined in a configuration file. You cannot add new nodes without editing a configuration file and restarting the httpd process.
  2. Load factors determined by load balancer itself
    The load balancing methods employed by mod_jk and mod_proxy_balancer are limited by the information httpd can retain about the requests forwarded to a given AS node, including the traffic or busyness of the AJP connector, or the request or session count. These balancing methods assume that individual threads/requests/sessions contribute equally to the load of the server, and that the load of a machine is dominated by only one of these factors.  If your application servers do more work than just processing web requests, these methods quickly become poor indicators of a server’s load.
  3. Ignorance of web application lifecycle
    Both mod_jk and mod_proxy_balancer use server granularity. So long as the single AJP connector to a given node is functional, that node is elegible to receive web requests.  The load balancer knows nothing of the deployment state of individual web applications.  Say, for example, you wanted to patch a deployed web application on each server in your cluster by undeploying and redeploying a new war. The load balancer will continue directing requests for the target web application a given node, even while it is no longer deployed.  Since the server cannot distinguish a request for an undeployed web application from a request for a non-existant resource, the end user sees a 404 error.  To work around this, you must shutdown and restart the entire application server just to update a single web application.

I see.  How does mod_cluster help?

  1. Dynamic configuration
    mod_cluster addresses the issue of static configuration in 2 ways:

    • Rather than httpd defining the AS nodes to which to talk, mod_cluster works in reverse. JBoss AS nodes define the httpd instance(s) that will talk to it. While this is still technically static configuration, new nodes can be added to the AS cluster without requiring any configuration changes on the httpd-side.
    • mod_cluster includes an optional mod_advertise module that allows httpd to broadcast its existence to the AS nodes. This eliminates the need for static configuration of the httpd instances within the AS nodes entirely.
  2. Server-side load calculation
    In mod_cluster, the AS nodes, themselves, dictate their load factor to httpd. This allows mod_cluster to load balance based on attributes about which httpd could not otherwise know (e.g. CPU load, memory usage, etc.).  Load is periodically calculated from any number of user defined load metrics.  Several LoadMetric implementations are provided out of the box, and it is trivial to provide your own.
  3. Full awareness of web application lifecycle
    If your application server uses web application granularity, why shouldn’t your load balancer?  In mod_cluster, AS nodes inform httpd of each web application deployment.  This allows mod_cluster to gracefully redirect web traffic in the event an individual web application is undeployed and/or redeployed.

Sounds promising!  Where can I get it?

1 GA = General Availability, i.e. stable release