|
Project Information
Members
Featured
Downloads
Wiki pages
|
mod_athena load balancer for httpd mod_proxy2009-10-27: add / update in trunk that allows you to give a path to a health check script to execute for ath_agent.sh -- it won't send update if check fails, works well with "expect update" on load balancer proxy. NOTE: people building on httpd-2.2.11 with latest apr: You need to edit right after configure script line 20035: add: APR_CFLAGS=`$APR_CONFIG --cflags --cppflags`
CFLAGS="$CFLAGS $APR_CFLAGS"so section is this: echo "${ECHO_T}setting apxs to... $APR_CONFIG" >&6; }
APR_LIB_DEPS=`$APR_CONFIG --libs`
APR_LIB_LD=`$APR_CONFIG --link-ld`
APR_CFLAGS=`$APR_CONFIG --cflags --cppflags`
CFLAGS="$CFLAGS $APR_CFLAGS"
{ echo "$as_me:$LINENO: result: apr-1-config lib deps... $APR_LIB_DEPS" >&5I will need to update the gnu tool chain and fix my build/ath_common.m4 for permanent fix. There is a new release, 2.2.3 that addresses bugs in the "sticky" and "blind sticky" session feature. This code is new to our production environment. YMMV. UPDATED 2009/3/9: 2.2.4 will REALLY fix this. Posted soon. This tool gets periodic blog attention from me here http://damao.net/vhosts/node.to/wordpress/?cat=4 Current install notes: if you already have apr installed independently on your system inside default lib search paths, it can cause build problems. Also, it seems certain combinations of libtool and automake being in your path might cause issues. I'm investigating this. This mostly effects Redhat 5 users. Hopefully we can get an RPM put together. The best way to install this is on a custom built reverse proxy httpd, with --enable-proxy --enable-proxy-http --enable-ssl --enable-threads --with-mpm=worker configure params. Often the best way to get a sense of a module is to read its directives. So here they are http://ath.sourceforge.net/mod_athena_doc/html/mod_athena_directives.html still on sourceforge. This module provides a feature rich load balancer built as a module for Apache httpd-2.2.x. It requires no patches to httpd, and requires only the presence of mod_proxy. It can function in both prefork and threaded modes. There is a GET interface for sending and retrieving data from the lb engine. It works in conjunction with mod_proxy, very much like mod_proxy_balancer, but follows a different implementation strategy that allows for some additional features. It also lacks a couple of things that mod_proxy_balancer provides, which we hope to rectify either by using this code to extend mod_proxy_balancer, or to simply implement some of its features in this module. To describe this, here are terms to make things clearer: Farm: this corresponds to the mod_proxy_balancer's idea of clusters, or "farms" in most hardware load balancers. A farm contains members, and members represent an application server resource. Physicals: this is synonymous with the Members described above, or "workers" in mod_proxy_balancer. Here are the fundamental design features/differences from mod_proxy_balancer: Shared Memory Model: mod_athena stores its table of member and farm data in a separate shared memory segment, with tunable fine-grained locking. This means common data is served by all processes and threads, without contention slowing the server. This has been tested on Linux 2.6 up approx 10,000 concurrent open proxy requests on a quad core box. Statistics Push Model: mod_athena makes load balancing decisions based on per-member data that is updated by agents running on the members themselves. This allows you to balance on arbitrary data like CPU load, thread count, etc. The code path for load balancing is optimized and fixed-length. Fine grained locking for updates and dirty reads for reading ensure the speed of this operation. Updates are sent via a simple GET query based protocol. Algorithms: mod_athena offers a round-robin algorithm, a "simple" algorithm that uses one member data column, eg. CPU, to decide, and a "dynamic" algorithm, that normalizes and combines an arbitrary combination of factors. This dynamic algorithm is very powerful in the modern web application environment: it can make decisions based on the combined factors of things like back-end db connection counts, memory loads, CPU loads, etc. What mod_proxy_balancer can do that this does not: Retry of request: We made the early design decision not to do this, although I think maybe it would be sensible to have it as an option. It would increase stress on the proxy. Also, it exposes interesting risk scenarios, for example, if the original request is actually crashing the worker, it seems like you would quickly iterate through all your workers and starve. In our model, the request would fail, the member gets marked out, and the user gets an error. The user would then have to "choose" to try again (which they do sometimes). In the reverse http(s) proxy configuration, a request from a client is answered by apache. The request is handled by mod_proxy, which would then rewrite the url to substitute a mod_athena "AthFarm" where the target hostname usually resides. httpd.conf: ProxyPass /reports/ http://farm_one/reports/
<AthFarm farm_one>
Member server1
Member server2
Member server3:8080
</AthFarm>Then, mod_athena takes over the request. It will look up the farm, run the algorithm and analyze health status for that farm, and then substitute a real address for for the farm name. If the farm or servers are disabled or all are sick, a complete alternative URL will be substitued. The result is handed back to mod_proxy to finish the request. pseudo-code: http://farm_one/reports/ --> http://chosen_server/reports/ pseudo-code: http://farm_one/reports/ --> http://www.yoursite.com/outage/ In the query configuration, a client (in this case probably another application) sends a GET request with a query string containing the desired farm(s) to httpd. > fetch -q -o - http://my_athena/ath/balance/?farm_one chosen_server The module engine will catch this request, run the appropriate algorithm(s), and return the appropriate result(s) as text content to the client. You can easily use mod_athena in both modes in the same instance of the engine. The package comes with scripts that can be used to mirror and monitor the system, along with a perl package that provides an HTML based front end for runtime management. It is designed to satisfy the most complicated of large online applications. It currently is used as the principle load distribution in an enterprise class HR product of a Fortune 500 company, and has 6 years of continuous production deployment, servicing many millions of requests a day. The interface to the engine is via intuitive http GET calls. The project is being migrated from ath.sourceforge.net. Please go there for documentation for the moment. |