6. Load balancing

Just like any web server, Pazpar2 can be load balanced by a standard hardware or software load balancer as long as the session stickiness is ensured. If you are already running the Apache2 web server in front of Pazpar2 and use the apache mod_proxy module to 'relay' client requests to Pazpar2, this set up can be easily extended to include load balancing capabilites. To do so, you need to enable the mod_proxy_balance module in your Apache2 installation.

On a Debian based Apache 2 system, the relevant modules can be enabled with:

     sudo a2enmod proxy_http
    

The mod_proxy_balancer can pass all 'sessionsticky' requests to the same backend worker, as long as the requests are marked with the originating worker's ID (called 'route'). If the Pazpar2 serverID is configured (by setting an 'id' attribute on the 'server' element in the Pazpar2 configuration file) Pazpar2 will append it to the 'session' element returned during the 'init' in a mod_proxy_balancer compatible manner. Since the 'session' is then re-sent by the client (for all pazpar2 request besides 'init'), the balancer can use the marker to pass the request to the right route. To do so, the balancer needs to be configured to inspect the 'session' parameter.

Example 3.1. Apache 2 load balancing configuration

Having four Pazpar2 instances running on the same host, having port range of 8004-8007 and serverIDs of: pz1, pz2, pz3 and pz4 respectively, we could use the following Apache 2 configuration to expose a single Pazpar2 'endpoint' on a standard (/pazpar2/search.pz2) location:

       <Proxy *>
         AddDefaultCharset off
         Order deny,allow
         Allow from all
       </Proxy>
       ProxyVia Off

       # 'route' has to match the configured pazpar2 server ID
       <Proxy balancer://pz2cluster>
         BalancerMember http://localhost:8004 route=pz1
         BalancerMember http://localhost:8005 route=pz2
         BalancerMember http://localhost:8006 route=pz3
         BalancerMember http://localhost:8007 route=pz4
       </Proxy>

       # route is resent in the 'session' param which has the form:
       # 'sessid.serverid', understandable by the mod_proxy_load_balancer
       # this is not going to work if the client tampers with the 'session' param
       ProxyPass /pazpar2/search.pz2 balancer://pz2cluster lbmethod=byrequests stickysession=session nofailover=On
     

The 'ProxyPass' line sets up a reverse proxy for request ‘/pazpar2/search.pz2’ and delegates all requests to the load balancer (virtual worker) with name ‘pz2cluster’. Sticky sessions are enabled and implemented using the ‘session’ parameter. The ‘Proxy’ section lists all the servers (real workers) which the load balancer can use.