3. Multi-database search with the multi filter

To arrange for Metaproxy to broadcast searches to multiple back-end servers, the configuration needs to include two components: a virt_db filter that specifies multiple <target> elements, and a subsequent multi filter. Here, for example, is a complete configuration that broadcasts searches to both the Library of Congress catalogue and Index Data's tiny testing database of MARC records:

<?xml version="1.0"?>
<metaproxy xmlns="http://indexdata.com/metaproxy" version="1.0">
  <start route="start"/>
  <routes>
    <route id="start">
      <filter type="frontend_net">
        <threads>10</threads>
        <port>@:9000</port>
      </filter>
      <filter type="virt_db">
        <virtual>
          <database>lc</database>
          <target>lx2.loc.gov:210/LCDB_MARC8</target>
        </virtual>
        <virtual>
          <database>marc</database>
          <target>indexdata.com/marc</target>
        </virtual>
        <virtual>
          <database>all</database>
          <target>lx2.loc.gov:210/LCDB_MARC8</target>
          <target>indexdata.com/marc</target>
        </virtual>
      </filter>
      <filter type="multi"/>
      <filter type="z3950_client">
        <timeout>30</timeout>
      </filter>
      <filter type="bounce"/>
    </route>
  </routes>
</metaproxy>

(Using a virt_db filter that specifies multiple <target> elements, but without a subsequent multi filter, yields surprising and undesirable results, as will be described below. Don't do that.)

Metaproxy can be invoked with this configuration as follows:

../src/metaproxy --config config-simple-multi.xml

And thereafter, Z39.50 clients can connect to the running server (on port 9000, as specified in the configuration) and search in any of the databases lc (the Library of Congress catalogue), marc (Index Data's test database of MARC records) or all (both of these). As an example, a session using the YAZ command-line client yaz-client is here included (edited for brevity and clarity):

$ yaz-client @:9000
Connecting...OK.
Z> base lc
Z> find computer
Search was a success.
Number of hits: 10000, setno 1
Elapsed: 5.521070
Z> base marc
Z> find computer
Search was a success.
Number of hits: 10, setno 3
Elapsed: 0.060187
Z> base all
Z> find computer
Search was a success.
Number of hits: 10010, setno 4
Elapsed: 2.237648
Z> show 1
[marc]Record type: USmarc
001    11224466
003 DLC
005 00000000000000.0
008 910710c19910701nju           00010 eng
010    $a 11224466
040    $a DLC $c DLC
050 00 $a 123-xyz
100 10 $a Jack Collins
245 10 $a How to program a computer
260 1  $a Penguin
263    $a 8710
300    $a p. cm.
Elapsed: 0.119612
Z> show 2
[VOYAGER]Record type: USmarc
001 13339105
005 20041229102447.0
008 030910s2004    caua          000 0 eng
035    $a (DLC)  2003112666
906    $a 7 $b cbc $c orignew $d 4 $e epcn $f 20 $g y-gencatlg
925 0  $a acquire $b 1 shelf copy $x policy default
955    $a pc10 2003-09-10 $a pv12 2004-06-23 to SSCD; $h sj05 2004-11-30 $e sj05 2004-11-30 to Shelf.
010    $a   2003112666
020    $a 0761542892
040    $a DLC $c DLC $d DLC
050 00 $a MLCM 2004/03312 (G)
245 10 $a 007, everything or nothing : $b Prima's official strategy guide / $c created by Kaizen Media Group.
246 3  $a Double-O-seven, everything or nothing
246 30 $a Prima's official strategy guide
260    $a Roseville, CA : $b Prima Games, $c c2004.
300    $a 161 p. : $b col. ill. ; $c 28 cm.
500    $a "Platforms: Nintendo GameCube, Macintosh, PC, PlayStation 2 computer entertainment system, Xbox"--P. [4] of cover.
650  0 $a Video games.
710 2  $a Kaizen Media Group.
856 42 $3 Publisher description $u http://www.loc.gov/catdir/description/random052/2003112666.html
Elapsed: 0.150623
Z>

As can be seen, the first record in the result set is from the Index Data test database, and the second from the Library of Congress database. The result-set continues alternating records round-robin style until the point where one of the databases' records are exhausted.

This example uses only two back-end databases; more may be used. There is no limitation imposed on the number of databases that may be metasearched in this way: issues of resource usage and administrative complexity dictate the practical limits.

What happens when one of the databases doesn't respond? By default, the entire multi-database search fails, and the appropriate diagnostic is returned to the client. This is usually appropriate during development, when technicians need maximum information, but can be inconvenient in deployment, when users typically don't want to be bothered with problems of this kind and prefer just to get the records from the databases that are available. To obtain this latter behavior add an empty <hideunavailable> element inside the multi filter:

      <filter type="multi">
        <hideunavailable/>
      </filter>

Under this regime, an error is reported to the client only if all the databases in a multi-database search are unavailable.