To arrange for Metaproxy to broadcast searches to multiple back-end
servers, the configuration needs to include two components: a
virt_db
filter that specifies multiple
<target>
elements, and a subsequent
multi
filter. Here, for example, is a complete configuration that
broadcasts searches to both the Library of Congress catalogue and
Index Data's tiny testing database of MARC records:
<?xml version="1.0"?> <metaproxy xmlns="http://indexdata.com/metaproxy" version="1.0"> <start route="start"/> <routes> <route id="start"> <filter type="frontend_net"> <threads>10</threads> <port>@:9000</port> </filter> <filter type="virt_db"> <virtual> <database>lc</database> <target>lx2.loc.gov:210/LCDB_MARC8</target> </virtual> <virtual> <database>marc</database> <target>indexdata.com/marc</target> </virtual> <virtual> <database>all</database> <target>lx2.loc.gov:210/LCDB_MARC8</target> <target>indexdata.com/marc</target> </virtual> </filter> <filter type="multi"/> <filter type="z3950_client"> <timeout>30</timeout> </filter> <filter type="bounce"/> </route> </routes> </metaproxy>
(Using a
virt_db
filter that specifies multiple
<target>
elements, but without a subsequent
multi
filter, yields surprising and undesirable results, as will be
described below. Don't do that.)
Metaproxy can be invoked with this configuration as follows:
../src/metaproxy --config config-simple-multi.xml
And thereafter, Z39.50 clients can connect to the running server
(on port 9000, as specified in the configuration) and search in
any of the databases
lc
(the Library of Congress catalogue),
marc
(Index Data's test database of MARC records)
or
all
(both of these). As an example, a session
using the YAZ command-line client yaz-client
is
here included (edited for brevity and clarity):
$ yaz-client @:9000 Connecting...OK. Z> base lc Z> find computer Search was a success. Number of hits: 10000, setno 1 Elapsed: 5.521070 Z> base marc Z> find computer Search was a success. Number of hits: 10, setno 3 Elapsed: 0.060187 Z> base all Z> find computer Search was a success. Number of hits: 10010, setno 4 Elapsed: 2.237648 Z> show 1 [marc]Record type: USmarc 001 11224466 003 DLC 005 00000000000000.0 008 910710c19910701nju 00010 eng 010 $a 11224466 040 $a DLC $c DLC 050 00 $a 123-xyz 100 10 $a Jack Collins 245 10 $a How to program a computer 260 1 $a Penguin 263 $a 8710 300 $a p. cm. Elapsed: 0.119612 Z> show 2 [VOYAGER]Record type: USmarc 001 13339105 005 20041229102447.0 008 030910s2004 caua 000 0 eng 035 $a (DLC) 2003112666 906 $a 7 $b cbc $c orignew $d 4 $e epcn $f 20 $g y-gencatlg 925 0 $a acquire $b 1 shelf copy $x policy default 955 $a pc10 2003-09-10 $a pv12 2004-06-23 to SSCD; $h sj05 2004-11-30 $e sj05 2004-11-30 to Shelf. 010 $a 2003112666 020 $a 0761542892 040 $a DLC $c DLC $d DLC 050 00 $a MLCM 2004/03312 (G) 245 10 $a 007, everything or nothing : $b Prima's official strategy guide / $c created by Kaizen Media Group. 246 3 $a Double-O-seven, everything or nothing 246 30 $a Prima's official strategy guide 260 $a Roseville, CA : $b Prima Games, $c c2004. 300 $a 161 p. : $b col. ill. ; $c 28 cm. 500 $a "Platforms: Nintendo GameCube, Macintosh, PC, PlayStation 2 computer entertainment system, Xbox"--P. [4] of cover. 650 0 $a Video games. 710 2 $a Kaizen Media Group. 856 42 $3 Publisher description $u http://www.loc.gov/catdir/description/random052/2003112666.html Elapsed: 0.150623 Z>
As can be seen, the first record in the result set is from the Index Data test database, and the second from the Library of Congress database. The result-set continues alternating records round-robin style until the point where one of the databases' records are exhausted.
This example uses only two back-end databases; more may be used. There is no limitation imposed on the number of databases that may be metasearched in this way: issues of resource usage and administrative complexity dictate the practical limits.
What happens when one of the databases doesn't respond? By default,
the entire multi-database search fails, and the appropriate
diagnostic is returned to the client. This is usually appropriate
during development, when technicians need maximum information, but
can be inconvenient in deployment, when users typically don't want
to be bothered with problems of this kind and prefer just to get
the records from the databases that are available. To obtain this
latter behavior add an empty
<hideunavailable>
element inside the
multi
filter:
<filter type="multi"> <hideunavailable/> </filter>
Under this regime, an error is reported to the client only if all the databases in a multi-database search are unavailable.