Table of Contents
In this section, we will test the system by indexing a small set of sample OAI records that are included with the Zebra distribution, running a Zebra server against the newly created database, and searching the indexes with a client that connects to that server.
Go to the
examples/oai-pmh subdirectory of the
distribution archive, or make a deep copy of the Debian installation
An XML file containing multiple OAI
records is located in the sub
Additional OAI test records can be downloaded by running a shell script (you may want to abort the script when you have waited longer than your coffee brews ..).
cd data ./fetch_OAI_data.sh cd ../
To index these OAI records, type:
zebraidx-2.0 -c conf/zebra.cfg init zebraidx-2.0 -c conf/zebra.cfg update data zebraidx-2.0 -c conf/zebra.cfg commit
In case you have not installed zebra yet but have compiled the binaries from this tarball, use the following command form:
../../index/zebraidx -c conf/zebra.cfg this and that
On some systems the Zebra binaries are installed under the generic names, you need to use the following command form:
zebraidx -c conf/zebra.cfg this and that
In this command, the word
update is followed
by the name of a directory:
zebraidx updates all
files in the hierarchy rooted at
The command option
-c conf/zebra.cfg points to the proper
You might ask yourself how XML content is indexed using XSLT stylesheets: to satisfy your curiosity, you might want to run the indexing transformation on an example debugging OAI record.
xsltproc conf/oai2index.xsl data/debug-record.xml
Here you see the OAI record transformed into the indexing XML format. Zebra is creating several inverted indexes, and their name and type are clearly visible in the indexing XML format.
If your indexing command was successful, you are now ready to fire up a server. To start a server on port 9999, type:
zebrasrv-2.0 -c conf/zebra.cfg @:9999
The Zebra index that you have just created has a single database
The database contains several OAI records, and the server will
return records in the XML format only. The indexing machine
did the splitting into individual records just behind the scenes.