pazpar2_protocol — The webservice protocol of Pazpar2
Webservice requests are any that refer to filename "search.pz2". Arguments are GET-style parameters. Argument 'command' is always required and specifies the operation to perform. Any request not recognized as a webservice request is forwarded to the HTTP server specified in the configuration using the proxy setting. This way, a regular webserver can host the user interface (itself dynamic or static HTML), and Ajax-style calls can be used from JS (or any other client-based scripting environment) to interact with the search logic in Pazpar2.
Each command is described in sub sections to follow.
Initializes a session. Returns session ID to be used in subsequent requests. If a server ID is given in the Pazpar2 server section, then that is included in the session ID as suffix after a period (.).
If the init command is performed as a HTTP GET request, service and settings from local files are used. The service parameter may choose a particular local service.
If the init command is performed as a HTTP POST request and the content-type is text/xml, then the content is XML parsed and treated as service for the session. The root element should be service. Refer to description of the service format. The posting of a service appeared in Pazpar2 version 1.2.1.
Example:
search.pz2?command=init
Response:
<init> <status>OK</status> <session>2044502273</session> </init>
The init command may take a number of setting parameters, similar to the 'settings' command described below. These settings are immediately applied to the new session. Other parameters for init are:
If this is defined and the value is non-zero, the session will not use the predefined databases in the configuration; only those specified in the settings parameters (per session databases).
If this is defined it specifies a service ID. Makes the session use the service with this ID. If this setting is omitted, the session will use the unnamed service in the Pazpar2 configuration.
Keeps a session alive. An idle session will time out after one minute. The ping command can be used to keep the session alive, absent other activity. It is suggested that any browser client have a simple alarm handler which sends a ping every 50 seconds or so, once a session has been initialized.
Example:
search.pz?command=ping&session=2044502273
Response:
<ping> <status>OK</status> </ping>
The settings command applies session-specific settings to one or more databases. A typical function of this is to enable access to restricted resources for registered users, or to set a user- or library-specific username/password to use against a target.
Each setting parameter has the form name[target]=value, where name is the name of the setting (e.g. pz:authentication), target is a target ID, or possibly a wildcard, and value is the desired value for the setting.
Because the settings command manipulates potentially sensitive information, it is possible to configure Pazpar2 to only allow access to this command from a trusted site -- usually from server-side scripting, which in turn is responsible for authenticating the user, and possibly determining which resources that they have access to, etc.
As a shortcut, it is also possible to override settings directly in the init command.
If the settings command is performed as HTTP POST and the content-type is text/xml, then the content is XML parsed and treated as settings - with a format identical to local settings files. The posting of settings appeared in Pazpar2 version 1.2.1.
Example:
search.pz?command=settings&session=2044502273&pz:allow[search.com:210/db1]=1
Response:
<settings> <status>OK</status> </settings>
Launches a search. Parameters:
Session ID
CCL query
Limits the search to a given set of targets specified by the
filter. The filter consists of a comma-separated list of
setting+operator+args
pairs all of which must be satisfied (matched) for Pazpar2 to
include the target.
The setting is a Pazpar2 setting
(such as pz:id
).
The operator is either
=
(string match)
or ~
(substring match).
The args is a list of values separated
by |
. If either of these values match, the
key-value pair is matched.
For Pazpar2 1.13.0 the filter can be prefixed with vertical bar
(|
) as first character. In this case, if any of the
key-value pairs matches, Pazpar2 includes the target.
Narrows the search by one or more fields (typically facets).
The limit is sequence of one or more
name=args pairs separated
by comma. The args is a list of values
separated by vertical bar (|
).
The meaning of |
is alternative (i.e. OR).
A value that contains a comma (,
),
a vertical bar (|
) or
backslash itself must be preceded by backslash (\
).
The pz:limitmap configuration
item defines how the searches are mapped to a database.
Specifies the first record to retrieve from each target. The first record in a result set for a target is numbered 0, next record is numbered 1. By default startrecs is 0.
Specifies the maximum number of records to retrieve from each target. The default value is 100. This setting has same meaning as per-target setting pz:maxrecs . If pz:maxrecs is set, it takes precedence over argument maxrecs.
Specifies sort criteria. The argument is a comma-separated list (no whitespace allowed) of sort fields, with the highest-priority field first. A sort field may be followed by a colon followed by the number '0' (decreasing) or '1' (increasing). Default sort order is decreasing. Sort field names can be any field name designated as a sort field in the pazpar2.cfg file, or the special names 'relevance', 'retrieval' and 'position'.
Sort type 'position' sorts by position/offset for each database. Sort type 'retrieval' sorts by position of retrieval (first record retrieved is 1, second record is 2, etc.).
If not specified here or as sort-default in pazpar2.cfg, Pazpar2 will default to the built-in 'relevance' ranking.
Having sort criteria at search is important for targets that supportnative sorting in order to get best results. Pazpar2 will trigger a new search if search criteria changes from Pazpar2 to target-based sorting or vice versa.
Sets mergekey for this search and rest of session, or until
another mergekey is given for show/search. The mergekey value is a
comma-separated list with one or more names as they appear
in the service description equivalent to
mergekey="optional"
inside a metadata element.
If the empty string is given for mergekey it is disabled
and rest of session will use the default mergekey from service
or stylesheet.
This facility, "dynamic mergekey", appeared in Pazpar2 version 1.6.31.
Sets rank method for this search and rest of session, or until another rank is given for show/search. The rank value is a comma-separated list of field=value pairs. The format is the same as rank for a metadata element. If the empty string is given for rank, it is disabled and rest of session will use the default rank from metadata or stylesheet.
This facility, "dynamic ranking", appeared in Pazpar2 version 1.6.31.
Example:
search.pz2?session=2044502273&command=search&query=computer+science
Response:
<search> <status>OK</status> </search>
Provides status information about an ongoing search. Parameters:
Session ID
Example:
search.pz2?session=2044502273&command=stat
Output
<stat> <activeclients>26</activeclients> <hits>84919</hits> -- Total hitcount <records>742</records> -- Total number of records fetched in last query <clients>45</clients> -- Total number of associated clients <unconnected>0</unconnected> -- Number of disconnected clients <connecting>0</connecting> -- Number of clients in connecting state <working>26</working> -- ... working (searching, presenting, etc.) <idle>2</idle> -- ... idle (not doing anything) <failed>0</failed> -- ... Connection failed <error>13</error> -- ... Error was produced somewhere </stat>
Shows records retrieved. Parameters:
Session ID
First record to show - 0-indexed.
Number of records to show. If omitted, 20 is used.
If block is set to 1, the command will hang until there are records ready to display. Use this to show first records quickly without requiring rapid polling.
If block is set to preferred
, the command will
wait until records have been received from all databases with preferred
setting
Specifies sort criteria. The argument is a comma-separated list (no whitespace allowed) of sort fields, with the highest-priority field first. A sort field may be followed by a colon followed by the number '0' (decreasing) or '1' (increasing). Default sort order is decreasing. Sort field names can be any field name designated as a sort field in the pazpar2.cfg file, or the special names 'relevance', 'retrieval' and 'position'.
Sort type 'position' sorts by position/offset for each database. Sort type 'retrieval' sorts by position of retrieval (first record retrieved is 1, second record is 2, etc.).
If not specified here or as sort-default in pazpar2.cfg, then Pazpar2 will default to the built-in 'relevance' ranking.
Having sort criteria at search is important for targets that supports native sorting in order to get best results. Pazpar2 will trigger a new search if search criteria changes from pazpar2-based to target-based sorting.
For targets where pz:sortmap is defined, a sort operation will be executed (which will possibly include extending the search).
Sets mergekey for this show, and for the rest of the session, or until
another mergekey is given for show/search. The mergekey value is a
comma-separated list with one or more names as they appear
in the service description equivalent to
mergekey="optional"
inside a metadata element.
If the empty string is given for mergekey, it is disabled
and rest of session will use the default mergekey from service
or stylesheet.
This facility, "dynamic mergekey", appeared in Pazpar2 version 1.6.31.
Sets rank method for this show, and for the rest of the session, or until another rank is given for show/search. The rank value is a comma-separated list of field=value pairs. The format is the same as rank for a metadata element. If the empty string is given for rank, it is disabled and rest of session will use the default rank from metadata or stylesheet.
This facility, "dynamic ranking", appeared in Pazpar2 version 1.6.31.
If specified and set to 1, then data will include snippets marked with <match> tags. Otherwise snippets will not be included.
This facility, "snippets", appeared in Pazpar2 version 1.6.32.
If specified and set to 2, enables Pazpar2 to return approximation on hits and counts when doing record filtering using the limit parameter on search and a limitmap with a value of "local:".
This facility, "version", appeared in Pazpar2 version 1.6.13.
Example:
search.pz2?session=2044502273&command=show&start=0&num=2&sort=title:1
Output:
<show> <status>OK</status> <activeclients>3</activeclients> -- How many clients are still working <merged>6</merged> -- Number of merged records <total>7</total> -- Total of all hitcounts <start>0</start> -- The start number you requested <num>2</num> -- Number of records retrieved <hit> <md-title>How to program a computer, by Jack Collins</md-title> <count>2</count> -- Number of merged records <recid>6</recid> -- Record ID for this record </hit> <hit> <md-title> Computer processing of dynamic images from an Anger scintillation camera : the proceedings of a workshop / </md-title> <recid>2</recid> </hit> </show>
Retrieves a detailed record. Unlike the show command, this command returns metadata records before merging takes place. Parameters:
Session ID
record ID as provided by the show command.
This optional parameter is an integer which, when given, makes
Pazpar2 return the original record for a specific target.
The record set from first target is numbered 0,
second record set is numbered 1, etc.
The nativesyntax setting, as usual, is used to determine how to
create XML from the original record - unless parameter
binary
is given in which the record is
fetched as "raw" from ZOOM C (raw, original record).
When offset/checksum is not given, the Pazpar2 metadata for the record is returned and with metadata for each target's data specified in a "location" list.
This optional parameter is a string which, when given, makes
Pazpar2 return the original record for a specific target. The
checksum is returned as attribute 'checksum' in element
'location' for show command and record command (when checksum
and offset is NOT given).
The nativesyntax setting, as usual, is used to determine how to
create XML from the original record - unless parameter
binary
is given in which the record is
fetched as "raw" from ZOOM C (raw, original record).
When offset/checksum is not given, the Pazpar2 metadata for the record is returned and with metadata for each targets' data specified in a 'location' list.
This optional parameter can be used to override pz:nativesyntax as given for the target. This allow an alternative nativesyntax to be used for original records (see parameteroffset above).
This optional parameter is the record syntax used for raw transfer (i.e. when offset is specified). If syntax is not given, but offset is used, the value of pz:requestsyntax is used.
This optional parameter is the element set name used for retrieval of a raw record (i.e. when offset is specified). If esn is not given, but offset is used, the value of pz:elements is used.
This optional parameter enables "binary" response for retrieval of a original record (i.e. when offset is specified). For binary response, the record by default is fetched from ZOOM C using the "raw" option or by parameter nativesyntax if given.
If specified and set to 1, then data will include snippets marked with <match> tags. Otherwise snippets will not be included.
This facility, "snippets", appeared in Pazpar2 version 1.6.32.
Example:
search.pz2?session=605047297&command=record&id=3
Example output:
<record> <md-title> The Puget Sound Region : a portfolio of thematic computer maps / </md-title> <md-date>1974</md-date> <md-author>Mairs, John W.</md-author> <md-subject>Cartography</md-subject> </record>
Retrieves term list(s). Parameters:
Session ID
comma-separated list of termlist names. If omitted, all termlists are returned.
maximum number of entries to return - default is 15.
If specified and set to 2, enables Pazpar2 to return approximation on hits and counts when doing record filtering using the limit parameter on search and a limitmap with a value of "local:".
This facility, "version", appeared in Pazpar2 version 1.6.13.
Example:
search.pz2?session=2044502273&command=termlist&name=author,subject
Output:
<termlist> <activeclients>3</activeclients> <list name="author"> <term> <name>Donald Knuth</name> <frequency>10</frequency> </term> <term> <name>Robert Pirsig</name> <frequency>2</frequency> </term> </list> <list name="subject"> <term> <name>Computer programming</name> <frequency>10</frequency> </term> </list> </termlist>
For the special termlist name "xtargets", results are returned about the targets which have returned the most hits. The 'term' subtree has additional elements, specifically a state and diagnostic field. In the example below, a target ID is returned in place of 'name'. This may or may not change later.
Example
<term> <name>library2.mcmaster.ca</name> <frequency>11734</frequency> -- Number of hits <state>Client_Idle</state> -- See the description of 'bytarget' below <diagnostic>0</diagnostic> -- Z39.50 diagnostic codes </term>
Returns information about the status of each active client. Parameters:
Session ID
If specified and set to 2, enables Pazpar2 to return approximation on hits and counts when doing record filtering using the limit parameter on search and a limitmap with a value of "local:".
This facility, "version", appeared in Pazpar2 version 1.6.13.
Example:
search.pz2?session=605047297&command=bytarget&id=3
Example output:
<bytarget> <status>OK</status> <target> <id>lx2.loc.gov:210/LCDB_MARC8</id> <name>Library of Congress</name> <hits>10000</hits> <diagnostic>0</diagnostic> <records>20</records> <filtered>0</filtered> <state>Client_Working</state> <query_type>pqf</query_type> <query_data>@attr 1=1016 birds</query_data> </target> <!-- ... more target nodes below as necessary --> </bytarget>
The following client states are defined: Client_Connecting, Client_Idle, Client_Working, Client_Error, Client_Failed, Client_Disconnected.