Name

pazpar2_protocol — The webservice protocol of Pazpar2

DESCRIPTION

Webservice requests are any that refer to filename "search.pz2". Arguments are GET-style parameters. Argument 'command' is always required and specifies the operation to perform. Any request not recognized as a webservice request is forwarded to the HTTP server specified in the configuration using the proxy setting. This way, a regular webserver can host the user interface (itself dynamic or static HTML), and Ajax-style calls can be used from JS (or any other client-based scripting environment) to interact with the search logic in Pazpar2.

Each command is described in sub sections to follow.

info

Returns version and statistics about the Pazpar2 instance.

init

Initializes a session. Returns session ID to be used in subsequent requests. If a server ID is given in the Pazpar2 server section, then that is included in the session ID as suffix after a period (.).

If the init command is performed as a HTTP GET request, service and settings from local files are used. The service parameter may choose a particular local service.

If the init command is performed as a HTTP POST request and the content-type is text/xml, then the content is XML parsed and treated as service for the session. The root element should be service. Refer to description of the service format. The posting of a service appeared in Pazpar2 version 1.2.1.

Example:

     search.pz2?command=init
    

Response:

    <init>
     <status>OK</status>
     <session>2044502273</session>
    </init>

The init command may take a number of setting parameters, similar to the 'settings' command described below. These settings are immediately applied to the new session. Other parameters for init are:

clear

If this is defined and the value is non-zero, the session will not use the predefined databases in the configuration; only those specified in the settings parameters (per session databases).

service

If this is defined it specifies a service ID. Makes the session use the service with this ID. If this setting is omitted, the session will use the unnamed service in the Pazpar2 configuration.

ping

Keeps a session alive. An idle session will time out after one minute. The ping command can be used to keep the session alive, absent other activity. It is suggested that any browser client have a simple alarm handler which sends a ping every 50 seconds or so, once a session has been initialized.

Example:

     search.pz?command=ping&session=2044502273

    

Response:

<ping>
  <status>OK</status>
</ping>

settings

The settings command applies session-specific settings to one or more databases. A typical function of this is to enable access to restricted resources for registered users, or to set a user- or library-specific username/password to use against a target.

Each setting parameter has the form name[target]=value, where name is the name of the setting (e.g. pz:authentication), target is a target ID, or possibly a wildcard, and value is the desired value for the setting.

Because the settings command manipulates potentially sensitive information, it is possible to configure Pazpar2 to only allow access to this command from a trusted site -- usually from server-side scripting, which in turn is responsible for authenticating the user, and possibly determining which resources that they have access to, etc.

Note

As a shortcut, it is also possible to override settings directly in the init command.

If the settings command is performed as HTTP POST and the content-type is text/xml, then the content is XML parsed and treated as settings - with a format identical to local settings files. The posting of settings appeared in Pazpar2 version 1.2.1.

Example:

search.pz?command=settings&session=2044502273&pz:allow[search.com:210/db1]=1
      

Response:

<settings>
  <status>OK</status>
</settings>

search

Launches a search. Parameters:

session

Session ID

query

CCL query

filter

Limits the search to a given set of targets specified by the filter. The filter consists of a comma-separated list of setting+operator+args pairs all of which must be satisfied (matched) for Pazpar2 to include the target. The setting is a Pazpar2 setting (such as pz:id). The operator is either = (string match) or ~ (substring match). The args is a list of values separated by |. If either of these values match, the key-value pair is matched.

For Pazpar2 1.13.0 the filter can be prefixed with vertical bar (|) as first character. In this case, if any of the key-value pairs matches, Pazpar2 includes the target.

limit

Narrows the search by one or more fields (typically facets). The limit is sequence of one or more name=args pairs separated by comma. The args is a list of values separated by vertical bar (|). The meaning of | is alternative (i.e. OR). A value that contains a comma (,), a vertical bar (|) or backslash itself must be preceded by backslash (\). The pz:limitmap configuration item defines how the searches are mapped to a database.

startrecs

Specifies the first record to retrieve from each target. The first record in a result set for a target is numbered 0, next record is numbered 1. By default startrecs is 0.

maxrecs

Specifies the maximum number of records to retrieve from each target. The default value is 100. This setting has same meaning as per-target setting pz:maxrecs . If pz:maxrecs is set, it takes precedence over argument maxrecs.

sort

Specifies sort criteria. The argument is a comma-separated list (no whitespace allowed) of sort fields, with the highest-priority field first. A sort field may be followed by a colon followed by the number '0' (decreasing) or '1' (increasing). Default sort order is decreasing. Sort field names can be any field name designated as a sort field in the pazpar2.cfg file, or the special names 'relevance', 'retrieval' and 'position'.

Sort type 'position' sorts by position/offset for each database. Sort type 'retrieval' sorts by position of retrieval (first record retrieved is 1, second record is 2, etc.).

If not specified here or as sort-default in pazpar2.cfg, Pazpar2 will default to the built-in 'relevance' ranking.

Having sort criteria at search is important for targets that supportnative sorting in order to get best results. Pazpar2 will trigger a new search if search criteria changes from Pazpar2 to target-based sorting or vice versa.

mergekey

Sets mergekey for this search and rest of session, or until another mergekey is given for show/search. The mergekey value is a comma-separated list with one or more names as they appear in the service description equivalent to mergekey="optional" inside a metadata element. If the empty string is given for mergekey it is disabled and rest of session will use the default mergekey from service or stylesheet.

This facility, "dynamic mergekey", appeared in Pazpar2 version 1.6.31.

rank

Sets rank method for this search and rest of session, or until another rank is given for show/search. The rank value is a comma-separated list of field=value pairs. The format is the same as rank for a metadata element. If the empty string is given for rank, it is disabled and rest of session will use the default rank from metadata or stylesheet.

This facility, "dynamic ranking", appeared in Pazpar2 version 1.6.31.

Example:

search.pz2?session=2044502273&command=search&query=computer+science

     

Response:

<search>
  <status>OK</status>
</search>
     

stat

Provides status information about an ongoing search. Parameters:

session

Session ID

Example:

search.pz2?session=2044502273&command=stat
    

Output

<stat>
  <activeclients>26</activeclients>
  <hits>84919</hits>            -- Total hitcount
  <records>742</records>        -- Total number of records fetched in last query
  <clients>45</clients>         -- Total number of associated clients
  <unconnected>0</unconnected>  -- Number of disconnected clients
  <connecting>0</connecting>    -- Number of clients in connecting state
  <working>26</working>         -- ... working (searching, presenting, etc.)
  <idle>2</idle>                -- ... idle (not doing anything)
  <failed>0</failed>            -- ... Connection failed
  <error>13</error>             -- ... Error was produced somewhere
</stat>
     

show

Shows records retrieved. Parameters:

session

Session ID

start

First record to show - 0-indexed.

num

Number of records to show. If omitted, 20 is used.

block

If block is set to 1, the command will hang until there are records ready to display. Use this to show first records quickly without requiring rapid polling.

If block is set to preferred, the command will wait until records have been received from all databases with preferred setting

sort

Specifies sort criteria. The argument is a comma-separated list (no whitespace allowed) of sort fields, with the highest-priority field first. A sort field may be followed by a colon followed by the number '0' (decreasing) or '1' (increasing). Default sort order is decreasing. Sort field names can be any field name designated as a sort field in the pazpar2.cfg file, or the special names 'relevance', 'retrieval' and 'position'.

Sort type 'position' sorts by position/offset for each database. Sort type 'retrieval' sorts by position of retrieval (first record retrieved is 1, second record is 2, etc.).

If not specified here or as sort-default in pazpar2.cfg, then Pazpar2 will default to the built-in 'relevance' ranking.

Having sort criteria at search is important for targets that supports native sorting in order to get best results. Pazpar2 will trigger a new search if search criteria changes from pazpar2-based to target-based sorting.

For targets where pz:sortmap is defined, a sort operation will be executed (which will possibly include extending the search).

mergekey

Sets mergekey for this show, and for the rest of the session, or until another mergekey is given for show/search. The mergekey value is a comma-separated list with one or more names as they appear in the service description equivalent to mergekey="optional" inside a metadata element. If the empty string is given for mergekey, it is disabled and rest of session will use the default mergekey from service or stylesheet.

This facility, "dynamic mergekey", appeared in Pazpar2 version 1.6.31.

rank

Sets rank method for this show, and for the rest of the session, or until another rank is given for show/search. The rank value is a comma-separated list of field=value pairs. The format is the same as rank for a metadata element. If the empty string is given for rank, it is disabled and rest of session will use the default rank from metadata or stylesheet.

This facility, "dynamic ranking", appeared in Pazpar2 version 1.6.31.

snippets

If specified and set to 1, then data will include snippets marked with <match> tags. Otherwise snippets will not be included.

This facility, "snippets", appeared in Pazpar2 version 1.6.32.

version

If specified and set to 2, enables Pazpar2 to return approximation on hits and counts when doing record filtering using the limit parameter on search and a limitmap with a value of "local:".

This facility, "version", appeared in Pazpar2 version 1.6.13.

Example:

search.pz2?session=2044502273&command=show&start=0&num=2&sort=title:1

Output:

<show>
  <status>OK</status>
  <activeclients>3</activeclients>     -- How many clients are still working
  <merged>6</merged>                   -- Number of merged records
  <total>7</total>                     -- Total of all hitcounts
  <start>0</start>                     -- The start number you requested
  <num>2</num>                         -- Number of records retrieved
  <hit>
    <md-title>How to program a computer, by Jack Collins</md-title>
    <count>2</count>                   -- Number of merged records
    <recid>6</recid>                   -- Record ID for this record
  </hit>
  <hit>
    <md-title>
  Computer processing of dynamic images from an Anger scintillation camera :
  the proceedings of a workshop /
    </md-title>
    <recid>2</recid>
  </hit>
</show>
     

record

Retrieves a detailed record. Unlike the show command, this command returns metadata records before merging takes place. Parameters:

session

Session ID

id

record ID as provided by the show command.

offset

This optional parameter is an integer which, when given, makes Pazpar2 return the original record for a specific target. The record set from first target is numbered 0, second record set is numbered 1, etc. The nativesyntax setting, as usual, is used to determine how to create XML from the original record - unless parameter binary is given in which the record is fetched as "raw" from ZOOM C (raw, original record).

When offset/checksum is not given, the Pazpar2 metadata for the record is returned and with metadata for each target's data specified in a "location" list.

checksum

This optional parameter is a string which, when given, makes Pazpar2 return the original record for a specific target. The checksum is returned as attribute 'checksum' in element 'location' for show command and record command (when checksum and offset is NOT given). The nativesyntax setting, as usual, is used to determine how to create XML from the original record - unless parameter binary is given in which the record is fetched as "raw" from ZOOM C (raw, original record).

When offset/checksum is not given, the Pazpar2 metadata for the record is returned and with metadata for each targets' data specified in a 'location' list.

nativesyntax

This optional parameter can be used to override pz:nativesyntax as given for the target. This allow an alternative nativesyntax to be used for original records (see parameteroffset above).

syntax

This optional parameter is the record syntax used for raw transfer (i.e. when offset is specified). If syntax is not given, but offset is used, the value of pz:requestsyntax is used.

esn

This optional parameter is the element set name used for retrieval of a raw record (i.e. when offset is specified). If esn is not given, but offset is used, the value of pz:elements is used.

binary

This optional parameter enables "binary" response for retrieval of a original record (i.e. when offset is specified). For binary response, the record by default is fetched from ZOOM C using the "raw" option or by parameter nativesyntax if given.

snippets

If specified and set to 1, then data will include snippets marked with <match> tags. Otherwise snippets will not be included.

This facility, "snippets", appeared in Pazpar2 version 1.6.32.

Example:

search.pz2?session=605047297&command=record&id=3

Example output:

<record>
  <md-title>
	The Puget Sound Region : a portfolio of thematic computer maps /
  </md-title>
  <md-date>1974</md-date>
  <md-author>Mairs, John W.</md-author>
  <md-subject>Cartography</md-subject>
</record>

stop

Makes Pazpar2 stop further search and retrieval for busy databases.

termlist

Retrieves term list(s). Parameters:

session

Session ID

name

comma-separated list of termlist names. If omitted, all termlists are returned.

num

maximum number of entries to return - default is 15.

version

If specified and set to 2, enables Pazpar2 to return approximation on hits and counts when doing record filtering using the limit parameter on search and a limitmap with a value of "local:".

This facility, "version", appeared in Pazpar2 version 1.6.13.

Example:

search.pz2?session=2044502273&command=termlist&name=author,subject

Output:

<termlist>
  <activeclients>3</activeclients>
  <list name="author">
    <term>
      <name>Donald Knuth</name>
      <frequency>10</frequency>
    </term>
    <term>
      <name>Robert Pirsig</name>
      <frequency>2</frequency>
    </term>
  </list>
  <list name="subject">
    <term>
      <name>Computer programming</name>
      <frequency>10</frequency>
    </term>
  </list>
</termlist>

For the special termlist name "xtargets", results are returned about the targets which have returned the most hits. The 'term' subtree has additional elements, specifically a state and diagnostic field. In the example below, a target ID is returned in place of 'name'. This may or may not change later.

Example

<term>
  <name>library2.mcmaster.ca</name>
  <frequency>11734</frequency>         -- Number of hits
  <state>Client_Idle</state>           -- See the description of 'bytarget' below
  <diagnostic>0</diagnostic>           -- Z39.50 diagnostic codes
</term>

bytarget

Returns information about the status of each active client. Parameters:

session

Session ID

version

If specified and set to 2, enables Pazpar2 to return approximation on hits and counts when doing record filtering using the limit parameter on search and a limitmap with a value of "local:".

This facility, "version", appeared in Pazpar2 version 1.6.13.

Example:

search.pz2?session=605047297&command=bytarget&id=3

Example output:

<bytarget>
  <status>OK</status>
  <target>
    <id>lx2.loc.gov:210/LCDB_MARC8</id>
    <name>Library of Congress</name>
    <hits>10000</hits>
    <diagnostic>0</diagnostic>
    <records>20</records>
    <filtered>0</filtered>
    <state>Client_Working</state>
    <query_type>pqf</query_type>
    <query_data>@attr 1=1016 birds</query_data>
  </target>
  <!-- ... more target nodes below as necessary -->
</bytarget>

The following client states are defined: Client_Connecting, Client_Idle, Client_Working, Client_Error, Client_Failed, Client_Disconnected.

service

Returns service definition (XML). Parameters:

session

Session ID

The service command appeared in Pazpar2 version 1.6.32

SEE ALSO

Pazpar2: pazpar2(8)

Pazpar2 Configuration: pazpar2_conf(5)