Pazpar2 conf

Pazpar2 conf
Prev	Reference	Next

Name

pazpar2_conf — Pazpar2 Configuration

Synopsis

pazpar2.conf

DESCRIPTION

The Pazpar2 configuration file, together with any referenced XSLT files, govern Pazpar2's behavior as a client, and control the normalization and extraction of data elements from incoming result records, for the purposes of merging, sorting, facet analysis, and display.

The file is specified using the option -f on the Pazpar2 command line. There is not presently a way to reload the configuration file without restarting Pazpar2, although this will most likely be added some time in the future.

FORMAT

The configuration file is XML-structured. It must be well-formed XML. All elements specific to Pazpar2 should belong to the namespace http://www.indexdata.com/pazpar2/1.0 (this is assumed in the following examples). The root element is named "pazpar2". Under the root element are a number of elements which group categories of information. The categories are described below.

threads

This section is optional and is supported for Pazpar2 version 1.3.1 and later. It is identified by element "threads" which may include one attribute "number" which specifies the number of worker-threads that the Pazpar2 instance is to use. A value of 0 (zero) disables worker-threads (all work is carried out in main thread).

sockets

This section is optional and is supported for Pazpar2 version 1.13.0 and later . It is identified by element "sockets" which may include one attribute "max" which specifies the maximum number of sockets to be used by Pazpar2.

file

This configuration takes one attribute path which specifies a path to search for local files, such as XSLTs and settings. The path is a colon separated list of directories. Its default value is "." which is equivalent to the location of the main configuration file (where indeed the file element is given).

server

This section governs overall behavior of a server endpoint. It is identified by the element "server" which takes an optional attribute, "id", which identifies this particular Pazpar2 server. Any string value for "id" may be given.

The data elements are described below. From Pazpar2 version 1.2 this is a repeatable element.

listen

Configures the webservice -- this controls how you can connect to Pazpar2 from your browser or server-side code. The attributes 'host' and 'port' control the binding of the server. The 'host' attribute can be used to bind the server to a secondary IP address of your system, enabling you to run Pazpar2 on port 80 alongside a conventional web server. You can override this setting on the command line using the option -h.

proxy

If this item is given, Pazpar2 will forward all incoming HTTP requests that do not contain the filename 'search.pz2' to the host and port specified using the 'host' and 'port' attributes. The 'myurl' attribute is required, and should provide the base URL of the server. Generally, the HTTP URL for the host specified in the 'listen' parameter. This functionality is crucial if you wish to use Pazpar2 in conjunction with browser-based code (JS, Flash, applets, etc.) which operates in a security sandbox. Such code can only connect to the same server from which the enclosing HTML page originated. Pazpar2's proxy functionality enables you to host all of the main pages (plus images, CSS, etc.) of your application on a conventional webserver, while efficiently processing webservice requests for metasearch status, results, etc.

icu_chain

Specifies character set normalization for relevancy / sorting / mergekey and facets - for the server. These definitions serve as default for services that don't have these given. For the meaning of these settings refer to the icu_chain element inside service.

relevance / sort / mergekey / facet

Obsolete. Use element icu_chain instead.

settings

Specifies target settings for the server. These settings serve as default for all services which don't have these given. The settings element requires one attribute 'src' which specifies a settings file or a directory. If a directory is given, all files with suffix .xml are read from this directory. Refer to the section called “TARGET SETTINGS” for more information.

service

This nested element controls the behavior of Pazpar2 with respect to your data model. In Pazpar2, incoming records are normalized, using XSLT, into an internal representation. The 'service' section controls the further processing and extraction of data from the internal representation, primarily through the 'metadata' sub-element.

Pazpar2 version 1.2 and later allows multiple service elements. Multiple services must be given a unique ID by specifying attribute id. A single service may be unnamed (service ID omitted). The service ID is referred to in the init webservice command's service parameter.

metadata

One of these elements is required for every data element in the internal representation of the record (see Section 2, “Your data model”). It governs subsequent processing as pertains to sorting, relevance ranking, merging, and display of data elements. It supports the following attributes:

name

This is the name of the data element. It is matched against the 'type' attribute of the 'metadata' element in the normalized record. A warning is produced if metadata elements with an unknown name are found in the normalized record. This name is also used to represent data elements in the records returned by the webservice API, and to name sort lists and browse facets.

type

The type of data element. This value governs any normalization or special processing that might take place on an element. Possible values are 'generic' (basic string), 'year' (a range is computed if multiple years are found in the record). Note: This list is likely to increase in the future.

brief

If this is set to 'yes', then the data element is included in brief records in the webservice API. Note that this only makes sense for metadata elements that are merged (see below). The default value is 'no'.

sortkey

Specifies that this data element is to be used for sorting. The possible values are 'numeric' (numeric value), 'skiparticle' (string; skip common, leading articles), and 'no' (no sorting). The default value is 'no'.

When 'skiparticle' is used, some common articles from the English and German languages are ignored. At present the list is: 'the', 'den', 'der', 'die', 'des', 'an', 'a'.

rank

Specifies that this element is to be used to help rank records against the user's query (when ranking is requested). The value is of the form

M [F N]

where M is an integer, used as a weight against the basic TF*IDF score. A value of 1 is the base, higher values give additional weight to elements of this type. The default is '0', which excludes this element from the rank calculation.

F is a CCL field and N is the multiplier for terms that matches those parts of the CCL field in search. The F+N combo allows the system to use a different multiplier for a certain field. For example, a rank value of "1 au 3" gives a multiplier of 3 for all terms part of the au (author) terms, and 1 for everything else.

For Pazpar2 1.6.13 and later, the rank may also be defined "per-document", by the normalization stylesheet.

The per field rank was introduced in Pazpar2 1.6.15. Earlier releases only allowed a rank value M (simple integer).

See Section 7, “Relevance ranking” for more about ranking.

termlist

Specifies that this element is to be used as a termlist, or browse facet. Values are tabulated from incoming records, and a highscore of values (with their associated frequency) is made available to the client through the webservice API. The possible values are 'yes' and 'no' (default).

merge

This governs whether, and how elements are extracted from individual records and merged into cluster records. The possible values are: 'unique' (include all unique elements), 'longest' (include only the longest element (strlen)), 'range' (calculate a range of values across all matching records), 'all' (include all elements), or 'no' (don't merge; this is the default);

Pazpar2 1.6.24 also offers a new value for merge, 'first', which is like 'all' but only takes all from first database that returns the particular metadata field.

mergekey

If set to 'required', the value of this metadata element is appended to the resulting mergekey if the metadata is present in a record instance. If the metadata element is not present, then a unique mergekey will be generated instead.

If set to 'optional', the value of this metadata element is appended to the resulting mergekey if the metadata is present in a record instance. If the metadata is not present, it will be empty.

If set to 'no' or the mergekey attribute is omitted, the metadata will not be used in the creation of a mergekey.

facetrule

Specifies the ICU rule set to be used for normalizing facets. If facetrule is omitted from metadata, the rule set 'facet' is used.

limitcluster

Allow a limit on merged metadata. The value of this attribute is the name of actual metadata content to be used for matching (most often same name as metadata name).

Note

Requires Pazpar2 1.6.23 or later.

limitmap

Specifies a default limitmap for this field. This is to avoid mass configuring of targets. However it is important to review/do this on a per-target basis, since it is usually target-specific. See limitmap for format.

facetmap

Specifies a default facetmap for this field. This is to avoid mass configuring of targets. However it is important to review/do this on a per-target basis, since it is usually target-specific. See facetmap for format.

icurule

Specifies the ICU rule set to be used for normalizing metadata text. The "display" part of the rule is kept in the returned metadata record (record+show commands), the end result - normalized text - is used for performing within-cluster merge (unique, longest, etc.). If the icurule is omitted, type generic (text) is converted as follows: any of the characters " ,/.:([" are chopped of prefix and suffix of text content unless it includes the characters "://" (i.e. a URL).

Note

Requires Pazpar2 1.9.0 or later.

setting

This attribute allows you to make use of static database settings in the processing of records. Three possible values are allowed. 'no' is the default and doesn't do anything. 'postproc' copies the value of a setting with the same name into the output of the normalization stylesheet(s). 'parameter' makes the value of a setting with the same name available as a parameter to the normalization stylesheet, so you can further process the value inside of the stylesheet, or use the value to decide how to deal with other data values.

The purpose of using settings in this way, can either be to control the behavior of normalization stylesheets in a database-dependent way, or to easily make database-dependent values available to display-logic in your user interface, without having to implement complicated interactions between the user interface and your configuration system.

xslt

Defines a XSLT stylesheet. The xslt element takes exactly one attribute id which names the stylesheet. This can be referred to in target settings pz:xslt.

The content of the xslt element is the embedded stylesheet XML

icu_chain

Specifies a named ICU rule set. The icu_chain element must include attribute 'id' which specifies the identifier (name) for the ICU rule set. Pazpar2 uses the particular rule sets for particular purposes. Rule set 'relevance' is used to normalize terms for relevance ranking. Rule set 'sort' is used to normalize terms for sorting. Rule set 'mergekey' is used to normalize terms for making a mergekey. Rule set 'facet' is normally used to normalize facet terms, unless facetrule is given for a metadata field.

The icu_chain element must also include a 'locale' attribute which must be set to one of the locale strings defined in ICU. The child elements listed below can be in any order, except the 'index' element which logically belongs to the end of the list. The stated tokenization, transformation and charmapping instructions are performed in order from top to bottom.

casemap: The attribute 'rule' defines the direction of the per-character casemapping. Allowed values are "l" (lower), "u" (upper), "t" (title).
transform: Normalization and transformation of tokens follows the rules defined in the 'rule' attribute. For possible values we refer to the extensive ICU documentation found at the ICU transformation home page. Set filtering principles are explained at the ICU set and filtering page.
tokenize: Tokenization is the only rule in the ICU chain which splits one token into multiple tokens. The 'rule' attribute may have the following values: "s" (sentence), "l" (line-break), "w" (word), and "c" (character), with the latter probably not being very useful in a pruning Pazpar2 installation.

From Pazpar2 version 1.1 the ICU wrapper from YAZ is used. Refer to the yaz-icu utility for more information.

relevance

Specifies the ICU rule set used for relevance ranking. The child element of 'relevance' must be 'icu_chain' and the 'id' attribute of the icu_chain is ignored. This definition is obsolete and should be replaced by the equivalent construct:

	   <icu_chain id="relevance" locale="en">..<icu_chain>

sort

Specifies the ICU rule set used for sorting. The child element of 'sort' must be 'icu_chain' and the 'id' attribute of the icu_chain is ignored. This definition is obsolete and should be replaced by the equivalent construct:

	   <icu_chain id="sort" locale="en">..<icu_chain>

mergekey

Specifies ICU tokenization and transformation rules for tokens that are used in Pazpar2's mergekey. The child element of 'mergekey' must be 'icu_chain' and the 'id' attribute of the icu_chain is ignored. This definition is obsolete and should be replaced by the equivalent construct:

	   <icu_chain id="mergekey" locale="en">..<icu_chain>

facet

Specifies ICU tokenization and transformation rules for tokens that are used in Pazpar2's facets. The child element of 'facet' must be 'icu_chain' and the 'id' attribute of the icu_chain is ignored. This definition is obsolete and should be replaced by the equivalent construct:

	   <icu_chain id="facet" locale="en">..<icu_chain>

ccldirective

Customizes the CCL parsing (interpretation of query parameter in search). The name and value of the CCL directive is given by attributes 'name' and 'value' respectively. Refer to possible list of names in the YAZ manual .

rank

Customizes the ranking (relevance) algorithm. Also known as rank tweaks. The rank element accepts the following attributes - all being optional:

cluster: Attribute 'cluster' is a boolean that controls whether Pazpar2 should boost ranking for merged records. Is 'yes' by default. A value of 'no' will make Pazpar2 average ranking of each record in a cluster.
debug: Attribute 'debug' is a boolean that controls whether Pazpar2 should include details about ranking for each document in the show command's response. Enable by using value "yes", disable by using value "no" (default).
follow: Attribute 'follow' is a floating point number greater than or equal to 0. A positive number will boost weight for terms that occur close to each other (proximity, distance). A value of 1, will double the weight if two terms are in proximity distance of 1 (next to each other). The default value of 'follow' is 0 (order will not affect weight).
lead: Attribute 'lead' is a floating point number. It controls if term weight should be reduced by position from start in a metadata field. A positive value of 'lead' will reduce weight as it appears further away from the lead of the field. Default value is 0 (no reduction of weight by position).
length: Attribute 'length' determines how/if term weight should be divided by length of metadata field. A value of "linear" will divide by length. A value of "log" will divide by log2(length). A value of "none" will leave term weight as is (no division). Default value is "linear".

Refer to Section 7, “Relevance ranking” to see how these tweaks are used in computation of score.

Customization of ranking algorithm was introduced with Pazpar2 1.6.18. The semantics of some of the fields changed in versions up to 1.6.22.

sort-default

Specifies the default sort criteria (default 'relevance'), which previously was hard-coded as default criteria in search. This is a fix/work-around to avoid re-searching when using target-based sorting. In order for this to work efficiently, the search must also have the sort criteria parameter; otherwise pazpar2 will do re-searching on search criteria changes, if changed between search and show command.

This configuration was added in Pazpar2 1.6.20.

settings

Specifies target settings for this service. Refer to the section called “TARGET SETTINGS”.

timeout

Specifies timeout parameters for this service. The timeout element supports the following attributes: session, z3950_operation, z3950_session which specifies 'session timeout', 'Z39.50 operation timeout', 'Z39.50 session timeout' respectively. The Z39.50 operation timeout is the time Pazpar2 will wait for an active Z39.50/SRU operation before it gives up (times out). The Z39.50 session time out is the time Pazpar2 will keep the session alive for an idle session (no operation).

The following is recommended but not required: z3950_operation (30) < session (60) < z3950_session (180) . The default values are given in parentheses.

The Z39.50 operation timeout may be set per database. Refer to pz:timeout.

EXAMPLE

Below is a working example configuration:

   
<?xml version="1.0" encoding="UTF-8"?>
<pazpar2 xmlns="http://www.indexdata.com/pazpar2/1.0">
 <threads number="10"/>
 <file path=".:/usr/share/pazpar2/xsl"/>
 <server>
  <listen port="9004"/>
  <service>
   <rank debug="yes"/>
   <metadata name="title" brief="yes" sortkey="skiparticle"
             merge="longest" rank="6"/>
   <metadata name="isbn" merge="unique"/>
   <metadata name="date" brief="yes" sortkey="numeric"
             type="year" merge="range" termlist="yes"/>
   <metadata name="author" brief="yes" termlist="yes"
             merge="longest" rank="2"/>
   <metadata name="subject" merge="unique" termlist="yes" rank="3" limitmap="local:"/>
   <metadata name="url" merge="unique"/>
   <icu_chain id="relevance" locale="el">
    <transform rule="[:Control:] Any-Remove"/>
    <tokenize rule="l"/>
    <transform rule="[[:WhiteSpace:][:Punctuation:]] Remove"/>
    <casemap rule="l"/>
   </icu_chain>
   <settings src="mysettings"/>
   <timeout session="60"/>
  </service>
 </server>
</pazpar2>

INCLUDE FACILITY

The XML configuration may be partitioned into multiple files by using the include element which takes a single attribute, src. The src attribute is regular Shell-like glob pattern. For example,

   <include src="/etc/pazpar2/conf.d/*.xml"/>

The include facility requires Pazpar2 version 1.2.

TARGET SETTINGS

Pazpar2 features a cunning scheme by which you can associate various kinds of attributes, or settings with search targets. This can be done through XML files which are read at startup; each file can associate one or more settings with one or more targets. The file format is generic in nature, designed to support a wide range of application requirements. The settings can be purely technical things, like, how to perform a title search against a given target, or it can associate arbitrary name=value pairs with groups of targets -- for instance, if you would like to place all commercial full-text bases in one group for selection purposes, or you would like to control what targets are accessible to users by default. Per-database settings values can even be used to drive sorting, facet/termlist generation, or end-user interface display logic.

During startup, Pazpar2 will recursively read a specified directory (can be identified in the pazpar2.cfg file or on the command line), and process any settings files found therein.

Clients of the Pazpar2 webservice interface can selectively override settings for individual targets within the scope of one session. This can be used in conjunction with an external authentication system to determine which resources are to be accessible to which users. Pazpar2 itself has no notion of end-users, and so can be used in conjunction with any type of authentication system. Similarly, the authentication tokens submitted to access-controlled search targets can similarly be overridden, to allow use of Pazpar2 in a consortial or multi-library environment, where different end-users may need to be represented to some search targets in different ways. This, again, can be managed using an external database or other lookup mechanism. Setting overrides can be performed either using the init or the settings webservice command.

In fact, every setting that applies to a database (except pz:id, which can only be used for filtering targets to use for a search) can be overridden on a per-session basis. This allows the client to override specific CCL fields for searching, etc., to meet the needs of a session or user.

Finally, as an extreme case of this, the webservice client can introduce entirely new targets, on the fly, as part of the init or settings command. This is useful if you desire to manage information about your search targets in a separate application such as a database. You do not need any static settings file whatsoever to run Pazpar2 -- as long as the webservice client is prepared to supply the necessary information at the beginning of every session.

Note

The following discussion of practical issues related to session and settings management are cast in terms of a user interface based on Ajax/Javascript technology. It would apply equally well to many other kinds of browser-based logic.

Typically, a Javascript client is not allowed to directly alter the parameters of a session. There are two reasons for this. One has to do with access to information; typically, information about a user will be stored in a system on the server side, or it will be accessible in some way from the server. However, since the Javascript client cannot be entirely trusted (some hostile agent might in fact 'pretend' to be a regular WS client), it is more robust to control session settings from scripting that you run as part of your webserver. Typically, this can be handled during the session initialization, as follows:

Step 1: The Javascript client loads, and asks the webserver for a new Pazpar2 session ID. This can be done using a Javascript call, for instance. Note that it is possible to submit Ajax HTTPXmlRequest calls either to Pazpar2 or to the webserver that Pazpar2 is proxying for. Refer to Pazpar2 protocol(7).

Step 2: Code on the webserver authenticates the user, by database lookup, LDAP access, NCIP, etc., and determines which resources the user has access to, and any user-specific parameters that are to be applied during this session.

Step 3: The webserver initializes a new Pazpar2 settings, and sets user-specific parameters as necessary, using the init webservice command. A new session ID is returned.

Step 4: The webserver returns this session ID to the Javascript client, which then uses the session ID to submit searches, show results, etc.

Step 5: When the Javascript client ceases to use the session, Pazpar2 destroys any session-specific information.

SETTINGS FILE FORMAT

Each file contains a root element named <settings>. It may contain one or more <set> elements. The settings and set elements may contain the following attributes. Attributes in the set node override those in the setting root element. Each set node must specify (directly, or inherited from the parent node) at least a target, name, and value.

target

This specifies the search target to which this setting should be applied. Targets are identified by their Z39.50 URL, generally including the host, port, and database name, (e.g. z3950.indexdata.com:210/marc). Two wildcard forms are accepted: * (asterisk) matches all known targets; z3950.indexdata.com:210/* matches all known databases on the given host.

A precedence system determines what happens if there are overlapping values for the same setting name for the same target. A setting for a specific target name overrides a setting which specifies target using a wildcard. This makes it easy to set defaults for all targets, and then override them for specific targets or hosts. If there are multiple overlapping settings with the same name and target value, the 'precedence' attribute determines what happens.

For Pazpar2 1.6.4 or later, the target ID may be user-defined, in which case, the actual host, port, etc., is given by the setting pz:url.

name

The name of the setting. This can be anything you like. However, Pazpar2 reserves a number of setting names for specific purposes, all starting with 'pz:', and it is a good idea to avoid that prefix if you make up your own setting names. See below for a list of reserved variables.

value

The value of the setting. Generally, this can be anything you want -- however, some of the reserved settings may expect specific kinds of values.

precedence

This should be an integer. If not provided, the default value is 0. If two (or more) settings have the same content for target and name, the precedence value determines the outcome. If both settings have the same precedence value, they are both applied to the target(s). If one has a higher value, then the value of that setting is applied, and the other one is ignored.

By setting defaults for target, name, or value in the root settings node, you can use the settings files in many different ways. For instance, you can use a single file to set defaults for many different settings, like search fields, retrieval syntaxes, etc. You can have one file per server, which groups settings for that server or target. You could also have one file which associates a number of targets with a given setting, for instance, to associate many databases with a given category or class that makes sense within your application.

The following examples illustrate uses of the settings system to associate settings with targets to meet different requirements.

The example below associates a set of default values that can be used across many targets. Note the wildcard for targets. This associates the given settings with all targets for which no other information is provided.

    <settings target="*">

    <!-- This file introduces default settings for pazpar2 -->

    <!-- mapping for unqualified search -->
    <set name="pz:cclmap:term" value="u=1016 t=l,r s=al"/>

    <!-- field-specific mappings -->
    <set name="pz:cclmap:ti" value="u=4 s=al"/>
    <set name="pz:cclmap:su" value="u=21 s=al"/>
    <set name="pz:cclmap:isbn" value="u=7"/>
    <set name="pz:cclmap:issn" value="u=8"/>
    <set name="pz:cclmap:date" value="u=30 r=r"/>

    <set name="pz:limitmap:title" value="rpn:@attr 1=4 @attr 6=3"/>
    <set name="pz:limitmap:date" value="ccl:date"/>

    <!-- Retrieval settings -->

    <set name="pz:requestsyntax" value="marc21"/>
    <set name="pz:elements" value="F"/>

    <!-- Query encoding -->
    <set name="pz:queryencoding" value="iso-8859-1"/>

    <!-- Result normalization settings -->

    <set name="pz:nativesyntax" value="iso2709"/>
    <set name="pz:xslt" value="marc21.xsl"/>

    </settings>

The next example shows certain settings overridden for one target, one which returns XML records containing Dublin Core elements, and which furthermore requires a username/password.

    <settings target="funkytarget.com:210/db1">
    <set name="pz:requestsyntax" value="xml"/>
    <set name="pz:nativesyntax" value="xml"/>
    <set name="pz:xslt" value="../etc/dublincore.xsl"/>

    <set name="pz:authentication" value="myuser/password"/>
    </settings>

The following example associates a specific name/value combination with a number of targets. The targets below are access-restricted, and can only be used by users with special credentials.

    <settings name="pz:allow" value="0">
    <set target="funkytarget.com:210/*"/>
    <set target="commercial.com:2100/expensiveDb"/>
    </settings>

RESERVED SETTING NAMES

The following setting names are reserved by Pazpar2 to control the behavior of the client function.

pz:allow

Allows or denies access to the resources it is applied to. Possible values are '0' and '1'. The default is '1' (allow access to this resource).

pz:apdulog

If the 'pz:apdulog' setting is defined and has other value than 0, then Z39.50 APDUs are written to the log.

pz:authentication

Sets an authentication string for a given database. For Z39.50, this is carried as part of the Initialize Request. In order to carry the information in the "open" elements, separate username and password with a slash (In Z39.50 it is a VisibleString). In order to carry the information in the idPass elements, separate username term, password term and, optionally, a group term with a single blank. If three terms are given, the order is user, group, password. If only two terms are given, the order is user, password.

For HTTP based protocols, such as SRU and Apache Solr, the authentication string includes a username term and, optionally, a password term. Each term is separated by a single blank. The authentication information is passed either by HTTP basic authentication or via URL parameters. The mode of operation is determined by pz:authentication_mode setting.

pz:authentication_mode

Determines how authentication is carried in HTTP based protocols. Value may be "basic" or "url".

pz:block_timeout

(Not yet implemented). Specifies the time for which a block should be released anyway.

pz:cclmap:xxx

This establishes a CCL field definition or other setting, for the purpose of mapping end-user queries. XXX is the field or setting name, and the value of the setting provides parameters (e.g. parameters to send to the server, etc.). Please consult the YAZ manual for a full overview of the many capabilities of the powerful and flexible CCL parser.

Note that it is easy to establish a set of default parameters, and then override them individually for a given target.

pz:elements

The element set name to be used when retrieving records from a server.

pz:extendrecs

If a show command goes to the boundary of a result set for a database - depends on sorting - and pz:extendrecs is set to a positive value. then Pazpar2 wait for show to fetch pz:extendrecs more records. This setting is best used if a database does native sorting, because the result set otherwise may be completely re-sorted during extended fetch. The default value of pz:extendrecs is 0 (no extended fetch).

Warning

The pz:extendrecs setting appeared in Pazpar2 version 1.6.26. But the behavior changed with the release of Pazpar2 1.6.29.

pz:facetmap:name

Specifies that for field name, the target supports (native) facets. The value is the name of the field on the target.

pz:facetmap:split:name

Like pz:facetmap, but makes Pazpar2 inspect the term value consisting of two items separated by colon. First item is the raw ID to be sent to database if limitmap on the field name is used. The second item is the display term.

This facility was added in Pazpar2 version 1.11.0.

pz:id

This setting can't be 'set' -- it contains the ID (normally ZURL) for a given target, and is useful for filtering -- specifically when you want to select one or more specific targets in the search command.

pz:limitmap:name

Specifies attributes for limiting a search to a field - using the limit parameter for search. It can be used to filter locally or remotely (search in a target). In some cases the mapping of a field to a value is identical to an existing cclmap field; in other cases the field must be specified in a different way - for example to match a complete field (rather than parts of a subfield).

The value of limitmap may have one of three forms: referral to an existing CCL field, a raw PQF string or a local limit. Leading string determines type; either ccl: for CCL field, rpn: for PQF/RPN, or local: for filtering in Pazpar2. The local filtering may be followed by a field a metadata field (default is to use the name of the limitmap itself).

For Pazpar2 version 1.6.23 and later the limitmap may include multiple specifications, separated by , (comma). For example: ccl:title,local:ltitle,rpn:@attr 1=4.

Note

The limitmap facility is supported for Pazpar2 version 1.6.0. Local filtering is supported in Pazpar2 1.6.6.

pz:maxrecs

Controls the maximum number of records to be retrieved from a server. The default is 100.

pz:memcached

If set and non-empty, libMemcached will configured and enabled for the target. The value of this setting is same as the ZOOM option memcached, which in turn is the configuration string passed to the memcached function of libMemcached.

This setting is honored in Pazpar2 1.6.39 or later. Pazpar2 must be using YAZ version 5.0.13 or later.

pz:redis

If set and non-empty, redis will be configured and enabled for the target. The value of this setting is exactly as the redis option for ZOOM C of YAZ. Refer to the YAZ manual.

This setting is honored in Pazpar2 1.6.43 or later. Pazpar2 must be using YAZ version 5.2.0 or later.

pz:nativesyntax

Specifies how Pazpar2 should map retrieved records to XML. Currently supported values are xml, iso2709 and txml.

The value iso2709 makes Pazpar2 convert retrieved MARC records to MARCXML. In order to convert to XML, the exact character set of the MARC must be known (if not, the resulting XML is probably not well-formed). The character set may be specified by adding: ;charset to iso2709. If omitted, a charset of MARC-8 is assumed. This is correct for most MARC21/USMARC records.

The value txml is like iso2709 except that records are converted to TurboMARC instead of MARCXML.

The value xml is used if Pazpar2 retrieves records that are already XML (no conversion takes place).

pz:negotiation_charset

Sets character set for Z39.50 negotiation. Most targets do not support this, and some will even close connection if set (crash on server side or similar). If set, you probably want to set it to UTF-8.

pz:piggyback

Piggybacking enables the server to retrieve records from the server as part of the search response in Z39.50. Almost all servers support this (or fail it gracefully), but a few servers will produce undesirable results. Set to '1' to enable piggybacking, '0' to disable it. Default is 1 (piggybacking enabled).

pz:pqf_prefix

Allows you to specify an arbitrary PQF query language substring. The provided string is prefixed to the user's query after it has been normalized to PQF internally in pazpar2. This allows you to attach complex 'filters' to queries for a given target, sometimes necessary to select sub-catalogs in union catalog systems, etc.

pz:pqf_strftime

Allows you to extend a query with dates and operators. The provided string allows certain substitutions and serves as a format string. The special two character sequence '%%' gets converted to the original query. Other characters leading with the percent sign are conversions supported by strftime. All other characters are copied verbatim. For example, the string @and @attr 1=30 @attr 2=3 %Y %% would search for current year combined with the original PQF (%%).

This setting can also be used as more general alternative to pz:pqf_prefix -- a way of embedding the submitted query anywhere in the string rather than appending it to prefix. For example, if it is desired to omit all records satisfying the query @attr 1=pica.bib 0007 then this subquery can be combined with the submitted query as the second argument of @andnot by using the pz:pqf_strftime value @not %% @attr 1=pica.bib 0007

pz:preferred

Specifies that a target is preferred, e.g. possible local, faster target. Using block=preferred on show command will wait for all these targets to return records before releasing the block. If no target is preferred, the block=preferred will be identical to block=1, which release when one target has returned records.

pz:present_chunk

Controls the chunk size in present requests. Pazpar2 will make (maxrecs / chunk) request(s). The default is 20.

pz:queryencoding

The encoding of the search terms that a target accepts. Most targets do not honor UTF-8 in which case this needs to be specified. Each term in a query will be converted if this setting is given.

pz:recordfilter

Specifies a filter which allows Pazpar2 to only include records that meet a certain criteria in a result. Unmatched records will be ignored. The filter takes the form name, name~value, or name=value, which will include only records with metadata element (name) that has the substring (~value) given, or matches exactly (=value). If value is omitted all records with the named metadata element present will be included.

pz:requestsyntax

This specifies the record syntax to use when requesting records from a given server. The value can be a symbolic name like marc21 or xml, or it can be a Z39.50-style dot-separated OID.

pz:sort

Specifies sort criteria to be applied to the result set. Only works for targets which support the sort service.

pz:sortmap:field

Specifies native sorting for a target where field is a sort criterion (see command show). The value has two components separated by a colon: strategy and native-field. Strategy is one of z3950, type7, cql, sru11, or embed. The second component, native-field, is the field that is recognized by the target.

Note

Only supported for Pazpar2 1.6.4 and later.

pz:sru

This setting enables SRU/Solr support. It has four possible settings. 'get', enables SRU access through GET requests. 'post' enables SRU/POST support, less commonly supported, but useful if very large requests are to be submitted. 'soap' enables the SRW (SRU over SOAP) variation of the protocol.

A value of 'solr' enables Solr client support. This is supported for Pazpar version 1.5.0 and later.

pz:sru_version

This allows SRU version to be specified. If unset Pazpar2 will use the default of YAZ (currently 1.2). Should be set to 1.1 or 1.2. For Solr, the current supported/tested version is 1.4 and 3.x.

pz:termlist_term_count

Specifies number of facet terms to be requested from the target. The default is unspecified e.g. server-decided. Also see pz:facetmap.

pz:termlist_term_factor

Specifies whether to use a factor for pazpar2 generated facets (1) or not (0). When mixing locally generated (by the downloaded (pz:maxrecs) samples) facet with native (target-generated) facets, the later will dominate the facet list since they are generated based on the complete result set. By scaling up the facet count using the ratio between total hit count and the sample size, the total facet count can be approximated and thus better compared with native facets. This is not enabled by default.

pz:timeout

Specifies timeout for operation (e.g. search, and fetch) for a database. This overrides the z3650_operation timeout that is given for a service. See timeout.

Note

The timeout facility is supported for Pazpar2 version 1.8.4 and later.

pz:url

Specifies URL for the target and overrides the target ID.

Note

pz:url is only recognized for Pazpar2 1.6.4 and later.

pz:xslt

A comma-separated list of stylesheet names that specifies how to convert incoming records to the internal representation.

For each name, the embedded stylesheets (XSL) that comes with the service definition are consulted first, and take precedence over external files; see xslt of service definition). If the name does not match an embedded stylesheet, it is considered a filename.

The suffix of each file specifies the kind of tranformation. Suffix ".xsl" makes an XSL transform. Suffix ".mmap" will use the MMAP transform (described below).

The special value "auto" will use a file which is the pz:requestsyntax's value followed by '.xsl'.

When mapping MARC records, XSLT can be bypassed for increased performance with the alternate "MARC map" format. Provide the path of a file with extension ".mmap" containing on each line:

       <field> <subfield> <metadata element>

For example:

	245 a title
	500 $ description
        773 * citation

To map the field value, specify a subfield of '$'. To store a concatenation of all subfields, specify a subfield of '*'.

pz:zproxy

The 'pz:zproxy' setting has the value syntax 'host.internet.adress:port'. It is used to tunnel Z39.50 requests through the named Z39.50 proxy.

Prev	Up	Next
Pazpar2 protocol	Home	Pazpar2_play

Name

Synopsis

DESCRIPTION

FORMAT

threads

sockets

file

server

Note

Note

EXAMPLE

INCLUDE FACILITY

TARGET SETTINGS

Note

SETTINGS FILE FORMAT

RESERVED SETTING NAMES

Warning

Note

Note

Note

Note

SEE ALSO