pazpar2_conf — Pazpar2 Configuration
pazpar2.conf
The Pazpar2 configuration file, together with any referenced XSLT files, govern Pazpar2's behavior as a client, and control the normalization and extraction of data elements from incoming result records, for the purposes of merging, sorting, facet analysis, and display.
The file is specified using the option -f on the Pazpar2 command line. There is not presently a way to reload the configuration file without restarting Pazpar2, although this will most likely be added some time in the future.
The configuration file is XML-structured. It must be well-formed XML. All
elements specific to Pazpar2 should belong to the namespace
http://www.indexdata.com/pazpar2/1.0
(this is assumed in the
following examples). The root element is named "pazpar2
".
Under the root element are a number of elements which group categories of
information. The categories are described below.
This section is optional and is supported for Pazpar2 version 1.3.1 and
later. It is identified by element "threads
" which
may include one attribute "number
" which specifies
the number of worker-threads that the Pazpar2 instance is to use.
A value of 0 (zero) disables worker-threads (all work is carried out
in main thread).
This section is optional and is supported for Pazpar2 version 1.13.0 and
later . It is identified by element "sockets
" which
may include one attribute "max
" which specifies
the maximum number of sockets to be used by Pazpar2.
This configuration takes one attribute path
which
specifies a path to search for local files, such as XSLTs and settings.
The path is a colon separated list of directories. Its default value
is ".
" which is equivalent to the location of the
main configuration file (where indeed the file element is given).
This section governs overall behavior of a server endpoint. It is identified by the element "server" which takes an optional attribute, "id", which identifies this particular Pazpar2 server. Any string value for "id" may be given.
The data elements are described below. From Pazpar2 version 1.2 this is a repeatable element.
Configures the webservice -- this controls how you can connect to Pazpar2 from your browser or server-side code. The attributes 'host' and 'port' control the binding of the server. The 'host' attribute can be used to bind the server to a secondary IP address of your system, enabling you to run Pazpar2 on port 80 alongside a conventional web server. You can override this setting on the command line using the option -h.
If this item is given, Pazpar2 will forward all incoming HTTP requests that do not contain the filename 'search.pz2' to the host and port specified using the 'host' and 'port' attributes. The 'myurl' attribute is required, and should provide the base URL of the server. Generally, the HTTP URL for the host specified in the 'listen' parameter. This functionality is crucial if you wish to use Pazpar2 in conjunction with browser-based code (JS, Flash, applets, etc.) which operates in a security sandbox. Such code can only connect to the same server from which the enclosing HTML page originated. Pazpar2's proxy functionality enables you to host all of the main pages (plus images, CSS, etc.) of your application on a conventional webserver, while efficiently processing webservice requests for metasearch status, results, etc.
Specifies character set normalization for relevancy / sorting / mergekey and facets - for the server. These definitions serve as default for services that don't have these given. For the meaning of these settings refer to the icu_chain element inside service.
Obsolete. Use element icu_chain instead.
Specifies target settings for the server. These settings serve
as default for all services which don't have these given.
The settings element requires one attribute 'src' which specifies
a settings file or a directory. If a directory is given, all
files with suffix .xml
are read from this
directory. Refer to
the section called “TARGET SETTINGS” for more information.
This nested element controls the behavior of Pazpar2 with respect to your data model. In Pazpar2, incoming records are normalized, using XSLT, into an internal representation. The 'service' section controls the further processing and extraction of data from the internal representation, primarily through the 'metadata' sub-element.
Pazpar2 version 1.2 and later allows multiple service elements.
Multiple services must be given a unique ID by specifying
attribute id
.
A single service may be unnamed (service ID omitted). The
service ID is referred to in the
init
webservice
command's service
parameter.
One of these elements is required for every data element in the internal representation of the record (see Section 2, “Your data model”). It governs subsequent processing as pertains to sorting, relevance ranking, merging, and display of data elements. It supports the following attributes:
This is the name of the data element. It is matched against the 'type' attribute of the 'metadata' element in the normalized record. A warning is produced if metadata elements with an unknown name are found in the normalized record. This name is also used to represent data elements in the records returned by the webservice API, and to name sort lists and browse facets.
The type of data element. This value governs any normalization or special processing that might take place on an element. Possible values are 'generic' (basic string), 'year' (a range is computed if multiple years are found in the record). Note: This list is likely to increase in the future.
If this is set to 'yes', then the data element is included in brief records in the webservice API. Note that this only makes sense for metadata elements that are merged (see below). The default value is 'no'.
Specifies that this data element is to be used for sorting. The possible values are 'numeric' (numeric value), 'skiparticle' (string; skip common, leading articles), and 'no' (no sorting). The default value is 'no'.
When 'skiparticle' is used, some common articles from the English and German languages are ignored. At present the list is: 'the', 'den', 'der', 'die', 'des', 'an', 'a'.
Specifies that this element is to be used to help rank records against the user's query (when ranking is requested). The value is of the form
M [F N]
where M is an integer, used as a weight against the basic TF*IDF score. A value of 1 is the base, higher values give additional weight to elements of this type. The default is '0', which excludes this element from the rank calculation.
F is a CCL field and N is the multiplier for terms
that matches those parts of the CCL field in search.
The F+N combo allows the system to use a different
multiplier for a certain field. For example, a rank value of
"1 au 3
" gives a multiplier of 3 for
all terms part of the au (author) terms, and 1 for everything else.
For Pazpar2 1.6.13 and later, the rank may also be defined "per-document", by the normalization stylesheet.
The per field rank was introduced in Pazpar2 1.6.15. Earlier releases only allowed a rank value M (simple integer).
See Section 7, “Relevance ranking” for more about ranking.
Specifies that this element is to be used as a termlist, or browse facet. Values are tabulated from incoming records, and a highscore of values (with their associated frequency) is made available to the client through the webservice API. The possible values are 'yes' and 'no' (default).
This governs whether, and how elements are extracted from individual records and merged into cluster records. The possible values are: 'unique' (include all unique elements), 'longest' (include only the longest element (strlen)), 'range' (calculate a range of values across all matching records), 'all' (include all elements), or 'no' (don't merge; this is the default);
Pazpar2 1.6.24 also offers a new value for merge, 'first', which is like 'all' but only takes all from first database that returns the particular metadata field.
If set to 'required
', the value of this
metadata element is appended to the resulting mergekey if
the metadata is present in a record instance.
If the metadata element is not present, then a unique mergekey
will be generated instead.
If set to 'optional
', the value of this
metadata element is appended to the resulting mergekey if
the metadata is present in a record instance. If the metadata
is not present, it will be empty.
If set to 'no
' or the mergekey attribute is
omitted, the metadata will not be used in the creation of a
mergekey.
Specifies the ICU rule set to be used for normalizing facets. If facetrule is omitted from metadata, the rule set 'facet' is used.
Allow a limit on merged metadata. The value of this attribute is the name of actual metadata content to be used for matching (most often same name as metadata name).
Requires Pazpar2 1.6.23 or later.
Specifies a default limitmap for this field. This is to avoid mass configuring of targets. However it is important to review/do this on a per-target basis, since it is usually target-specific. See limitmap for format.
Specifies a default facetmap for this field. This is to avoid mass configuring of targets. However it is important to review/do this on a per-target basis, since it is usually target-specific. See facetmap for format.
Specifies the ICU rule set to be used for normalizing
metadata text. The "display" part of the rule is kept
in the returned metadata record (record+show commands), the
end result - normalized text - is used for performing
within-cluster merge (unique, longest, etc.). If the icurule is
omitted, type generic (text) is converted as follows:
any of the characters " ,/.:([
" are
chopped of prefix and suffix of text content
unless it includes the
characters "://
" (i.e. a URL).
Requires Pazpar2 1.9.0 or later.
This attribute allows you to make use of static database settings in the processing of records. Three possible values are allowed. 'no' is the default and doesn't do anything. 'postproc' copies the value of a setting with the same name into the output of the normalization stylesheet(s). 'parameter' makes the value of a setting with the same name available as a parameter to the normalization stylesheet, so you can further process the value inside of the stylesheet, or use the value to decide how to deal with other data values.
The purpose of using settings in this way, can either be to control the behavior of normalization stylesheets in a database-dependent way, or to easily make database-dependent values available to display-logic in your user interface, without having to implement complicated interactions between the user interface and your configuration system.
Defines a XSLT stylesheet. The xslt
element takes exactly one attribute id
which names the stylesheet. This can be referred to in target
settings pz:xslt.
The content of the xslt element is the embedded stylesheet XML
Specifies a named ICU rule set. The icu_chain element must include attribute 'id' which specifies the identifier (name) for the ICU rule set. Pazpar2 uses the particular rule sets for particular purposes. Rule set 'relevance' is used to normalize terms for relevance ranking. Rule set 'sort' is used to normalize terms for sorting. Rule set 'mergekey' is used to normalize terms for making a mergekey. Rule set 'facet' is normally used to normalize facet terms, unless facetrule is given for a metadata field.
The icu_chain element must also include a 'locale' attribute which must be set to one of the locale strings defined in ICU. The child elements listed below can be in any order, except the 'index' element which logically belongs to the end of the list. The stated tokenization, transformation and charmapping instructions are performed in order from top to bottom.
The attribute 'rule' defines the direction of the per-character casemapping. Allowed values are "l" (lower), "u" (upper), "t" (title).
Normalization and transformation of tokens follows the rules defined in the 'rule' attribute. For possible values we refer to the extensive ICU documentation found at the ICU transformation home page. Set filtering principles are explained at the ICU set and filtering page.
Tokenization is the only rule in the ICU chain which splits one token into multiple tokens. The 'rule' attribute may have the following values: "s" (sentence), "l" (line-break), "w" (word), and "c" (character), with the latter probably not being very useful in a pruning Pazpar2 installation.
From Pazpar2 version 1.1 the ICU wrapper from YAZ is used. Refer to the yaz-icu utility for more information.
Specifies the ICU rule set used for relevance ranking. The child element of 'relevance' must be 'icu_chain' and the 'id' attribute of the icu_chain is ignored. This definition is obsolete and should be replaced by the equivalent construct:
<icu_chain id="relevance" locale="en">..<icu_chain>
Specifies the ICU rule set used for sorting. The child element of 'sort' must be 'icu_chain' and the 'id' attribute of the icu_chain is ignored. This definition is obsolete and should be replaced by the equivalent construct:
<icu_chain id="sort" locale="en">..<icu_chain>
Specifies ICU tokenization and transformation rules for tokens that are used in Pazpar2's mergekey. The child element of 'mergekey' must be 'icu_chain' and the 'id' attribute of the icu_chain is ignored. This definition is obsolete and should be replaced by the equivalent construct:
<icu_chain id="mergekey" locale="en">..<icu_chain>
Specifies ICU tokenization and transformation rules for tokens that are used in Pazpar2's facets. The child element of 'facet' must be 'icu_chain' and the 'id' attribute of the icu_chain is ignored. This definition is obsolete and should be replaced by the equivalent construct:
<icu_chain id="facet" locale="en">..<icu_chain>
Customizes the CCL parsing (interpretation of query parameter in search). The name and value of the CCL directive is given by attributes 'name' and 'value' respectively. Refer to possible list of names in the YAZ manual .
Customizes the ranking (relevance) algorithm. Also known as rank tweaks. The rank element accepts the following attributes - all being optional:
Attribute 'cluster' is a boolean that controls whether Pazpar2 should boost ranking for merged records. Is 'yes' by default. A value of 'no' will make Pazpar2 average ranking of each record in a cluster.
Attribute 'debug' is a boolean that controls whether Pazpar2 should include details about ranking for each document in the show command's response. Enable by using value "yes", disable by using value "no" (default).
Attribute 'follow' is a floating point number greater than or equal to 0. A positive number will boost weight for terms that occur close to each other (proximity, distance). A value of 1, will double the weight if two terms are in proximity distance of 1 (next to each other). The default value of 'follow' is 0 (order will not affect weight).
Attribute 'lead' is a floating point number. It controls if term weight should be reduced by position from start in a metadata field. A positive value of 'lead' will reduce weight as it appears further away from the lead of the field. Default value is 0 (no reduction of weight by position).
Attribute 'length' determines how/if term weight should be divided by length of metadata field. A value of "linear" will divide by length. A value of "log" will divide by log2(length). A value of "none" will leave term weight as is (no division). Default value is "linear".
Refer to Section 7, “Relevance ranking” to see how these tweaks are used in computation of score.
Customization of ranking algorithm was introduced with Pazpar2 1.6.18. The semantics of some of the fields changed in versions up to 1.6.22.
Specifies the default sort criteria (default 'relevance'), which previously was hard-coded as default criteria in search. This is a fix/work-around to avoid re-searching when using target-based sorting. In order for this to work efficiently, the search must also have the sort criteria parameter; otherwise pazpar2 will do re-searching on search criteria changes, if changed between search and show command.
This configuration was added in Pazpar2 1.6.20.
Specifies target settings for this service. Refer to the section called “TARGET SETTINGS”.
Specifies timeout parameters for this service.
The timeout
element supports the following attributes:
session
, z3950_operation
,
z3950_session
which specifies
'session timeout', 'Z39.50 operation timeout',
'Z39.50 session timeout' respectively. The Z39.50 operation
timeout is the time Pazpar2 will wait for an active Z39.50/SRU
operation before it gives up (times out). The Z39.50 session
time out is the time Pazpar2 will keep the session alive for
an idle session (no operation).
The following is recommended but not required: z3950_operation (30) < session (60) < z3950_session (180) . The default values are given in parentheses.
The Z39.50 operation timeout may be set per database. Refer to pz:timeout.
Below is a working example configuration:
<?xml version="1.0" encoding="UTF-8"?> <pazpar2 xmlns="http://www.indexdata.com/pazpar2/1.0"> <threads number="10"/> <file path=".:/usr/share/pazpar2/xsl"/> <server> <listen port="9004"/> <service> <rank debug="yes"/> <metadata name="title" brief="yes" sortkey="skiparticle" merge="longest" rank="6"/> <metadata name="isbn" merge="unique"/> <metadata name="date" brief="yes" sortkey="numeric" type="year" merge="range" termlist="yes"/> <metadata name="author" brief="yes" termlist="yes" merge="longest" rank="2"/> <metadata name="subject" merge="unique" termlist="yes" rank="3" limitmap="local:"/> <metadata name="url" merge="unique"/> <icu_chain id="relevance" locale="el"> <transform rule="[:Control:] Any-Remove"/> <tokenize rule="l"/> <transform rule="[[:WhiteSpace:][:Punctuation:]] Remove"/> <casemap rule="l"/> </icu_chain> <settings src="mysettings"/> <timeout session="60"/> </service> </server> </pazpar2>
The XML configuration may be partitioned into multiple files by using
the include
element which takes a single attribute,
src
. The src
attribute is
regular Shell-like glob pattern. For example,
<include src="/etc/pazpar2/conf.d/*.xml"/>
The include facility requires Pazpar2 version 1.2.
Pazpar2 features a cunning scheme by which you can associate various kinds of attributes, or settings with search targets. This can be done through XML files which are read at startup; each file can associate one or more settings with one or more targets. The file format is generic in nature, designed to support a wide range of application requirements. The settings can be purely technical things, like, how to perform a title search against a given target, or it can associate arbitrary name=value pairs with groups of targets -- for instance, if you would like to place all commercial full-text bases in one group for selection purposes, or you would like to control what targets are accessible to users by default. Per-database settings values can even be used to drive sorting, facet/termlist generation, or end-user interface display logic.
During startup, Pazpar2 will recursively read a specified directory (can be identified in the pazpar2.cfg file or on the command line), and process any settings files found therein.
Clients of the Pazpar2 webservice interface can selectively override settings for individual targets within the scope of one session. This can be used in conjunction with an external authentication system to determine which resources are to be accessible to which users. Pazpar2 itself has no notion of end-users, and so can be used in conjunction with any type of authentication system. Similarly, the authentication tokens submitted to access-controlled search targets can similarly be overridden, to allow use of Pazpar2 in a consortial or multi-library environment, where different end-users may need to be represented to some search targets in different ways. This, again, can be managed using an external database or other lookup mechanism. Setting overrides can be performed either using the init or the settings webservice command.
In fact, every setting that applies to a database (except pz:id, which can only be used for filtering targets to use for a search) can be overridden on a per-session basis. This allows the client to override specific CCL fields for searching, etc., to meet the needs of a session or user.
Finally, as an extreme case of this, the webservice client can introduce entirely new targets, on the fly, as part of the init or settings command. This is useful if you desire to manage information about your search targets in a separate application such as a database. You do not need any static settings file whatsoever to run Pazpar2 -- as long as the webservice client is prepared to supply the necessary information at the beginning of every session.
The following discussion of practical issues related to session and settings management are cast in terms of a user interface based on Ajax/Javascript technology. It would apply equally well to many other kinds of browser-based logic.
Typically, a Javascript client is not allowed to directly alter the parameters of a session. There are two reasons for this. One has to do with access to information; typically, information about a user will be stored in a system on the server side, or it will be accessible in some way from the server. However, since the Javascript client cannot be entirely trusted (some hostile agent might in fact 'pretend' to be a regular WS client), it is more robust to control session settings from scripting that you run as part of your webserver. Typically, this can be handled during the session initialization, as follows:
Step 1: The Javascript client loads, and asks the webserver for a new Pazpar2 session ID. This can be done using a Javascript call, for instance. Note that it is possible to submit Ajax HTTPXmlRequest calls either to Pazpar2 or to the webserver that Pazpar2 is proxying for. Refer to Pazpar2 protocol(7).
Step 2: Code on the webserver authenticates the user, by database lookup, LDAP access, NCIP, etc., and determines which resources the user has access to, and any user-specific parameters that are to be applied during this session.
Step 3: The webserver initializes a new Pazpar2 settings, and sets user-specific parameters as necessary, using the init webservice command. A new session ID is returned.
Step 4: The webserver returns this session ID to the Javascript client, which then uses the session ID to submit searches, show results, etc.
Step 5: When the Javascript client ceases to use the session, Pazpar2 destroys any session-specific information.
Each file contains a root element named <settings>. It may contain one or more <set> elements. The settings and set elements may contain the following attributes. Attributes in the set node override those in the setting root element. Each set node must specify (directly, or inherited from the parent node) at least a target, name, and value.
This specifies the search target to which this setting should be
applied. Targets are identified by their Z39.50 URL, generally
including the host, port, and database name, (e.g.
z3950.indexdata.com:210/marc
).
Two wildcard forms are accepted:
* (asterisk) matches all known targets;
z3950.indexdata.com:210/*
matches all
known databases on the given host.
A precedence system determines what happens if there are overlapping values for the same setting name for the same target. A setting for a specific target name overrides a setting which specifies target using a wildcard. This makes it easy to set defaults for all targets, and then override them for specific targets or hosts. If there are multiple overlapping settings with the same name and target value, the 'precedence' attribute determines what happens.
For Pazpar2 1.6.4 or later, the target ID may be user-defined, in which case, the actual host, port, etc., is given by the setting pz:url.
The name of the setting. This can be anything you like. However, Pazpar2 reserves a number of setting names for specific purposes, all starting with 'pz:', and it is a good idea to avoid that prefix if you make up your own setting names. See below for a list of reserved variables.
The value of the setting. Generally, this can be anything you want -- however, some of the reserved settings may expect specific kinds of values.
This should be an integer. If not provided, the default value is 0. If two (or more) settings have the same content for target and name, the precedence value determines the outcome. If both settings have the same precedence value, they are both applied to the target(s). If one has a higher value, then the value of that setting is applied, and the other one is ignored.
By setting defaults for target, name, or value in the root settings node, you can use the settings files in many different ways. For instance, you can use a single file to set defaults for many different settings, like search fields, retrieval syntaxes, etc. You can have one file per server, which groups settings for that server or target. You could also have one file which associates a number of targets with a given setting, for instance, to associate many databases with a given category or class that makes sense within your application.
The following examples illustrate uses of the settings system to associate settings with targets to meet different requirements.
The example below associates a set of default values that can be used across many targets. Note the wildcard for targets. This associates the given settings with all targets for which no other information is provided.
<settings target="*"> <!-- This file introduces default settings for pazpar2 --> <!-- mapping for unqualified search --> <set name="pz:cclmap:term" value="u=1016 t=l,r s=al"/> <!-- field-specific mappings --> <set name="pz:cclmap:ti" value="u=4 s=al"/> <set name="pz:cclmap:su" value="u=21 s=al"/> <set name="pz:cclmap:isbn" value="u=7"/> <set name="pz:cclmap:issn" value="u=8"/> <set name="pz:cclmap:date" value="u=30 r=r"/> <set name="pz:limitmap:title" value="rpn:@attr 1=4 @attr 6=3"/> <set name="pz:limitmap:date" value="ccl:date"/> <!-- Retrieval settings --> <set name="pz:requestsyntax" value="marc21"/> <set name="pz:elements" value="F"/> <!-- Query encoding --> <set name="pz:queryencoding" value="iso-8859-1"/> <!-- Result normalization settings --> <set name="pz:nativesyntax" value="iso2709"/> <set name="pz:xslt" value="marc21.xsl"/> </settings>
The next example shows certain settings overridden for one target, one which returns XML records containing Dublin Core elements, and which furthermore requires a username/password.
<settings target="funkytarget.com:210/db1"> <set name="pz:requestsyntax" value="xml"/> <set name="pz:nativesyntax" value="xml"/> <set name="pz:xslt" value="../etc/dublincore.xsl"/> <set name="pz:authentication" value="myuser/password"/> </settings>
The following example associates a specific name/value combination with a number of targets. The targets below are access-restricted, and can only be used by users with special credentials.
<settings name="pz:allow" value="0"> <set target="funkytarget.com:210/*"/> <set target="commercial.com:2100/expensiveDb"/> </settings>
The following setting names are reserved by Pazpar2 to control the behavior of the client function.
Allows or denies access to the resources it is applied to. Possible values are '0' and '1'. The default is '1' (allow access to this resource).
If the 'pz:apdulog' setting is defined and has other value than 0, then Z39.50 APDUs are written to the log.
Sets an authentication string for a given database. For Z39.50, this is carried as part of the Initialize Request. In order to carry the information in the "open" elements, separate username and password with a slash (In Z39.50 it is a VisibleString). In order to carry the information in the idPass elements, separate username term, password term and, optionally, a group term with a single blank. If three terms are given, the order is user, group, password. If only two terms are given, the order is user, password.
For HTTP based protocols, such as SRU and Apache Solr, the
authentication string includes a username term and, optionally,
a password term.
Each term is separated by a single blank. The
authentication information is passed either by HTTP basic
authentication or via URL parameters. The mode of operation is
determined by pz:authentication_mode
setting.
Determines how authentication is carried in HTTP based protocols.
Value may be "basic
" or "url
".
(Not yet implemented). Specifies the time for which a block should be released anyway.
This establishes a CCL field definition or other setting, for the purpose of mapping end-user queries. XXX is the field or setting name, and the value of the setting provides parameters (e.g. parameters to send to the server, etc.). Please consult the YAZ manual for a full overview of the many capabilities of the powerful and flexible CCL parser.
Note that it is easy to establish a set of default parameters, and then override them individually for a given target.
The element set name to be used when retrieving records from a server.
If a show command goes to the boundary of a result set for a database - depends on sorting - and pz:extendrecs is set to a positive value. then Pazpar2 wait for show to fetch pz:extendrecs more records. This setting is best used if a database does native sorting, because the result set otherwise may be completely re-sorted during extended fetch. The default value of pz:extendrecs is 0 (no extended fetch).
The pz:extendrecs setting appeared in Pazpar2 version 1.6.26. But the behavior changed with the release of Pazpar2 1.6.29.
name
Specifies that for field name
, the target
supports (native) facets. The value is the name of the
field on the target.
name
Like pz:facetmap, but makes Pazpar2 inspect the term value consisting
of two items separated by colon. First item is the raw ID to be
sent to database if limitmap on the field
name
is used. The second item is
the display term.
This facility was added in Pazpar2 version 1.11.0.
This setting can't be 'set' -- it contains the ID (normally ZURL) for a given target, and is useful for filtering -- specifically when you want to select one or more specific targets in the search command.
name
Specifies attributes for limiting a search to a field - using the limit parameter for search. It can be used to filter locally or remotely (search in a target). In some cases the mapping of a field to a value is identical to an existing cclmap field; in other cases the field must be specified in a different way - for example to match a complete field (rather than parts of a subfield).
The value of limitmap may have one of three forms: referral to
an existing CCL field, a raw PQF string or a local limit. Leading string
determines type; either ccl:
for CCL field,
rpn:
for PQF/RPN, or local:
for filtering in Pazpar2. The local filtering may be followed
by a field a metadata field (default is to use the name of the
limitmap itself).
For Pazpar2 version 1.6.23 and later the limitmap may include multiple
specifications, separated by ,
(comma).
For example:
ccl:title,local:ltitle,rpn:@attr 1=4
.
The limitmap facility is supported for Pazpar2 version 1.6.0. Local filtering is supported in Pazpar2 1.6.6.
Controls the maximum number of records to be retrieved from a server. The default is 100.
If set and non-empty,
libMemcached will
configured and enabled for the target.
The value of this setting is same as the ZOOM option
memcached
, which in turn is the configuration
string passed to the memcached
function
of libMemcached.
This setting is honored in Pazpar2 1.6.39 or later. Pazpar2 must be using YAZ version 5.0.13 or later.
If set and non-empty, redis will be configured and enabled for the target. The value of this setting is exactly as the redis option for ZOOM C of YAZ. Refer to the YAZ manual.
This setting is honored in Pazpar2 1.6.43 or later. Pazpar2 must be using YAZ version 5.2.0 or later.
Specifies how Pazpar2 should map retrieved records to XML. Currently
supported values are xml
,
iso2709
and txml
.
The value iso2709
makes Pazpar2 convert retrieved
MARC records to MARCXML. In order to convert to XML, the exact
character set of the MARC must be known (if not, the resulting
XML is probably not well-formed). The character set may be
specified by adding:
;
charset
to
iso2709
. If omitted, a charset of
MARC-8 is assumed. This is correct for most MARC21/USMARC records.
The value txml
is like iso2709
except that records are converted to TurboMARC instead of MARCXML.
The value xml
is used if Pazpar2 retrieves
records that are already XML (no conversion takes place).
Sets character set for Z39.50 negotiation. Most targets do not support
this, and some will even close connection if set (crash on server
side or similar). If set, you probably want to set it to
UTF-8
.
Piggybacking enables the server to retrieve records from the server as part of the search response in Z39.50. Almost all servers support this (or fail it gracefully), but a few servers will produce undesirable results. Set to '1' to enable piggybacking, '0' to disable it. Default is 1 (piggybacking enabled).
Allows you to specify an arbitrary PQF query language substring. The provided string is prefixed to the user's query after it has been normalized to PQF internally in pazpar2. This allows you to attach complex 'filters' to queries for a given target, sometimes necessary to select sub-catalogs in union catalog systems, etc.
Allows you to extend a query with dates and operators.
The provided string allows certain substitutions and serves as a
format string.
The special two character sequence '%%' gets converted to the
original query. Other characters leading with the percent sign are
conversions supported by strftime.
All other characters are copied verbatim. For example, the string
@and @attr 1=30 @attr 2=3 %Y %%
would search for current year combined with the original PQF (%%).
This setting can also be used as more general alternative to
pz:pqf_prefix -- a way of embedding the submitted query
anywhere in the string rather than appending it to prefix. For
example, if it is desired to omit all records satisfying the
query @attr 1=pica.bib 0007
then this
subquery can be combined with the submitted query as the second
argument of @andnot
by using the
pz:pqf_strftime value @not %% @attr 1=pica.bib
0007
Specifies that a target is preferred, e.g. possible local, faster target. Using block=preferred on show command will wait for all these targets to return records before releasing the block. If no target is preferred, the block=preferred will be identical to block=1, which release when one target has returned records.
Controls the chunk size in present requests. Pazpar2 will make (maxrecs / chunk) request(s). The default is 20.
The encoding of the search terms that a target accepts. Most targets do not honor UTF-8 in which case this needs to be specified. Each term in a query will be converted if this setting is given.
Specifies a filter which allows Pazpar2 to only include records that meet a certain criteria in a result. Unmatched records will be ignored. The filter takes the form name, name~value, or name=value, which will include only records with metadata element (name) that has the substring (~value) given, or matches exactly (=value). If value is omitted all records with the named metadata element present will be included.
This specifies the record syntax to use when requesting records from a given server. The value can be a symbolic name like marc21 or xml, or it can be a Z39.50-style dot-separated OID.
Specifies sort criteria to be applied to the result set. Only works for targets which support the sort service.
field
Specifies native sorting for a target where
field
is a sort criterion (see command
show). The value has two components separated by a colon: strategy and
native-field. Strategy is one of z3950
,
type7
, cql
,
sru11
, or embed
.
The second component, native-field, is the field that is recognized
by the target.
Only supported for Pazpar2 1.6.4 and later.
This setting enables SRU/Solr support. It has four possible settings. 'get', enables SRU access through GET requests. 'post' enables SRU/POST support, less commonly supported, but useful if very large requests are to be submitted. 'soap' enables the SRW (SRU over SOAP) variation of the protocol.
A value of 'solr' enables Solr client support. This is supported for Pazpar version 1.5.0 and later.
This allows SRU version to be specified. If unset Pazpar2 will use the default of YAZ (currently 1.2). Should be set to 1.1 or 1.2. For Solr, the current supported/tested version is 1.4 and 3.x.
Specifies number of facet terms to be requested from the target. The default is unspecified e.g. server-decided. Also see pz:facetmap.
Specifies whether to use a factor for pazpar2 generated facets (1) or not (0). When mixing locally generated (by the downloaded (pz:maxrecs) samples) facet with native (target-generated) facets, the later will dominate the facet list since they are generated based on the complete result set. By scaling up the facet count using the ratio between total hit count and the sample size, the total facet count can be approximated and thus better compared with native facets. This is not enabled by default.
Specifies timeout for operation (e.g. search, and fetch) for a database. This overrides the z3650_operation timeout that is given for a service. See timeout.
The timeout facility is supported for Pazpar2 version 1.8.4 and later.
Specifies URL for the target and overrides the target ID.
pz:url
is only recognized for
Pazpar2 1.6.4 and later.
A comma-separated list of stylesheet names that specifies how to convert incoming records to the internal representation.
For each name, the embedded stylesheets (XSL) that comes with the service definition are consulted first, and take precedence over external files; see xslt of service definition). If the name does not match an embedded stylesheet, it is considered a filename.
The suffix of each file specifies the kind of tranformation.
Suffix ".xsl
" makes an XSL transform. Suffix
".mmap
" will use the MMAP transform (described below).
The special value "auto
" will use a file
which is the pz:requestsyntax's
value followed by
'.xsl'
.
When mapping MARC records, XSLT can be bypassed for increased performance with the alternate "MARC map" format. Provide the path of a file with extension ".mmap" containing on each line:
<field> <subfield> <metadata element>
For example:
245 a title 500 $ description 773 * citation
To map the field value, specify a subfield of '$'. To store a concatenation of all subfields, specify a subfield of '*'.
The 'pz:zproxy' setting has the value syntax 'host.internet.adress:port'. It is used to tunnel Z39.50 requests through the named Z39.50 proxy.