10. Extended Services: Remote Insert, Update and Delete

Note

Extended services are only supported when accessing the Zebra server using the Z39.50 protocol. The SRU protocol does not support extended services.

The extended services are not enabled by default in zebra - due to the fact that they modify the system. Zebra can be configured to allow anybody to search, and to allow only updates for a particular admin user in the main zebra configuration file zebra.cfg. For user admin, you could use:

     perm.anonymous: r
     perm.admin: rw
     passwd: passwordfile
    

And in the password file passwordfile, you have to specify users and encrypted passwords as colon separated strings. Use a tool like htpasswd to maintain the encrypted passwords.

     admin:secret
    

It is essential to configure Zebra to store records internally, and to support modifications and deletion of records:

     storeData: 1
     storeKeys: 1
    

The general record type should be set to any record filter which is able to parse XML records, you may use any of the two declarations (but not both simultaneously!)

     recordType: dom.filter_dom_conf.xml
     # recordType: grs.xml
    

Notice the difference to the specific instructions

     recordType.xml: dom.filter_dom_conf.xml
     # recordType.xml: grs.xml
    

which only work when indexing XML files from the filesystem using the *.xml naming convention.

To enable transaction safe shadow indexing, which is extra important for this kind of operation, set

     shadow: directoryname: size (e.g. 1000M)
    

See Section 2, “The Zebra Configuration File” for additional information on these configuration options.

Note

It is not possible to carry information about record types or similar to Zebra when using extended services, due to limitations of the Z39.50 protocol. Therefore, indexing filters can not be chosen on a per-record basis. One and only one general XML indexing filter must be defined.

10.1. Extended services in the Z39.50 protocol

The Z39.50 standard allows servers to accept special binary extended services protocol packages, which may be used to insert, update and delete records into servers. These carry control and update information to the servers, which are encoded in seven package fields:

Table 6.1. Extended services Z39.50 Package Fields

ParameterValueNotes
type'update'Must be set to trigger extended services
actionstring Extended service action type with one of four possible values: recordInsert, recordReplace, recordDelete, and specialUpdate
recordXML stringAn XML formatted string containing the record
syntax'xml'XML/SUTRS/MARC. GRS-1 not supported. The default filter (record type) as given by recordType in zebra.cfg is used to parse the record.
recordIdOpaquestring Optional client-supplied, opaque record identifier used under insert operations.
recordIdNumber positive numberZebra's internal system number, not allowed for recordInsert or specialUpdate actions which result in fresh record inserts.
databaseNamedatabase identifier The name of the database to which the extended services should be applied.

The action parameter can be any of recordInsert (will fail if the record already exists), recordReplace (will fail if the record does not exist), recordDelete (will fail if the record does not exist), and specialUpdate (will insert or update the record as needed, record deletion is not possible).

During all actions, the usual rules for internal record ID generation apply, unless an optional recordIdNumber Zebra internal ID or a recordIdOpaque string identifier is assigned. The default ID generation is configured using the recordId: from zebra.cfg. See Section 2, “The Zebra Configuration File”.

Setting of the recordIdNumber parameter, which must be an existing Zebra internal system ID number, is not allowed during any recordInsert or specialUpdate action resulting in fresh record inserts.

When retrieving existing records indexed with GRS-1 indexing filters, the Zebra internal ID number is returned in the field /*/id:idzebra/localnumber in the namespace xmlns:id="http://www.indexdata.dk/zebra/", where it can be picked up for later record updates or deletes.

A new element set for retrieval of internal record data has been added, which can be used to access minimal records containing only the recordIdNumber Zebra internal ID, or the recordIdOpaque string identifier. This works for any indexing filter used. See Section 4, “Retrieval of Zebra internal record data”.

The recordIdOpaque string parameter is an client-supplied, opaque record identifier, which may be used under insert, update and delete operations. The client software is responsible for assigning these to records. This identifier will replace zebra's own automagic identifier generation with a unique mapping from recordIdOpaque to the Zebra internal recordIdNumber. The opaque recordIdOpaque string identifiers are not visible in retrieval records, nor are searchable, so the value of this parameter is questionable. It serves mostly as a convenient mapping from application domain string identifiers to Zebra internal ID's.

10.2. Extended services from yaz-client

We can now start a yaz-client admin session and create a database:

      
      $ yaz-client localhost:9999 -u admin/secret
      Z> adm-create
      
     

Now the Default database was created, we can insert an XML file (esdd0006.grs from example/gils/records) and index it:

      
      Z> update insert id1234 esdd0006.grs
      
     

The 3rd parameter - id1234 here - is the recordIdOpaque package field.

Actually, we should have a way to specify "no opaque record id" for yaz-client's update command.. We'll fix that.

The newly inserted record can be searched as usual:

      
      Z> f utah
      Sent searchRequest.
      Received SearchResponse.
      Search was a success.
      Number of hits: 1, setno 1
      SearchResult-1: term=utah cnt=1
      records returned: 0
      Elapsed: 0.014179
      
     

Let's delete the beast, using the same recordIdOpaque string parameter:

      
      Z> update delete id1234
      No last record (update ignored)
      Z> update delete 1 esdd0006.grs
      Got extended services response
      Status: done
      Elapsed: 0.072441
      Z> f utah
      Sent searchRequest.
      Received SearchResponse.
      Search was a success.
      Number of hits: 0, setno 2
      SearchResult-1: term=utah cnt=0
      records returned: 0
      Elapsed: 0.013610
      
     

If shadow register is enabled in your zebra.cfg, you must run the adm-commit command

      
      Z> adm-commit
      
     

after each update session in order write your changes from the shadow to the life register space.

10.3. Extended services from yaz-php

Extended services are also available from the YAZ PHP client layer. An example of an YAZ-PHP extended service transaction is given here:

      
      $record = '<record><title>A fine specimen of a record</title></record>';

      $options = array('action' => 'recordInsert',
      'syntax' => 'xml',
      'record' => $record,
      'databaseName' => 'mydatabase'
      );

      yaz_es($yaz, 'update', $options);
      yaz_es($yaz, 'commit', array());
      yaz_wait();

      if ($error = yaz_error($yaz))
      echo "$error";
      
     

10.4. Extended services debugging guide

When debugging ES over PHP we recommend the following order of tests:

  • Make sure you have a nice record on your filesystem, which you can index from the filesystem by use of the zebraidx command. Do it exactly as you planned, using one of the GRS-1 filters, or the DOMXML filter. When this works, proceed.

  • Check that your server setup is OK before you even coded one single line PHP using ES. Take the same record form the file system, and send as ES via yaz-client like described in Section 10.2, “Extended services from yaz-client”, and remember the -a option which tells you what goes over the wire! Notice also the section on permissions: try

            perm.anonymous: rw
           

    in zebra.cfg to make sure you do not run into permission problems (but never expose such an insecure setup on the internet!!!). Then, make sure to set the general recordType instruction, pointing correctly to the GRS-1 filters, or the DOMXML filters.

  • If you insist on using the sysno in the recordIdNumber setting, please make sure you do only updates and deletes. Zebra's internal system number is not allowed for recordInsert or specialUpdate actions which result in fresh record inserts.

  • If shadow register is enabled in your zebra.cfg, you must remember running the

            Z> adm-commit
           

    command as well.

  • If this works, then proceed to do the same thing in your PHP script.