5. Extended indexing of MARC records

Extended indexing of MARC records will help you if you need index a combination of subfields, or index only a part of the whole field, or use during indexing process embedded fields of MARC record.

Extended indexing of MARC records additionally allows:

Note

In compare with simple indexing process the extended indexing may increase (about 2-3 times) the time of indexing process for MARC records.

5.1. The index-formula

At the beginning, we have to define the term index-formula for MARC records. This term helps to understand the notation of extended indexing of MARC records by Zebra. Our definition is based on the document "The table of conformity for Z39.50 use attributes and RUSMARC fields". The document is available only in Russian language.

The index-formula is the combination of subfields presented in such way:

     71-00$a, $g, $h ($c){.$b ($c)} , (1)
    

We know that Zebra supports a BIB-1 attribute - right truncation. In this case, the index-formula (1) consists from forms, defined in the same way as (1)

     71-00$a, $g, $h
     71-00$a, $g
     71-00$a
    

Note

The original MARC record may be without some elements, which included in index-formula.

This notation includes such operands as:

#

It means whitespace character.

-

The position may contain any value, defined by MARC format. For example, index-formula

	 70-#1$a, $g , (2)
	

includes

	 700#1$a, $g
	 701#1$a, $g
	 702#1$a, $g
	
{...}

The repeatable elements are defined in figure-brackets {}. For example, index-formula

	 71-00$a, $g, $h ($c){.$b ($c)} , (3)
	

includes

	 71-00$a, $g, $h ($c). $b ($c)
	 71-00$a, $g, $h ($c). $b ($c). $b ($c)
	 71-00$a, $g, $h ($c). $b ($c). $b ($c). $b ($c)
	

Note

All another operands are the same as accepted in MARC world.

5.2. Notation of index-formula for Zebra

Extended indexing overloads path of elm definition in abstract syntax file of Zebra (.abs file). It means that names beginning with "mc-" are interpreted by Zebra as index-formula. The database index is created and linked with access point (BIB-1 use attribute) according to this formula.

For example, index-formula

     71-00$a, $g, $h ($c){.$b ($c)} , (4)
    

in .abs file looks like:

     mc-71.00_$a,_$g,_$h_(_$c_){.$b_(_$c_)}
    

The notation of index-formula uses the operands:

_

It means whitespace character.

.

The position may contain any value, defined by MARC format. For example, index-formula

	 70-#1$a, $g , (5)
	

matches mc-70._1_$a,_$g_ and includes

	 700_1_$a,_$g_
	 701_1_$a,_$g_
	 702_1_$a,_$g_
	
{...}

The repeatable elements are defined in figure-brackets {}. For example, index-formula

	 71#00$a, $g, $h ($c) {.$b ($c)} , (6)
	

matches mc-71.00_$a,_$g,_$h_(_$c_){.$b_(_$c_)} and includes

	 71.00_$a,_$g,_$h_(_$c_).$b_(_$c_)
	 71.00_$a,_$g,_$h_(_$c_).$b_(_$c_).$b_(_$c_)
	 71.00_$a,_$g,_$h_(_$c_).$b_(_$c_).$b_(_$c_).$b_(_$c_)
	
<...>

Embedded index-formula (for linked fields) is between <>. For example, index-formula

	 4--#-$170-#1$a, $g ($c) , (7)
	

matches mc-4.._._$1<70._1_$a,_$g_(_$c_)>_ and includes

	 463_._$1<70._1_$a,_$g_(_$c_)>_
	

Note

All another operands are the same as accepted in MARC world.

5.2.1. Examples

  1. indexing LEADER

    You need to use keyword "ldr" to index leader. For example, indexing data from 6th and 7th position of LEADER

    	 elm mc-ldr[6] Record-type !
    	 elm mc-ldr[7] Bib-level   !
    	
  2. indexing data from control fields

    indexing date (the time added to database)

    	 elm mc-008[0-5] Date/time-added-to-db !
    	

    or for RUSMARC (this data included in 100th field)

    	 elm mc-100___$a[0-7]_ Date/time-added-to-db !
    	
  3. using indicators while indexing

    For RUSMARC index-formula 70-#1$a, $g matches

    	 elm 70._1_$a,_$g_ Author !:w,!:p
    	

    When Zebra finds a field according to "70." pattern it checks the indicators. In this case the value of first indicator doesn't mater, but the value of second one must be whitespace, in another case a field is not indexed.

  4. indexing embedded (linked) fields for UNIMARC based formats

    For RUSMARC index-formula 4--#-$170-#1$a, $g ($c) matches

    	 elm mc-4.._._$1<70._1_$a,_$g_(_$c_)>_ Author !:w,!:p
    	 

    Data are extracted from record if the field matches to "4.._." pattern and data in linked field match to embedded index-formula 70._1_$a,_$g_(_$c_).