-Since the existing records in an index can not be addressed by their
-IDs, it is impossible to delete or modify records when using this method.
-
-<sect1>Indexing with File Record IDs
-
-<p>
-If you have a set of external records that you wish to index you may
-use the file key feature of the Zebra system. In short, the file key
-methodology uses the paths of the files containing records as their
-unique identifiers. To perform indexing of a directory with file keys,
-again, you specify the top-level directory after the <tt>update</tt>
-command. The command will recursively traverse the directories and
-compare each with whatever have been indexed before in the same
-directory. If a file is new (not in the previous version of the
-directory) it is inserted into the registers; if a file was already
-indexed and it has been modified since the last insertionm, the index
-is also modified; if a file has been removed since the last visit, it
-is deleted from the index.
-
-The resulting system is easy to administer. To delete a record
-you simply have to delete the corresponding file (say, with the
-<tt/rm/ command).
-To force update of a given file, you may use the <tt>touch</tt>
-command. And to add files create new files (or directories with files).
-For your changes to take effect in the register you must run <tt>zebraidx</tt> with
-the same directory root again.
-
-To use this method, you must specify <tt>file</tt> as the value
-of <tt>recordId</tt> in the configuration file. In addition, you
-should set <tt>storeKeys</tt> to <tt>1</tt>, since the Zebra
-indexer must save additional information about the keys to each record in order to
-modify the indices correctly at a later time.
-
-For example, to update group <tt>esdd</tt> records below
-<tt>/home/grs</tt> you could type:
-<tscreen><verb>
-$ zebraidx -g esdd update /home/grs
-</verb></tscreen>
-
-The corresponding configuration file includes:
-<tscreen><verb>
-esdd.recordId: file
-esdd.recordType: grs
-esdd.storeKeys: 1
-</verb></tscreen>
-
-<em>Important note: You cannot start out with a group of records with simple
-indexing (no record IDs as in the previous section) and then later
-enable file record Ids. Zebra must know from the first time that you
-index the group that
-the files should be indexed with file record IDs.
-</em>
-
-You cannot explicitly delete records when using this method (using the
-<bf/delete/ command to <tt/zebraidx/. Instead
-you have to delete the files from the file system (or remove them)
-and then run <tt>zebraidx</tt> with the <bf/update/ command again.
-
-<sect1>Indexing with General Record IDs
-<p>
-When using this method you construct an (almost) arbritrary, internal
-record key based on the contents of the record itself and other system
-information. If you have a group of records that associates an ID with
-each record, this method is convenient. For example, the record may
-contain a title or a ID-number - unique within the group. In either
-case you specify the Z39.50 attribute set and use-attribute location
-in which this information is stored, and the system looks at this
-field to determine the identity of the record.
-
-As before, the record ID is defined by the <tt>recordId</tt> setting
-in the configuration file. The value of the record ID specification
-consists of one or more tokens separated by whitespace. The resulting
-ID is
-represented in the index by concatenating the tokens and separating them by
-ASCII value (1).
-
-There are three kinds of tokens:
-<descrip>
-<tag>Internal record info</tag> The token refers to a key that is
-extracted from the record. The syntax of this token is
- <tt/(/ <em/set/ <tt/,/ <em/use/ <tt/)/, where <em/set/ is the
-attribute set ordinal number and <em/use/ is the use value of the attribute.
-<tag>System variable</tag> The system variables are preceded by
-<verb>$</verb> and immediately followed by the system variable name, which
-may one of
- <descrip>
- <tag>group</tag> Group name.
- <tag>database</tag> Current database specified.
- <tag>type</tag> Record type.
- </descrip>
-<tag>Constant string</tag> A string used as part of the ID — surrounded
- by single- or double quotes.
-</descrip>
-
-The sample GILS records that come with the Zebra distribution contain a
-unique ID
-in the Control-Identifier field. This field is mapped to the Bib-1
-use attribute 1007. To use this field as a record id, specify
-<tt>(1,1007)</tt> as the value of the <tt>recordId</tt> in the
-configuration file. If you have other record types that uses
-the same field for a different purpose, you might add the record type (or group or database name)
-to the record id of the gils records as well, to prevent matches
-with other types of records. In this case the recordId might be
-set like this:
-<tscreen><verb>
-gils.recordId: $type (1,1007)
-</verb></tscreen>
-
-As for the file record id case described in the previous section
-updating your system is simply a matter of running <tt>zebraidx</tt>
-with the <tt>update</tt> command. However, the update with general
-keys is considerably slower than with file record IDs, since all files
-visited must be (re)read to find their IDs.
-
-You may have noticed that when using the general record IDs
-method, you can only add or modify existing records with the <tt>update</tt>
-command. If you wish to delete records, you must use the,
-<tt>delete</tt> command, with a directory as a parameter.
-This will remove all records that match the files below that root
-directory.
-
-<sect1>Register Location<label id="register-location">
-
-<p>
-Normally, the index files that form dictionaries, inverted
-files, record info, etc., are stored in the directory where you run
-<tt>zebraidx</tt>. If you wish to store these, possibly large, files
-somewhere else, you must add the <tt>register</tt> entry to the
-configuration file. Furthermore, the Zebra system allows its file
-structures to
-span multiple file systems, which is useful if a very large number of
-records are stored.
-
-The value <tt>register</tt> of register is a sequence of tokens.
-Each token takes the form:
-<tscreen>
-<em>dir</em><tt>:</tt><em>size</em>.
-</tscreen>
-The <em>dir</em> specifies a directory in which index files will be
-stored and the <em>size</em> specifies the maximum size of all
-files in that directory. The Zebra indexer system fills each directory
-in the order specified and use the next specified directories as needed.
-The <em>size</em> is an integer followed by a qualifier
-code, <tt>M</tt> for megabytes, <tt>k</tt> for kilobytes.
-
-For instance, if you have two spare disks :) and the first disk is mounted
-on <tt>/d1</tt> and has 200 Mb of free space and the
-second, mounted on <tt>/d2</tt> has 300 Mb, you could
-put this entry in your configuration file:
-<tscreen><verb>
-register: /d1:200M /d2:300M
-</verb></tscreen>
-
-Note that Zebra does not verify that the amount of space specified is
-actually available on the directory (file system) specified - it is
-your responsibility to ensure that enough space is available, and that
-other applications do not use the free space. In a large production system,
-it is recommended that you allocate one or more filesystem exclusively
-to the Zebra register files.
-
-<sect1>Safe Updating - Using Shadow Registers<label id="shadow-registers">
-
-<sect2>Description
-
-<p>
-The Zebra server supports updating of the index structures. That is,
-you can add records to databases managed by Zebra without rebuilding
-the entire index. Since this process involves modifying structured
-files with various references between blocks of data in the files, the
-update process is inherently sensitive to system crashes, or to
-process interruptions: Anything but a successfully completed update
-process will leave the register files in an unknown state, and you
-will essentially have no recourse but to re-index everything, or to
-restore the register files from a backup medium. Further, while the
-update process is active, users cannot be allowed to access the
-system, as the contents of the register files may change unpredictably.
-
-You can solve these problems by enabling the shadow register system in
-Zebra. During the updating procedure, <tt/zebraidx/ will temporarily
-write changes to the involved files in a set of &dquot;shadow
-files&dquot;, without modifying the files that are accessed by the
-active server processes. If the update procedure is interrupted by a
-system crash or a signal, you simply repeat the procedure - the
-register files have not been changed or damaged, and the partially
-written shadow files are automatically deleted before the new updating
-procedure commences.
-
-At the end of the updating procedure (or in a separate operation, if
-you so desire), the system enters a &dquot;commit mode&dquot;. First,
-any active server processes are forced to access those blocks that
-have been changed from the shadow files rather than from the main
-register files; the unmodified blocks are still accessed at their
-normal location (the shadow files are not a complete copy of the
-register files - they only contain those parts that have actually been
-modified). If the process is interrupted at any point during the
-commit process, the server processes will continue to access the
-shadow files until you can repeat the commit procedure and complete
-the writing of data to the main register files. You can perform
-multiple update operations to the registers before you commit the
-changes to the system files, or you can execute the commit operation
-at the end of each update operation. When the commit phase has
-completed successfully, any running server processes are instructed to
-switch their operations to the new, operational register, and the
-temporary shadow files are deleted.
-
-<sect2>How to Use Shadow Register Files
-
-<p>
-The first step is to allocate space on your system for the shadow
-files. You do this by adding a <tt/shadow/ entry to the <tt/zebra.cfg/
-file. The syntax of the <tt/shadow/ entry is exactly the same as for
-the <tt/register/ entry (see section <ref name="Register Location"
-id="register-location">). The location of the shadow area should be
-<it/different/ from the location of the main register area (if you
-have specified one - remember that the default register area is the
-working directory of the server and indexing processes).
-
-The following excerpt from a <tt/zebra.cfg/ file shows one example of
-a setup that configures both the main register location and the shadow
-file area. Note that two directories or partitions have been set aside
-for the shadow file area. You can specify any number of directories
-for each of the file areas.
-
-<tscreen><verb>
-register: /d1:500M
-
-shadow: /scratch1:100M /scratch2:200M
-</verb></tscreen>
-
-When shadow files are enabled, an extra command is available at the
-<tt/zebraidx/ command line. In order to make changes to the system
-take effect for the users, you'll have to submit a
-&dquot;commit&dquot; command after a (sequence of) update
-operation(s). You can ask the indexer to commit the changes
-immediately after the update operation:
-
-<tscreen><verb>
-$ zebraidx update /d1/records update /d2/more-records commit
-</verb></tscreen>
-
-Or you can execute multiple updates before committing the changes:
-
-<tscreen><verb>
-$ zebraidx -g books update /d1/records update /d2/more-records
-$ zebraidx -g fun update /d3/fun-records
-$ zebraidx commit
-</verb></tscreen>
-
-If one of the update operations above had been interrupted, the commit
-operation on the last line would fail: <tt/zebraidx/ will not let you
-commit changes that would destroy the running register. You'll have to
-rerun all of the update operations since your last commit operation,
-before you can commit the new changes.
-
-Similarly, if the commit operation fails, <tt/zebraidx/ will not let
-you start a new update operation before you have successfully repeated
-the commit operation. The server processes will keep accessing the
-shadow files rather than the (possibly damaged) blocks of the main
-register files until the commit operation has successfully completed.
-
-You should be aware that update operations may take slightly longer
-when the shadow register system is enabled, since more file access
-operations are involved. Further, while the disk space required for
-the shadow register data is modest for a small update operation, you
-may prefer to disable the system if you are adding a very large number
-of records to an already very large database (we use the terms
-<it/large/ and <it/modest/ very loosely here, since every
-application's perception of size is different). To update the system
-without the use of the the shadow files, simply run <tt/zebraidx/ with
-the <tt/-n/ option (note that you do not have to execute the
-<bf/commit/ command of <tt/zebraidx/ when you temporarily disable the
-use of the shadow registers in this fashion. Note also that, just as
-when the shadow registers are not enabled, server processes will be
-barred from accessing the main register while the update procedure
-takes place.
-