<chapter id="record-model">
- <!-- $Id: recordmodel.xml,v 1.1 2002-04-09 13:26:26 adam Exp $ -->
+ <!-- $Id: recordmodel.xml,v 1.3 2002-04-10 14:47:49 heikki Exp $ -->
<title>The Record Model</title>
<para>
structured
record type <literal>grs</literal> as introduced in
<xref linkend="record-types"/>.
+ FIXME - Need to describe the simple string-tag model, or at least
+ refer to it here. -H
</para>
<para>
<para>
This allows Zebra to read
records in the ISO2709 (MARC) encoding standard. In this case, the
- last paramemeter <emphasis>abstract syntax</emphasis> names the
+ last parameter <emphasis>abstract syntax</emphasis> names the
<literal>.abs</literal> file (see below)
which describes the specific MARC structure of the input record as
well as the indexing rules.
<note>
<para>
The indentation used above is used to illustrate how Zebra
- interprets the markup. The indentation, in itself, has no
+ interprets the mark-up. The indentation, in itself, has no
significance to the parser for the canonical input format, which
discards superfluous whitespace.
</para>
<term>FINISH</term>
<listitem>
<para>
- The expression asssociated with this pattern is evaluated
+ The expression associated with this pattern is evaluated
once, before the application terminates. It can be used to release
system resources - typically ones allocated in the
<emphasis>INIT</emphasis> step.
<term>record</term>
<listitem>
<para>
- Begin a new record. The followingparameter should be the
+ Begin a new record. The following parameter should be the
name of the schema that describes the structure of the record, eg.
<literal>gils</literal> or <literal>wais</literal> (see below).
The <literal>begin record</literal> call should precede
<note>
<para>
- Documentation needs extension here about types of nodes - numerical,
+ FIXME! Documentation needs extension here about types of nodes - numerical,
textual, etc., plus the various types of inclusion notes.
</para>
</note>
</para>
<para>
+ FIXME - Need a diagram here, or a simple explanation how it all hangs together -H
+ </para>
+
+ <para>
<itemizedlist>
<listitem>
Generally, settings are characterized by a single
keyword, identifying the setting, followed by a number of parameters.
Some settings are repeatable (r), while others may occur only once in a
- file. Some settings are optional (o), whicle others again are
+ file. Some settings are optional (o), while others again are
mandatory (m).
</para>
The <emphasis>names</emphasis> parameter is a list of names
by which the tag should be recognized in the input file format.
The names should be separated by slashes (/).
- The <emphasis>type</emphasis> is th recommended datatype of
+ The <emphasis>type</emphasis> is the recommended data type of
the tag.
It should be one of the following:
</para>
<para>
- <emphasis>NOTE: The schema-mapping functions are so far limited to a
+ <emphasis>NOTE: FIXME! The schema-mapping functions are so far limited to a
straightforward mapping of elements. This should be extended with
mechanisms for conversions of the element contents, and conditional
mappings of elements based on the record contents.</emphasis>
</para>
<para>
- <emphasis>NOTE: This will be described better. We're in the process of
+ <emphasis>NOTE: FIXME! This will be described better. We're in the process of
re-evaluating and most likely changing the way that MARC records are
handled by the system.</emphasis>
</para>
(preceded by <literal>x</literal>).
In addition, the combinations
\\, \\r, \\n, \\t, \\s (space — remember that real
- space-characters may ot occur in the value definition), and
- \\ are recognised, with their usual interpretation.
+ space-characters may not occur in the value definition), and
+ \\ are recognized, with their usual interpretation.
</para>
</listitem>
<para>
Curly braces {} may be used to enclose ranges of single
characters (possibly using the escape convention described in the
- preceding point), eg. {a-z} to entroduce the
+ preceding point), eg. {a-z} to introduce the
standard range of ASCII characters.
Note that the interpretation of such a range depends on
the concrete representation in your local, physical character set.
<listitem>
<para>
- SUTRS. Again, the mapping is fairly straighforward. Indentation
+ SUTRS. Again, the mapping is fairly straightforward. Indentation
is used to show the hierarchical structure of the record. All
"GRS" type records support both the GRS-1 and SUTRS
representations.
+ FIXME - What is SUTRS - should be expanded here
</para>
</listitem>
abstract syntaxes can be mapped to the SOIF format, although nested
elements are represented by concatenation of the tag names at each
level.
+ FIXME - Is this used anywhere ? What is SOIF anyway? -H
</para>
</listitem>