- <section id="record-model-domxml-internal">
- <title>&dom; Internal Record Representation</title>
- <para>When indexing, an &xml; Reader is invoked to split the input
- files into suitable record &xml; pieces. Each record piece is then
- transformed to an &xml; &dom; structure, which is essentially the
- record model. Only &xslt; transformations can be applied during
- index, search and retrieval. Consequently, output formats are
- restricted to whatever &xslt; can deliver from the record &xml;
- structure, be it other &xml; formats, HTML, or plain text. In case
- you have <literal>libxslt1</literal> running with E&xslt; support,
- you can use this functionality inside the &dom;
- filter configuration &xslt; stylesheets.
+ <section id="record-model-domxml-pipeline-store">
+ <title>Store pipeline</title>
+ The <literal><store></literal> pipeline takes documents
+ from any common &dom; &xml; format to the &zebra; specific
+ storage &dom; &xml; format.
+ It may consist of zero ore more
+ <literal><![CDATA[<xslt stylesheet="path/file.xsl"/>]]></literal>
+ &xslt; transformations, and the outcome is handled to the
+ &zebra; core for deposition into the internal storage system.
+ </section>
+
+ <section id="record-model-domxml-pipeline-retrieve">
+ <title>Retrieve pipeline</title>
+ <para>
+ Finally, there may be one or more
+ <literal><retrieve></literal> pipeline definitions, each
+ of them again consisting of zero or more
+ <literal><![CDATA[<xslt stylesheet="path/file.xsl"/>]]></literal>
+ &xslt; transformations. These are used for document
+ presentation after search, and take the internal storage &dom;
+ &xml; to the requested output formats during record present
+ requests.
+ </para>
+ <para>
+ The possible multiple
+ <literal><retrieve></literal> pipeline definitions
+ are distinguished by their unique <literal>name</literal>
+ attributes, these are the literal <literal>schema</literal> or
+ <literal>element set</literal> names used in
+ <ulink url="http://www.loc.gov/standards/sru/srw/">&srw;</ulink>,
+ <ulink url="&url.sru;">&sru;</ulink> and
+ &z3950; protocol queries.
+ </para>
+ </section>
+
+
+ <section id="record-model-domxml-canonical-index">
+ <title>Canonical Indexing Format</title>
+
+ <para>
+ &dom; &xml; indexing comes in two flavors: pure
+ processing-instruction governed plain &xml; documents, and - very
+ similar to the Alvis filter indexing format - &xml; documents
+ containing &xml; <literal><record></literal> and
+ <literal><index></literal> instructions from the magic
+ namespace <literal>xmlns:z="http://indexdata.dk/zebra-2.0"</literal>.
+ </para>
+
+ <section id="record-model-domxml-canonical-index-pi">
+ <title>Processing-instruction governed indexing format</title>
+
+ <para>The output of the processing instruction driven
+ indexing &xslt; stylesheets must contain
+ processing instructions named
+ <literal>zebra-2.0</literal>.
+ The output of the &xslt; indexing transformation is then
+ parsed using &dom; methods, and the contained instructions are
+ performed on the <emphasis>elements and their
+ subtrees directly following the processing instructions</emphasis>.
+ </para>
+ <para>
+ For example, the output of the command
+ <screen>
+ xsltproc dom-index-pi.xsl marc-one.xml
+ </screen>
+ might look like this:
+ <screen>
+ <![CDATA[
+ <?xml version="1.0" encoding="UTF-8"?>
+ <?zebra-2.0 record id=11224466 rank=42?>
+ <record>
+ <?zebra-2.0 index control:0?>
+ <control>11224466</control>
+ <?zebra-2.0 index any:w title:w title:p title:s?>
+ <title>How to program a computer</title>
+ </record>
+ ]]>
+ </screen>