1 <chapter id="examples">
2 <!-- $Id: examples.xml,v 1.8 2002-10-10 14:27:18 heikki Exp $ -->
3 <title>Example Configurations</title>
6 <title>Overview</title>
9 <literal>zebraidx</literal> and <literal>zebrasrv</literal> are both
10 driven by a master configuration file, which may refer to other
11 subsidiary configuration files. By default, they try to use
12 <filename>zebra.cfg</filename> in the working directory as the
13 master file; but this can be changed using the <literal>-t</literal>
14 option to specify an alternative master configuration file.
17 The master configuration file tells Zebra:
22 Where to find subsidiary configuration files, including
23 <literal>default.idx</literal>
24 which specifies the default indexing rules.
30 What attribute sets to recognise in searches.
36 Policy details such as what record type to expect, what
37 low-level indexing algorithm to use, how to identify potential
38 duplicate records, etc.
45 Now let's see what goes in the <literal>zebra.cfg</literal> file
46 for some example configurations.
51 <title>Example 1: XML Indexing And Searching</title>
54 This example shows how Zebra can be used with absolutely minimal
55 configuration to index a body of
56 <ulink url="http://www.w3.org/xml/###">XML</ulink>
57 documents, and search them using
58 <ulink url="http://www.w3.org/xpath/###">XPath</ulink>
59 expressions to specify access points.
62 Go to the <literal>examples/dinosauricon</literal> subdirectory
63 of the distribution archive.
64 There you will find a <literal>records</literal> subdirectory,
65 which contains some raw XML data to be added to the database: in
66 this case, as single file, <literal>genera.xml</literal>,
67 which contain information about all the known dinosaur genera as of
71 Now we need to create the Zebra database, which we do with the
72 Zebra indexer, <literal>zebraidx</literal>, which is
73 driven by the <literal>zebra.cfg</literal> configuration file.
74 For our purposes, we don't need any
75 special behaviour - we can use the defaults - so we start with a
76 minimal file that just tells <literal>zebraidx</literal> where to
77 find the default indexing rules, and how to parse the records:
79 profilePath: .:../../tab:../../../yaz/tab
84 That's all you need for a minimal Zebra configuration. Now you can
85 roll the XML records into the database and build the indexes:
87 zebraidx update records
91 Now start the server. Like the indexer, its behaviour is
93 <literal>zebra.cfg</literal> file; and like the indexer, it works
94 just fine with this minimal configuration.
98 By default, the server listens on IP port number 9999, although
99 this can easily be changed - see
100 <xref linkend="zebrasrv"/>.
103 Now you can use the Z39.50 client program of your choice to execute
104 XPath-based boolean queries and fetch the XML records that satisfy
107 $ yaz-client tcp:@:9999
109 Z> find @attr 1=/GENUS/MEANING @and lizard earthquakes
113 <GENUS name="Sauroposeidon" type="with">
114 <MEANING>lizard Poseidon <LOW>(Greek god of, among other things, earthquakes)</LOW></MEANING>
115 <SPECIES name="proteles">
116 <AUTHOR type="vide" name="Franklin" year="2000"></AUTHOR>
117 <AUTHOR name="Wedel, Cifelli, Sanders"></AUTHOR>
119 <PLACE name="Oklahoma"></PLACE>
120 <TIME value="Albian"></TIME>
121 <LENGTH value="30" q="1"></LENGTH>
122 <REMAINS content="rib, cervical vertebrae"></REMAINS>
124 <P> This new <NOMEN name="Brachiosaurus"></NOMEN>-like <LINK content="dinosaur"></LINK>
125 was perhaps the tallest. With its head raised, it stood 60 feet (nearly
126 20 m) tall. </P>
129 <idzebra xmlns="http://www.indexdata.dk/zebra/">
130 <size>593</size>
131 <localnumber>891</localnumber>
132 <filename>records/genera.xml</filename>
138 Now wasn't that easy?
142 <sect1 id="example2">
143 <title>Example 2: Supporting Z39.50 Searches</title>
146 You may have noticed as <literal>zebraidx</literal> was building
147 the database that it issued a warning, which we ignored at the
150 $ zebraidx update records
151 00:45:46-08/10: ../../index/zebraidx(5016) [warn] records/genera.xml:0 Couldn't open GENUS.abs [No such file or directory]
153 <!-- FIXME ### This needs more text -->
162 The master configuration file, <literal>zebra.cfg</literal>,
163 which is as short and simple as it can be:
165 # $Header: /home/cvsroot/idis/doc/examples.xml,v 1.8 2002-10-10 14:27:18 heikki Exp $
166 # Bare-bones master configuration file for Zebra
167 profilePath: .:../../tab:../../../yaz/tab
169 Apart from the comments, which are ignored, all this specifies is
170 that the server should recognise the attribute set described in
172 <literal>bib1.att</literal>.
173 ### What is an attribute set?
179 The BIB-1 attribute set configuration file,
180 <literal>bib1.att</literal>, which is also as short as possible:
182 # $Header: /home/cvsroot/idis/doc/examples.xml,v 1.8 2002-10-10 14:27:18 heikki Exp $
183 # Bare-bones BIB-1 attribute set file for Zebra
186 Apart from the comments, all this specifies is that reference of
187 the attribute set described by this file is
188 <literal>Bib-1</literal>, a name recognised by the system as
189 referring to a well-known opaque identifier that is transmitted
190 by clients as part of their searches.
191 ### Yeuch! Surely we can say that better!
194 ### Can't we somehow say this trivial thing in the main
201 The simplest hello-world example could go like this:
206 <title>The art of motorcycle maintenance</title>
207 <subject scheme="Dewey">zen</subject>
212 f @attr 1=/book/title motorcycle
214 f @attr 1=/book/subject[@scheme=Dewey] zen
216 If you suddenly decide you want broader interop, you can add
217 an abs file (more or less like this):
222 elm (2,1) title title
223 elm (2,21) subject subject
227 How to include images:
231 <imagedata fileref="system.eps" format="eps">
234 <imagedata fileref="system.gif" format="gif">
237 <phrase>The Multi-Lingual Search System Architecture</phrase>
241 <emphasis role="strong">
242 The Multi-Lingual Search System Architecture.
245 Network connections across local area networks are
246 represented by straight lines, and those over the
247 internet by jagged lines.
251 Whene the three <*object> thingies inside the top-level <mediaobject>
252 are decreasingly preferred version to include depending on what the
253 rendering engine can handle. I generated the EPS version of the image
254 by exporting a line-drawing done in TGIF, then converted that to the
255 GIF using a shell-script called "epstogif" which used an appallingly
256 baroque sequence of conversions, which I would prefer not to pollute
257 the Zebra build environment with:
261 # Yes, what follows is stupidly convoluted, but I can't find a
262 # more straightforward path from the EPS generated by tgif's
263 # "Print" command into a browser-friendly format.
265 file=`echo "$1" | sed 's/\.eps//'`
266 ps2pdf "$1" "$file".pdf
267 pdftopbm "$file".pdf "$file"
268 pnmscale 0.50 < "$file"-000001.pbm | pnmcrop | ppmtogif
269 rm -f "$file".pdf "$file"-000001.pbm
273 <!-- Keep this comment at the end of the file
278 sgml-minimize-attributes:nil
279 sgml-always-quote-attributes:t
282 sgml-parent-document: "zebra.xml"
283 sgml-local-catalogs: nil
284 sgml-namecase-general:t