1 <chapter id="introduction">
2 <title>Introduction</title>
5 <title>Overview</title>
8 The Zebra system is a fielded free-text indexing and retrieval engine with a
9 Z39.50 frontend. You can use any commercial or freeware Z39.50 client
10 to access data stored in Zebra.
14 The Zebra server is our first step towards the development of a fully
15 configurable, open information system. Eventually, it will be paired
16 off with a powerful Z39.50 client to support complex information
17 management tasks within almost any application domain. We're making
18 the server available now because it's no fun to be in the open
19 information retrieval business all by yourself. We want to allow
20 people with interesting data to make their things
21 available in interesting ways, without having to start out
22 by implementing yet another protocol stack from scratch.
26 This document is an introduction to the Zebra system. It will tell you
27 how to compile the software, and how to prepare your first database.
28 It also explains how the server can be configured to give you the
29 functionality that you need.
33 If you find the software interesting, you should join the support
34 mailing-list by sending email to
35 <literal>zebra-request@indexdata.dk</literal>.
41 <title>Features</title>
44 This is a list of some of the most important features of the
54 Supports updating - records can be added and deleted without
55 rebuilding the index from scratch.
56 The update procedure is tolerant to crashes or hard interrupts
57 during register updating - registers can be reconstructed following
59 Registers can be safely updated even while users are accessing
66 Supports large databases - files for indices, etc. can be
67 automatically partitioned over multiple disks.
74 Supports arbitrarily complex records - base input format is an
75 SGML-like syntax which allows nested (structured) data elements, as
76 well as variant forms of data.
83 Supports random storage formats. A system of input filters driven by
84 regular expressions allows you to easily process most ASCII-based
85 data formats. SGML, ISO2709 (MARC), and raw text are also supported.
92 Supports boolean queries as well as relevance-ranking (free-text)
93 searching. Right truncation and masking in terms are supported, as
94 well as full regular expressions.
101 Supports multiple concrete syntaxes
102 for record exchange (depending on the configuration): GRS-1, SUTRS,
103 ISO2709 (*MARC). Records can be mapped between record syntaxes and
111 Supports approximate matching in registers (ie. spelling mistakes,
131 Protocol facilities: Init, Search, Retrieve, Browse and Sort.
138 Piggy-backed presents are honored in the search-request.
145 Named result sets are supported.
152 Easily configured to support different application profiles, with
153 tables for attribute sets, tag sets, and abstract syntaxes.
154 Additional tables control facilities such as element mappings to
155 different schema (eg., GILS-to-USMARC).
162 Complex composition specifications using Espec-1 are partially
163 supported (simple element requests only).
170 Element Set Names are defined using the Espec-1 capability of the
171 system, and are given in configuration files as simple element
172 requests (and possibly variant requests).
179 Some variant support (not fully implemented yet).
186 Using the YAZ toolkit for the protocol implementation, the
187 server can utilise a plug-in XTI/mOSI implementation (not included) to
188 provide SR services over an OSI stack, as well as Z39.50 over TCP/IP.
195 Zebra runs on most Unix-like systems as well as Windows NT - a binary
196 distribution for Windows NT is forthcoming - so far, the installation
197 requires MSVC++ to compile the system (we use version 5.0).
209 <title>Future Work</title>
212 These are some of the plans that we have for the software in the near
213 and far future, approximately ordered after their relative importance.
215 asterisk will be implemented before the
223 *Complete the support for variants.
229 *Finalize the data element <emphasis>include</emphasis> facility
230 to support multimedia data elements in records.
236 Add more sophisticated relevance ranking mechanisms.
237 Add support for soundex and stemming.
238 Add relevance <emphasis>feedback</emphasis> support.
244 Complete EXPLAIN support.
250 Add support for very large records by implementing segmentation and/or
257 Support the Item Update extended service of the protocol.
263 We want to add a management system that allows you to
264 control your databases and configuration tables from a graphical
272 Programmers thrive on user feedback. If you are interested in a
273 facility that you don't see mentioned here, or if there's something
274 you think we could do better, please drop us a mail.
275 If you think it's all really neat, you're welcome to drop us a line
276 saying that, too. You'll find contact info at the end of this file.
281 <!-- Keep this comment at the end of the file
286 sgml-minimize-attributes:nil
287 sgml-always-quote-attributes:t
290 sgml-parent-document: "zebra.xml"
291 sgml-local-catalogs: nil
292 sgml-namecase-general:t