From: Mike Taylor Date: Wed, 28 Aug 2002 08:14:47 +0000 (+0000) Subject: Rolling tweaks. Nothing earth-shattering yet, though I have X-Git-Tag: ZEBRA.1.3.2~48 X-Git-Url: http://sru.miketaylor.org.uk/cgi-bin?a=commitdiff_plain;h=79e9818dfb6b9a0a04bdd6bc6467c8dae3b8f493;p=idzebra-moved-to-github.git Rolling tweaks. Nothing earth-shattering yet, though I have added an initial Applications section to introduction.xml --- diff --git a/doc/administration.xml b/doc/administration.xml index 417a3da..5ebfcd3 100644 --- a/doc/administration.xml +++ b/doc/administration.xml @@ -1,5 +1,5 @@ - + Administrating Zebra @@ -433,7 +433,7 @@ in the configuration file. In addition, you should set storeKeys to 1, since the Zebra indexer must save additional information about the contents of each record - in order to modify the indices correctly at a later time. + in order to modify the indexes correctly at a later time. diff --git a/doc/introduction.xml b/doc/introduction.xml index 6553bc6..b9a68d2 100644 --- a/doc/introduction.xml +++ b/doc/introduction.xml @@ -1,5 +1,5 @@ - + Introduction @@ -10,34 +10,40 @@ Zebra is a high-performance, general-purpose structured text indexing and retrieval engine. It reads structured records in a - variety of input formats (eg. email, XML, MARC) and allows access - to them through exact boolean search expressions and - relevance-ranked free-text queries. - + variety of input formats (eg. email, XML, MARC) and provides access + to them through a powerful combination of boolean search + expressions and relevance-ranked free-text queries. + - - Zebra supports large databases (more than ten gigabytes of data, - tens of millions of records). It supports safe, incremental - database updates on live systems. You can access data stored in - Zebra using a variety of Index Data tools (eg. YAZ and PHP/YAZ) as - well as commercial and freeware Z39.50 clients and toolkits. - + + Zebra supports large databases (tens of millions of records, + tens of gigabytes of data). It allows safe, incremental + database updates on live systems. Because Zebra supports + the industry-standard information retrieval protocol, Z39.50, + you can search Zebra databases using an enormous variety of + programs and toolkits, both commercial and free, which understand + this protocol. Application libraries are available to allow + bespoke clients to be written in Perl, C, C++, Java, Tcl, Visual + Basic, Python, PHP and more - see + the ZOOM web site + for more information on some of these client toolkits. + - This document is an introduction to the Zebra system. It will tell you - how to compile the software, and how to prepare your first database. - It also explains how the server can be configured to give you the + This document is an introduction to the Zebra system. It explains + how to compile the software, how to prepare your first database, + and how to configure the server to give you the functionality that you need. - - If you find the software interesting, you should visit the - - Zebra web site, where you can join the + If you use Zebra, you should visit its + web site, + where you can join the mailing-list by sending email to + ### zebra-subscribe@mailman.indexdata.dk @@ -55,7 +61,7 @@ - Supports large databases - files for indices, etc. can be + Supports large databases - files for indexes, etc. can be automatically partitioned over multiple disks. @@ -199,16 +205,34 @@ DADS - the DTV Article Database Service - DADS is a huge database of ### records, allowing students and - researchers at DTU (###) to search and order articles from several - different databases at once. The database contains - literature on all engineering subjects. It's available on-line - through a web gateway at + DADS is a huge database of more than ten million records, totally + over ten gigabytes of data. The records are metadata about academic + journal articles, primarily scientific; about 10% of these + metadata records link to the full text of the articles they + describe, a body of about a terabyte of information (although the + full text is not indexed.) + + + It allows students and researchers at DTU (###) to find and order + articles from multiple databases in a single query. The database + contains literature on all engineering subjects. It's available + on-line through a web gateway at http://www.dtv.dk/search/index_e.htm - though only to members of the university. + though currently only to registered users. + + + + + Various web indexes + + Zebra has been used by a variety of institutions to construct + indexes of large web sites, typically in the region of tens of + millions of pages. In this role, it functions somewhat similarly + to the engine of google or altavista, but for a selected intranet + or subset of the whole Web. - ### Much more information needed. + ### examples, details and numbers, please!