X-Git-Url: http://sru.miketaylor.org.uk/?a=blobdiff_plain;f=doc%2Fserver.xml;h=f09b2cfc35d6c8195bc7178618768015efaabae1;hb=27742a4ea82e9b3494c166203b06d1d7c48da923;hp=993fe63f60f73e6b51567e73cbcc785b0125d4ad;hpb=d2e692248eac6469ef7a3a3f8044010cb5cc1da7;p=idzebra-moved-to-github.git
diff --git a/doc/server.xml b/doc/server.xml
index 993fe63..f09b2cf 100644
--- a/doc/server.xml
+++ b/doc/server.xml
@@ -1,5 +1,5 @@
-
+
The Z39.50 Server
@@ -12,6 +12,72 @@
can be run (inetd, nt service, stand-alone program, daemon...) -H
-->
+
+
+
+ Description
+ Zebra is a high-performance, general-purpose structured text indexing
+ and retrieval engine. It reads structured records in a variety of input
+ formats (eg. email, XML, MARC) and allows access to them through exact
+ boolean search expressions and relevance-ranked free-text queries.
+
+
+ zebrasrv is the Z39.50 and SRW/U frontend
+ server for the Zebra indexer.
+
+
+ On Unix you can run the zebrasrv
+ server from the command line - and put it
+ in the background. It may also operate under the inet daemon.
+ On WIN32 you can run the server as a console application or
+ as a WIN32 Service.
+
+
+
+
+ Synopsis
+ &zebrasrv-synopsis;
+
+
+
+ Options
+
+
+ The options for zebrasrv are the same
+ as those for YAZ' yaz-ztest.
+ Option -c specifies a Zebra configuration
+ file - if omitted zebra.cfg is read.
+
+
+ &zebrasrv-options;
+
+
+ Files
+
+ zebra.cfg
+
+
+ See Also
+
+
+ zebraidx
+ 1
+ ,
+
+ yaz-ztest
+ 8
+
+
+
+ The Zebra software is Copyright Index Data
+ http://www.indexdata.dk
+ and distributed under the
+ GPLv2 license.
+
+
+
+
+
Z39.50 Protocol Support and Behavior
@@ -476,8 +479,8 @@
-
-
+
+
Present
@@ -531,8 +534,408 @@
timeout.
+
+
+ Explain
+
+ Zebra maintains a "classic"
+ Explain database
+ on the-side.
+ This database is called IR-Explain-1 and can be
+ searched using attribute Exp-1.
+
+
+ The records in the explain database is of type
+ grs.sgml and can be retrieved as
+ SUTRS, XML, GRS-1 + ASN.1 Explain.
+
+
+ Classic Explain only defines retrieaval of Explain information
+ via ASN.1. Pratically no Z39.50 clients supports this. Fortunately
+ they don't have to - since Zebra allows retrieaval of this information
+ in the other formats.
+
+
+ The root element for the Explain grs.sgml records is
+ explain, thus
+ explain.abs is used for indexing.
+
+
+
+ Zebra must be able to locate
+ explain.abs in order to index the Explain
+ records properly. Zebra will work without it but the information
+ will not be searchable.
+
+
+
+ The following Explain categories are supported: CategoryList, TargetInfo,
+ DatabaseInfo, AttributeDetails .
+
+
+ The following Explain search atributes are supported:
+ ExplainCategory (1), DatabaseName (3), DateAdded (9), DateChanged(10).
+ See tab/explain.att for more information.
+
+
+
+ Example searches
+
+
+ List supported categories:
+
+ @attr exp1 1=1 categorylist
+
+
+
+
+ Get targetinfo
+
+ @attr exp1 1=1 targetinfo
+
+
+
+
+ Get databaseinfo record for database Default.
+
+ @and @attr exp1 1=1 databaseinfo @attr exp1 1=3 Default
+
+
+
+
+
+
+
+
+ The SRU/SRW Server
+
+ In addition to Z39.50, Zebra supports the more recent and
+ web-friendly IR protocol SRU, described at
+ .
+ SRU is ``Search/Retrieve via URL'', a simple, REST-like protocol
+ that uses HTTP GET to request search responses. The request
+ itself is made of parameters such as
+ query,
+ startRecord,
+ maximumRecords
+ and
+ recordSchema;
+ the response is an XML document containing hit-count, result-set
+ records, diagnostics, etc. SRU can be thought of as a re-casting
+ of Z39.50 semantics in web-friendly terms; or as a standardisation
+ of the ad-hoc query parameters used by search engines such as Google
+ and AltaVista; or as a superset of A9's OpenSearch (which it
+ predates).
+
+
+ Zebra further supports SRW, described at
+ .
+ SRW is the ``Search/Retrieve Web Service'', a SOAP-based alternative
+ implementation of the abstract protocol that SRU implements as HTTP
+ GET requests. In SRW, requests are encoded as XML documents which
+ are posted to the server. The responses are identical to those
+ returned by SRU servers, except that they are wrapped in a several
+ layers of SOAP envelope.
+
+
+ Zebra supports all three protocols - Z39.50, SRU and SRW - on the
+ same port, recognising what protocol is used by each incoming
+ requests and handling them accordingly. This is a achieved through
+ the use of Deep Magic; civilians are warned not to stand too close.
+
+
+ From here on, ``SRU'' is used to indicate both the SRU and SRW
+ protocols, as they are identical except for the transport used for
+ the protocol packets and Zebra's support for them is equivalent.
+
+
+
+ Running the SRU Server (zebrasrv)
+
+ Because Zebra supports all three protocols on one port, it would
+ seem to follow that the SRU server is run in the same way as
+ the Z39.50 server, as described above. This is true, but only in
+ an uninterestingly vacuous way: a Zebra server run in this manner
+ will indeed recognise and accept SRU requests; but since it
+ doesn't know how to handle the CQL queries that these protocols
+ use, all it can do is send failure responses.
+
+
+
+ It is possible to cheat, by having SRU search Zebra with
+ a PQF query instead of CQL, using the
+ x-pquery
+ parameter instead of
+ query.
+ This is a
+ non-standard extension
+ of CQL, and a
+ very naughty
+ thing to do, but it does give you a way to see Zebra serving SRU
+ ``right out of the box''. If you start your favourite Zebra
+ server in the usual way, on port 9999, then you can send your web
+ browser to:
+
+
+ http://localhost:9999/Default?version=1.1
+ &operation=searchRetrieve
+ &x-pquery=mineral
+ &startRecord=1
+ &maximumRecords=1
+
+
+ This will display the XML-formatted SRU response that includes the
+ first record in the result-set found by the query
+ mineral. (For clarity, the SRU URL is shown
+ here broken across lines, but the lines should be joined to gether
+ to make single-line URL for the browser to submit.)
+
+
+
+ In order to turn on Zebra's support for CQL queries, it's necessary
+ to have the YAZ generic front-end (which Zebra uses) translate them
+ into the Z39.50 Type-1 query format that is used internally. And
+ to do this, the generic front-end's own configuration file must be
+ used. This file is described
+ elsewhere;
+ the salient point for SRU support is that
+ zebrasrv
+ must be started with the
+ -f frontendConfigFile
+ option rather than the
+ -c zebraConfigFile
+ option,
+ and that the front-end configuration file must include both a
+ reference to the Zebra configuration file and the CQL-to-PQF
+ translator configuration file.
+
+
+ A minimal front-end configuration file that does this would read as
+ follows:
+
+
+
+ zebra.cfg
+ ../../tab/pqf.properties
+
+
+]]>
+
+ The
+ <config>
+ element contains the name of the Zebra configuration file that was
+ previously specified by the
+ -c
+ command-line argument, and the
+ <cql2rpn>
+ element contains the name of the CQL properties file specifying how
+ various CQL indexes, relations, etc. are translated into Type-1
+ queries.
+
+
+ A zebra server running with such a configuration can then be
+ queried using proper, conformant SRU URLs with CQL queries:
+
+
+ http://localhost:9999/Default?version=1.1
+ &operation=searchRetrieve
+ &query=title=utah and description=epicent*
+ &startRecord=1
+ &maximumRecords=1
+
+
+
+
+ SRU and SRW Protocol Support and Behavior
+
+ Zebra running as an SRU server supports SRU version 1.1, including
+ CQL version 1.1. In particular, it provides support for the
+ following elements of the protocol.
+
+
+
+ Search and Retrieval
+
+ Zebra fully supports SRU's core
+ searchRetrieve
+ operation, as described at
+
+
+
+ One of the great strengths of SRU is that it mandates a standard
+ query language, CQL, and that all conforming implementations can
+ therefore be trusted to correctly interpret the same queries. It
+ is with some shame, then, that we admit that Zebra also supports
+ an additional query language, our own Prefix Query Format (PQF,
+ ).
+ A PQF query is submitted by using the extension parameter
+ x-pquery,
+ in which case the
+ query
+ parameter must be omitted, which makes the request not valid SRU.
+ Please don't do this.
+
+
+
+
+ Scan
+
+ Zebra does not support SRU's
+ scan
+ operation, as described at
+
+
+
+ This is a rather embarrassing surprise as the pieces are all
+ there: Z39.50 scan is supported, and SRU scan requests are
+ recognised and diagnosed. To add further to the embarrassment, a
+ mutant form of SRU scan is supported, using
+ the non-standard x-pScanClause parameter in
+ place of the standard scanClause to scan on a
+ PQF query clause.
+
+
+
+
+ Explain
+
+ Zebra fully supports SRU's core
+ explain
+ operation, as described at
+
+
+
+ The ZeeRex record explaining a database may be requested either
+ with a fully fledged SRU request (with
+ operation=explain
+ and version-number specified)
+ or with a simple HTTP GET at the server's basename.
+ The ZeeRex record returned in response is the one embedded
+ in the YAZ Frontend Server configuration file that is described in the
+ Virtual Hosts documentation.
+
+
+ Unfortunately, the data found in the
+ CQL-to-PQF text file must be added by hand-craft into the explain
+ section of the YAZ Frontend Server configuration file to be able
+ to provide a suitable explain record.
+ Too bad, but this is all extreme
+ new alpha stuff, and a lot of work has yet to be done ..
+
+
+ There is no linkeage whatsoever between the Z39.50 explain model
+ and the SRU/SRW explain response (well, at least not implemented
+ in Zebra, that is ..). Zebra does not provide a means using
+ Z39.50 to obtain the ZeeRex record.
+
+
+
+
+ Some SRU Examples
+
+ Surf into http://localhost:9999
+ to get an explain response, or use
+
+
+
+ See number of hits for a query
+
+
+
+ Fetch record 5-7 in Dublin Core format
+
+
+
+ Even search using PQF queries using the extended naughty
+ verbx-pquery
+
+
+
+ Or scan indexes using the extended extremely naughty
+ verbx-pScanClause
+
+ Don't do this in production code!
+ But it's a great fast debugging aid.
+
+
+
+
+ Initialization, Present, Sort, Close
+
+ In the Z39.50 protocol, Initialization, Present, Sort and Close
+ are separate operations. In SRU, however, these operations do not
+ exist.
+
+
+
+
+ SRU has no explicit initialization handshake phase, but
+ commences immediately with searching, scanning and explain
+ operations.
+
+
+
+
+ Neither does SRU have a close operation, since the protocol is
+ stateless and each request is self-contained. (It is true that
+ multiple SRU request/response pairs may be implemented as
+ multiple HTTP request/response pairs over a single persistent
+ TCP/IP connection; but the closure of that connection is not a
+ protocol-level operation.)
+
+
+
+
+ Retrieval in SRU is part of the
+ searchRetrieve operation, in which a search
+ is submitted and the response includes a subset of the records
+ in the result set. There is no direct analogue of Z39.50's
+ Present operation which requests records from an established
+ result set. In SRU, this is achieved by sending a subsequent
+ searchRetrieve request with the query
+ cql.resultSetId=id where
+ id is the identifier of the previously
+ generated result-set.
+
+
+
+
+ Sorting in CQL is done within the
+ searchRetrieve operation - in v1.1, by an
+ explicit sort parameter, but the forthcoming
+ v1.2 or v2.0 will most likely use an extension of the query
+ language, CQL for sorting: see
+
+
+
+
+
+ It can be seen, then, that while Zebra operating as an SRU server
+ does not provide the same set of operations as when operating as a
+ Z39.50 server, it does provide equivalent functionality.
+
+
+
+
+