X-Git-Url: http://sru.miketaylor.org.uk/?a=blobdiff_plain;ds=sidebyside;f=doc%2Fbook.xml;h=385f76328d3fad1a1688f29b239b465e6f2dd2b9;hb=92d556a79d978f878f90ef01b97c5b313dd01e15;hp=d6e6f8d355072044c5dcdbc43f20e069925f8f38;hpb=8fc15e69384e20bb9c305a84683c36311bdec9f3;p=metaproxy-moved-to-github.git
diff --git a/doc/book.xml b/doc/book.xml
index d6e6f8d..385f763 100644
--- a/doc/book.xml
+++ b/doc/book.xml
@@ -1,4 +1,4 @@
-
+
Metaproxy - User's Guide and Reference
@@ -9,16 +9,20 @@
2006
- Index Data
+ Index Data ApS
Metaproxy is a universal router, proxy and encapsulated
metasearcher for information retrieval protocols. It accepts,
processes, interprets and redirects requests from IR clients using
- standard protocols such as ANSI/NISO Z39.50 (and in the future SRU
- and SRW), as well as functioning as a limited
- HTTP server. Metaproxy is configured by an XML file which
+ standard protocols such as
+ ANSI/NISO Z39.50
+ (and in the future SRU
+ and SRW), as
+ well as functioning as a limited
+ HTTP server.
+ Metaproxy is configured by an XML file which
specifies how the software should function in terms of routes that
the request packets can take through the proxy, each step on a
route being an instantiation of a filter. Filters come in many
@@ -33,6 +37,16 @@
should not at this stage redistribute the code without explicit
written permission from the copyright holders, Index Data ApS.
+
+
+
+
+
+
+
+
+
+
@@ -40,39 +54,55 @@
Introduction
-
- Metaproxy
- is a standalone program that acts as a universal router, proxy and
- encapsulated metasearcher for information retrieval protocols such
- as Z39.50, and in the future SRU and SRW. To clients, it acts as a
- server of these
- protocols: it can be searched, records can be retrieved from it,
- etc. To servers, it acts as a client: it searches in them,
- retrieves records from them, etc. it satisfies its clients'
- requests by transforming them, multiplexing them, forwarding them
- on to zero or more servers, merging the results, transforming
- them, and delivering them back to the client. In addition, it
- acts as a simple HTTP server; support for further protocols can be
- added in a modular fashion, through the creation of new filters.
-
-
- Anything goes in!
- Anything goes out!
- Cold bananas, fish, pyjamas,
- Mutton, beef and trout!
+
+ Metaproxy
+ is a standalone program that acts as a universal router, proxy and
+ encapsulated metasearcher for information retrieval protocols such
+ as Z39.50, and in the future
+ SRU and SRW.
+ To clients, it acts as a server of these protocols: it can be searched,
+ records can be retrieved from it, etc.
+ To servers, it acts as a client: it searches in them,
+ retrieves records from them, etc. it satisfies its clients'
+ requests by transforming them, multiplexing them, forwarding them
+ on to zero or more servers, merging the results, transforming
+ them, and delivering them back to the client. In addition, it
+ acts as a simple HTTP server; support
+ for further protocols can be added in a modular fashion, through the
+ creation of new filters.
+
+
+ Anything goes in!
+ Anything goes out!
+ Fish, bananas, cold pyjamas,
+ Mutton, beef and trout!
- attributed to Cole Porter.
-
-
- Metaproxy is a more capable alternative to
- YAZ Proxy,
- being more powerful, flexible, configurable and extensible. Among
- its many advantages over the older, more pedestrian work are
- support for multiplexing (encapsulated metasearching), routing by
- database name, authentication and authorisation and serving local
- files via HTTP. Equally significant, its modular architecture
- facilitites the creation of pluggable modules implementing further
- functionality.
-
+
+
+ Metaproxy is a more capable alternative to
+ YAZ Proxy,
+ being more powerful, flexible, configurable and extensible. Among
+ its many advantages over the older, more pedestrian work are
+ support for multiplexing (encapsulated metasearching), routing by
+ database name, authentication and authorisation and serving local
+ files via HTTP. Equally significant, its modular architecture
+ facilitites the creation of pluggable modules implementing further
+ functionality.
+
+
+ This manual will briefly describe Metaproxy's licensing situation
+ before giving an overview of its architecture, then discussing the
+ key concept of a filter in some depth and giving an overview of
+ the various filter types, then discussing the configuration file
+ format. After this come several optional chapters which may be
+ freely skipped: a detailed discussion of virtual databases and
+ multi-database searching, some notes on writing extensions
+ (additional filter types) and a high-level description of the
+ source code. Finally comes the reference guide, which contains
+ instructions for invoking the metaproxy
+ program, and detailed information on each type of filter,
+ including examples.
+
@@ -81,8 +111,8 @@
The Metaproxy Licence
- No decision has yet been made on the terms under which
- Metaproxy will be distributed.
+ No decision has yet been made on the terms under which
+ Metaproxy will be distributed.
It is possible that, unlike
other Index Data products, metaproxy may not be released under a
@@ -95,8 +125,219 @@
+
+ Installation
+
+ Metaproxy depends on the following tools/libraries:
+
+ YAZ++
+
+
+ This is a C++ library based on YAZ.
+
+
+
+ Libxslt
+
+ This is an XSLT processor - based on
+ Libxml2. Both Libxml2 and
+ Libxslt must be installed with the development components
+ (header files, etc.) as well as the run-time libraries.
+
+
+
+ Boost
+
+
+ The popular C++ library. Initial versions of Metaproxy
+ was built with 1.33.0. Version 1.33.1 works too.
+
+
+
+
+
+
+ In order to compile Metaproxy a modern C++ compiler is
+ required. Boost, in particular, requires the C++ compiler
+ to facilitate the newest features. Refer to Boost
+ Compiler Status
+ for more information.
+
+
+ We have succesfully used Metaproxy with Boost using the compilers
+ GCC version 4.0 and
+ Microsoft Visual Studio 2003/2005.
+
+
+ Installation on Unix (from Source)
+
+ Here is a quick step-by-step guide on how to compile all the
+ tools that Metaproxy uses. Only few systems have none of the required
+ tools binary packages. If, for example, Libxml2/libxslt are already
+ installed as development packages use those (and omit compilation).
+
+
+
+ Libxml2/libxslt:
+
+
+ gunzip -c libxml2-version.tar.gz|tar xf -
+ cd libxml2-version
+ ./configure
+ make
+ su
+ make install
+
+
+ gunzip -c libxslt-version.tar.gz|tar xf -
+ cd libxslt-version
+ ./configure
+ make
+ su
+ make install
+
+
+ YAZ/YAZ++:
+
+
+ gunzip -c yaz-version.tar.gz|tar xf -
+ cd yaz-version
+ ./configure
+ make
+ su
+ make install
+
+
+ gunzip -c yazpp-version.tar.gz|tar xf -
+ cd yazpp-version
+ ./configure
+ make
+ su
+ make install
+
+
+ Boost:
+
+
+ gunzip -c boost-version.tar.gz|tar xf -
+ cd boost-version
+ ./configure
+ make
+ su
+ make install
+
+
+ Metaproxy:
+
+
+ gunzip -c metaproxy-version.tar.gz|tar xf -
+ cd metaproxy-version
+ ./configure
+ make
+ su
+ make install
+
+
+
+
+ Installation on Debian
+
+ ### To be written
+
+
+
+ Installation on Windows
+
+ Compilation of Metaproxy can be done using
+ Microsoft Visual Studio.
+ We know Version 2003 works. We expect Version 2005 to
+ work as well.
+
+
+ Boost
+
+ Get Boost from its home page.
+ You also need Boost Jam (an alternative to make).
+ That's also available from this
+ home page. The files download are called something like:
+ boost_1_33-1.exe
+ and
+ boost-jam-3.1.12-1-ntx86.zip.
+ Unpack Boost Jam first. Put bjam.exe
+ in your system path. Make a command prompt and ensure
+ it can be found automatically. If not check the PATH.
+ The Boost .exe is a self-extracting exe with
+ complete source for Boost. Compile that source with
+ Boost Jam (An alternative to Make).
+ The compilation takes a while.
+ By default, the Boost build process puts the resulting
+ libraries + header files in
+ \boost\lib, \boost\include.
+
+
+ For more informatation about installing Boost refer to the
+ getting started
+ pages.
+
+
+
+
+ Libxslt
+
+ Libxslt can be downloaded
+ for Windows from
+ here.
+
+
+ Libxslt has other dependencies, but thes can all be downloaded
+ from the same site. Get the following:
+ iconv, zlib, libxml2, libxslt.
+
+
+
+
+ YAZ
+
+ YAZ can be downloaded
+ for Windows from
+ here.
+
+
+
+
+ YAZ++
+
+ Get YAZ++ as well.
+ Version 1.0 or later is required. For now get it from
+ Index Data's
+ Snapshot area.
+
+
+ YAZ++ includes NMAKE makefiles, similar to those found in the
+ YAZ package.
+
+
+
+
+
+
+
+
The Metaproxy Architecture
@@ -640,6 +881,16 @@
Introductory notes
+
+ Lark's vomit
+
+ This chapter goes into a level of technical detail that is
+ probably not necessary in order to configure and use Metaproxy.
+ It is provided only for those who like to know how things work.
+ You should feel free to skip on to the next section if this one
+ doesn't seem like fun.
+
+
Two of Metaproxy's filters are concerned with multiple-database
operations. Of these, virt_db can work alone
@@ -647,15 +898,119 @@
while multi can work with the output of
virt_db to perform multicast searching, merging
the results into a unified result-set. The interaction between
- these two filters is necessarily complex, reflecting the real
- complexity of multicast searching in a protocol such as Z39.50
- that separates initialisation from searching, with the database to
- search known only during the latter operation.
+ these two filters is necessarily complex: it reflecting the real,
+ irreducible complexity of multicast searching in a protocol such
+ as Z39.50 that separates initialisation from searching, and in
+ which the database to be searched is not known at initialisation
+ time.
+
+
+ Hold on tight - this may get a little hairy.
+
+
+
+
+
+ Virtual databases with the virt_db filter
+
+ In the general course of things, a Z39.50 Init request may carry
+ with it an otherInfo packet of type VAL_PROXY,
+ whose value indicates the address of a Z39.50 server to which the
+ ultimate connection is to be made. (This otherInfo packet is
+ supported by YAZ-based Z39.50 clients and servers, but has not yet
+ been ratified by the Maintenance Agency and so is not widely used
+ in non-Index Data software. We're working on it.)
+ The VAL_PROXY packet functions
+ analogously to the absoluteURI-style Request-URI used with the GET
+ method when a web browser asks a proxy to forward its request: see
+ the
+ Request-URI
+ section of
+ the HTTP 1.1 specification.
+
+
+ The role of the virt_db filter is to rewrite
+ this otherInfo packet dependent on the virtual database that the
+ client wants to search. For example, a virt_db
+ filter could be set up so that searches in the virtual database
+ ``lc'' are forwarded to the Library of Congress server, and
+ searches in the virtual database ``id'' are forwarded to the toy
+ GILS database that Index Data hosts for testing purposes. A
+ virt_db configuration to make this switch would
+ look like this:
+
+
+
+ lc
+ z3950.loc.gov:7090/Voyager
+
+
+ id
+ indexdata.dk/gils
+
+ ]]>
+
+ When Metaproxy receives a Z39.50 Init request from a client, it
+ doesn't immediately forward that request to the back-end server.
+ Why not? Because it doesn't know which
+ back-end server to forward it to until the client sends a search
+ request that specifies the database that it wants to search in.
+ Instead, it just treasures the Init request up in its heart; and,
+ later, the first time the client does a search on one of the
+ specified virtual databases, a connection is forged to the
+ appropriate server and the Init request is forwarded to it. If,
+ later in the session, the same client searches in a different
+ virtual database, then a connection is forged to the server that
+ hosts it, and the same cached Init request is forwarded there,
+ too.
- ### Much, much more to say!
+ All of this clever Init-delaying is done by the
+ frontend_net filter. The
+ virt_db filter knows nothing about it; in
+ fact, because the Init request that is received from the client
+ doesn't get forwarded until a Search reqeust is received, the
+ virt_db filter (and the
+ z3950_client filter behind it) doesn't even get
+ invoked at Init time. The only thing that a
+ virt_db filter ever does is rewrite the
+ VAL_PROXY otherInfo in the requests that pass
+ through it.
+
+
+ A picture is worth a thousand words (but only five hundred on 64-bit architectures)
+
+
+
+
+
+
+
+
+
+
+
+ [Here there should be a diagram showing the progress of
+ packages through the filters during a simple virtual-database
+ search and a multi-database search, but is seems that your
+ toolchain has not been able to include the diagram in this
+ document. This is because of LaTeX suckage. Time to move to
+ OpenOffice. Yes, really.]
+
+
+
+
+
+
@@ -969,6 +1324,5 @@
sgml-parent-document: "main.xml"
sgml-local-catalogs: nil
sgml-namecase-general:t
- nxml-child-indent: 1
End:
-->