1 <?xml version="1.0" standalone="no"?>
2 <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
3 "@DTD_DIR@/docbookx.dtd" [
4 <!ENTITY oslash "ø"> <!-- CIRCLED DIVISION SLASH -->
6 <!-- $Id: yp2.xml.in,v 1.1 2006-02-02 18:22:47 mike Exp $ -->
9 <title>YP2 - User's Guide and Reference</title>
11 <firstname>Mike</firstname><surname>Taylor</surname>
15 <holder>Index Data</holder>
20 YP2 is ... in need of description :-)
27 <chapter id="introduction">
28 <title>Introduction</title>
32 <title>Overview</title>
34 <ulink url="http://indexdata.dk/yp2/">YP2</ulink>
38 ### We should probably consider saying a little more by way of
46 <chapter id="filters">
47 <title>Filters</title>
51 <title>Introductory notes</title>
53 It's useful to think of YP2 as an interpreter providing a small
54 number of primitives and operations, but operating on a very
55 complex data type, namely the ``package''.
58 A package represents a Z39.50 or SRW/U request (whether for Init,
59 Search, Scan, etc.) together with information about where it came
60 from. Packages are created by front-end filters such as
61 <literal>frontend_net</literal> (see below), which reads them from
62 the network; other front-end filters are possible. They then pass
63 along a route consisting of a sequence of filters, each of which
64 transforms the package and may also have side-effects such as
65 generating logging. Eventually, the route will yield a response,
66 which is sent back to the origin.
69 There are many kinds of filter: some that are defined statically
70 as part of YP2, and other that may be provided by third parties
71 and dynamically loaded. They all conform to the same simple API
72 of essentially two methods: <function>configure()</function> is
73 called at startup time, and is passed a DOM tree representing that
74 part of the configuration file that pertains to this filter
75 instance: it is expected to walk that tree extracting relevant
76 information; and <function>process()</function> is called every
77 time the filter has to processes a package.
80 While all filters provide the same API, there are different modes
81 of functionality. Some filters are sources: they create
83 (<literal>frontend_net</literal>);
84 others are sinks: they consume packages and return a result
85 (<literal>z3950_client</literal>,
86 <literal>backend_test</literal>,
87 <literal>http_file</literal>);
88 the others are true filters, that read, process and pass on the
90 (<literal>auth_simple</literal>,
91 <literal>log</literal>,
92 <literal>multi</literal>,
93 <literal>session_shared</literal>,
94 <literal>template</literal>,
95 <literal>virt_db</literal>).
101 <title>Individual filters</title>
103 The filters are here named by the string that is used as the
104 <literal>type</literal> attribute of a
105 <literal><filter></literal> element in the configuration
106 file to request them, with the name of the class that implements
111 <title><literal>auth_simple</literal>
112 (yp2::filter::AuthSimple)</title>
114 Simple authentication and authorisation. The configuration
115 specifies the name of a file that is the user register, which
116 lists <varname>username</varname>:<varname>password</varname>
117 pairs, one per line, colon separated. When a session begins, it
118 is rejected unless username and passsword are supplied, and match
119 a pair in the register.
122 ### discuss authorisation phase
127 <title><literal>backend_test</literal>
128 (yp2::filter::Backend_test)</title>
130 A sink that provides dummy responses in the manner of the
131 <literal>yaz-ztest</literal> Z39.50 server. This is useful only
137 <title><literal>frontend_net</literal>
138 (yp2::filter::FrontendNet)</title>
140 A source that accepts Z39.50 and SRW connections from a port
141 specified in the configuration, reads protocol units, and
142 feeds them into the next filter, eventually returning the
143 result to the origin.
148 <title><literal>http_file</literal>
149 (yp2::filter::HttpFile)</title>
151 A sink that returns the contents of files from the local
152 filesystem in response to HTTP requests. (Yes, Virginia, this
153 does mean that YP2 is also a Web-server in its spare time. So
154 far it does not contain either an email-reader or a Lisp
155 interpreter, but that day is surely coming.)
160 <title><literal>log</literal>
161 (yp2::filter::Log)</title>
163 Writes logging information to standard output, and passes on
164 the package unchanged.
169 <title><literal>multi</literal>
170 (yp2::filter::Multi)</title>
172 Performs multicast searching. See the extended discussion of
173 multi-database searching below.
178 <title><literal>session_shared</literal>
179 (yp2::filter::SessionShared)</title>
181 When this is finished, it will implement global sharing of
182 result sets (i.e. between threads and therefore between
183 clients), but it's not yet done.
188 <title><literal>template</literal>
189 (yp2::filter::Template)</title>
191 Does nothing at all, merely passing the packet on. (Maybe it
192 should be called <literal>nop</literal> or
193 <literal>passthrough</literal>?) This exists not to be used, but
194 to be copied - to become the skeleton of new filters as they are
200 <title><literal>virt_db</literal>
201 (yp2::filter::Virt_db)</title>
203 Performs virtual database selection. See the extended discussion
204 of virtual databases below.
209 <title><literal>z3950_client</literal>
210 (yp2::filter::Z3950Client)</title>
212 Performs Z39.50 searching and retrieval by proxying the
213 packages that are passed to it. Init requests are sent to the
214 address specified in the <literal>VAL_PROXY</literal> otherInfo
215 attached to the request: this may have been specified by client,
216 or generated by a <literal>virt_db</literal> filter earlier in
217 the route. Subsequent requests are sent to the same address,
218 which is remembered at Init time in a Session object.
225 <title>Future directions</title>
227 Some other filters that do not yet exist, but which would be
228 useful, are briefly described. These may be added in future
234 <term><literal>frontend_cli</literal> (source)</term>
237 Command-line interface for generating requests.
242 <term><literal>srw2z3950</literal> (filter)</term>
245 Translate SRW requests into Z39.50 requests.
250 <term><literal>srw_client</literal> (sink)</term>
253 SRW searching and retrieval.
258 <term><literal>sru_client</literal> (sink)</term>
261 SRU searching and retrieval.
266 <term><literal>opensearch_client</literal> (sink)</term>
269 A9 OpenSearch searching and retrieval.
279 <chapter id="configuration">
280 <title>Configuration: the YP2 configuration file format</title>
284 <title>Introductory notes</title>
286 If YP2 is an interpreter providing operations on packages, then
287 its configuration file can be thought of as a program for that
288 interpreter. Configuration is by means of a single file, the name
289 of which is supplied as the sole command-line argument to the
290 <command>yp2</command> program.
293 The configuration files are written in XML. (But that's just an
294 implementation detail - they could just as well have been written
295 in YAML or Lisp-like S-expressions, or in a custom syntax.)
298 Since XML has been chosen, an XML schema,
299 <filename>config.xsd</filename>, is provided for validating
300 configuration files. This file is supplied in the
301 <filename>etc</filename> directory of the YP2 distribution. It
302 can be used by (among other tools) the <command>xmllint</command>
303 program supplied as part of the <literal>libxml2</literal>
307 xmllint --noout --schema etc/config.xsd my-config-file.xml
310 (A recent version of <literal>libxml2</literal> is required, as
311 support for XML Schemas is a relatively recent addition.)
316 <title>Overview of XML structure</title>
318 All elements and attributes are in the namespace
319 <ulink url="http://indexdata.dk/yp2/config/1"/>.
320 This is most easily achieved by setting the default namespace on
321 the top-level element, as here:
324 <yp2 xmlns="http://indexdata.dk/yp2/config/1">
327 The top-level element is <yp2>. This contains a
328 <start> element, a <filters> element and a
329 <routes> element, in that order. <filters> is
330 optional; the other two are mandatory. All three are
334 The <start> element is empty, but carries a
335 <literal>route</literal> attribute, whose value is the name of
336 route at which to start running - analogouse to the name of the
337 start production in a formal grammar.
340 If present, <filters> contains zero or more <filter>
341 elements; filters carry a <literal>type</literal> attribute and
342 contain various elements that provide suitable configuration for
343 filters of that type. The filter-specific elements are described
344 below. Filters defined in this part of the file must carry an
345 <literal>id</literal> attribute so that they can be referenced
349 <routes> contains one or more <route> elements, each
350 of which must carry an <literal>id</literal> element. One of the
351 routes must have the ID value that was specified as the start
352 route in the <start> element's <literal>route</literal>
353 attribute. Each route contains zero or more <filter>
354 elements. These are of two types. They may be empty, but carry a
355 <literal>refid</literal> attribute whose value is the same as the
356 <literal>id</literal> of a filter previously defined in the
357 <filters> section. Alternatively, a route within a filter
358 may omit the <literal>refid</literal> attribute, but contain
359 configuration elements similar to those used for filters defined
360 in the <filters> section.
366 <title>Filter configuration</title>
368 All <filter> elements have in common that they must carry a
369 <literal>type</literal> attribute whose value is one of the
370 supported ones, listed in the schema file and discussed below. In
371 additional, <filters>s occurring the <filters> section
372 must have an <literal>id</literal> attribute, and those occurring
373 within a route must have either a <literal>refid</literal>
374 attribute referencing a previously defined filter or contain its
375 own configuration information.
378 In general, each filter recognises different configuration
379 elements within its element, as each filter has different
380 functionality. These are as follows:
384 <title><literal>auth_simple</literal></title>
386 <filter type="auth_simple">
387 <userRegister>../etc/example.simple-auth</userRegister>
393 <title><literal>backend_test</literal></title>
395 <filter type="backend_test"/>
400 <title><literal>frontend_net</literal></title>
402 <filter type="frontend_net">
403 <threads>10</threads>
404 <port>@:9000</port>
410 <title><literal>http_file</literal></title>
412 <filter type="http_file">
413 <mimetypes>/etc/mime.types</mimetypes>
415 <documentroot>.</documentroot>
416 <prefix>/etc</prefix>
423 <title><literal>log</literal></title>
425 <filter type="log">
426 <message>B</message>
432 <title><literal>multi</literal></title>
434 <filter type="multi"/>
439 <title><literal>session_shared</literal></title>
441 <filter type="session_shared">
448 <title><literal>template</literal></title>
450 <filter type="template"/>
455 <title><literal>virt_db</literal></title>
457 <filter type="virt_db">
459 <database>loc</database>
460 <target>z3950.loc.gov:7090/voyager</target>
463 <database>idgils</database>
464 <target>indexdata.dk/gils</target>
471 <title><literal>z3950_client</literal></title>
473 <filter type="z3950_client">
474 <timeout>30</timeout>
483 <chapter id="multidb">
484 <title>Virtual database as multi-database searching</title>
488 <title>Introductory notes</title>
490 Two of YP2's filters are concerned with multiple-database
491 operations. Of these, <literal>virt_db</literal> can work alone
492 to control the routing of searches to one of a number of servers,
493 while <literal>multi</literal> can work with the output of
494 <literal>virt_db</literal> to perform multicast searching, merging
495 the results into a unified result-set. The interaction between
496 these two filters is necessarily complex, reflecting the real
497 complexity of multicast searching in a protocol such as Z39.50
498 that separates initialisation from searching, with the database to
499 search known only during the latter operation.
502 ### Much, much more to say!
509 <chapter id="classes">
510 <title>Classes in the YP2 source code</title>
514 <title>Introductory notes</title>
516 <emphasis>Stop! Do not read this!</emphasis>
517 You won't enjoy it at all.
520 This chapter contains documentation of the YP2 source code, and is
521 of interest only to maintainers and developers. If you need to
522 change YP2's behaviour or write a new filter, then you will most
523 likely find this chapter helpful. Otherwise it's a waste of your
524 good time. Seriously: go and watch a film or something.
525 <citetitle>This is Spinal Tap</citetitle> is particularly good.
528 Still here? OK, let's continue.
531 In general, classes seem to be named big-endianly, so that
532 <literal>FactoryFilter</literal> is not a filter that filters
533 factories, but a factory that produces filters; and
534 <literal>FactoryStatic</literal> is a factory for the statically
535 registered filters (as opposed to those that are dynamically
541 <title>Individual classes</title>
543 The classes making up the YP2 application are here listed by
544 class-name, with the names of the source files that define them in
549 <title><literal>yp::FactoryFilter</literal>
550 (<filename>factory_filter.cpp</filename>)</title>
552 A factory class that exists primarily to provide the
553 <literal>create()</literal> method, which takes the name of a
554 filter class as its argument and returns a new filter of that
555 type. To enable this, the factory must first be populated by
556 calling <literal>add_creator()</literal> for static filters (this
557 is done by the <literal>FactoryStatic</literal> class, see below)
558 and <literal>add_creator_dyn()</literal> for filters loaded
564 <title><literal>yp2::FactoryStatic</literal>
565 (<filename>factory_static.cpp</filename>)</title>
567 A subclass of <literal>FactoryFilter</literal> which is
568 responsible for registering all the statically defined filter
569 types. It does this by knowing about all those filters'
570 structures, which are listed in its constructor. Merely
571 instantiating this class registers all the static classes. It is
572 for the benefit of this class that <literal>struct
573 yp2_filter_struct</literal> exists, and that all the filter
574 classes provide a static object of that type.
579 <title><literal>yp2::filter::Base</literal>
580 (<filename>filter.cpp</filename>)</title>
582 The virtual base class of all filters. The filter API is, on the
583 surface at least, extremely simple: two methods.
584 <literal>configure()</literal> is passed a DOM tree representing
585 that part of the configuration file that pertains to this filter
586 instance, and is expected to walk that tree extracting relevant
587 information. And <literal>process()</literal> processes a
588 package (see below). That surface simplicitly is a bit
589 misleading, as <literal>process()</literal> needs to know a lot
590 about the <literal>Package</literal> class in order to do
596 <title><literal>yp2::filter::AuthSimple</literal>,
597 <literal>Backend_test</literal>, etc.
598 (<filename>filter_auth_simple.cpp</filename>,
599 <filename>filter_backend_test.cpp</filename>, etc.)</title>
601 Individual filters. Each of these is implemented by a header and
602 a source file, named <filename>filter_*.hpp</filename> and
603 <filename>filter_*.cpp</filename> respectively. All the header
604 files should be pretty much identical, in that they declare the
605 class, including a private <literal>Rep</literal> class and a
606 member pointer to it, and the two public methods. The only extra
607 information in any filter header is additional private types and
608 members (which should really all be in the <literal>Rep</literal>
609 anyway) and private methods (which should also remain known only
610 to the source file, but C++'s brain-damaged design requires this
611 dirty laundry to be exhibited in public. Thanks, Bjarne!)
614 The source file for each filter needs to supply:
619 A definition of the private <literal>Rep</literal> class.
624 Some boilerplate constructors and destructors.
629 A <literal>configure()</literal> method that uses the
630 appropriate XML fragment.
635 Most important, the <literal>process()</literal> method that
636 does all the actual work.
643 <title><literal>yp2::Package</literal>
644 (<filename>package.cpp</filename>)</title>
646 Represents a package on its way through the series of filters
647 that make up a route. This is essentially a Z39.50 or SRU APDU
648 together with information about where it came from, which is
649 modified as it passes through the various filters.
654 <title><literal>yp2::Pipe</literal>
655 (<filename>pipe.cpp</filename>)</title>
657 This class provides a compatibility layer so that we have an IPC
658 mechanism that works the same under Unix and Windows. It's not
659 particularly exciting.
664 <title><literal>yp2::RouterChain</literal>
665 (<filename>router_chain.cpp</filename>)</title>
672 <title><literal>yp2::RouterFleXML</literal>
673 (<filename>router_flexml.cpp</filename>)</title>
680 <title><literal>yp2::Session</literal>
681 (<filename>session.cpp</filename>)</title>
688 <title><literal>yp2::ThreadPoolSocketObserver</literal>
689 (<filename>thread_pool_observer.cpp</filename>)</title>
696 <title><literal>yp2::util</literal>
697 (<filename>util.cpp</filename>)</title>
699 A namespace of various small utility functions and classes,
700 collected together for convenience. Most importantly, includes
701 the <literal>yp2::util::odr</literal> class, a wrapper for YAZ's
707 <title><literal>yp2::xml</literal>
708 (<filename>xmlutil.cpp</filename>)</title>
710 A namespace of various XML utility functions and classes,
711 collected together for convenience.
718 <title>Other Source Files</title>
720 In addition to the YP2 source files that define the classes
721 described above, there are a few additional files which are
722 briefly described here:
726 <term><literal>yp2_prog.cpp</literal></term>
729 The main function of the <command>yp2</command> program.
734 <term><literal>ex_router_flexml.cpp</literal></term>
737 Identical to <literal>yp2_prog.cpp</literal>: it's not clear why.
742 <term><literal>test_*.cpp</literal></term>
745 Unit-tests for various modules.
751 ### Still to be described:
752 <literal>ex_filter_frontend_net.cpp</literal>,
753 <literal>filter_dl.cpp</literal>,
754 <literal>plainfile.cpp</literal>,
755 <literal>tstdl.cpp</literal>.
764 <!-- This is just a lame way to get some vertical whitespace at
765 the end of the document -->
775 <!-- Keep this comment at the end of the file
780 sgml-minimize-attributes:nil
781 sgml-always-quote-attributes:t
784 sgml-parent-document:nil
785 sgml-local-catalogs: nil
786 sgml-namecase-general:t