1 <!DOCTYPE refentry PUBLIC "-//OASIS//DTD DocBook V4.4//EN"
2 "http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd" [
3 <!ENTITY copyright SYSTEM "copyright.xml">
4 <!ENTITY % idcommon SYSTEM "common/common.ent">
7 <refentry id="ref-zoom">
9 <productname>Metaproxy</productname>
10 <info><orgname>Index Data</orgname></info>
14 <refentrytitle>zoom</refentrytitle>
15 <manvolnum>3mp</manvolnum>
16 <refmiscinfo class="manual">Metaproxy Module</refmiscinfo>
20 <refname>zoom</refname>
21 <refpurpose>Metaproxy ZOOM Module</refpurpose>
25 <title>DESCRIPTION</title>
27 This filter implements a generic client based on
28 <ulink url="&url.yaz.zoom;">ZOOM</ulink> of YAZ.
29 The client implements the protocols that ZOOM C does: Z39.50, SRU
30 (GET, POST, SOAP) and SOLR .
34 This filter only deals with Z39.50 on input. The following services
35 are supported: init, search, present and close. The backend target
36 is selected based on the database as part search and
37 <emphasis>not</emphasis> as part of init.
41 This filter is an alternative to the z3950_client filter but also
42 shares properties of the virt_db - in that the target is selected
43 for a specific database
47 The ZOOM filter relies on a target profile description, which is
48 XML based. It picks the profile for a given database from a web service
49 or it may be locally given for each unique database (AKA virtual database
50 in virt_db). Target profiles are directly and indrectly given as part
51 of the <literal>torus</literal> element in the configuration.
57 <title>CONFIGURATION</title>
59 The configuration consists of six parts: <literal>torus</literal>,
60 <literal>fieldmap</literal>, <literal>cclmap</literal>,
61 <literal>contentProxy</literal>, <literal>log</literal>
62 and <literal>zoom</literal>.
67 The <literal>torus</literal> element specifies target profiles
68 and takes the following content:
72 <term>attribute <literal>url</literal></term>
75 URL of Web service to be used when fetch target profiles from
76 a remote service (Torus normally).
79 The sequence <literal>%query</literal> is replaced with a CQL
80 query for the Torus search.
83 The special sequence <literal>%realm</literal> is replaced by value
84 of attribute <literal>realm</literal> or by realm DATABASE argument.
87 The special sequence <literal>%db</literal> is replaced with
88 a single database while searching. Note that this sequence
89 is no longer needed, because the <literal>%query</literal> can already
90 query for a single database by using CQL query
91 <literal>udb==...</literal>.
96 <term>attribute <literal>content_url</literal></term>
99 URL of Web service to be used to fetch target profile
100 for a given database (udb) of type content. Semantics otherwise like
101 <literal>url</literal> attribute above.
105 <varlistentry id="auth_url">
106 <term>attribute <literal>auth_url</literal></term>
109 URL of Web service to be used to for auth/IP lookup. If this
110 defined, all access is granted or denied as part of Z39.50 Init
111 by the ZOOM module and the use of database parameters realm and
112 torus_url is not allowed. If this setting is not defined,
113 all access is allowed and realm and/or torus_url may be used.
118 <term>attribute <literal>realm</literal></term>
121 The default realm value. Used for %realm in URL, unless
122 specified in DATABASE parameter.
127 <term>attribute <literal>proxy</literal></term>
130 HTTP proxy to bse used for fetching target profiles.
135 <term>attribute <literal>xsldir</literal></term>
138 Directory that is searched for XSL stylesheets. Stylesheets
139 are specified in the target profile by the
140 <literal>transform</literal> element.
145 <term>attribute <literal>element_transform</literal></term>
148 Specifies the element that triggers retrieval and transform using
149 the parameters elementSet, recordEncoding, requestSyntax, transform
150 from the target profile. Default value
151 is "pz2", due to the fact that for historical reasons the
152 common format is that used in Pazpar2.
157 <term>attribute <literal>element_raw</literal></term>
160 Specifies an element that triggers retrieval using the
161 parameters elementSet, recordEncoding, requestSyntax from the
162 target profile. Same actions as for element_transform, but without
163 the XSL transform. Useful for debugging.
164 The default value is "raw".
169 <term>attribute <literal>explain_xsl</literal></term>
172 Specifies a stylesheet that converts one or more Torus records
173 to ZeeExplain records. The content of recordData is assumed to be
174 holding each Explain record.
179 <term>attribute <literal>record_xsl</literal></term>
182 Specifies a stylesheet that converts retrieval records after
183 transform/literal operations.
186 When Metaproxy creates a content proxy session, the XSL parameter
187 <literal>cproxyhost</literal> is passed to the transform.
192 <term>element <literal>records</literal></term>
195 Local target profiles. This element may includes zero or
196 more <literal>record</literal> elements (one per target
197 profile). See section TARGET PROFILE.
203 <refsect2 id="fieldmap">
204 <title>fieldmap</title>
206 The <literal>fieldmap</literal> may be specified zero or more times and
207 specifies the map from CQL fields to CCL fields and takes the
212 <term>attribute <literal>cql</literal></term>
215 CQL field that we are mapping "from".
220 <term>attribute <literal>ccl</literal></term>
223 CCL field that we are mapping "to".
229 <refsect2 id="cclmap_base">
230 <title>cclmap</title>
232 The third part of the configuration consists of zero or more
233 <literal>cclmap</literal> elements that specifies
234 <emphasis>base</emphasis> CCL profile to be used for all targets.
235 This configuration, thus, will be combined with cclmap-definitions
236 from the target profile.
240 <title>contentProxy</title>
242 The <literal>contentProxy</literal> element controls content proxy'in.
244 is optional and must only be defined if content proxy'ing is enabled.
248 <term>attribute <literal>config_file</literal></term>
251 Specifies the file that configures the cf-proxy system. Metaproxy
252 uses setting <literal>sessiondir</literal> and
253 <literal>proxyhostname</literal> from that file to configure
254 name of proxy host and directory of parameter files for the cf-proxy.
259 <term>attribute <literal>server</literal></term>
262 Specifies the content proxy host. The host is of the form
263 host[:port]. That is without a method (such as HTTP) and optional
268 This setting is deprecated. Use the config_file (above)
269 to inform about the proxy server.
275 <term>attribute <literal>tmp_file</literal></term>
278 Specifies a filename of a session file for content proxy'ing. The
279 file should be an absolute filename that includes
280 <literal>XXXXXX</literal> which is replaced by a unique filename
281 using the mkstemp(3) system call. The default value of this
282 setting is <literal>/tmp/cf.XXXXXX.p</literal>.
286 This setting is deprecated. Use the config_file (above)
287 to inform about the session file area.
297 The <literal>log</literal> element controls logging for the
302 <term>attribute <literal>apdu</literal></term>
305 If the value of apdu is "true", then protocol packages
306 (APDUs and HTTP packages) from the ZOOM filter will be
307 logged to the yaz_log system. A value of "false" will
308 not perform logging of protocol packages (the default
319 The <literal>zoom</literal> element controls settings for the
324 <term>attribute <literal>timeout</literal></term>
327 Is an integer that specifies, in seconds, how long an operation
328 may take before ZOOM gives up. Default value is 40.
333 <term>attribute <literal>proxy_timeout</literal></term>
336 Is an integer that specifies, in seconds, how long an operation
337 a proxy check will wait before giving up. Default value is 1.
346 <title>QUERY HANDLING</title>
348 The ZOOM filter accepts three query types: RPN(Type-1), CCL and
352 Queries are converted in two separate steps. In the first step
353 the input query is converted to RPN/Type-1. This is always
354 the common internal format between step 1 and step 2.
355 In step 2 the query is converted to the native query type of the target.
358 Step 1: for RPN, the query is passed un-modified to the target.
361 Step 1: for CCL, the query is converted to RPN via
362 <link linkend="cclmap"><literal>cclmap</literal></link> elements part of
363 the target profile as well as
364 <link linkend="cclmap_base">base CCL maps</link>.
367 Step 1: For CQL, the query is converted to CCL. The mappings of
368 CQL fields to CCL fields are handled by
369 <link linkend="fieldmap"><literal>fieldmap</literal></link>
370 elements as part of the target profile. The resulting query, CCL,
371 is the converted to RPN using the schema mentioned earlier (via
372 <literal>cclmap</literal>).
375 Step 2: If the target is Z39.50-based, it is passed verbatim (RPN).
376 If the target is SRU-based, the RPN will be converted to CQL.
377 If the target is SOLR-based, the RPN will be converted to SOLR's query
383 <title>SORTING</title>
385 The ZOOM module actively handle CQL sorting - using the SORTBY parameter
386 which was introduced in SRU version 1.2. The conversion from SORTBY clause
387 to native sort for some target is driven by the two parameters:
388 <link linkend="sortStrategy"><literal>sortStrategy</literal></link>
389 and <link linkend="sortmap"><literal>sortmap_</literal><replaceable>field</replaceable></link>.
392 If a sort field that does not have an equivalent
393 <literal>sortmap_</literal>-mapping is passed un-modified through the
394 conversion. It doesn't throw a diagnostic.
399 <title>TARGET PROFILE</title>
401 The ZOOM module is driven by a number of settings that specifies how
402 to handle each target.
403 Note that unknown elements are silently <emphasis>ignored</emphasis>.
406 The elements, in alphabetical order, are:
410 <term id="zoom-torus-authentication">authentication</term><listitem>
412 Authentication parameters to be sent to the target. For
413 Z39.50 targets, this will be sent as part of the
414 Init Request. Authentication consists of two components: username
415 and password, separated by a slash.
418 If this value is omitted or empty no authentication information is sent.
423 <varlistentry id="cclmap">
424 <term>cclmap_<replaceable>field</replaceable></term><listitem>
426 This value specifies CCL field (qualifier) definition for some
427 field. For Z39.50 targets this most likely will specify the
428 mapping to a numeric use attribute + a structure attribute.
429 For SRU targets, the use attribute should be string based, in
430 order to make the RPN to CQL conversion work properly (step 2).
436 <term>cfAuth</term><listitem>
438 When cfAuth is defined, its value will be used as authentication
439 to backend target and authentication setting will be specified
440 as part of a database. This is like a "proxy" for authentication and
441 is used for Connector Framework based targets.
447 <term id="zoom-torus-cfproxy">cfProxy</term><listitem>
449 Specifies HTTP proxy for the target in the form
450 <replaceable>host</replaceable>:<replaceable>port</replaceable>.
456 <term>cfSubDB</term><listitem>
458 Specifies sub database for a Connector Framework based target.
463 <varlistentry id="zoom-torus-contentConnector">
464 <term>contentConnector</term><listitem>
466 Specifies a database for content-based proxy'ing.
472 <term>elementSet</term><listitem>
474 Specifies the elementSet to be sent to the target if record
475 transform is enabled (not to be confused' with the record_transform
476 module). The record transform is enabled only if the client uses
477 record syntax = XML and a element set determined by
478 the <literal>element_transform</literal> /
479 <literal>element_raw</literal> from the configuration.
480 By default that is the element sets <literal>pz2</literal>
481 and <literal>raw</literal>.
482 If record transform is not enabled, this setting is
483 not used and the element set specified by the client
490 <term>literalTransform</term><listitem>
492 Specifies a XSL stylesheet to be used if record
493 transform is anabled; see description of elementSet.
494 The XSL transform is only used if the element set is set to the
495 value of <literal>element_transform</literal> in the configuration.
498 The value of literalTransform is the XSL - string encoded.
504 <term>piggyback</term><listitem>
506 A value of 1/true is a hint to the ZOOM module that this Z39.50
507 target supports piggyback searches, ie Search Response with
508 records. Any other value (false) will prevent the ZOOM module
509 to make use of piggyback (all records part of Present Response).
515 <term>queryEncoding</term><listitem>
517 If this value is defined, all queries will be converted
518 to this encoding. This should be used for all Z39.50 targets that
519 do not use UTF-8 for query terms.
525 <term>recordEncoding</term><listitem>
527 Specifies the character encoding of records that are returned
528 by the target. This is primarily used for targets were records
529 are not UTF-8 encoded already. This setting is only used
530 if the record transform is enabled (see description of elementSet).
536 <term>requestSyntax</term><listitem>
538 Specifies the record syntax to be specified for the target
539 if record transform is enabled; see description of elementSet.
540 If record transform is not enabled, the record syntax of the
541 client is passed verbatim to the target.
546 <varlistentry id="sortmap">
547 <term>sortmap_<replaceable>field</replaceable></term><listitem>
549 This value the native field for a target. The form of the value is
550 given by <link linkend="sortStrategy">sortStrategy</link>.
555 <varlistentry id="sortStrategy">
556 <term>sortStrategy</term><listitem>
558 Specifies sort strategy for a target. One of:
559 <literal>z3950</literal>, <literal>type7</literal>,
560 <literal>cql</literal>, <literal>sru11</literal> or
561 <literal>embed</literal>. The <literal>embed</literal> chooses type-7
562 or CQL sortby depending on whether Type-1 or CQL is
563 actually sent to the target.
569 <term>sru</term><listitem>
571 If this setting is set, it specifies that the target is web service
572 based and must be one of : <literal>get</literal>,
573 <literal>post</literal>, <literal>soap</literal>
574 or <literal>solr</literal>.
580 <term>sruVersion</term><listitem>
582 Specifies the SRU version to use. It unset, version 1.2 will be
583 used. Some servers do not support this version, in which case
584 version 1.1 or even 1.0 could be set it.
590 <term>transform</term><listitem>
592 Specifies a XSL stylesheet filename to be used if record
593 transform is anabled; see description of elementSet.
594 The XSL transform is only used if the element set is set to the
595 value of <literal>element_transform</literal> in the configuration.
601 <term>udb</term><listitem>
603 This value is required and specifies the unique database for
604 this profile . All target profiles should hold a unique database.
609 <varlistentry id="urlRecipe">
610 <term>urlRecipe</term><listitem>
612 The value of this field is a string that generates a dynamic link
613 based on record content. If the resulting string is non-zero in length
614 a new field, <literal>metadata</literal> with attribute
615 <literal>type="generated-url"</literal> is generated.
616 The contents of this field is the result of the URL recipe conversion.
617 The urlRecipe value may refer to an existing metadata element by
618 ${field[pattern/result/flags]}, which will take content
619 of field and perform a regular expression conversion using the pattern
620 given. For example: <literal>${md-title[\s+/+/g]}</literal> takes
621 metadata element <literal>title</literal> and converts one or more
622 spaces to a plus character.
628 <term>zurl</term><listitem>
630 This is setting is mandatory and specifies the ZURL of the
631 target in the form of host/database. The HTTP method should
632 not be provided as this is guessed from the "sru" attribute value.
639 <title>DATABASE parameters</title>
641 Extra information may be carried in the Z39.50 Database or SRU path,
642 such as authentication to be passed to backend etc. Some of
643 the parameters override TARGET profile values. The format is
646 udb,parm1=value1&parm2=value2&...
649 Where udb is the unique database recognised by the backend and parm1,
650 value1, .. are parameters to be passed. The following describes the
651 supported parameters. Like form values in HTTP the parameters and
652 values are URL encoded. The separator, though, between udb and parameters
653 is a comma rather than a question mark. What follows question mark are
654 HTTP arguments (in this case SRU arguments).
658 <term>content-password</term>
661 The password to be used for content proxy session. If this parameter
662 is not given, value of parameter <literal>password</literal> is passed
663 to content proxy session.
668 <term>content-proxy</term>
671 Specifies proxy to be used for content proxy session. If this parameter
672 is not given, value of parameter <literal>proxy</literal> is passed
673 to content proxy session.
678 <term>content-user</term>
681 The user to be used for content proxy session. If this parameter
682 is not given, value of parameter <literal>user</literal> is passed
683 to content proxy session.
688 <term>cproxysession</term>
691 Specifies the session ID for content proxy. This parameter is, generally,
692 not used by anything but the content proxy itself when invoking
698 <term>nocproxy</term>
701 If this parameter is specified, content-proxying is disabled
707 <term>password</term>
710 Specifies password to be passed to backend. It is also passed
711 to content proxy session unless overriden by content-password.
712 If this parameter is omitted, the password will be taken from
713 TARGET profile setting
714 <link linkend="zoom-torus-authentication">
715 <literal>authentication</literal>
725 Specifies one or more proxies for backend. If this parameter is
726 omitted, the proxy will be taken from TARGET profile setting
727 <link linkend="zoom-torus-cfproxy">
728 <literal>cfProxy</literal></link>.
729 The parameter is a list of comma-separated host:port entries.
730 Bost host and port must be given for each proxy.
738 Session realm to be used for this target, changed the resulting
739 URL to be used for getting a target profile, by changing the
740 value that gets substituted for the %realm string. This parameter
741 is not allowed if access is controlled by
742 <link linkend="auth_url">auth_url</link>
749 <term>torus_url</term>
752 Sets the URL to be used for Torus records fetch - overriding value
753 of <literal>url</literal> attribute of element <literal>torus</literal>
754 in zoom configuration. This parameter is not allowed if access is
756 <link linkend="auth_url">auth_url</link> in configuration.
765 Specifies user to be passed to backend. It is also passed
766 to content proxy session unless overriden by content-user.
767 If this parameter is omitted, the user will be taken from TARGET
769 <link linkend="zoom-torus-authentication">
770 <literal>authentication</literal>
780 All parameters that has prefix x, dash are passed verbatim
788 <title>SCHEMA</title>
789 <literallayout><xi:include
790 xi:href="../xml/schema/filter_zoom.rnc"
792 xmlns:xi="http://www.w3.org/2001/XInclude" />
797 <title>EXAMPLES</title>
799 In example below Target definitions (Torus records) are fetched
800 from a web service via a proxy. A CQL profile is configured which
801 maps to a set of CCL fields ("no field", au, tu and su). Presumably
802 the target definitions fetched maps the CCL to their native RPN.
803 A CCL "ocn" is mapped for all targets. Logging of APDUs are enabled,
804 and a timeout is given.
808 url="http://torus.indexdata.com/src/records/?query=%query"
809 proxy="localhost:3128"
811 <fieldmap cql="cql.anywhere"/>
812 <fieldmap cql="cql.serverChoice"/>
813 <fieldmap cql="dc.creator" ccl="au"/>
814 <fieldmap cql="dc.title" ccl="ti"/>
815 <fieldmap cql="dc.subject" ccl="su"/>
819 <attr type="u" value="12"/>
820 <attr type="s" value="107"/>
831 Here is another example with two locally defined targets: A
832 Solr target and a Z39.50 target.
840 <cclmap_term>t=z</cclmap_term>
841 <cclmap_ti>u=title t=z</cclmap_ti>
843 <zurl>ocs-test.indexdata.com/solr/select</zurl>
847 <cclmap_term>t=l,r</cclmap_term>
848 <cclmap_ti>u=4 t=l,r</cclmap_ti>
849 <zurl>z3950.loc.gov:7090/voyager</zurl>
853 <fieldmap cql="cql.serverChoice"/>
854 <fieldmap cql="dc.title" ccl="ti"/>
862 <title>SEE ALSO</title>
865 <refentrytitle>metaproxy</refentrytitle>
866 <manvolnum>1</manvolnum>
871 <refentrytitle>virt_db</refentrytitle>
872 <manvolnum>3mp</manvolnum>
880 <!-- Keep this comment at the end of the file