1 <!DOCTYPE refentry PUBLIC "-//OASIS//DTD DocBook V4.4//EN"
2 "http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd" [
3 <!ENTITY copyright SYSTEM "copyright.xml">
4 <!ENTITY % idcommon SYSTEM "common/common.ent">
7 <refentry id="ref-zoom">
9 <productname>Metaproxy</productname>
10 <info><orgname>Index Data</orgname></info>
14 <refentrytitle>zoom</refentrytitle>
15 <manvolnum>3mp</manvolnum>
16 <refmiscinfo class="manual">Metaproxy Module</refmiscinfo>
20 <refname>zoom</refname>
21 <refpurpose>Metaproxy ZOOM Module</refpurpose>
25 <title>DESCRIPTION</title>
27 This filter implements a generic client based on
28 <ulink url="&url.yaz.zoom;">ZOOM</ulink> of YAZ.
29 The client implements the protocols that ZOOM C does: Z39.50, SRU
30 (GET, POST, SOAP) and Solr .
34 This filter only deals with Z39.50 on input. The following services
35 are supported: init, search, present and close. The backend target
36 is selected based on the database as part search and
37 <emphasis>not</emphasis> as part of init.
41 This filter is an alternative to the z3950_client filter but also
42 shares properties of the virt_db - in that the target is selected
43 for a specific database
47 The ZOOM filter relies on a target profile description, which is
48 XML based. It picks the profile for a given database from a web service
49 or it may be locally given for each unique database (AKA virtual database
50 in virt_db). Target profiles are directly and indrectly given as part
51 of the <literal>torus</literal> element in the configuration.
57 <title>CONFIGURATION</title>
59 The configuration consists of six parts: <literal>torus</literal>,
60 <literal>fieldmap</literal>, <literal>cclmap</literal>,
61 <literal>contentProxy</literal>, <literal>log</literal>
62 and <literal>zoom</literal>.
67 The <literal>torus</literal> element specifies target profiles
68 and takes the following content:
72 <term>attribute <literal>url</literal></term>
75 URL of Web service to be used when fetch target profiles from
76 a remote service (Torus normally).
79 The sequence <literal>%query</literal> is replaced with a CQL
80 query for the Torus search.
83 The special sequence <literal>%realm</literal> is replaced by value
84 of attribute <literal>realm</literal> or by realm DATABASE argument.
87 The special sequence <literal>%db</literal> is replaced with
88 a single database while searching. Note that this sequence
89 is no longer needed, because the <literal>%query</literal> can already
90 query for a single database by using CQL query
91 <literal>udb==...</literal>.
96 <term>attribute <literal>content_url</literal></term>
99 URL of Web service to be used to fetch target profile
100 for a given database (udb) of type content. Semantics otherwise like
101 <literal>url</literal> attribute above.
105 <varlistentry id="auth_url">
106 <term>attribute <literal>auth_url</literal></term>
109 URL of Web service to be used to for auth/IP lookup. If this
110 defined, all access is granted or denied as part of Z39.50 Init
111 by the ZOOM module and the use of database parameters realm and
112 torus_url is not allowed. If this setting is not defined,
113 all access is allowed and realm and/or torus_url may be used.
118 <term>attribute <literal>realm</literal></term>
121 The default realm value. Used for %realm in URL, unless
122 specified in DATABASE parameter.
127 <term>attribute <literal>proxy</literal></term>
130 HTTP proxy to bse used for fetching target profiles.
135 <term>attribute <literal>xsldir</literal></term>
138 Directory that is searched for XSL stylesheets. Stylesheets
139 are specified in the target profile by the
140 <literal>transform</literal> element.
145 <term>attribute <literal>element_transform</literal></term>
148 Specifies the element that triggers retrieval and transform using
149 the parameters elementSet, recordEncoding, requestSyntax, transform
150 from the target profile. Default value
151 is "pz2", due to the fact that for historical reasons the
152 common format is that used in Pazpar2.
157 <term>attribute <literal>element_raw</literal></term>
160 Specifies an element that triggers retrieval using the
161 parameters elementSet, recordEncoding, requestSyntax from the
162 target profile. Same actions as for element_transform, but without
163 the XSL transform. Useful for debugging.
164 The default value is "raw".
169 <term>attribute <literal>explain_xsl</literal></term>
172 Specifies a stylesheet that converts one or more Torus records
173 to ZeeExplain records. The content of recordData is assumed to be
174 holding each Explain record.
179 <term>attribute <literal>record_xsl</literal></term>
182 Specifies a stylesheet that converts retrieval records after
183 transform/literal operations.
186 When Metaproxy creates a content proxy session, the XSL parameter
187 <literal>cproxyhost</literal> is passed to the transform.
192 <term>element <literal>records</literal></term>
195 Local target profiles. This element may includes zero or
196 more <literal>record</literal> elements (one per target
197 profile). See section TARGET PROFILE.
203 <refsect2 id="fieldmap">
204 <title>fieldmap</title>
206 The <literal>fieldmap</literal> may be specified zero or more times and
207 specifies the map from CQL fields to CCL fields and takes the
212 <term>attribute <literal>cql</literal></term>
215 CQL field that we are mapping "from".
220 <term>attribute <literal>ccl</literal></term>
223 CCL field that we are mapping "to".
229 <refsect2 id="cclmap_base">
230 <title>cclmap</title>
232 The third part of the configuration consists of zero or more
233 <literal>cclmap</literal> elements that specifies
234 <emphasis>base</emphasis> CCL profile to be used for all targets.
235 This configuration, thus, will be combined with cclmap-definitions
236 from the target profile.
240 <title>contentProxy</title>
242 The <literal>contentProxy</literal> element controls content proxy'in.
244 is optional and must only be defined if content proxy'ing is enabled.
248 <term>attribute <literal>config_file</literal></term>
251 Specifies the file that configures the cf-proxy system. Metaproxy
252 uses setting <literal>sessiondir</literal> and
253 <literal>proxyhostname</literal> from that file to configure
254 name of proxy host and directory of parameter files for the cf-proxy.
259 <term>attribute <literal>server</literal></term>
262 Specifies the content proxy host. The host is of the form
263 host[:port]. That is without a method (such as HTTP) and optional
268 This setting is deprecated. Use the config_file (above)
269 to inform about the proxy server.
275 <term>attribute <literal>tmp_file</literal></term>
278 Specifies a filename of a session file for content proxy'ing. The
279 file should be an absolute filename that includes
280 <literal>XXXXXX</literal> which is replaced by a unique filename
281 using the mkstemp(3) system call. The default value of this
282 setting is <literal>/tmp/cf.XXXXXX.p</literal>.
286 This setting is deprecated. Use the config_file (above)
287 to inform about the session file area.
297 The <literal>log</literal> element controls logging for the
302 <term>attribute <literal>apdu</literal></term>
305 If the value of apdu is "true", then protocol packages
306 (APDUs and HTTP packages) from the ZOOM filter will be
307 logged to the yaz_log system. A value of "false" will
308 not perform logging of protocol packages (the default
319 The <literal>zoom</literal> element controls settings for the
324 <term>attribute <literal>timeout</literal></term>
327 Is an integer that specifies, in seconds, how long an operation
328 may take before ZOOM gives up. Default value is 40.
333 <term>attribute <literal>proxy_timeout</literal></term>
336 Is an integer that specifies, in seconds, how long an operation
337 a proxy check will wait before giving up. Default value is 1.
346 <title>QUERY HANDLING</title>
348 The ZOOM filter accepts three query types: RPN(Type-1), CCL and
352 Queries are converted in two separate steps. In the first step
353 the input query is converted to RPN/Type-1. This is always
354 the common internal format between step 1 and step 2.
355 In step 2 the query is converted to the native query type of the target.
358 Step 1: for RPN, the query is passed un-modified to the target.
361 Step 1: for CCL, the query is converted to RPN via
362 <link linkend="zoom-torus-cclmap"><literal>cclmap</literal></link>
364 the target profile as well as
365 <link linkend="cclmap_base">base CCL maps</link>.
368 Step 1: For CQL, the query is converted to CCL. The mappings of
369 CQL fields to CCL fields are handled by
370 <link linkend="fieldmap"><literal>fieldmap</literal></link>
371 elements as part of the target profile. The resulting query, CCL,
372 is the converted to RPN using the schema mentioned earlier (via
373 <literal>cclmap</literal>).
376 Step 2: If the target is Z39.50-based, it is passed verbatim (RPN).
377 If the target is SRU-based, the RPN will be converted to CQL.
378 If the target is Solr-based, the RPN will be converted to Solr's query
384 <title>SORTING</title>
386 The ZOOM module actively handle CQL sorting - using the SORTBY parameter
387 which was introduced in SRU version 1.2. The conversion from SORTBY clause
388 to native sort for some target is driven by the two parameters:
389 <link linkend="zoom-torus-sortStrategy">
390 <literal>sortStrategy</literal>
392 and <link linkend="zoom-torus-sortmap">
393 <literal>sortmap_</literal><replaceable>field</replaceable>
397 If a sort field that does not have an equivalent
398 <literal>sortmap_</literal>-mapping is passed un-modified through the
399 conversion. It doesn't throw a diagnostic.
404 <title>TARGET PROFILE</title>
406 The ZOOM module is driven by a number of settings that specifies how
407 to handle each target.
408 Note that unknown elements are silently <emphasis>ignored</emphasis>.
411 The elements, in alphabetical order, are:
415 <term id="zoom-torus-authentication">authentication</term><listitem>
417 Authentication parameters to be sent to the target. For
418 Z39.50 targets, this will be sent as part of the
419 Init Request. Authentication consists of two components: username
420 and password, separated by a slash.
423 If this value is omitted or empty no authentication information is sent.
429 <term id="zoom-torus-authenticationMode">authenticationMode</term><listitem>
431 Specifies how authentication parameters are passed to server
432 for SRU. Possible values are: <literal>url</literal>
433 and <literal>basic</literal>. For the url mode username and password
434 are carried in URL arguments x-username and x-password.
435 For the basic mode, HTTP basic authentication is used.
436 The settings only takes effect
437 if <link linkend="zoom-torus-authentication">authentication</link>
441 If this value is omitted HTTP basic authencation is used.
446 <varlistentry id="zoom-torus-cclmap">
447 <term>cclmap_<replaceable>field</replaceable></term><listitem>
449 This value specifies CCL field (qualifier) definition for some
450 field. For Z39.50 targets this most likely will specify the
451 mapping to a numeric use attribute + a structure attribute.
452 For SRU targets, the use attribute should be string based, in
453 order to make the RPN to CQL conversion work properly (step 2).
459 <term>cfAuth</term><listitem>
461 When cfAuth is defined, its value will be used as authentication
462 to backend target and authentication setting will be specified
463 as part of a database. This is like a "proxy" for authentication and
464 is used for Connector Framework based targets.
470 <term id="zoom-torus-cfproxy">cfProxy</term><listitem>
472 Specifies HTTP proxy for the target in the form
473 <replaceable>host</replaceable>:<replaceable>port</replaceable>.
479 <term>cfSubDB</term><listitem>
481 Specifies sub database for a Connector Framework based target.
486 <varlistentry id="zoom-torus-contentConnector">
487 <term>contentConnector</term><listitem>
489 Specifies a database for content-based proxy'ing.
495 <term>elementSet</term><listitem>
497 Specifies the elementSet to be sent to the target if record
498 transform is enabled (not to be confused' with the record_transform
499 module). The record transform is enabled only if the client uses
500 record syntax = XML and a element set determined by
501 the <literal>element_transform</literal> /
502 <literal>element_raw</literal> from the configuration.
503 By default that is the element sets <literal>pz2</literal>
504 and <literal>raw</literal>.
505 If record transform is not enabled, this setting is
506 not used and the element set specified by the client
513 <term>literalTransform</term><listitem>
515 Specifies a XSL stylesheet to be used if record
516 transform is anabled; see description of elementSet.
517 The XSL transform is only used if the element set is set to the
518 value of <literal>element_transform</literal> in the configuration.
521 The value of literalTransform is the XSL - string encoded.
527 <term>piggyback</term><listitem>
529 A value of 1/true is a hint to the ZOOM module that this Z39.50
530 target supports piggyback searches, ie Search Response with
531 records. Any other value (false) will prevent the ZOOM module
532 to make use of piggyback (all records part of Present Response).
538 <term>queryEncoding</term><listitem>
540 If this value is defined, all queries will be converted
541 to this encoding. This should be used for all Z39.50 targets that
542 do not use UTF-8 for query terms.
548 <term>recordEncoding</term><listitem>
550 Specifies the character encoding of records that are returned
551 by the target. This is primarily used for targets were records
552 are not UTF-8 encoded already. This setting is only used
553 if the record transform is enabled (see description of elementSet).
559 <term>requestSyntax</term><listitem>
561 Specifies the record syntax to be specified for the target
562 if record transform is enabled; see description of elementSet.
563 If record transform is not enabled, the record syntax of the
564 client is passed verbatim to the target.
569 <varlistentry id="zoom-torus-sortmap">
570 <term>sortmap_<replaceable>field</replaceable></term><listitem>
572 This value the native field for a target. The form of the value is
573 given by <link linkend="zoom-torus-sortStrategy">sortStrategy</link>.
578 <varlistentry id="zoom-torus-sortStrategy">
579 <term>sortStrategy</term><listitem>
581 Specifies sort strategy for a target. One of:
582 <literal>z3950</literal>, <literal>type7</literal>,
583 <literal>cql</literal>, <literal>sru11</literal> or
584 <literal>embed</literal>. The <literal>embed</literal> chooses type-7
585 or CQL sortby depending on whether Type-1 or CQL is
586 actually sent to the target.
592 <term>sru</term><listitem>
594 If this setting is set, it specifies that the target is web service
595 based and must be one of : <literal>get</literal>,
596 <literal>post</literal>, <literal>soap</literal>
597 or <literal>solr</literal>.
602 <varlistentry id="sruVersion">
603 <term>sruVersion</term><listitem>
605 Specifies the SRU version to use. It unset, version 1.2 will be
606 used. Some servers do not support this version, in which case
607 version 1.1 or even 1.0 could be set it.
612 <varlistentry id="transform">
613 <term>transform</term><listitem>
615 Specifies a XSL stylesheet filename to be used if record
616 transform is anabled; see description of elementSet.
617 The XSL transform is only used if the element set is set to the
618 value of <literal>element_transform</literal> in the configuration.
623 <varlistentry id="udb">
624 <term>udb</term><listitem>
626 This value is required and specifies the unique database for
627 this profile . All target profiles should hold a unique database.
632 <varlistentry id="urlRecipe">
633 <term>urlRecipe</term><listitem>
635 The value of this field is a string that generates a dynamic link
636 based on record content. If the resulting string is non-zero in length
637 a new field, <literal>metadata</literal> with attribute
638 <literal>type="generated-url"</literal> is generated.
639 The contents of this field is the result of the URL recipe conversion.
640 The urlRecipe value may refer to an existing metadata element by
641 ${field[pattern/result/flags]}, which will take content
642 of field and perform a regular expression conversion using the pattern
643 given. For example: <literal>${md-title[\s+/+/g]}</literal> takes
644 metadata element <literal>title</literal> and converts one or more
645 spaces to a plus character.
650 <varlistentry id="zurl">
651 <term>zurl</term><listitem>
653 This is setting is mandatory and specifies the ZURL of the
654 target in the form of host/database. The HTTP method should
655 not be provided as this is guessed from the "sru" attribute value.
662 <title>DATABASE parameters</title>
664 Extra information may be carried in the Z39.50 Database or SRU path,
665 such as authentication to be passed to backend etc. Some of
666 the parameters override TARGET profile values. The format is
669 udb,parm1=value1&parm2=value2&...
672 Where udb is the unique database recognised by the backend and parm1,
673 value1, .. are parameters to be passed. The following describes the
674 supported parameters. Like form values in HTTP the parameters and
675 values are URL encoded. The separator, though, between udb and parameters
676 is a comma rather than a question mark. What follows question mark are
677 HTTP arguments (in this case SRU arguments).
681 <term>content-password</term>
684 The password to be used for content proxy session. If this parameter
685 is not given, value of parameter <literal>password</literal> is passed
686 to content proxy session.
691 <term>content-proxy</term>
694 Specifies proxy to be used for content proxy session. If this parameter
695 is not given, value of parameter <literal>proxy</literal> is passed
696 to content proxy session.
701 <term>content-user</term>
704 The user to be used for content proxy session. If this parameter
705 is not given, value of parameter <literal>user</literal> is passed
706 to content proxy session.
711 <term>cproxysession</term>
714 Specifies the session ID for content proxy. This parameter is, generally,
715 not used by anything but the content proxy itself when invoking
721 <term>nocproxy</term>
724 If this parameter is specified, content-proxying is disabled
730 <term>password</term>
733 Specifies password to be passed to backend. It is also passed
734 to content proxy session unless overriden by content-password.
735 If this parameter is omitted, the password will be taken from
736 TARGET profile setting
737 <link linkend="zoom-torus-authentication">
738 <literal>authentication</literal>
748 Specifies one or more proxies for backend. If this parameter is
749 omitted, the proxy will be taken from TARGET profile setting
750 <link linkend="zoom-torus-cfproxy">
751 <literal>cfProxy</literal></link>.
752 The parameter is a list of comma-separated host:port entries.
753 Bost host and port must be given for each proxy.
761 Session realm to be used for this target, changed the resulting
762 URL to be used for getting a target profile, by changing the
763 value that gets substituted for the %realm string. This parameter
764 is not allowed if access is controlled by
765 <link linkend="auth_url">auth_url</link>
772 <term>torus_url</term>
775 Sets the URL to be used for Torus records fetch - overriding value
776 of <literal>url</literal> attribute of element <literal>torus</literal>
777 in zoom configuration. This parameter is not allowed if access is
779 <link linkend="auth_url">auth_url</link> in configuration.
788 Specifies user to be passed to backend. It is also passed
789 to content proxy session unless overriden by content-user.
790 If this parameter is omitted, the user will be taken from TARGET
792 <link linkend="zoom-torus-authentication">
793 <literal>authentication</literal>
803 All parameters that has prefix x, dash are passed verbatim
811 <title>SCHEMA</title>
812 <literallayout><xi:include
813 xi:href="../xml/schema/filter_zoom.rnc"
815 xmlns:xi="http://www.w3.org/2001/XInclude" />
820 <title>EXAMPLES</title>
822 In example below Target definitions (Torus records) are fetched
823 from a web service via a proxy. A CQL profile is configured which
824 maps to a set of CCL fields ("no field", au, tu and su). Presumably
825 the target definitions fetched maps the CCL to their native RPN.
826 A CCL "ocn" is mapped for all targets. Logging of APDUs are enabled,
827 and a timeout is given.
831 url="http://torus.indexdata.com/src/records/?query=%query"
832 proxy="localhost:3128"
834 <fieldmap cql="cql.anywhere"/>
835 <fieldmap cql="cql.serverChoice"/>
836 <fieldmap cql="dc.creator" ccl="au"/>
837 <fieldmap cql="dc.title" ccl="ti"/>
838 <fieldmap cql="dc.subject" ccl="su"/>
842 <attr type="u" value="12"/>
843 <attr type="s" value="107"/>
854 Here is another example with two locally defined targets: A
855 Solr target and a Z39.50 target.
863 <cclmap_term>t=z</cclmap_term>
864 <cclmap_ti>u=title t=z</cclmap_ti>
866 <zurl>ocs-test.indexdata.com/solr/select</zurl>
870 <cclmap_term>t=l,r</cclmap_term>
871 <cclmap_ti>u=4 t=l,r</cclmap_ti>
872 <zurl>z3950.loc.gov:7090/voyager</zurl>
876 <fieldmap cql="cql.serverChoice"/>
877 <fieldmap cql="dc.title" ccl="ti"/>
885 <title>SEE ALSO</title>
888 <refentrytitle>metaproxy</refentrytitle>
889 <manvolnum>1</manvolnum>
894 <refentrytitle>virt_db</refentrytitle>
895 <manvolnum>3mp</manvolnum>
903 <!-- Keep this comment at the end of the file