2 <title>The YAZ Proxy</title>
4 The YAZ proxy is a transparent Z39.50-to-Z39.50 gateway. That is,
5 it is a Z39.50 server which has as its back-end a Z39.50 client
6 that forwards requests on to another server (known as the
7 <firstterm>backend target</firstterm>.)
10 The YAZ Proxy is useful for debugging Z39.50 software, logging
11 APDUs, redirecting Z39.50 packages through firewalls, etc.
12 Furthermore, it offers facilities that often
13 boost performance for connectionless Z39.50 clients such
17 Unlike most other server software, the proxy runs single-threaded,
18 single-process. Every I/O operation
19 is non-blocking so it is very lightweight and extremely fast.
20 It does not store any state information on the hard drive,
21 except any log files you ask for.
24 <section id="proxy-example">
25 <title>Example: Using the Proxy to Log APDUs</title>
27 Suppose you use a commercial Z39.50 client for which you do not
28 have source code, and it's not behaving how you think it should
29 when running against some specific server that you have no control
30 over. One way to diagnose the problem is to find out what packets
31 (APDUs) are being sent and received, but not all client
32 applications have facilities to do APDU logging.
35 No problem. Run the proxy on a friendly machine, get it to log
36 APDUs, and point the errant client at the proxy instead of
37 directly at the server that's causing it problems.
40 Suppose the server is running on <literal>foo.bar.com</literal>,
41 port 18398. Run the proxy on the machine of your choice, say
42 <literal>your.company.com</literal> like this:
45 yaz-proxy -a - -t tcp:foo.bar.com:18398 tcp:@:9000
48 (The <literal>-a -</literal> option requests APDU logging on
49 standard output, <literal>-t tcp:foo.bar.com:18398</literal>
50 specifies where the backend target is, and
51 <literal>tcp:@:9000</literal> tells the proxy to listen on port
52 9000 and accept connections from any machine.)
55 Now change your client application's configuration so that instead
56 of connecting to <literal>foo.bar.com</literal> port 18398, it
57 connects to <literal>your.company.com</literal> port 9000, and
58 start it up. It will work exactly as usual, but all the packets
59 will be sent via the proxy, which will generate a log like this:
64 referenceId OCTETSTRING(len=4) 69 6E 69 74
65 protocolVersion BITSTRING(len=1)
66 options BITSTRING(len=2)
67 preferredMessageSize 1048576
68 maximumRecordSize 1048576
69 implementationId 'Mike Taylor (id=169)'
70 implementationName 'Net::Z3950.pm (Perl)'
71 implementationVersion '0.31'
75 referenceId OCTETSTRING(len=4) 69 6E 69 74
76 protocolVersion BITSTRING(len=1)
77 options BITSTRING(len=2)
78 preferredMessageSize 1048576
79 maximumRecordSize 1048576
82 implementationName 'GFS/YAZ / Zebra Information Server'
83 implementationVersion 'YAZ 1.9.1 / Zebra 1.3.3'
87 referenceId OCTETSTRING(len=1) 30
90 mediumSetPresentNumber 0
92 resultSetName 'default'
97 smallSetElementSetNames choice
101 mediumSetElementSetNames choice
104 preferredRecordSyntax OID: 1 2 840 10003 5 10
108 attributeSetId OID: 1 2 840 10003 3 1
116 general OCTETSTRING(len=7) 6D 69 6E 65 72 61 6C
125 <section id="proxy-target">
126 <title>Specifying the Backend Target</title>
128 When the proxy accepts a Z39.50 client session, it
129 determines the backend target by the following rules:
132 <para> If the <literal>InitializeRequest</literal> PDU from the
133 client includes an <literal>otherInfo</literal> element with OID
134 <literal>1.2.840.10003.10.1000.81.1</literal>, then the
135 contents of that element specify the target to be used, in the
136 usual YAZ address format (typically
137 <literal>tcp:<parameter>hostname</parameter>:<parameter>port</parameter></literal>)
139 <ulink url="http://www.indexdata.dk/yaz/doc/comstack.addresses.php"
140 >the Addresses section of the YAZ manual</ulink>.
144 <para> Otherwise, the Proxy uses the default target, if one was
145 specified on the command-line with the <literal>-t</literal>
150 <para> Otherwise, the proxy closes the connection with
157 <section id="proxy-keepalive">
158 <title>Keep-alive Facility for Stateless Clients</title>
160 Stateless clients such as web gateways may generate a cookie for a Z39.50
161 session which is sent to the proxy as part of PDUs.
162 In this case, the proxy will keep alive its Z39.50 session
163 to the backend target even when the connection from the client
164 to the proxy is closed. When the client contacts the
165 proxy again, and re-issues the same cookie, the proxy reuses the
166 Z39.50 connection with the backend target.
170 guarantee that the Z39.50 connection to the backend
171 target is kept forever: the proxy will shut it down after certain
173 <!-- ### How long? Wot no command-line option? -->
174 So in effect, the connection from the client's
175 point of view should be considered stateless, and the keep-alive
176 facility should be treated only as a performance booster.
179 Cookies may be passed in an <literal>otherInfo</literal> element
180 with OID <literal>1.2.840.10003.10.1000.81.2</literal>.
184 <section id="proxy-cache">
185 <title>Query Caching</title>
187 Simple stateless clients often send identical Z39.50 searches
188 in a relatively short period of time (e.g. in order to produce a
189 results-list page, the next page,
190 a single full-record, etc). And for many targets, it's
191 much more expensive to produce a new result set than to
192 reuse an existing one.
195 The proxy tries to solve that by remembering the last query for each
196 backend target, so that if an identical query is received next, it
197 is turned into Present Requests rather than new Search Requests.
199 <!-- ### should be generalised to an arbitrary-sized cache -->
201 This optimization should work for any Z39.50 client and/or
202 target. The target does not have to support named result sets.
204 <!-- ### There should be an option to turn this off, as it will
205 affect semantics for some searches on some databases:
206 e.g. "ten most recent stories" in a newswire database.
210 <section id="proxy-optimizations">
211 <title>Other Optimizations</title>
213 We've had some plans to support caching of result set records,
214 but this has not yet been implemented.
218 <section id="proxy-usage">
219 <title>Proxy Usage</title>
222 <refentry id="yaz-proxy">
224 <refentrytitle>yaz-proxy</refentrytitle>
225 <manvolnum>8</manvolnum>
228 <refname>yaz-proxy</refname>
229 <refpurpose>The YAZ toolkit's transparent Z39.50 proxy</refpurpose>
233 <command>yaz-proxy</command>
234 <arg choice="opt">-a <replaceable>filename</replaceable></arg>
235 <arg choice="opt">-c <replaceable>num</replaceable></arg>
236 <arg choice="opt">-v <replaceable>level</replaceable></arg>
237 <arg choice="opt">-t <replaceable>target</replaceable></arg>
238 <arg choice="opt">-u <replaceable>auth</replaceable></arg>
239 <arg choice="opt">-o <replaceable>level</replaceable></arg>
240 <arg choice="req"><replaceable>host</replaceable>:<replaceable>port</replaceable></arg>
244 <refsect1><title>DESCRIPTION</title>
246 The proxy runs stand-alone (not from
247 <literal>inetd</literal>). The
248 <replaceable>host</replaceable>:<replaceable>port</replaceable>
249 argument specifies host address to listen to, and the port to
250 listen on. Use the host <literal>@</literal>
251 to listen for connections coming from any address.
254 <refsect1><title>OPTIONS</title>
256 <varlistentry><term>-a <replaceable>filename</replaceable></term>
258 Specifies the name of a file to which to write a log of the
259 APDUs (protocol packets) that pass through the proxy. The
260 special filename <literal>-</literal> may be used to indicate
264 <varlistentry><term>-c <replaceable>num</replaceable></term>
266 Specifies the maximum number of connections to be cached
270 <varlistentry><term>-v <replaceable>level</replaceable></term>
272 Sets the logging level. <replaceable>level</replaceable> is
273 a comma-separated list of members of the set
274 {<literal>fatal</literal>,<literal>debug</literal>,<literal>warn</literal>,<literal>log</literal>,<literal>malloc</literal>,<literal>all</literal>,<literal>none</literal>}.
277 <varlistentry><term>-t <replaceable>target</replaceable></term>
279 Specifies the default backend target to use when a client
280 connects that does not explicitly specify a target in its
281 <literal>initRequest</literal>.
284 <varlistentry><term>-u <replaceable>auth</replaceable></term>
286 Specifies authentication info to be sent to the backend target.
287 This is useful if you happen to have an internal target that
288 requires authentication, or if the client software does not allow
292 <varlistentry><term>-o <replaceable>level</replaceable></term>
294 Sets level for optimization. Use zero to disable; non-zero
295 to enable. Handling for this is not fully implemented;
296 we will probably use a bit mask to enable/disable specific
303 <title>EXAMPLES</title>
305 The following command starts the proxy, listening on port
306 9000, with its default backend target set to the Library of
307 Congress bibliographic server:
310 $ yaz-proxy -t z3950.loc.gov:7090 @:9000
313 The LOC target is sometimes very slow. You can connect to
314 it using yaz-client as follows:
317 $ yaz-client localhost:9000/voyager
320 Connection accepted by target.
322 Name : Voyager LMS - Z39.50 Server
324 Options: search present
328 Received SearchResponse.
329 Search was a success.
330 Number of hits: 10000
335 Received SearchResponse.
336 Search was a success.
337 Number of hits: 10000
342 In this test, the second search was more than 4000 times faster
343 than the first, because the proxy cached the result of the first
344 search and noticed that the second was the same.
347 The YAZ command-line client,
348 <literal>yaz-client</literal>,
349 allows you to set the backend target in
350 the <literal>initRequest</literal> using the
351 <literal>-p</literal> option. For example, to connect to
352 Index Data's target you could use:
355 yaz-client -p indexdata.dk localhost:9000/gils
361 <!-- Keep this comment at the end of the file
366 sgml-minimize-attributes:nil
367 sgml-always-quote-attributes:t
370 sgml-parent-document: "yaz++.xml"
371 sgml-local-catalogs: nil
372 sgml-namecase-general:t