1 # $Id: ZOOM.pod,v 1.30 2006-03-09 12:57:19 mike Exp $
8 ZOOM - Perl extension implementing the ZOOM API for Information Retrieval
14 $conn = new ZOOM::Connection($host, $port,
15 databaseName => "mydb");
16 $conn->option(preferredRecordSyntax => "usmarc");
17 $rs = $conn->search_pqf('@attr 1=4 dinosaur');
19 print $rs->record(0)->render();
22 print "Error ", $@->code(), ": ", $@->message(), "\n";
27 This module provides a nice, Perlish implementation of the ZOOM
28 Abstract API described and documented at http://zoom.z3950.org/api/
30 the ZOOM module is implemented as a set of thin classes on top of the
31 non-OO functions provided by this distribution's C<Net::Z3950::ZOOM>
33 turn is a thin layer on top of the ZOOM-C code supplied as part of
34 Index Data's YAZ Toolkit. Because ZOOM-C is also the underlying code
35 that implements ZOOM bindings in C++, Visual Basic, Scheme, Ruby, .NET
36 (including C#) and other languages, this Perl module works compatibly
37 with those other implementations. (Of course, the point of a public
38 API such as ZOOM is that all implementations should be compatible
39 anyway; but knowing that the same code is running is reassuring.)
41 The ZOOM module provides two enumerations (C<ZOOM::Error> and
42 C<ZOOM::Event>), two utility functions C<diag_str()> and C<event()> in
43 the C<ZOOM> package itself, and eight classes:
53 Of these, the Query class is abstract, and has three concrete
58 C<ZOOM::Query::CQL2RPN>.
59 Finally, it also provides a
61 module which supplies a useful general-purpose logging facility.
62 Many useful ZOOM applications can be built using only the Connection,
63 ResultSet, Record and Exception classes, as in the example
66 A typical application will begin by creating an Connection object,
67 then using that to execute searches that yield ResultSet objects, then
68 fetching records from the result-sets to yield Record objects. If an
69 error occurs, an Exception object is thrown and can be dealt with.
71 More sophisticated applications might also browse the server's indexes
72 to create a ScanSet, from which indexed terms may be retrieved; others
73 might send ``Extended Services'' Packages to the server, to achieve
74 non-standard tasks such as database creation and record update.
75 Searching using a query syntax other than PQF can be done using an
76 query object of one of the Query subclasses. Finally, sets of options
77 may be manipulated independently of the objects they are associated
78 with using an Options object.
80 In general, method calls throw an exception if anything goes wrong, so
81 you don't need to test for success after each call. See the section
82 below on the Exception class for details.
84 =head1 UTILITY FUNCTIONS
86 =head2 ZOOM::diag_str()
88 $msg = ZOOM::diag_str(ZOOM::Error::INVALID_QUERY);
90 Returns a human-readable English-language string corresponding to the
91 error code that is its own parameter. This works for any error-code
93 C<ZOOM::Exception::code()>,
94 C<ZOOM::Connection::error_x()>
96 C<ZOOM::Connection::errcode()>,
97 irrespective of whether it is a member of the C<ZOOM::Error>
98 enumeration or drawn from the BIB-1 diagnostic set.
103 Lark's vomit. Do not read this section.
105 $which = ZOOM::event([ $conn1, $conn2, $conn3 ]);
107 Used only in complex asynchronous applications, this function takes a
108 reference to a list of Connection objects, waits until an event
109 occurs on any one of them, and returns an integer indicating which of
110 the connections it occurred on. The return value is a 1-based index
111 into the list; 0 is returned if no event occurs within the longest
112 timeout specified by the C<timeout> options of all the connections.
115 This function is not yet implemented.
119 The eight ZOOM classes are described here in ``sensible order'':
120 first, the four commonly used classes, in the he order that they will
121 tend to be used in most programs (Connection, ResultSet, Record,
122 Exception); then the four more esoteric classes in descending order of
123 how often they are needed.
125 With the exception of the Options class, which is an extension to the
126 ZOOM model, the introduction to each class includes a link to the
127 relevant section of the ZOOM Abstract API.
129 =head2 ZOOM::Connection
131 $conn = new ZOOM::Connection("indexdata.dk:210/gils");
132 print("server is '", $conn->option("serverImplementationName"), "'\n");
133 $conn->option(preferredRecordSyntax => "usmarc");
134 $rs = $conn->search_pqf('@attr 1=4 mineral');
135 $ss = $conn->scan('@attr 1=1003 a');
136 if ($conn->errcode() != 0) {
137 die("somthing went wrong: " . $conn->errmsg())
141 This class represents a connection to an information retrieval server,
142 using an IR protocol such as ANSI/NISO Z39.50, SRW (the
143 Search/Retrieve Webservice), SRU (the Search/Retrieve URL) or
144 OpenSearch. Not all of these protocols require a low-level connection
145 to be maintained, but the Connection object nevertheless provides a
146 location for the necessary cache of configuration and state
147 information, as well as a uniform API to the connection-oriented
148 facilities (searching, index browsing, etc.), provided by these
151 See the description of the C<Connection> class in the ZOOM Abstract
153 http://zoom.z3950.org/api/zoom-current.html#3.2
159 $conn = new ZOOM::Connection("indexdata.dk", 210);
160 $conn = new ZOOM::Connection("indexdata.dk:210/gils");
161 $conn = new ZOOM::Connection("tcp:indexdata.dk:210/gils");
162 $conn = new ZOOM::Connection("http:indexdata.dk:210/gils");
163 $conn = new ZOOM::Connection("indexdata.dk", 210,
164 databaseName => "mydb",
165 preferredRecordSyntax => "marc");
167 Creates a new Connection object, and immediately connects it to the
168 specified server. If you want to make a new Connection object but
169 delay forging the connection, use the C<create()> and C<connect()>
172 This constructor can be called with two arguments or a single
173 argument. In the former case, the arguments are the name and port
174 number of the Z39.50 server to connect to; in the latter case, the
175 single argument is a YAZ service-specifier string of the form
177 When the two-option form is used (which may be done using a vacuous
178 second argument of zero), any number of additional argument pairs may
179 be provided, which are interpreted as key-value pairs to be set as
180 options after the Connection object is created but before it is
181 connected to the server. This is a convenient way to set options,
182 including those that must be set before connecting such as
183 authentication tokens.
189 [I<scheme>:]I<host>[:I<port>][/I<databaseName>]
193 In which the I<host> and I<port> parts are as in the two-argument
194 form, the I<databaseName> if provided specifies the name of the
195 database to be used in subsequent searches on this connection, and the
196 optional I<scheme> (default C<tcp>) indicates what protocol should be
197 used. At present, the following schemes are supported:
207 Z39.50 connection encrypted using SSL (Secure Sockets Layer). Not
208 many servers support this, but Index Data's Zebra is one that does.
212 Z39.50 connection on a Unix-domain (local) socket, in which case the
213 I<hostname> portion of the string is instead used as a filename in the
218 SRW connection using SOAP over HTTP.
222 Support for SRU will follow in the fullness of time.
224 If an error occurs, an exception is thrown. This may indicate a
225 networking problem (e.g. the host is not found or unreachable), or a
226 protocol-level problem (e.g. a Z39.50 server rejected the Init
229 =head4 create() / connect()
231 $options = new ZOOM::Options();
232 $options->option(implementationName => "my client");
233 $conn = create ZOOM::Connection($options)
234 $conn->connect($host, 0);
236 The usual Connection constructor, C<new()> brings a new object into
237 existence and forges the connection to the server all in one
238 operation, which is often what you want. For applications that need
239 more control, however, these two method separate the two steps,
240 allowing additional steps in between such as the setting of options.
242 C<create()> creates and returns a new Connection object, which is
243 I<not> connected to any server. It may be passed an options block, of
244 type C<ZOOM::Options> (see below), into which options may be set
245 before or after the creation of the Connection. The connection to the
246 server may then be forged by the C<connect()> method, the arguments of
247 which are the same as those of the C<new()> constructor.
249 =head4 error_x() / errcode() / errmsg() / addinfo() / diagset()
251 ($errcode, $errmsg, $addinfo, $diagset) = $conn->error_x();
252 $errcode = $conn->errcode();
253 $errmsg = $conn->errmsg();
254 $addinfo = $conn->addinfo();
255 $diagset = $conn->diagset();
257 These methods may be used to obtain information about the last error
258 to have occurred on a connection - although typically they will not
259 been used, as the same information is available through the
260 C<ZOOM::Exception> that is thrown when the error occurs. The
266 methods each return one element of the diagnostic, and
268 returns all four at once.
270 See the C<ZOOM::Exception> for the interpretation of these elements.
272 =head4 option() / option_binary()
274 print("server is '", $conn->option("serverImplementationName"), "'\n");
275 $conn->option(preferredRecordSyntax => "usmarc");
276 $conn->option_binary(iconBlob => "foo\0bar");
277 die if length($conn->option_binary("iconBlob") != 7);
279 Objects of the Connection, ResultSet, ScanSet and Package classes
280 carry with them a set of named options which affect their behaviour in
281 certain ways. See the ZOOM-C options documentation for details:
283 Connection options are listed at
284 http://indexdata.com/yaz/doc/zoom.tkl#zoom.connections
286 These options are set and fetched using the C<option()> method, which
287 may be called with either one or two arguments. In the two-argument
288 form, the option named by the first argument is set to the value of
289 the second argument, and its old value is returned. In the
290 one-argument form, the value of the specified option is returned.
292 For historical reasons, option values are not binary-clean, so that a
293 value containing a NUL byte will be returned in truncated form. The
294 C<option_binary()> method behaves identically to C<option()> except
295 that it is binary-clean, so that values containing NUL bytes are set
296 and returned correctly.
298 =head4 search() / search_pqf()
300 $rs = $conn->search(new ZOOM::Query::CQL('title=dinosaur'));
301 # The next two lines are equivalent
302 $rs = $conn->search(new ZOOM::Query::PQF('@attr 1=4 dinosaur'));
303 $rs = $conn->search_pqf('@attr 1=4 dinosaur');
305 The principal purpose of a search-and-retrieve protocol is searching
306 (and, er, retrieval), so the principal method used on a Connection
307 object is C<search()>. It accepts a single argument, a C<ZOOM::Query>
308 object (or, more precisely, an object of a subclass of this class);
309 and it creates and returns a new ResultSet object representing the set
310 of records resulting from the search.
312 Since queries using PQF (Prefix Query Format) are so common, we make
313 them a special case by providing a C<search_pqf()> method. This is
314 identical to C<search()> except that it accepts a string containing
315 the query rather than an object, thereby obviating the need to create
316 a C<ZOOM::Query::PQF> object. See the documentation of that class for
317 information about PQF.
319 =head4 scan() / scan_pqf()
321 $rs = $conn->scan(new ZOOM::Query::CQL('title=dinosaur'));
322 # The next two lines are equivalent
323 $rs = $conn->scan(new ZOOM::Query::PQF('@attr 1=4 dinosaur'));
324 $rs = $conn->scan_pqf('@attr 1=4 dinosaur');
326 Many Z39.50 servers allow you to browse their indexes to find terms to
327 search for. This is done using the C<scan> method, which creates and
328 returns a new ScanSet object representing the set of terms resulting
331 C<scan()> takes a single argument, but it has to work hard: it
332 specifies both what index to scan for terms, and where in the index to
333 start scanning. What's more, the specification of what index to scan
334 includes multiple facets, such as what database fields it's an index
335 of (author, subject, title, etc.) and whether to scan for whole fields
336 or single words (e.g. the title ``I<The Empire Strikes Back>'', or the
337 four words ``Back'', ``Empire'', ``Strikes'' and ``The'', interleaved
338 with words from other titles in the same index.
340 All of this is done by using a Query object representing a query of a
341 single term as the C<scan()> argument. The attributes associated with
342 the term indicate which index is to be used, and the term itself
343 indicates the point in the index at which to start the scan. For
344 example, if the argument is the query C<@attr 1=4 fish>, then
350 This is the BIB-1 attribute with type 1 (meaning access-point, which
351 specifies an index), and type 4 (which means ``title''). So the scan
352 is in the title index.
356 Start the scan from the lexicographically earliest term that is equal
357 to or falls after ``fish''.
361 The argument C<@attr 1=4 @attr 6=3 fish> would behave similarly; but
362 the BIB-1 attribute 6=3 mean completeness=``complete field'', so the
363 scan would be for complete titles rather than for words occurring in
366 This takes a bit of getting used to.
368 The behaviour is C<scan()> is affected by the following options, which
369 may be set on the Connection through which the scan is done:
373 =item number [default: 10]
375 Indicates how many terms should be returned in the ScanSet. The
376 number actually returned may be less, if the start-point is near the
377 end of the index, but will not be greater.
379 =item position [default: 1]
381 A 1-based index specifying where in the returned list of terms the
382 seed-term should appear. By default it should be the first term
383 returned, but C<position> may be set, for example, to zero (requesting
384 the next terms I<after> the seed-term), or to the same value as
385 C<number> (requesting the index terms I<before> the seed term).
387 =item stepSize [default: 0]
389 An integer indicating how many indexed terms are to be skipped between
390 each one returned in the ScanSet. By default, no terms are skipped,
391 but overriding this can be useful to get a high-level overview of the
394 Since scans using PQF (Prefix Query Format) are so common, we make
395 them a special case by providing a C<scan_pqf()> method. This is
396 identical to C<scan()> except that it accepts a string containing the
397 query rather than an object, thereby obviating the need to create a
398 C<ZOOM::Query::PQF> object.
404 $p = $conn->package();
405 $o = new ZOOM::Options();
406 $o->option(databaseName => "newdb");
407 $p = $conn->package($o);
409 Creates and returns a new C<ZOOM::Package>, to be used in invoking an
410 Extended Service. An options block may optionally be passed in. See
411 the C<ZOOM::Package> documentation.
415 if ($conn->last_event() == ZOOM::Event::CONNECT) {
416 print "Connected!\n";
419 Returns a C<ZOOM::Event> enumerated value indicating the type of the
420 last event that occurred on the connection. This is used only in
421 complex asynchronous applications - see the section below on
422 C<ZOOM::Event> for more information.
425 This method has not been tested.
431 Destroys a Connection object, tearing down any low-level connection
432 associated with it and freeing its resources. It is an error to reuse
433 a Connection that has been C<destroy()>ed.
435 =head2 ZOOM::ResultSet
437 $rs = $conn->search_pqf('@attr 1=4 mineral');
440 $rec = $rs->record($i-1);
441 print $rec->render();
444 A ResultSet object represents the set of zero or more records
445 resulting from a search, and is the means whereby these records can be
446 retrieved. A ResultSet object may maintain client side cache or some,
447 less, none, all or more of the server's records: in general, this is
448 supposed to an implementaton detail of no interest to a typical
449 application, although more sophisticated applications do have
450 facilities for messing with the cache. Most applications will only
451 need the C<size()>, C<record()> and C<sort()> methods.
453 There is no C<new()> method nor any other explicit constructor. The
454 only way to create a new ResultSet is by using C<search()> (or
455 C<search_pqf()>) on a Connection.
457 See the description of the C<Result Set> class in the ZOOM Abstract
459 http://zoom.z3950.org/api/zoom-current.html#3.4
465 $rs->option(elementSetName => "f");
467 Allows options to be set into, and read from, a ResultSet, just like
468 the Connection class's C<option()> method. There is no
469 C<option_binary()> method for ResultSet objects.
471 ResultSet options are listed at
472 http://indexdata.com/yaz/doc/zoom.resultsets.tkl
476 print "Found ", $rs->size(), " records\n";
478 Returns the number of records in the result set.
480 =head4 record() / record_immediate()
482 $rec = $rs->record(0);
483 $rec2 = $rs->record_immediate(0);
484 $rec3 = $rs->record_immediate(1)
485 or print "second record wasn't in cache\n";
487 The C<record()> method returns a C<ZOOM::Record> object representing
488 a record from result-set, whose position is indicated by the argument
489 passed in. This is a zero-based index, so that legitimate values
490 range from zero to C<$rs->size()-1>.
492 The C<record_immediate()> API is identical, but it never invokes a
493 network operation, merely returning the record from the ResultSet's
494 cache if it's already there, or an undefined value otherwise. So if
495 you use this method, B<you must always check the return value>.
499 $rs->records(0, 10, 0);
501 print $rs->record_immediate($i)->render();
504 @nextseven = $rs->records(10, 7, 1);
506 The C<record_immediate()> method only fetches records from the cache,
507 whereas C<record()> fetches them from the server if they have not
508 already been cached; but the ZOOM module has to guess what the most
509 efficient strategy for this is. It might fetch each record, alone
510 when asked for: that's optimal in an application that's only
511 interested in the top hit from each search, but pessimal for one that
512 wants to display a whole list of results. Conversely, the software's
513 strategy might be always to ask for blocks of a twenty records:
514 that's great for assembling long lists of things, but wasteful when
515 only one record is wanted. The problem is that the ZOOM module can't
516 tell, when you call C<$rs->record()>, what your intention is.
518 But you can tell it. The C<records()> method fetches a sequence of
519 records, all in one go. It takes three arguments: the first is the
520 zero-based index of the first record in the sequence, the second is
521 the number of records to fetch, and the third is a boolean indication
522 of whether or not to return the retrieved records as well as adding
523 them to the cache. (You can always pass 1 for this if you like, and
524 Perl will discard the unused return value, but there is a small
525 efficiency gain to be had by passing 0.)
527 Once the records have been retrieved from the server
528 (i.e. C<records()> has completed without throwing an exception), they
529 can be fetched much more efficiently using C<record()> - or
530 C<record_immediate()>, which is then guaranteed to succeed.
536 Resets the ResultSet's record cache, so that subsequent invocations of
537 C<record_immediate()> will fail. I struggle to imagine a real
538 scenario where you'd want to do this.
542 if ($rs->sort("yaz", "1=4 >i 1=21 >s") < 0) {
546 Sorts the ResultSet in place (discarding any cached records, as they
547 will in general be sorted into a different position). There are two
548 arguments: the first is a string indicating the type of the
549 sort-specification, and the second is the specification itself.
551 The C<sort()> method returns 0 on success, or -1 if the
552 sort-specification is invalid.
554 At present, the only supported sort-specification type is C<yaz>.
555 Such a specification consists of a space-separated sequence of keys,
556 each of which itself consists of two space-separated words (so that
557 the total number of words in the sort-specification is even). The two
558 words making up each key are a field and a set of flags. The field
559 can take one of two forms: if it contains an C<=> sign, then it is a
560 BIB-1 I<type>=I<value> pair specifying which field to sort
561 (e.g. C<1=4> for a title sort); otherwise it is sent for the server to
562 interpret as best it can. The word of flags is made up from one or
563 more of the following: C<s> for case sensitive, C<i> for case
564 insensitive; C<<> for ascending order and C<E<gt>> for descending
567 For example, the sort-specification in the code-fragment above will
568 sort the records in C<$rs> case-insensitively in descending order of
569 title, with records having equivalent titles sorted case-sensitively
570 in ascending order of subject. (The BIB-1 access points 4 and 21
571 represent title and subject respectively.)
577 Destroys a ResultSet object, freeing its resources. It is an error to
578 reuse a ResultSet that has been C<destroy()>ed.
582 $rec = $rs->record($i);
583 print $rec->render();
585 $marc = new_from_usmarc MARC::Record($raw);
586 print "Record title is: ", $marc->title(), "\n";
588 A Record object represents a record that has been retrived from the
591 There is no C<new()> method nor any other explicit constructor. The
592 only way to create a new Record is by using C<record()> (or
593 C<record_immediate()>, or C<records()>) on a ResultSet.
595 In general, records are ``owned'' by their result-sets that they were
596 retrieved from, so they do not have to be explicitly memory-managed:
597 they are deallocated (and therefore can no longer be used) when the
598 result-set is destroyed.
600 See the description of the C<Record> class in the ZOOM Abstract
602 http://zoom.z3950.org/api/zoom-current.html#3.5
610 Returns a human-readable representation of the record. Beyond that,
611 no promises are made: careful programs should not make assumptions
612 about the format of the returned string.
614 This method is useful mostly for debugging.
620 $marc = new_from_usmarc MARC::Record($raw);
622 Returns an opaque blob of data that is the raw form of the record.
623 Exactly what this is, and what you can do with it, varies depending on
624 the record-syntax. For example, XML records will be returned as,
625 well, XML; MARC records will be returned as ISO 2709-encoded blocks
626 that can be decoded by software such as the fine C<Marc::Record>
627 module; GRS-1 record will be ... gosh, what an interesting question.
628 But no-one uses GRS-1 any more, do they?
630 =head4 clone() / destroy()
632 $rec = $rs->record($i);
633 $newrec = $rec->clone();
635 print $newrec->render();
638 Usually, it's convenient that Record objects are owned by their
639 ResultSets and go away when the ResultSet is destroyed; but
640 occasionally you need a Record to outlive its parent and destroy it
641 later, explicitly. To do this, C<clone()> the record, keep the new
642 Record object that is returned, and C<destroy()> it when it's no
643 longer needed. This is B<only> situation in which a Record needs to
646 =head2 ZOOM::Exception
648 In general, method calls throw an exception (of class
649 C<ZOOM::Exception>) if anything goes wrong, so you don't need to test
650 for success after each call. Exceptions are caught by enclosing the
651 main code in an C<eval{}> block and checking C<$@> on exit from that
652 block, as in the code-sample above.
654 There are a small number of exceptions to this rule: the three
655 record-fetching methods in the C<ZOOM::ResultSet> class,
657 C<record_immediate()>,
660 can all return undefined values for legitimate reasons, under
661 circumstances that do not merit throwing an exception. For this
662 reason, the return values of these methods should be checked. See the
663 individual methods' documentation for details.
665 An exception carries the following pieces of information:
671 A numeric code that specifies the type of error. This can be checked
672 for equality with known values, so that intelligent applications can
673 take appropriate action.
677 A human-readable message corresponding with the code. This can be
678 shown to users, but its value should not be tested, as it could vary
679 in different versions or under different locales.
681 =item additional information [optional]
683 A string containing information specific to the error-code. For
684 example, when the error-code is the BIB-1 diagnostic 109 ("Database
685 unavailable"), the additional information is the name of the database
686 that the application tried to use. For some error-codes, there is no
687 additional information at all; for some others, the additional
688 information is undefined and may just be an human-readable string.
690 =item diagnostic set [optional]
692 A short string specifying the diagnostic set from which the error-code
693 was drawn: for example, C<ZOOM> for a ZOOM-specific error such as
694 C<ZOOM::Error::MEMORY> ("out of memory"), and C<BIB-1> for a Z39.50
695 error-code drawn from the BIB-1 diagnostic set.
699 In theory, the error-code should be interpreted in the context of the
700 diagnostic set from which it is drawn; in practice, nearly all errors
701 are from either the ZOOM or BIB-1 diagnostic sets, and the codes in
702 those sets have been chosen so as not to overlap, so the diagnostic
703 set can usually be ignored.
705 See the description of the C<Exception> class in the ZOOM Abstract
707 http://zoom.z3950.org/api/zoom-current.html#3.7
713 die new ZOOM::Exception($errcode, $errmsg, $addinfo, $diagset);
715 Creates and returns a new Exception object with the specified
716 error-code, error-message, additional information and diagnostic set.
717 Applications will not in general need to use this, but may find it
718 useful to simulate ZOOM exceptions. As is usual with Perl, exceptions
719 are thrown using C<die()>.
721 =head4 code() / message() / addinfo() / diagset()
723 print "Error ", $@->code(), ": ", $@->message(), "\n";
724 print "(addinfo '", $@->addinfo(), "', set '", $@->diagset(), "')\n";
726 These methods, of no arguments, return the exception's error-code,
727 error-message, additional information and diagnostic set respectively.
733 Returns a human-readable rendition of an exception. The C<"">
734 operator is overloaded on the Exception class, so that an Exception
735 used in a string context is automatically rendered. Among other
736 consequences, this has the useful result that a ZOOM application that
737 died due to an uncaught exception will emit an informative message
742 $ss = $conn->scan('@attr 1=1003 a');
744 ($term, $occ) = $ss->term($n-1);
745 $rs = $conn->search_pqf('@attr 1=1003 "' . $term . "'");
746 assert($rs->size() == $occ);
748 A ScanSet represents a set of candidate search-terms returned from an
749 index scan. Its sole purpose is to provide access to those term, to
750 the corresponding display terms, and to the occurrence-counts of the
753 There is no C<new()> method nor any other explicit constructor. The
754 only way to create a new ScanSet is by using C<scan()> on a
757 See the description of the C<Scan Set> class in the ZOOM Abstract
759 http://zoom.z3950.org/api/zoom-current.html#3.6
765 print "Found ", $ss->size(), " terms\n";
767 Returns the number of terms in the scan set. In general, this will be
768 the scan-set size requested by the C<number> option in the Connection
769 on which the scan was performed [default 10], but it may be fewer if
770 the scan is close to the end of the index.
772 =head4 term() / display_term()
774 $ss = $conn->scan('@attr 1=1004 whatever');
775 ($term, $occurrences) = $ss->term(0);
776 ($displayTerm, $occurrences2) = $ss->display_term(0);
777 assert($occurrences == $occurrences2);
778 if (user_likes_the_look_of($displayTerm)) {
779 $rs = $conn->search_pqf('@attr 1=4 "' . $term . '"');
780 assert($rs->size() == $occurrences);
783 These methods return the scanned terms themselves. C<term()> returns
784 the term is a form suitable for submitting as part of a query, whereas
785 C<display_term()> returns it in a form suitable for displaying to a
786 user. Both versions also return the number of occurrences of the term
787 in the index, i.e. the number of hits that will be found if the term
788 is subsequently used in a query.
790 In most cases, the term and display term will be identical; however,
791 they may be different in cases where punctuation or case is
792 normalised, or where identifiers rather than the original document
797 print "scan status is ", $ss->option("scanStatus");
799 Allows options to be set into, and read from, a ScanSet, just like
800 the Connection class's C<option()> method. There is no
801 C<option_binary()> method for ScanSet objects.
803 ScanSet options are also described, though not particularly
805 http://indexdata.com/yaz/doc/zoom.scan.tkl
811 Destroys a ScanSet object, freeing its resources. It is an error to
812 reuse a ScanSet that has been C<destroy()>ed.
816 $p = $conn->package();
817 $p->option(action => "specialUpdate");
818 $p->option(recordIdOpaque => 145);
819 $p->option(record => content_of("/tmp/record.xml"));
823 This class represents an Extended Services Package: an instruction to
824 the server to do something not covered by the core parts of the Z39.50
825 standard (or the equivalent in SRW or SRU). Since the core protocols
826 are read-only, such requests are often used to make changes to the
827 database, such as in the record update example above.
829 Requesting an extended service is a four-step process: first, create a
830 package associated with the connection to the relevant database;
831 second, set options on the package to instruct the server on what to
832 do; third, send the package (which may result in an exception being
833 thrown if the server cannot execute the requested operations; and
834 finally, destroy the package.
836 Package options are listed at
837 http://indexdata.com/yaz/doc/zoom.ext.tkl
839 The particular options that have meaning are determined by the
840 top-level operation string specified as the argument to C<send()>.
841 For example, when the operation is C<update> (the most commonly used
842 extended service), the C<action> option may be set to any of
844 (add a new record, failing if that record already exists),
846 (delete a record, failing if it is not in the database).
848 (replace a record, failing if an old version is not already present)
851 (add a record, replacing any existing version that may be present).
853 For update, the C<record> option should be set to the full text of the
854 XML record to added, deleted or replaced. Depending on how the server
855 is configured, it may extract the record's unique ID from the text
856 (i.e. from a known element such as the C<001> field of a MARCXML
857 record), or it may require the unique ID to passed in explicitly using
858 the C<recordIdOpaque> option.
860 Extended services packages are B<not currently described> in the ZOOM
862 http://zoom.z3950.org/api/zoom-current.html
863 They will be added in a forthcoming version, and will function much
864 as those implemented in this module.
870 $p->option(recordIdOpaque => "46696f6e61");
872 Allows options to be set into, and read from, a Package, just like
873 the Connection class's C<option()> method. There is no
874 C<option_binary()> method for Package objects.
876 Package options are listed at
877 http://indexdata.com/yaz/doc/zoom.ext.tkl
883 Sends a package to the server associated with the Connection that
884 created it. Problems are reported by throwing an exception. The
885 single parameter indicates the operation that the server is being
886 requested to perform, and controls the interpretation of the package's
887 options. Valid operations include:
893 Request a copy of a nominated object, e.g. place an ILL request.
897 Create a new database, the name of which is specified by the
898 C<databaseName> option.
902 Drop an existing database, the name of which is specified by the
903 C<databaseName> option.
907 Commit changes made to the database within a transaction.
911 Modify the contents of the database by adding, deleting or replacing
912 records (as described above in the overview of the C<ZOOM::Package>
917 I have no idea what this does.
921 Although the module is capable of I<making> all these requests, not
922 all servers are capable of I<executing> them. Refusal is indicated by
923 throwing an exception. Problems may also be caused by lack of
924 privileges; so C<send()> must be used with caution, and is perhaps
925 best wrapped in a clause that checks for execptions, like so:
927 eval { $p->send("create") };
928 if ($@ && $@->isa("ZOOM::Exception")) {
929 print "Oops! ", $@->message(), "\n";
937 Destroys a Package object, freeing its resources. It is an error to
938 reuse a Package that has been C<destroy()>ed.
942 $q = new ZOOM::Query::CQL("creator=pike and subject=unix");
943 $q->sortby("1=4 >i 1=21 >s");
944 $rs = $conn->search($q);
947 C<ZOOM::Query> is a virtual base class from which various concrete
948 subclasses can be derived. Different subclasses implement different
949 types of query. The sole purpose of a Query object is to be used in a
950 C<search()> on a Connection; because PQF is such a common special
951 case, the shortcut Connection method C<search_pqf()> is provided.
953 The following Query subclasses are provided, each providing the
954 same set of methods described below:
958 =item ZOOM::Query::PQF
960 Implements Prefix Query Format (PQF), also sometimes known as Prefix
961 Query Notation (PQN). This esoteric but rigorous and expressive
962 format is described in the YAZ Manual at
963 http://indexdata.com/yaz/doc/tools.tkl#PQF
965 =item ZOOM::Query::CQL
967 Implements the Common Query Language (CQL) of SRU, the Search/Retrieve
968 URL. CQL is a much friendlier notation than PQF, using a simple infix
969 notation. The queries are passed ``as is'' to the server rather than
970 being compiled into a Z39.50 Type-1 query, so only CQL-compliant
971 servers can support such querier. CQL is described at
972 http://www.loc.gov/standards/sru/cql/
973 and in a slight out-of-date but nevertheless useful tutorial at
974 http://zing.z3950.org/cql/intro.html
976 =item ZOOM::Query::CQL2RPN
978 Implements CQL by compiling it on the client-side into a Z39.50
979 Type-1 (RPN) query, and sending that. This provides essentially the
980 same functionality as C<ZOOM::Query::CQL>, but it will work against
981 any standard Z39.50 server rather than only against the small subset
982 that support CQL natively. The drawback is that, because the
983 compilation is done on the client side, a configuration file is
984 required to direct the mapping of CQL constructs such as index names,
985 relations and modifiers into Type-1 query attributes. An example CQL
986 configuration file is included in the ZOOM-Perl distribution, in the
987 file C<samples/cql/pqf.properties>
991 See the description of the C<Query> class in the ZOOM Abstract
993 http://zoom.z3950.org/api/zoom-current.html#3.3
999 $q = new ZOOM::Query::CQL('title=dinosaur');
1000 $q = new ZOOM::Query::PQF('@attr 1=4 dinosaur');
1002 Creates a new query object, compiling the query passed as its argument
1003 according to the rules of the particular query-type being
1004 instantiated. If compilation fails, an exception is thrown.
1005 Otherwise, the query may be passed to the C<Connection> method
1008 $conn->option(cqlfile => "samples/cql/pqf.properties");
1009 $q = new ZOOM::Query::CQL2RPN('title=dinosaur', $conn);
1011 Note that for the C<ZOOM::Query::CQL2RPN> subclass, the Connection
1012 must also be passed into the constructor. This is used for two
1013 purposes: first, its C<cqlfile> option is used to find the CQL
1014 configuration file that directs the translations into RPN; and second,
1015 if compilation fails, then diagnostic information is cached in the
1016 Connection and be retrieved using C<$conn-E<gt>errcode()> and related
1021 $q->sortby("1=4 >i 1=21 >s");
1023 Sets a sort specification into the query, so that when a C<search()>
1024 is run on the query, the result is automatically sorted. The sort
1025 specification language is the same as the C<yaz> sort-specification
1026 type of the C<ResultSet> method C<sort()>, described above.
1032 Destroys a Query object, freeing its resources. It is an error to
1033 reuse a Query that has been C<destroy()>ed.
1035 =head2 ZOOM::Options
1037 $o1 = new ZOOM::Options();
1038 $o1->option(user => "alf");
1039 $o2 = new ZOOM::Options();
1040 $o2->option(password => "fruit");
1041 $opts = new ZOOM::Options($o1, $o2);
1042 $conn = create ZOOM::Connection($opts);
1043 $conn->connect($host); # Uses the specified username and password
1045 Several classes of ZOOM objects carry their own sets of options, which
1046 can be manipulated using their C<option()> method. Sometimes,
1047 however, it's useful to deal with the option sets directly, and the
1048 C<ZOOM::Options> class exists to enable this approach.
1050 Option sets are B<not currently described> in the ZOOM
1052 http://zoom.z3950.org/api/zoom-current.html
1053 They are an extension to that specification.
1059 $o1 = new ZOOM::Options();
1060 $o1and2 = new ZOOM::Options($o1);
1061 $o3 = new ZOOM::Options();
1062 $o1and3and4 = new ZOOM::Options($o1, $o3);
1064 Creates and returns a new option set. One or two (but no more)
1065 existing option sets may be passed as arguments, in which case they
1066 become ``parents'' of the new set, which thereby ``inherits'' their
1067 options, the values of the first parent overriding those of the second
1068 when both have a value for the same key. An option set that inherits
1069 from a parent that has its own parents also inherits the grandparent's
1072 =head4 option() / option_binary()
1074 $o->option(preferredRecordSyntax => "usmarc");
1075 $o->option_binary(iconBlob => "foo\0bar");
1076 die if length($o->option_binary("iconBlob") != 7);
1078 These methods are used to get and set options within a set, and behave
1079 the same way as the same-named C<Connection> methods - see above. As
1080 with the C<Connection> methods, values passed to and retrieved using
1081 C<option()> are interpreted as NUL-terminated, while those passed to
1082 and retrieved from C<option_binary()> are binary-clean.
1086 $o->option(x => "T");
1087 $o->option(y => "F");
1088 assert($o->bool("x", 1));
1089 assert(!$o->bool("y", 1));
1090 assert($o->bool("z", 1));
1092 The first argument is a key, and the second is a default value.
1093 Returns the value associated with the specified key as a boolean, or
1094 the default value if the key has not been set. The values C<T> (upper
1095 case) and C<1> are considered true; all other values (including C<t>
1096 (lower case) and non-zero integers other than one) are considered
1099 This method is provided in ZOOM-C because in a statically typed
1100 language it's convenient to have the result returned as an
1101 easy-to-test type. In a dynamically typed language such as Perl, this
1102 problem doesn't arise, so C<bool()> is nearly useless; but it is made
1103 available in case applications need to duplicate the idiosyncratic
1104 interpretation of truth and falsehood and ZOOM-C uses.
1108 $o->option(x => "012");
1109 assert($o->int("x", 20) == 12);
1110 assert($o->int("y", 20) == 20);
1112 Returns the value associated with the specified key as an integer, or
1113 the default value if the key has not been set. See the description of
1114 C<bool()> for why you almost certainly don't want to use this.
1118 $o->set_int(x => "29");
1120 Sets the value of the specified option as an integer. Of course, Perl
1121 happily converts strings to integers on its own, so you can just use
1122 C<option()> for this, but C<set_int()> is guaranteed to use the same
1123 string-to-integer conversion as ZOOM-C does, which might occasionally
1124 be useful. Though I can't imagine how.
1126 =head4 set_callback()
1130 return "$udata-$key-$udata";
1132 $o->set_callback(\&cb, "xyz");
1133 assert($o->option("foo") eq "xyz-foo-xyz");
1135 This method allows a callback function to be installed in an option
1136 set, so that the values of options can be calculated algorithmically
1137 rather than, as usual, looked up in a table. Along with the callback
1138 function itself, an additional datum is provided: when an option is
1139 subsequently looked up, this datum is passed to the callback function
1140 along with the key; and its return value is returned to the caller as
1141 the value of the option.
1144 Although it ought to be possible to specify callback function using
1145 the C<\&name> syntax above, or a literal C<sub { code }> code
1146 reference, the complexities of the Perl-internal memory management
1147 system mean that the function must currently be specified as a string
1148 containing the fully-qualified name, e.g. C<"main::cb">.>
1151 The current implementation of the this method leaks memory, not only
1152 when the callback is installed, but on every occasion that it is
1153 consulted to look up an option value.
1159 Destroys an Options object, freeing its resources. It is an error to
1160 reuse an Options object that has been C<destroy()>ed.
1164 The ZOOM module provides two enumerations that list possible return
1165 values from particular functions. They are described in the following
1170 if ($@->code() == ZOOM::Error::QUERY_PQF) {
1171 return "your query was not accepted";
1174 This class provides a set of manifest constants representing some of
1175 the possible error codes that can be raised by the ZOOM module. The
1176 methods that return error-codes are
1177 C<ZOOM::Exception::code()>,
1178 C<ZOOM::Connection::error_x()>
1180 C<ZOOM::Connection::errcode()>.
1182 The C<ZOOM::Error> class provides the constants
1192 C<UNSUPPORTED_PROTOCOL>,
1193 C<UNSUPPORTED_QUERY>,
1204 each of which specifies a client-side error. These codes constitute
1205 the C<ZOOM> diagnostic set.
1207 Since errors may also be diagnosed by the server, and returned to the
1208 client, error codes may also take values from the BIB-1 diagnostic set
1209 of Z39.50, listed at the Z39.50 Maintenance Agency's web-site at
1210 http://www.loc.gov/z3950/agency/defns/bib1diag.html
1212 All error-codes, whether client-side from the C<ZOOM::Error>
1213 enumeration or server-side from the BIB-1 diagnostic set, can be
1214 translated into human-readable messages by passing them to the
1215 C<ZOOM::diag_str()> utility function.
1219 if ($conn->last_event() == ZOOM::Event::CONNECT) {
1220 print "Connected!\n";
1223 In applications that need it - mostly complex multiplexing
1224 applications - The C<ZOOM::Connection::last_event()> method is used to
1225 return an indication of the last event that occurred on a particular
1226 connection. It always returns a value drawn from this enumeration,
1227 that is, one of C<NONE>, C<CONNECT>, C<SEND_DATA>, C<RECV_DATA>,
1228 C<TIMEOUT>, C<UNKNOWN>, C<SEND_APDU>, C<RECV_APDU>, C<RECV_RECORD> or
1231 You almost certainly don't need to know about this. Frankly, I'm not
1232 sure how to use it myself.
1236 ZOOM::Log::init_level(ZOOM::Log::mask_str("zoom,myapp,-warn"));
1237 ZOOM::Log::log("myapp", "starting up with pid ", $$);
1239 Logging facilities are provided by a set of functions in the
1240 C<ZOOM::Log> module. Note that C<ZOOM::Log> is not a class, and it
1241 is not possible to create C<ZOOM::Log> objects: the API is imperative,
1242 reflecting that of the underlying YAZ logging facilities. Although
1243 there are nine logging functions altogether, you can ignore nearly
1244 all of them: most applications that use logging will begin by calling
1245 C<mask_str()> and C<init_level()> once each, as above, and will then
1246 repeatedly call C<log()>.
1250 $level = ZOOM::Log::mask_str("zoom,myapp,-warn");
1252 Returns an integer corresponding to the log-level specified by the
1253 parameter. This is a string of zero or more comma-separated
1254 module-names, each indicating an individual module to be either added
1255 to the default log-level or removed from it (for those components
1256 prefixed by a minus-sign). The names may be those of either standard
1257 YAZ-logging modules such as C<fatal>, C<debug> and C<warn>, or custom
1258 modules such as C<myapp> in the example above. The module C<zoom>
1259 requests logging from the ZOOM module itself, which may be helpful for
1262 Note that calling this function does not in any way change the logging
1263 state: it merely returns a value. To change the state, this value
1264 must be passed to C<init_level()>.
1266 =head2 module_level()
1268 $level = ZOOM::Log::module_level("zoom");
1269 ZOOM::Log::log($level, "all systems clear: thrusters invogriated");
1271 Returns the integer corresponding to the single log-level specified as
1272 the parameter, or zero if that level has not been registered by a
1273 prior call to C<mask_str()>. Since C<log()> accepts either a numeric
1274 log-level or a string, there is no reason to call this function; but,
1275 what the heck, maybe you enjoy that kind of thing. Who are we to
1280 ZOOM::Log::init_level($level);
1282 Initialises the log-level to the specified integer, which is a bitmask
1283 of values, typically as returned from C<mask_str()>. All subsequent
1284 calls to C<log()> made with a log-level that matches one of the bits
1285 in this mask will result in a log-message being emitted. All logging
1286 can be turned off by calling C<init_level(0)>.
1288 =head2 init_prefix()
1290 ZOOM::Log::init_prefix($0);
1292 Initialises a prefix string to be included in all log-messages.
1296 ZOOM::Log::init_file("/tmp/myapp.log");
1298 Initialises the output file to be used for logging: subsequent
1299 log-messages are written to the nominated file. If this function is
1300 not called, log-messages are written to the standard error stream.
1304 ZOOM::Log::init($level, $0, "/tmp/myapp.log");
1306 Initialises the log-level, the logging prefix and the logging output
1307 file in a single operation.
1309 =head2 time_format()
1311 ZOOM::Log::time_format("%Y-%m-%d %H:%M:%S");
1313 Sets the format in which log-messages' timestamps are emitted, by
1314 means of a format-string like that used in the C function
1315 C<strftime()>. The example above emits year, month, day, hours,
1316 minutes and seconds in big-endian order, such that timestamps can be
1317 sorted lexicographically.
1319 =head2 init_max_size()
1321 (This doesn't seem to work, so I won't bother describing it.)
1325 ZOOM::Log::log(8192, "reducing to warp-factor $wf");
1326 ZOOM::Log::log("myapp", "starting up with pid ", $$);
1328 Provided that the first argument, log-level, is among the modules
1329 previously established by C<init_level()>, this function emits a
1330 log-message made up of a timestamp, the prefix supplied to
1331 C<init_prefix()>, if any, and the concatenation of all arguments after
1332 the first. The message is written to the standard output stream, or
1333 to the file previous specified by C<init_file()> if this has been
1336 The log-level argument may be either a numeric value, as returned from
1337 C<module_level()>, or a string containing the module name.
1341 The ZOOM abstract API,
1342 http://zoom.z3950.org/api/zoom-current.html
1344 The C<Net::Z3950::ZOOM> module, included in the same distribution as this one.
1346 The C<Net::Z3950> module, which this one supersedes.
1347 http://perl.z3950.org/
1349 The documentation for the ZOOM-C module of the YAZ Toolkit, which this
1350 module is built on. Specifically, its lists of options are useful.
1351 http://indexdata.com/yaz/doc/zoom.tkl
1353 The BIB-1 diagnostic set of Z39.50,
1354 http://www.loc.gov/z3950/agency/defns/bib1diag.html
1358 Mike Taylor, E<lt>mike@indexdata.comE<gt>
1360 =head1 COPYRIGHT AND LICENCE
1362 Copyright (C) 2005 by Index Data.
1364 This library is free software; you can redistribute it and/or modify
1365 it under the same terms as Perl itself, either Perl version 5.8.4 or,
1366 at your option, any later version of Perl 5 you may have available.