From: Adam Dickmeiss Date: Wed, 14 Sep 2011 09:27:25 +0000 (+0200) Subject: Reformat documentation for nxml mode. X-Git-Tag: v1.6.1~7 X-Git-Url: http://sru.miketaylor.org.uk/cgi-bin?a=commitdiff_plain;h=022710cf16caff9ab16cf3375e222f335b4e4327;p=pazpar2-moved-to-github.git Reformat documentation for nxml mode. --- diff --git a/doc/ajaxdev.xml b/doc/ajaxdev.xml index d997c88..042792f 100644 --- a/doc/ajaxdev.xml +++ b/doc/ajaxdev.xml @@ -1,286 +1,330 @@
- Ajax client development - - - Pazpar2 offers programmer a simple Web Service protocol that can be used (queried in a request/response fashion) from any, server- or client-side, programming language with an XML support. However, when programming a Web-based client to Pazpar2, to achieve certain level of interactivity and instant notification of latest changes in the result set, Ajax (Asynchronous JavaScript and XML) technology may be used. An Ajax client allows user to browse the results before the lengthy process of information retrieval from the back-end targets is finished. Blocking and waiting for usually slow back-end targets is one of the biggest functionality issues in a federated search engine. - - - Pz2.js - - - Pazpar2 comes with a small JavaScript library called pz2.js. This library is designed to simplify development of an Ajax-based pazpar2 client and alleviate programmer from the low-level details like polling the web service, fetching and parsing returned XML output or managing timers, sessions and basic state variables. - - - - The library supports most major browsers including Firefox 1.5+, IE 6+, Safari 2+, Opera 9+ and Konqueror. - - - - The library can work in two modes: a session-aware mode and a session-less mode. - + Ajax client development + + + Pazpar2 offers programmer a simple Web Service protocol that can be + used (queried in a request/response fashion) from any, server- or + client-side, programming language with an XML support. However, when + programming a Web-based client to Pazpar2, to achieve certain level of + interactivity and instant notification of latest changes in the result + set, Ajax (Asynchronous JavaScript and XML) technology may be used. An + Ajax client allows user to browse the results before the lengthy + process of information retrieval from the back-end targets is + finished. Blocking and waiting for usually slow back-end targets is + one of the biggest functionality issues in a federated search engine. + - In the session-aware mode, the library assumes that the pazpar2 daemon is contacted directly (preferably via Apache proxy to avoid security breaches) and tracks the session Ids internally. - - - In the session-less mode the library assumes that the client is identified on the server and the session Ids are not managed directly. This way of operation requires more sophisticated pazpar2 proxy (preferably a wrapper written in a server-side scripting language like PHP that can identify clients and relate them to open pazpar2 sessions). - - Using pz2.js - - - Client development with the pz2.js is strongly event based and the style should be familiar to most JavaScript developers. A simple client (jsdemo) is distributed with pazpar2's source code and shows how to set-up and use pz2.js. - - - - In short, programmer starts by instantiating the pz2 object and passing an array of parameters to the constructor. The parameter array specifies callbacks used for handling responses to the pazpar2 commands. Additionally, the parameter array is used to configure run-time parameters of the pz2.js like polling timer time-outs, session-mode and XSLT style-sheets. - - - Command callbacks - - - Callback naming is simple and follows “on” prefix plus command name scheme (like onsearch, onshow, onrecord, ... etc.). When programmer calls a function like show or record on the pz2 object, pz2.js will keep on polling pazpar2 (until the backend targets are idle) and with each command's response an assigned callback will be called. In case of pazpar2's internal error an error callback is called. - - - + Pz2.js + + + Pazpar2 comes with a small JavaScript library called pz2.js. This + library is designed to simplify development of an Ajax-based pazpar2 + client and alleviate programmer from the low-level details like + polling the web service, fetching and parsing returned XML output or + managing timers, sessions and basic state variables. + + + + The library supports most major browsers including Firefox 1.5+, IE + 6+, Safari 2+, Opera 9+ and Konqueror. + + + + The library can work in two modes: a session-aware mode and a + session-less mode. + + + + In the session-aware mode, the library assumes that the pazpar2 + daemon is contacted directly (preferably via Apache proxy to avoid + security breaches) and tracks the session Ids internally. + + + + In the session-less mode the library assumes that the client is + identified on the server and the session Ids are not managed + directly. This way of operation requires more sophisticated pazpar2 + proxy (preferably a wrapper written in a server-side scripting + language like PHP that can identify clients and relate them to open + pazpar2 sessions). + + + Using pz2.js + + + Client development with the pz2.js is strongly event based and the + style should be familiar to most JavaScript developers. A simple + client (jsdemo) is distributed with pazpar2's source code and shows + how to set-up and use pz2.js. + + + + In short, programmer starts by instantiating the pz2 object and + passing an array of parameters to the constructor. The parameter array + specifies callbacks used for handling responses to the pazpar2 + commands. Additionally, the parameter array is used to configure + run-time parameters of the pz2.js like polling timer time-outs, + session-mode and XSLT style-sheets. + + + Command callbacks + + + Callback naming is simple and follows “on” prefix plus command name + scheme (like onsearch, onshow, onrecord, ... etc.). When programmer + calls a function like show or record on the pz2 object, pz2.js will + keep on polling pazpar2 (until the backend targets are idle) and with + each command's response an assigned callback will be called. In case + of pazpar2's internal error an error callback is called. + + + my_paz = new pz2 ( { - "pazpar2path": "/pazpar2/search.pz2", - "usesessions" : true, - - // assigning command handler, turns on automatic polling - "onshow": my_onshow, - // polling period for each command can be specified - "showtime": 500, - - "onterm": my_onterm, - // facet terms are specified as a comma separated list - "termlist": "subject,author", - - "onrecord": my_onrecord - } - ); - - - - Each command callback is a user defined function that takes a hash object as a parameter. The hash object contains parsed pazpar2 responses (hash members that correspond to the elements in the response XML document). Within the handler programmer further processes the data and updates the viewed document. - - - - function my_onstat(data) { - var stat = document.getElementById("stat"); - stat.innerHTML = '<span>Active clients: '+ data.activeclients - + '/' + data.clients + ' | </span>' - + '<span>Retrieved records: ' + data.records - + '/' + data.hits + '</span>'; - } - - function my_onshow(data) { - // data contains parsed show response - for (var i = 0; i < data.hits[0].length; i++) - // update page with the hits - } - - function on_record(data) { - // if detailsstylesheet parameter was set data array - // will contain raw xml and xsl data - Element_appendTransformResult(someDiv, data.xmlDoc, data.xslDoc); - } - - - pz2.js on runtime - - - The search process is initiated by calling the search method on the instantiated pz2 object. To initiate short status reports and per-target status information methods stat and bytarget have to be called accordingly. - + "pazpar2path": "/pazpar2/search.pz2", + "usesessions" : true, - - my_paz.search (query, recPergPage, 'relevance'); - + // assigning command handler, turns on automatic polling + "onshow": my_onshow, + // polling period for each command can be specified + "showtime": 500, - - Managing the results (keeping track of the browsed results page and sorting) is up to the client's programmer. At any point the show method may be called to bring up the latest result set with a different sorting criteria or range and without re-executing the search on the back-end. - + "onterm": my_onterm, + // facet terms are specified as a comma separated list + "termlist": "subject,author", - - my_paz.show (1, 10, 'relevance'); - - - - To retrieve a detailed record the record command is called. When calling record command one may temporarily override its default callback by specifying the handler parameter. This might be useful when retrieving raw records that need to be processed differently. - - - - my_paz.record (recId, 2, 'opac', { “callback”: temp_callback, “args”, caller_args}); - + "onrecord": my_onrecord + } + ); + + + + Each command callback is a user defined function that takes a hash + object as a parameter. The hash object contains parsed pazpar2 + responses (hash members that correspond to the elements in the + response XML document). Within the handler programmer further + processes the data and updates the viewed document. + + + + function my_onstat(data) { + var stat = document.getElementById("stat"); + stat.innerHTML = '<span>Active clients: '+ data.activeclients + + '/' + data.clients + ' | </span>' + + '<span>Retrieved records: ' + data.records + + '/' + data.hits + '</span>'; + } + + function my_onshow(data) { + // data contains parsed show response + for (var i = 0; i < data.hits[0].length; i++) + // update page with the hits + } + + function on_record(data) { + // if detailsstylesheet parameter was set data array + // will contain raw xml and xsl data + Element_appendTransformResult(someDiv, data.xmlDoc, data.xslDoc); + } + + + pz2.js on runtime + + + The search process is initiated by calling the search method on the + instantiated pz2 object. To initiate short status reports and + per-target status information methods stat and bytarget have to be + called accordingly. + + + + my_paz.search (query, recPergPage, 'relevance'); + + + + Managing the results (keeping track of the browsed results page and + sorting) is up to the client's programmer. At any point the show + method may be called to bring up the latest result set with a + different sorting criteria or range and without re-executing the + search on the back-end. + + + + my_paz.show (1, 10, 'relevance'); + - - - PARAMATERS ARRAY + + To retrieve a detailed record the record command is called. When + calling record command one may temporarily override its default + callback by specifying the handler parameter. This might be useful + when retrieving raw records that need to be processed differently. + + + + my_paz.record (recId, 2, 'opac', { “callback”: temp_callback, “args”, caller_args}); + + + + + PARAMATERS ARRAY - pazpar2path - server path to pazpar2 (relative to the portal), when pazpar2 is installed as a package this does not have to be set - + pazpar2path + server path to pazpar2 (relative to the portal), when pazpar2 is installed as a package this does not have to be set + - usesessions - boolean, when set to true pz2.js will manage sessions internally otherwise it's left to the server-side script, default true - + usesessions + boolean, when set to true pz2.js will manage sessions internally otherwise it's left to the server-side script, default true + - autoInit - bolean, sets auto initialization of pazpar2 session on the object instantiation, default true, valid only if usesession is set to true - + autoInit + bolean, sets auto initialization of pazpar2 session on the object instantiation, default true, valid only if usesession is set to true + - detailstylesheet - path to the xsl presentation stylesheet (relative to the portal) used for the detailed record display - - errorhandler - callback function called on any, pazpar2 or pz2.js' internal, error + detailstylesheet + path to the xsl presentation stylesheet (relative to the portal) used for the detailed record display + + errorhandler + callback function called on any, pazpar2 or pz2.js' internal, error - oninit - specifies init response callback function + oninit + specifies init response callback function - onstat - specifies stat response callback function + onstat + specifies stat response callback function - onshow - specifies show response callback function + onshow + specifies show response callback function - onterm - specifies termlist response callback function + onterm + specifies termlist response callback function - onrecord - specifies record response callback function + onrecord + specifies record response callback function - onbytarget - specifies bytarget response callback function + onbytarget + specifies bytarget response callback function - onreset - specifies reset method callback function + onreset + specifies reset method callback function - termlist - comma separated list of facets + termlist + comma separated list of facets - keepAlive - ping period, should not be lower than 5000 usec + keepAlive + ping period, should not be lower than 5000 usec - stattime - default 1000 usec + stattime + default 1000 usec - termtime + termtime - showtime + showtime - bytargettime + bytargettime - + - - - METHODS - - stop () - stop activity by clearing timeouts + + + METHODS + + stop () + stop activity by clearing timeouts - reset () - reset state + reset () + reset state - init (sesionId) - session-mode, initialize new session or pick up a session already initialized + init (sesionId) + session-mode, initialize new session or pick up a session already initialized - ping () - session-mode, intitialize pinging + ping () + session-mode, intitialize pinging - search (query, num, sort, filter, showfrom) - execute piggy-back search and activate polling on every command specified by assigning command callback (in the pz2 constructor) + search (query, num, sort, filter, showfrom) + execute piggy-back search and activate polling on every command specified by assigning command callback (in the pz2 constructor) - show (start, num, sort) - start or change parameters of polling for a given window of records + show (start, num, sort) + start or change parameters of polling for a given window of records - record (id, offset, syntax, handler) - retrieve detailed or raw record. handler temporarily overrides default callback function. + record (id, offset, syntax, handler) + retrieve detailed or raw record. handler temporarily overrides default callback function. - termlist () - start polling for termlists + termlist () + start polling for termlists - bytarget () - start polling for target status + bytarget () + start polling for target status - stat () - start polling for pazpar2 status + stat () + start polling for pazpar2 status - - - - Pz2.js comes with a set of cross-browser helper classes and functions. + + + + Pz2.js comes with a set of cross-browser helper classes and functions. - + - Ajax helper class + Ajax helper class - pzHttpRequest - a cross-browser Ajax wrapper class + pzHttpRequest + a cross-browser Ajax wrapper class - constructor (url, errorHandler) - create new request for a given url + constructor (url, errorHandler) + create new request for a given url - get (params, callback) - asynchronous, send the request with given parameters (array) and call callback with response as parameter + get (params, callback) + asynchronous, send the request with given parameters (array) and call callback with response as parameter - post (params, data, callback) - asychronous, post arbitrary data (may be XML doc) and call callback with response as parameter + post (params, data, callback) + asychronous, post arbitrary data (may be XML doc) and call callback with response as parameter - load () - synchronous, returns the response for the given request + load () + synchronous, returns the response for the given request - + - + - XML helper functions - - document.newXmlDoc (root) - create new XML document with root node as specified in parameter + XML helper functions + + document.newXmlDoc (root) + create new XML document with root node as specified in parameter - document.parseXmlFromString (xmlString) - create new XML document from string + document.parseXmlFromString (xmlString) + create new XML document from string - document.transformToDoc (xmlDoc, xslDoc) - returns new XML document as a result + document.transformToDoc (xmlDoc, xslDoc) + returns new XML document as a result - Element_removeFromDoc (DOM_Element) - remove element from the document + Element_removeFromDoc (DOM_Element) + remove element from the document - Element_emptyChildren (DOM_Element) + Element_emptyChildren (DOM_Element) - Element_appendTransformResult (DOM_Element, xmlDoc, xslDoc) - append xsl transformation result to a DOM element + Element_appendTransformResult (DOM_Element, xmlDoc, xslDoc) + append xsl transformation result to a DOM element - Element_appendTextNode (DOM_Element, tagName, textContent) - append new text node to the element + Element_appendTextNode (DOM_Element, tagName, textContent) + append new text node to the element - Element_setTextContent (DOM_Element, textContent) - set text content of the element + Element_setTextContent (DOM_Element, textContent) + set text content of the element - Element_getTextContent (DOM_Element) - get text content of the element + Element_getTextContent (DOM_Element) + get text content of the element - Element_parseChildNodes (DOM_Element) - parse all descendants into an associative array + Element_parseChildNodes (DOM_Element) + parse all descendants into an associative array - - + +
+Local variables: +mode: nxml +nxml-child-indent: 1 +End: +--> diff --git a/doc/book.xml b/doc/book.xml index 7c9c9d4..f11e68a 100644 --- a/doc/book.xml +++ b/doc/book.xml @@ -51,121 +51,121 @@ - - - - - + + + + + Introduction - +
- What Pazpar2 is - - Pazpar2 is a stand-alone metasearch engine with a web-service API, designed - to be used either from a browser-based client (JavaScript, Flash, - Java applet, - etc.), from server-side code, or any combination of the two. - Pazpar2 is a highly optimized client designed to - search many resources in parallel. It implements record merging, - relevance-ranking and sorting by arbitrary data content, and facet - analysis for browsing purposes. It is designed to be data-model - independent, and is capable of working with MARC, DublinCore, or any - other XML-structured response format - -- XSLT is used to normalize and extract - data from retrieval records for display and analysis. It can be used - against any server which supports the - Z39.50, SRU/SRW - or SOLR protocol. Proprietary - backend modules can function as connectors between these standard - protocols and any non-standard API, including web-site scraping, to - support a large number of other protocols. - - - Additional functionality such as - user management and attractive displays are expected to be implemented by - applications that use Pazpar2. Pazpar2 itself is user-interface independent. - Its functionality is exposed through a simple XML-based web-service API, - designed to be easy to use from an Ajax-enabled browser, Flash - animation, Java applet, etc., or from a higher-level server-side language - like PHP, Perl or Java. Because session information can be shared between - browser-based logic and server-side scripting, there is tremendous - flexibility in how you implement application-specific logic on top - of Pazpar2. - - - Once you launch a search in Pazpar2, the operation continues behind the - scenes. Pazpar2 connects to servers, carries out searches, and - retrieves, deduplicates, and stores results internally. Your application - code may periodically inquire about the status of an ongoing operation, - and ask to see records or result set facets. Results become - available immediately, and it is easy to build end-user interfaces than - feel extremely responsive, even when searching more than 100 servers - concurrently. - - - Pazpar2 is designed to be highly configurable. Incoming records are - normalized to XML/UTF-8, and then further normalized using XSLT to a - simple internal representation that is suitable for analysis. By - providing XSLT stylesheets for different kinds of result records, you - can configure Pazpar2 to work against different kinds of information - retrieval servers. Finally, metadata is extracted in a configurable - way from this internal record, to support display, merging, ranking, - result set facets, and sorting. Pazpar2 is not bound to a specific model - of metadata, such as DublinCore or MARC: by providing the right - configuration, it can work with any combination of different kinds of data in - support of many different applications. - - - Pazpar2 is designed to be efficient and scalable. You can set it up to - search several hundred targets in parallel, or you can use it to support - hundreds of concurrent users. It is implemented with the same attention - to performance and economy that we use in our indexing engines, so that - you can focus on building your application without worrying about the - details of metasearch logic. You can devote all of your attention to - usability and let Pazpar2 do what it does best -- metasearch. - - - Pazpar2 is our attempt to re-think the traditional paradigms for - implementing and deploying metasearch logic, with an uncompromising - approach to performance, and attempting to make maximum use of the - capabilities of modern browsers. The demo user interface that - accompanies the distribution is but one example. If you think of new - ways of using Pazpar2, we hope you'll share them with us, and if we - can provide assistance with regards to training, design, programming, - integration with different backends, hosting, or support, please don't - hesitate to contact us. If you'd like to see functionality in Pazpar2 - that is not there today, please don't hesitate to contact us. It may - already be in our development pipeline, or there might be a - possibility for you to help out by sponsoring development time or - code. Either way, get in touch and we will give you straight answers. - - - Enjoy! - - - Pazpar2 is covered by the GNU General Public License (GPL) version 2. - See for further information. - + What Pazpar2 is + + Pazpar2 is a stand-alone metasearch engine with a web-service API, designed + to be used either from a browser-based client (JavaScript, Flash, + Java applet, + etc.), from server-side code, or any combination of the two. + Pazpar2 is a highly optimized client designed to + search many resources in parallel. It implements record merging, + relevance-ranking and sorting by arbitrary data content, and facet + analysis for browsing purposes. It is designed to be data-model + independent, and is capable of working with MARC, DublinCore, or any + other XML-structured response format + -- XSLT is used to normalize and extract + data from retrieval records for display and analysis. It can be used + against any server which supports the + Z39.50, SRU/SRW + or SOLR protocol. Proprietary + backend modules can function as connectors between these standard + protocols and any non-standard API, including web-site scraping, to + support a large number of other protocols. + + + Additional functionality such as + user management and attractive displays are expected to be implemented by + applications that use Pazpar2. Pazpar2 itself is user-interface independent. + Its functionality is exposed through a simple XML-based web-service API, + designed to be easy to use from an Ajax-enabled browser, Flash + animation, Java applet, etc., or from a higher-level server-side language + like PHP, Perl or Java. Because session information can be shared between + browser-based logic and server-side scripting, there is tremendous + flexibility in how you implement application-specific logic on top + of Pazpar2. + + + Once you launch a search in Pazpar2, the operation continues behind the + scenes. Pazpar2 connects to servers, carries out searches, and + retrieves, deduplicates, and stores results internally. Your application + code may periodically inquire about the status of an ongoing operation, + and ask to see records or result set facets. Results become + available immediately, and it is easy to build end-user interfaces than + feel extremely responsive, even when searching more than 100 servers + concurrently. + + + Pazpar2 is designed to be highly configurable. Incoming records are + normalized to XML/UTF-8, and then further normalized using XSLT to a + simple internal representation that is suitable for analysis. By + providing XSLT stylesheets for different kinds of result records, you + can configure Pazpar2 to work against different kinds of information + retrieval servers. Finally, metadata is extracted in a configurable + way from this internal record, to support display, merging, ranking, + result set facets, and sorting. Pazpar2 is not bound to a specific model + of metadata, such as DublinCore or MARC: by providing the right + configuration, it can work with any combination of different kinds of data + in support of many different applications. + + + Pazpar2 is designed to be efficient and scalable. You can set it up to + search several hundred targets in parallel, or you can use it to support + hundreds of concurrent users. It is implemented with the same attention + to performance and economy that we use in our indexing engines, so that + you can focus on building your application without worrying about the + details of metasearch logic. You can devote all of your attention to + usability and let Pazpar2 do what it does best -- metasearch. + + + Pazpar2 is our attempt to re-think the traditional paradigms for + implementing and deploying metasearch logic, with an uncompromising + approach to performance, and attempting to make maximum use of the + capabilities of modern browsers. The demo user interface that + accompanies the distribution is but one example. If you think of new + ways of using Pazpar2, we hope you'll share them with us, and if we + can provide assistance with regards to training, design, programming, + integration with different backends, hosting, or support, please don't + hesitate to contact us. If you'd like to see functionality in Pazpar2 + that is not there today, please don't hesitate to contact us. It may + already be in our development pipeline, or there might be a + possibility for you to help out by sponsoring development time or + code. Either way, get in touch and we will give you straight answers. + + + Enjoy! + + + Pazpar2 is covered by the GNU General Public License (GPL) version 2. + See for further information. +
- Connectors to non-standard databases - - If you wish to connect to commercial or other databases which do not - support open standards, please contact Index Data on - info@indexdata.com. We have a - proprietary framework for building connectors that enable Pazpar2 - to access - thousands of online databases, in addition to the vast number of catalogs - and online services that support the Z39.50/SRU/SRW/SOLR protocols. - + Connectors to non-standard databases + + If you wish to connect to commercial or other databases which do not + support open standards, please contact Index Data on + info@indexdata.com. We have a + proprietary framework for building connectors that enable Pazpar2 + to access + thousands of online databases, in addition to the vast number of catalogs + and online services that support the Z39.50/SRU/SRW/SOLR protocols. +
- +
A note on the name Pazpar2 @@ -193,30 +193,30 @@ Pazpar2 depends on the following tools/libraries: YAZ - - - The popular Z39.50 toolkit for the C language. - YAZ must be compiled with Libxml2/Libxslt support. - - + + + The popular Z39.50 toolkit for the C language. + YAZ must be compiled with Libxml2/Libxslt support. + + International - Components for Unicode (ICU) - - - ICU provides Unicode support for non-English languages with - character sets outside the range of 7bit ASCII, like - Greek, Russian, German and French. Pazpar2 uses the ICU - Unicode character conversions, Unicode normalization, case - folding and other fundamental operations needed in - tokenization, normalization and ranking of records. - - - Compiling, linking, and usage of the ICU libraries is optional, - but strongly recommended for usage in an international - environment. - - + Components for Unicode (ICU) + + + ICU provides Unicode support for non-English languages with + character sets outside the range of 7bit ASCII, like + Greek, Russian, German and French. Pazpar2 uses the ICU + Unicode character conversions, Unicode normalization, case + folding and other fundamental operations needed in + tokenization, normalization and ranking of records. + + + Compiling, linking, and usage of the ICU libraries is optional, + but strongly recommended for usage in an international + environment. + + @@ -261,59 +261,59 @@ changed with configure option .
- +
- Installation from source on Windows - - Pazpar2 can be built for Windows using - Microsoft Visual Studio. - The support files for building YAZ on Windows are located in the - win directory. The compilation is performed - using the win/makefile which is to be - processed by the NMAKE utility part of Visual Studio. - - - Ensure that the development libraries and header files are - available on your system before compiling Pazpar2. For installation - of YAZ, refer to - the Installation chapter of the YAZ manual at - . - It is easiest if YAZ and Pazpar2 are unpacked in the same - directory (side-by-side). - - - The compilation is tuned by editing the makefile of Pazpar2. - The process is similar to YAZ. Adjust the various directories - YAZ_DIR, ZLIB_DIR, etc., - as required. - - - Compile Pazpar2 by invoking nmake in - the win directory. - The resulting binaries of the build process are located in the - bin of the Pazpar2 source - tree - including the pazpar2.exe and necessary DLLs. - - - The Windows version of Pazpar2 is a console application. It may - be installed as a Windows Service by adding option - -install for the pazpar2 program. This will - register Pazpar2 as a service and use the other options provided - in the same invocation. For example: - - cd \MyPazpar2\etc - ..\bin\pazpar2 -install -f pazpar2.cfg -l pazpar2.log - - The Pazpar2 service may now be controlled via the Service Control - Panel. It may be unregistered by passing the -remove - option. Example: - - cd \MyPazpar2\etc - ..\bin\pazpar2 -remove - - + Installation from source on Windows + + Pazpar2 can be built for Windows using + Microsoft Visual Studio. + The support files for building YAZ on Windows are located in the + win directory. The compilation is performed + using the win/makefile which is to be + processed by the NMAKE utility part of Visual Studio. + + + Ensure that the development libraries and header files are + available on your system before compiling Pazpar2. For installation + of YAZ, refer to + the Installation chapter of the YAZ manual at + . + It is easiest if YAZ and Pazpar2 are unpacked in the same + directory (side-by-side). + + + The compilation is tuned by editing the makefile of Pazpar2. + The process is similar to YAZ. Adjust the various directories + YAZ_DIR, ZLIB_DIR, etc., + as required. + + + Compile Pazpar2 by invoking nmake in + the win directory. + The resulting binaries of the build process are located in the + bin of the Pazpar2 source + tree - including the pazpar2.exe and necessary DLLs. + + + The Windows version of Pazpar2 is a console application. It may + be installed as a Windows Service by adding option + -install for the pazpar2 program. This will + register Pazpar2 as a service and use the other options provided + in the same invocation. For example: + + cd \MyPazpar2\etc + ..\bin\pazpar2 -install -f pazpar2.cfg -l pazpar2.log + + The Pazpar2 service may now be controlled via the Service Control + Panel. It may be unregistered by passing the -remove + option. Example: + + cd \MyPazpar2\etc + ..\bin\pazpar2 -remove + +
- +
Installation of test interfaces @@ -395,7 +395,7 @@
- Installation on Debian or Ubuntu GNU/Linux + Installation on Debian GNU/Linux and Ubuntu Index Data provides Debian and Ubuntu packages for Pazpar2. As of February 2010, these @@ -406,14 +406,16 @@ .
- +
Apache 2 Proxy Apache 2 has a - + proxy module - which allows Pazpar2 to become a backend to an Apache 2 + + which allows Pazpar2 to become a backend to an Apache 2 based web service. The Apache 2 proxy must operate in the Reverse Proxy mode. @@ -425,14 +427,16 @@ sudo a2enmod proxy_http proxy_balancer - + Traditionally Pazpar2 interprets URL paths with suffix /search.pz2. The - ProxyPass directive of Apache must be used to map a URL path + + ProxyPass + + directive of Apache must be used to map a URL path the the Pazpar2 server (listening port). @@ -469,16 +473,16 @@
- +
- + Using Pazpar2 This chapter provides a general introduction to the use and deployment of Pazpar2. - +
Pazpar2 and your systems architecture @@ -509,7 +513,7 @@ with the server from which the enclosing HTML page or object originated, Pazpar2 is designed so that it can act as a transparent proxy in front of an existing webserver (see for details). + linkend="pazpar2_conf"/> for details). In this mode, all regular HTTP requests are transparently passed through to your webserver, while Pazpar2 only intercepts search-related webservice requests. @@ -584,11 +588,11 @@ ]]> - + As you can see, there isn't much to it. There are really only a few important elements to this file. - + Elements should belong to the namespace http://www.indexdata.com/pazpar2/1.0. @@ -630,7 +634,7 @@ The webservice API of Pazpar2 is described in detail in . - + In brief, you use the 'init' command to create a session, a temporary workspace which carries information about the current @@ -664,11 +668,12 @@ no effort. Resources that use non-standard record formats will require a bit of XSLT work, but that's all. - + - But what about resources that don't support Z39.50 at all? Some resources might - support OpenSearch, private, XML/HTTP-based protocols, or something - else entirely. Some databases exist only as web user interfaces and + But what about resources that don't support Z39.50 at all? + Some resources might support OpenSearch, private, XML/HTTP-based + protocols, or something else entirely. + Some databases exist only as web user interfaces and will require screen-scraping. Still others exist only as static files, or perhaps as databases supporting the OAI-PMH protocol. There is hope! Read on. @@ -679,12 +684,12 @@ work with database vendors to support standards, so you don't have to worry about programming against non-standard services. We also provide tools (see SimpleServer) + url="http://www.indexdata.com/simpleserver">SimpleServer) which make it comparatively easy to build gateways against servers with non-standard behavior. Again, we encourage you to share any work you do in this direction. - + But the bottom line is that working with non-standard resources in metasearching is really, really hard. If you want to build a @@ -741,9 +746,17 @@
Load balancing - Just like any web server, Pazpar2, can be load balanced by a standard hardware or software load balancer as long as the session stickiness is ensured. If you are already running the Apache2 web server in front of Pazpar2 and use the apache mod_proxy module to 'relay' client requests to Pazpar2, this set up can be easily extended to include load balancing capabilites. To do so you need to enable the - mod_proxy_balancer - module in your Apache2 installation. + Just like any web server, Pazpar2, can be load balanced by a standard + hardware or software load balancer as long as the session stickiness + is ensured. If you are already running the Apache2 web server in front + of Pazpar2 and use the apache mod_proxy module to 'relay' client + requests to Pazpar2, this set up can be easily extended to include + load balancing capabilites. + To do so you need to enable the + + mod_proxy_balance + + module in your Apache2 installation. @@ -755,14 +768,28 @@ - The mod_proxy_balancer can pass all 'sessionsticky' requests to the same backend worker as long as the requests are marked with the originating worker's ID (called 'route'). If the Pazpar2 serverID is configured (by setting an 'id' attribute on the 'server' element in the Pazpar2 configuration file) Pazpar2 will append it to the 'session' element returned during the 'init' in a mod_proxy_balancer compatible manner. Since the 'session' is then re-sent by the client (for all pazpar2 request besides 'init'), the balancer can use the marker to pass the request to the right route. To do so the balancer needs to be configured to inspect the 'session' parameter. + The mod_proxy_balancer can pass all 'sessionsticky' requests to the + same backend worker as long as the requests are marked with the + originating worker's ID (called 'route'). If the Pazpar2 serverID is + configured (by setting an 'id' attribute on the 'server' element in + the Pazpar2 configuration file) Pazpar2 will append it to the + 'session' element returned during the 'init' in a mod_proxy_balancer + compatible manner. + Since the 'session' is then re-sent by the client (for all pazpar2 + request besides 'init'), the balancer can use the marker to pass + the request to the right route. To do so the balancer needs to be + configured to inspect the 'session' parameter. Apache 2 load balancing configuration - Having 4 Pazpar2 instances running on the same host, port range of 8004-8007 and serverIDs of: pz1, pz2, pz3 and pz4 respectively we could use the following Apache 2 configuration to expose a single pazpar2 'endpoint' on a standard (/pazpar2/search.pz2) location: - + Having 4 Pazpar2 instances running on the same host, port range of + 8004-8007 and serverIDs of: pz1, pz2, pz3 and pz4 respectively we + could use the following Apache 2 configuration to expose a single + pazpar2 'endpoint' on a standard + (/pazpar2/search.pz2) location: + AddDefaultCharset off @@ -782,15 +809,21 @@ # route is resent in the 'session' param which has the form: # 'sessid.serverid', understandable by the mod_proxy_load_balancer # this is not going to work if the client tampers with the 'session' param - ProxyPass /pazpar2/search.pz2 balancer://pz2cluster lbmethod=byrequests stickysession=session nofailover=On]]> - - The 'ProxyPass' line sets up a reverse proxy for request ‘/pazpar2/search.pz2’ and delegates all requests to the load balancer (virtual worker) with name ‘pz2cluster’. Sticky sessions are enabled and implemented using the ‘session’ parameter. The ‘Proxy’ section lists all the servers (real workers) which the load balancer can use. - - - - + ProxyPass /pazpar2/search.pz2 balancer://pz2cluster lbmethod=byrequests stickysession=session nofailover=On + ]]> + + The 'ProxyPass' line sets up a reverse proxy for request + ‘/pazpar2/search.pz2’ and delegates all requests to the load balancer + (virtual worker) with name ‘pz2cluster’. + Sticky sessions are enabled and implemented using the ‘session’ parameter. + The ‘Proxy’ section lists all the servers (real workers) which the + load balancer can use. + + + +
- + @@ -805,51 +838,44 @@ &manref; - License - - - Pazpar2, - Copyright © ©right-year; Index Data. - - - - Pazpar2 is free software; you can redistribute it and/or modify it under - the terms of the GNU General Public License as published by the Free - Software Foundation; either version 2, or (at your option) any later - version. - - - - Pazpar2 is distributed in the hope that it will be useful, but WITHOUT ANY - WARRANTY; without even the implied warranty of MERCHANTABILITY or - FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License - for more details. - - - - You should have received a copy of the GNU General Public License - along with Pazpar2; see the file LICENSE. If not, write to the - Free Software Foundation, - 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA - - + + License + + + Pazpar2, + Copyright © ©right-year; Index Data. + + + + Pazpar2 is free software; you can redistribute it and/or modify it under + the terms of the GNU General Public License as published by the Free + Software Foundation; either version 2, or (at your option) any later + version. + + + + Pazpar2 is distributed in the hope that it will be useful, but WITHOUT ANY + WARRANTY; without even the implied warranty of MERCHANTABILITY or + FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License + for more details. + + + + You should have received a copy of the GNU General Public License + along with Pazpar2; see the file LICENSE. If not, write to the + Free Software Foundation, + 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + + &gpl2; - + diff --git a/doc/pazpar2.xml b/doc/pazpar2.xml index 32b499c..9e346e8 100644 --- a/doc/pazpar2.xml +++ b/doc/pazpar2.xml @@ -46,8 +46,10 @@ - DESCRIPTION - pazpar2 is the Pazpar2 Metasearch daemon + + DESCRIPTION + + pazpar2 is the Pazpar2 Metasearch daemon and server. In normal operation it acts as a simple HTTP server which serves the Pazpar2 protocol. @@ -58,8 +60,8 @@ - OPTIONS - + + OPTIONS @@ -79,7 +81,9 @@ - Puts the Pazpar2 server in the background. + + Puts the Pazpar2 server in the background. + @@ -197,14 +201,14 @@ - This is an option which is only recognized on Windows. It installs - Pazpar2 as a Windows Service. + This is an option which is only recognized on Windows. It installs + Pazpar2 as a Windows Service. - - Pazpar2 only supports Windows Service options if Pazpar2 is - linked against YAZ 3.0.29 or later. - + + Pazpar2 only supports Windows Service options if Pazpar2 is + linked against YAZ 3.0.29 or later. + @@ -213,8 +217,8 @@ - This is an option which is only recognized on Windows. It removes - a Pazpar2 - Windows Service. + This is an option which is only recognized on Windows. It removes + a Pazpar2 - Windows Service. @@ -223,7 +227,8 @@ - EXAMPLES + + EXAMPLES The Debian package of pazpar2 starts the server with: pazpar2 -D -f /etc/pazpar2/pazpar2.cfg -l /var/log/pazpar2.log -p /var/run/pazpar2.pid -u nobody @@ -253,15 +258,17 @@ - FILES + + FILES /usr/sbin/pazpar2: pazpar2 daemon - + /usr/share/pazpar2: pazpar2 shared files - + /etc/pazpar2: pazpar2 config area - SEE ALSO + + SEE ALSO Pazpar2 configuration: @@ -280,18 +287,9 @@ - diff --git a/doc/pazpar2_conf.xml b/doc/pazpar2_conf.xml index 04f6e2a..8ae9b97 100644 --- a/doc/pazpar2_conf.xml +++ b/doc/pazpar2_conf.xml @@ -15,7 +15,7 @@ &version; Index Data - + Pazpar2 conf 5 @@ -33,7 +33,8 @@ - DESCRIPTION + + DESCRIPTION The Pazpar2 configuration file, together with any referenced XSLT files, govern Pazpar2's behavior as a client, and control the normalization and @@ -49,7 +50,8 @@ - FORMAT + + FORMAT The configuration file is XML-structured. It must be well-formed XML. All elements specific to Pazpar2 should belong to the namespace @@ -60,24 +62,27 @@ information. The categories are described below. - threads - - This section is optional and is supported for Pazpar2 version 1.3.1 and - later . It is identified by element "threads" which - may include one attribute "number" which specifies - the number of worker-threads that the Pazpar2 instance is to use. - A value of 0 (zero) disables worker-threads (all work is carried out - in main thread). - + + threads + + This section is optional and is supported for Pazpar2 version 1.3.1 and + later . It is identified by element "threads" which + may include one attribute "number" which specifies + the number of worker-threads that the Pazpar2 instance is to use. + A value of 0 (zero) disables worker-threads (all work is carried out + in main thread). + - server + + server This section governs overall behavior of a server endpoint. It is identified by the element "server" which takes an optional attribute, "id", which identifies this particular Pazpar2 server. Any string value for "id" may be given. - The data + + The data elements are described below. From Pazpar2 version 1.2 this is a repeatable element. @@ -119,7 +124,7 @@ - + relevance / sort / mergekey / facet @@ -169,19 +174,21 @@ - metadata + + metadata One of these elements is required for every data element in the internal representation of the record (see . It governs - subsequent processing as pertains to sorting, relevance - ranking, merging, and display of data elements. It supports - the following attributes: + subsequent processing as pertains to sorting, relevance + ranking, merging, and display of data elements. It supports + the following attributes: - name + + name This is the name of the data element. It is matched @@ -199,7 +206,8 @@ - type + + type The type of data element. This value governs any @@ -212,7 +220,8 @@ - brief + + brief If this is set to 'yes', then the data element is @@ -223,7 +232,8 @@ - sortkey + + sortkey Specifies that this data element is to be used for @@ -235,7 +245,8 @@ - rank + + rank Specifies that this element is to be used to @@ -251,7 +262,8 @@ - termlist + + termlist Specifies that this element is to be used as a @@ -265,7 +277,8 @@ - merge + + merge This governs whether, and how elements are extracted @@ -279,8 +292,9 @@ - - mergekey + + + mergekey If set to 'required', the value of this @@ -302,8 +316,9 @@ - - setting + + + setting This attribute allows you to make use of static database @@ -347,7 +362,8 @@ in order from top to bottom. - casemap + + casemap The attribute 'rule' defines the direction of the @@ -356,7 +372,8 @@ - transform + + transform Normalization and transformation of tokens follows @@ -364,14 +381,15 @@ possible values we refer to the extensive ICU documentation found at the ICU - transformation home page. Set filtering + transformation home page. Set filtering principles are explained at the ICU set and - filtering page. + filtering page. - tokenize + + tokenize Tokenization is the only rule in the ICU chain @@ -424,7 +442,7 @@ - + settings @@ -458,68 +476,71 @@ - - - - - EXAMPLE - Below is a working example configuration: - - - - - - - - + EXAMPLE + + Below is a working example configuration: + + + + + + + + + + - - + - - - - - - - - - - - - - - - - - ]]> - + + + + + + + + + + + + + + + + ]]> + - INCLUDE FACILITY + + INCLUDE FACILITY The XML configuration may be partitioned into multiple files by using the include element which takes a single attribute, src. The of the src attribute is regular Shell like glob-pattern. For example, - ]]> + + ]]> The include facility requires Pazpar2 version 1.2. - TARGET SETTINGS + + TARGET SETTINGS Pazpar2 features a cunning scheme by which you can associate various kinds of attributes, or settings with search targets. This can be done @@ -566,7 +587,7 @@ on a per-session basis. This allows the client to override specific CCL fields for searching, etc., to meet the needs of a session or user. - + Finally, as an extreme case of this, the webservice client can introduce entirely new targets, on the fly, as part of the @@ -578,7 +599,7 @@ long as the webservice client is prepared to supply the necessary information at the beginning of every session. - + The following discussion of practical issues related to session and settings @@ -586,7 +607,7 @@ technology. It would apply equally well to many other kinds of browser-based logic. - + Typically, a Javascript client is not allowed to directly alter the parameters of a session. There are two reasons for this. One has to do with access @@ -598,37 +619,38 @@ webserver. Typically, this can be handled during the session initialization, as follows: - + Step 1: The Javascript client loads, and asks the webserver for a new Pazpar2 session ID. This can be done using a Javascript call, for instance. Note that it is possible to submit Ajax HTTPXmlRequest calls either to Pazpar2 or to the webserver that Pazpar2 is proxying for. See (XXX Insert link to Pazpar2 protocol). - - +
+ Step 2: Code on the webserver authenticates the user, by database lookup, LDAP access, NCIP, etc. Determines which resources the user has access to, and any user-specific parameters that are to be applied during this session. - + Step 3: The webserver initializes a new Pazpar2 settings, and sets user-specific parameters as necessary, using the init webservice command. A new session ID is returned. - + Step 4: The webserver returns this session ID to the Javascript client, which then uses the session ID to submit searches, show results, etc. - + Step 5: When the Javascript client ceases to use the session, Pazpar2 destroys any session-specific information. - SETTINGS FILE FORMAT + + SETTINGS FILE FORMAT Each file contains a root element named <settings>. It may contain one or more <set> elements. The settings and set @@ -637,7 +659,7 @@ specify (directly, or inherited from the parent node) at least a target, name, and value. - + target @@ -700,7 +722,7 @@ - + By setting defaults for target, name, or value in the root settings node, you can use the settings files in many different @@ -712,83 +734,84 @@ many databases with a given category or class that makes sense within your application. - + The following examples illustrate uses of the settings system to associate settings with targets to meet different requirements. - + The example below associates a set of default values that can be used across many targets. Note the wildcard for targets. This associates the given settings with all targets for which no other information is provided. + - + - - + + - - - - - - -q - - + + + + + + + + + - + - - + + - - + + - + - - + + - + - ]]> + ]]> - + The next example shows certain settings overridden for one target, one which returns XML records containing DublinCore elements, and which furthermore requires a username/password. - - - + + + + - - - ]]> + + + ]]> - + The following example associates a specific name/value combination with a number of targets. The targets below are access-restricted, and can only be used by users with special credentials. - - - - ]]> + + + + + ]]> - + - - RESERVED SETTING NAMES + + + RESERVED SETTING NAMES The following setting names are reserved by Pazpar2 to control the behavior of the client function. @@ -877,9 +900,9 @@ q pz:queryencoding - The encoding of the search terms that a target accepts. Most - targets do not honor UTF-8 in which case this needs to be specified. - Each term in a query will be converted if this setting is given. + The encoding of the search terms that a target accepts. Most + targets do not honor UTF-8 in which case this needs to be specified. + Each term in a query will be converted if this setting is given. @@ -919,12 +942,13 @@ q performance with the alternate "MARC map" format. Provide the path of a file with extension ".mmap" containing on each line: - <field> <subfield> <metadata element> + <field> <subfield> <metadata element> For example: - 245 a title - 500 $ description - 773 * citation + 245 a title + 500 $ description + 773 * citation + To map the field value specify a subfield of '$'. To store a concatenation of all subfields, specify a subfield of '*'. @@ -944,9 +968,10 @@ q Allows or denies access to the resources it is applied to. Possible - values are '0' and '1'. The default is '1' (allow access to this resource). - See the manual section on authorization and authentication for discussion - about how to use this setting. + values are '0' and '1'. + The default is '1' (allow access to this resource). + See the manual section on authorization and authentication for + discussion about how to use this setting. @@ -1005,8 +1030,8 @@ q the protocol. - A value of 'solr' anables SOLR client support. This is supported - for Pazpar version 1.5.0 and later. + A value of 'solr' anables SOLR client support. This is supported + for Pazpar version 1.5.0 and later. @@ -1072,7 +1097,7 @@ q will be ignored. The filter takes the form name, name~value, or name=value, which will include only records with metadata element (name) that has the substring (~value) given, or matches exactly (=value). If value is omitted all records - with the named + with the named metadata element present will be included. @@ -1107,10 +1132,10 @@ q field on the target. - - At this point only SOLR targets have been tested with this - facility. - + + At this point only SOLR targets have been tested with this + facility. + @@ -1119,22 +1144,22 @@ q pz:limitmap:name - Specifies attributes for limiting a search to a field - using - the limit parameter for search. In some cases the mapping of - a field to a value is identical to an existing cclmap field; in - other cases the field must be specified in a different way - for - example to match a complete field (rather than parts of a subfield). + Specifies attributes for limiting a search to a field - using + the limit parameter for search. In some cases the mapping of + a field to a value is identical to an existing cclmap field; in + other cases the field must be specified in a different way - for + example to match a complete field (rather than parts of a subfield). - The value of limitmap may have one of two forms: referral to - an exisiting CCL field or a raw PQF string. Leading string - determines type; either ccl: for CCL field or - rpn: for PQF/RPN. + The value of limitmap may have one of two forms: referral to + an exisiting CCL field or a raw PQF string. Leading string + determines type; either ccl: for CCL field or + rpn: for PQF/RPN. - - - The limitmap facility is supported for Pazpar2 version 1.6.0. - + + + The limitmap facility is supported for Pazpar2 version 1.6.0. + @@ -1142,9 +1167,10 @@ q - + - SEE ALSO + + SEE ALSO pazpar2 @@ -1163,15 +1189,7 @@ q diff --git a/doc/pazpar2_protocol.xml b/doc/pazpar2_protocol.xml index 1de2ec2..d332a62 100644 --- a/doc/pazpar2_protocol.xml +++ b/doc/pazpar2_protocol.xml @@ -26,7 +26,8 @@ The webservice protocol of Pazpar2 - DESCRIPTION + + DESCRIPTION Webservice requests are any that refer to filename "search.pz2". Arguments are GET-style parameters. Argument 'command' is always required and specifies @@ -34,13 +35,15 @@ request is forwarded to the HTTP server specified in the configuration using the proxy setting. This way, a regular webserver can host the user interface (itself dynamic - or static HTML), and Ajax-style calls can be used from JS (or any other client-based - scripting environment) to interact with the search logic in Pazpar2. + or static HTML), and Ajax-style calls can be used from JS (or any other + client-based scripting environment) to interact with the search logic + in Pazpar2. Each command is described in sub sections to follow. - init + + init Initializes a session. Returns session ID to be used in subsequent requests. If @@ -63,9 +66,9 @@ ]]> - The init command may take a number of setting parameters, similar to - the 'settings' command described below. These settings are immediately - applied to the new session. Other parameters for init are: + The init command may take a number of setting parameters, similar to + the 'settings' command described below. These settings are immediately + applied to the new session. Other parameters for init are: clear @@ -92,7 +95,8 @@ - ping + + ping Keeps a session alive. An idle session will time out after one minute. The ping command can be used to keep the session alive absent other @@ -145,7 +149,7 @@ Example: Response: OK ]]> - - + + - search + + search Launches a search, parameters: - + session @@ -297,7 +302,7 @@ search.pz2?session=2044502273&command=stat Session ID - + @@ -321,8 +326,8 @@ search.pz2?session=2044502273&command=stat block - If block is set to 1, the command will hang until there are records ready - to display. Use this to show first records rapidly without + If block is set to 1, the command will hang until there are records + ready to display. Use this to show first records rapidly without requiring rapid polling. @@ -337,7 +342,8 @@ search.pz2?session=2044502273&command=stat field first. A sort field may be followed by a colon followed by the number '0' or '1', indicating whether results should be sorted in increasing or decreasing order according to that field. 0==Decreasing is - the default. Sort field names can be any field name designated as a sort field + the default. + Sort field names can be any field name designated as a sort field in the pazpar2.cfg file, or the special name 'relevance'. @@ -382,7 +388,7 @@ search.pz2?session=2044502273&command=show&start=0&num=2&sort=title:1 Retrieves a detailed record. Unlike the show command, this command returns metadata records before merging takes place. Parameters: - + session @@ -518,7 +524,7 @@ search.pz2?session=605047297&command=record&id=3 -Output: + Output: 3 @@ -540,8 +546,8 @@ Output: ]]> - - + + For the special termlist name "xtargets", results are returned about the targets which have returned the most hits. @@ -560,9 +566,9 @@ Output: 0 -- Z39.50 diagnostic codes ]]> - + - + bytarget @@ -585,7 +591,7 @@ Output: - + Example output: - SEE ALSO + + SEE ALSO Pazpar2: @@ -630,15 +637,7 @@ search.pz2?session=605047297&command=bytarget&id=3