X-Git-Url: http://sru.miketaylor.org.uk/?a=blobdiff_plain;f=doc%2Fzebra.sgml;h=258af0d51aeb6f4c0459ca0909ab559fafec3dab;hb=543fcf12c500813da5f2da099eb773852fb2bdc0;hp=6774a64d76a828f5c80f64fe89e08b525b73d552;hpb=c55a1b30ba82e0343574574da5481cac2735a82b;p=idzebra-moved-to-github.git diff --git a/doc/zebra.sgml b/doc/zebra.sgml index 6774a64..258af0d 100644 --- a/doc/zebra.sgml +++ b/doc/zebra.sgml @@ -1,13 +1,13 @@
Zebra Server - Administrators's Guide and Reference <author><htmlurl url="http://www.indexdata.dk/" name="Index Data">, <tt><htmlurl url="mailto:info@index.ping.dk" name="info@index.ping.dk"></> -<date>$Revision: 1.29 $ +<date>$Revision: 1.31 $ <abstract> The Zebra information server combines a versatile fielded/free-text search engine with a Z39.50-1995 frontend to provide a powerful and flexible @@ -159,9 +159,6 @@ data elements in records. *Port the system to Windows NT. <item> -Add index and data compression to save disk space. - -<item> Add more sophisticated relevance ranking mechanisms. Add support for soundex and stemming. Add relevance <it/feedback/ support. @@ -989,7 +986,33 @@ For the <bf/Truncation/ attribute, <bf/No Truncation/ is the default. is <bf/Regxp-1/. <bf/Regxp-2/ enables the fault-tolerant (fuzzy) search. As a default, a single error (deletion, insertion, replacement) is accepted when terms are matched against the register -contents. +contents. The <bf/Regxp-1/ and <bf/Regxp-2/ both follow the same syntax +with the operands: +<descrip> +<tag/x/ Matches the character <it/x/. +<tag/./ Matches any character. +<tag><tt/[/..<tt/]/</tag> Matches the set of characters specified; + such as <tt/[abc]/ or <tt/[a-c]/. +</descrip> +and the operators: +<descrip> +<tag/x*/ Matches <it/x/ zero or more times. Priority: high. +<tag/x+/ Matches <it/x/ one or more times. Priority: high. +<tag/x?/ Matches <it/x/ once or twice. Priority: high. +<tag/xy/ Matches <it/x/, then <it/y/. Priority: medium. +<tag/x|y/ Matches either <it/x/ or <it/y/. Priority: low. +</descrip> +The order of evaluation may be changed by using parentheses. + +If the first character of the <bf/Regxp-2/ query is a plus character +(<tt/+/) it marks the beginning of a section with non-standard +specifiers. The next plus character marks the end of the section. +Currently Zebra only supports one specifier, the error tolerance, +which consists one digit. + +Since the plus operator is normally a suffix operator the addition to +the query syntax doesn't violate the syntax for standard regular +expressions. <sect2>Present