Additions from d.g. gilbert (gilbertd@bio.indiana.edu). These modifications can be picked up via ftp or gopher to ftp.bio.indiana.edu. Look in folder IUBio Software+Data/util/wais for iubio-wais-8b5.tar.Z (full source) or iubio-wais-8b5.patch for a difference or patch file. These additions include BOOLEANS == boolean 'and' and 'not' operators. This is at a simple level. There are no nesting symbols. Words are evaluated from left to right in the wais query string. When a 'not' operator is found, the following single word is moved into a buffer of not-words. This not-word buffer is evaluated after all the other words are evaluated. If a document matches a not-word, that document is removed from the set of matches (given a score of zero). When an 'and' word is found, the word following it is checked for matches to documents, and the the set of documents matching this and-word is compared to the set of documents matching any words prior to the and-word. The intersection of prior and current matching documents is retained, others are removed (set to a zero score). Only the waisserver is affected by BOOLEANS For example, this query red and blue or yellow but not green and orange or black but not white Will be interpreted like this (the parentheses just show the implicit left-to-right interpretation): (((((red and blue) or yellow) and orange) or black) not green) not white) PARTIALWORD == The asterisk symbol '*' is parsed if it immediately follows a word, as an key to search for all partial words that match the first part of the word. Only the waisserver is affected by PARTIALWORD. LITERAL == The quote or double-quote symbols are interpreted, if they occur in pairs around a string, as requesting a literal match of the enclosed string. Any special symbols, 'and', 'not', and '*' inside a literal string are not interpreted. The first part of a literal string must be a word that is indexed for that data, rather than a delimiter symbol, for a match to be found. Only the waisserver is affected by LITERAL. BIO == these are a set of changes that include - optional 'stoplist' file of words to ignore, - an optional set of symbols that are used as delimiters, with other graphic characters being used as valid word symbols, - a selection of biology data document structures BIO affects waisindex as well as waisserver. Internet GOpher users: You will need my patch to the GopherD Waisindex.c to use these BIO patches from Internet Gopher. See ftp.bio.indiana.edu, /util/gopher/iubio-gopher-v1.patch for these changes. Some general restructuring of the wais-8-b5 code was done also, to make parameter passing easier for user-defined data files. This should not cause any functional difference from the 8-b5 code when the Makefile defines are turned off. I consider the BOOLEANS modification to be the simplest, and smallest change to the wais code. The LITERAL and PARTIALWORD are somewhat more complex, but still fairly restricted in scope, affecting only a few waisserver modules. The BIO modifications involve changes in many areas of the wais code. See Makefile to enable/disable these additions. You should be able to have functionally equivalent programs to wais-8-b5 by turning off these defines, or enable them individually as needed. ........ # dgg additions # LITERAL == waisserver, search for "literal strings" # BOOLEANS == waisserver, search with boolean AND, NOT operators # PARTIALWORD == waisserver, search for partial words, hum* matches human, hummingbird, ... # BIO == waisindex, waisserver changes including symbol indexing & search & bio data formats # #CFLAGS = -g -I$(SUPDIR) -DSECURE_SERVER -DRELEVANCE_FEEDBACK -DUSG CFLAGS = -g -I$(SUPDIR) -DSECURE_SERVER -DRELEVANCE_FEEDBACK -DUSG -DBIO -DBOOLEANS -DPARTIALWORD -DLITERAL