ht://Dig htsearch

ht://Dig © 1995, 1996 Andrew Scherpbier andrew@sdsu.edu
Please see the file COPYING for license information.


Synopsis

htsearch

Description

Htsearch is the actual search engine of the ht://Dig search system. It is a CGI program that is expected to be invoked by an HTML form. It will accept both the GET and POST methods of passing data to the CGI program.

The HTML form is expected to contain at least a text field named words. This is where the user will enter the search words. Other values are:

restrict
This value is a pattern that all URLs of the search results will have to match.
The default is blank.
exclude
This value is a pattern that all URLs of the search results cannot match.
The default is blank.
config
Specifies the name of the configuration file. The name here is the name without the path and without the .conf at the end. This file is assumed to be located in the CONFIG_DIR directory.
The detault is htdig
method
This can be one of and, or, or boolean. It determines what type of search will be performed.
The default is specified by the match_method attribute in the configuration file.
format
This specifies the name of the template to display the search results in. There are two builtin templates named builtin-long and builtin-short which can be used.
The default is specified by the template_name attribute in the configuration file.
matchespserpage
Specifies how many matches will be displayed on each page of results.
The default is specified by the matches_per_page attribute in the configuration file.
page
This should normally not be used. It is generated by the paged results display.

The htsearch program will normally produce HTML output. In this process it makes extensive use of template files in which variables will be substituted. The template files are specified in the configuration file. The configuration file attributes defining these templates are:

In addition to these templates, the search results also use templates. These templates are specified in the template_map attribute.

There are many variables that can be substituted into these templates. Not all of them make sense for each file, so not all of them will be substituted for every file. The variables are:

EXCERPT
The relevant excerpt for the current match
URL
The URL to the document for the current match
SCORE
The score of the current match
TITLE
The title of the document for the current match
STARSRIGHT
A set of HTML <img> tags with the stars aligned on the right.
STARSLEFT
A set of HTML <img> tags with the stars aligned on the left.
SIZE
The size of the document for the current match
SIZEK
The size in kilobytes of the document for the current match
HOPCOUNT
The distance of this match away from the starting document(s).
DOCID
The internal ID for the document for the current match.
DESCRIPTIONS
A list of descriptions for the matched document. The entries in the list are separated by <br>.
MATCHES_PER_PAGE
The configured maximum number of matches on this page
MAX_STARS
The configured maximum number of stars to display in matches.
MATCH_MESSAGE
This is either all or some depending on the match method used.
MATCHES
The total number of matches that were found.
PLURAL_MATCHES
If the MATCHES variable is other than 1, this will be a single 's'.
PAGE
The current page number.
PAGES
The total number of pages.
PAGEHEADER
This expands to either the value of the page_list_header or no_page_list_header attributes depending on how many pages there are.
FIRSTDISPLAYED
The index of the first match on this page.
LASTDISPLAYED
The index of the last match on this page.
CGI
This expands to whatever the SCRIP_NAME environment variable is.
FORMAT
Expands to an HTML menu of all the available formats. The current format will be the default one.
METHOD
Expands to an HTML menu of all the available matching methods. The current method will be the default one.
PREVPAGE
This expands to the value of the prev_page_text or no_prev_page_text attributes depending on whether there is a previous page or not.
NEXTPAGE
This expands to the value of the next_page_text or no_next_page_text attributes depending on whether there is a next page or not.
PAGELIST
This expands to a list of hyperlinks using the page_number_text and no_page_number_text attributes.
SYNTAXERROR
Is the text of the boolean expression syntax error.

Files

CONFIG_DIR/htdig.conf
The default configuration file.
COMMON_DIR/header.html
The default search results header file
COMMON_DIR/footer.html
The default search results footer file
COMMON_DIR/nomatch.html
The default 'no matches found' HTML file
COMMON_DIR/syntax.html
The default file that explains boolean expression syntax errors

See Also

htdig, htmerge, Configuration file format.

Andrew Scherpbier <andrew@sdsu.edu>
Last modified: Tue Jul 16 16:12:47 PDT