next up previous contents index
Next: Example Up: Libwww: The XSB Internet Previous: Features and Configuration   Contents   Index

Accessing Internet with Libwww

To start using the package, you must load it first:


    :- [libwww].

The general form of a Web call is as follows:


    :- libwww_request([request1, request2, ..., request_n]).
Each request has the following syntax:

    request_type(+URL, +RequestParams, -ResponseParams, -Result, -Status)
The request type functor must be either htmlparse, xmlparse, fetch, or header. The first two are requests to parse HTML/XML pages, respectively. Fetch is a request to bring in a page without parsing, nd header is a request to retrieve only the header information (which is returned in the ResponseParams argument--see below). The URL must be an atom or a string (list of characters).6.1Request parameters must be either a variable (in which case the request is considered to not have special parameters) or a list. The following terms are allowed in that list: The ResponseParams argument is a list of terms returned by the libwww_request call. It contains two kinds of information: header information and sub-request information. The header information consists of terms like: header('Content-Type', 'text/html'), header('Server', 'Netscape-Enterprise/3.6 SP2'), etc., as defined by the HTTP protocol ( header/2 is a functor and its arguments are atoms). The sub-request information consists of terms of the form: subrequest('http://www.foo.org/test/file.html',-401). It indicates that during processing of the current request, it was necessary to access another page, http://www.foo.org/test/file.html, but the server responded with the error code -401 (authentication error). Such sub-requests might be spawned during XML parsing.

The Result of a libwww_request call depends on the request type. In case of fetch it is an atom or a list of characters (depending on whether URL was specified as an atom or a list of characters), or it might be an unbound variable in case of an error. For header requests, Result is always an unbound variable.

For htmlparse and xmlparse, Result is a variable in case of an error and a complex term otherwise. In the latter case, it is a list of the form [elt1,...,elt_n], where each elt_i is of the form:


    elt(tag, [attval(attrname,value),...], [elt1',...,elt'_m])
The second argument here represents the list of attribute-value pairs. In HTML, some attributes, like checked, can be binary, in which case the corresponding value will be unbound. The third argument represents HTML or XML elements that are within the scope of tag. These elements have the same syntax as the parent element: elt(tag',attrs,sub-elements). If a tag has no attributes or if it does not have sub-elements, the corresponding lists will be empty. One special tag, pcdata, is introduced to represent pieces of text that appear in the document. This tag is our own creation--neither HTML nor XML use tags to represent text. One important difference between pcdata and other tags is that the third argument in elt(pcdata,...,...) is an atom or a list of characters, not a list (unlike other tags). If URL was specified as an atom, then the third argument of the pcdata-element is an atom as well. If URL is a character list, then so is the corresponding argument in the pcdata-element.

Finally, Status is bound to an integer that represents the return code from the HTTP request. A complete list of return codes is given in XSB/prolog_includes/http_errors.h. If you need to refer to error codes in your Prolog application, it is advisable to use symbolic notation. To make this happen, put the following lines at the top of your program:


    :- compiler_options([xpp_on]).
    #include "http_errors.h"
The Libwww package also includes a predicate that is convenient for providing English language explanations to the errors:

    :- import http_liberr/3 from usermod.
The first argument of this predicate is the error code, the second is an explanation in English, and the last is the class of the error ( e.g., internal, server error, client error, etc.). For full details see XSB/packages/libwww/http_liberr.P. Note that the code for a successful call is HT_LOADED (=200), not zero or one!


next up previous contents index
Next: Example Up: Libwww: The XSB Internet Previous: Features and Configuration   Contents   Index
Baoqiu Cui
2000-04-23