Computer Science Department
Technical University of Madrid (UPM), Spain
Email: {dcabeza,herme,sacha}@dia.fi.upm.es
Abstract:
Keywords: WWW, HTML, HTTP, Distributed Execution, (Constraint) Logic Programming.
In fact, Prolog, its concurrent and constraint based extensions, and logic programming languages in general have many characteristics which appear to set them particularly well placed for making an impact on the development of practical networked applications, ranging from the simple to the quite sophisticated. Notably, LP/CLP systems share many characteristics with other recently proposed network programming tools, such as Java, including dynamic memory management, well-behaved structure and pointer manipulation, robustness, and compilation to architecture-independent bytecode. Furthermore, and unlike the scripting or application languages currently being proposed (e.g., shell scripts, Perl, Java, etc.), LP/CLP systems offer a quite unique set of additional features including dynamic databases, search facilities, grammars, sophisticated meta-programming, and well understood semantics.
In addition, most LP/CLP systems also already offer some kind of low level support for remote communication using Internet protocols. This generally involves providing a sockets (or ports) interface whereby it is possible to make remote data connections via the Internet's native protocol, TCP/IP. A few systems support higher level functionality layers on top of this interface including linda-style blackboards (e.g., SICStus Prolog [7] and &-Prolog/CIAO[16, 17, 14], BinProlog/-Prolog [28, 2], etc.) or shared variable based communication (e.g., KL1 [8], AKL [20], Oz [25], &-Prolog/CIAO [15, 4], etc.). In some cases, this functionality is provided via libraries, building on top of the basic TCP/IP primitives. This is the case, for example, of the SICStus (and CIAO) linda interface. In fact, as we have shown, shared variable based communication can also be implemented in conventional systems via library predicates, by using attributed variables [15, 4].
WWW applications generally use higher level protocols (such as HTTP or FTP) and application architectures (e.g., the cgi-bin interface) which are different from the shared variable or linda-based protocols. In this paper we study how good support for such protocols and architectures can be provided for LP/CLP systems, building on the basic, widely available TCP/IP protocols. Our aim is to discuss from a practical point of view a number of the new issues involved in writing Internet and WWW applications using LP/CLP systems, as well as the architecture of some typical applications. In the process, we will describe PiLLoW (``Programming in Logic Languages on the Web''), an Internet/WWW programming library for LP/CLP systems which, we argue, significantly simplifies the process of writing such applications. PiLLoW provides facilities for writing cgi-scripts, generating HTML structured documents by handling them as Herbrand terms, producing HTML forms, writing form handlers, accessing and parsing WWW documents, and accessing code posted at HTTP addresses. We also describe the architecture of some relatively sophisticated application classes, using a high-level model of client-server interaction, active modules [4]. Finally we describe an architecture for automatic LP/CLP code downloading for local execution, using just the library and generic browsers.
The argument throughout the paper is that, with only very small limitations in functionality (which disappear when concurrency is added to the system, as in systems such as BinProlog/-Prolog, AKL, Oz, KL1, and &-Prolog/CIAO), it is possible to add an extremely useful Internet/WWW programming layer to any LP/CLP system without making any significant changes in the implementation. We argue that this layer can simplify the generation of applications in LP/CLP systems including active WWW pages, search tools, content analyzers, indexers, software demonstrators, collaboratve work systems, MUDs and MOOs, code distributors, etc.
The PiLLoW library has been developed in the context of the &-Prolog and CIAO systems, but it has been adapted to a number of popular LP/CLP systems, supporting most of its functionality. This document can serve also as a WWW/HTML primer, containing sufficient information for developing relatively high-complexity WWW applications in Prolog and other LP and CLP languages.
The following is an example of how a very simple such executable can be written in an LP/CLP language. The source might be as follows:
main :- write('Content-type: text/html'), nl, nl, write('<HTML>'), write('Hello world.'), write('</HTML>').
And the actual executable could be generated as a saved state at the system prompt in the standard way. E.g., for most Edinburgh-style systems:
?:- compile('hello_world.pl'), save('/usr/local/etc/htppd/cgi-bin/hello_world'), main.
The address of the executable in machine www.xxx.yyy would then be
http://www.xxx.yyy/cgi-bin/hello_world.
In some systems, saved states have the disadvantage of their generally large size, but many systems have other ways of producing reasonably-sized executables. For example, in the &-Prolog/CIAO system compiled executables can be generated which are generally of smaller size than the source program.
Logic languages are, a priori, excellent candidates to be used as scripting languages (for example, the built-in grammars and databases greatly simplify many typical script-based applications. Note that grammars are much more powerful than regular expresions, often found in other scripting languages, which in general can only provide an approximation to the solution.) However, the relative complication in making executables (needing in most systems to start the system, compile or consult the file, and make a saved state) and the often large size of the resulting executables may deter CGI application programmers. It appears convenient to provide a means for LP/CLP programs to be executable as scripts, even if with reduced performance.
It is generally relatively easy to support scripts with the same functionality in most LP/CLP systems (see [13] for an example developed for the CIAO system and adapted to SICStus, which is included with the PiLLoW library). Let's assume that lpshell is a version of the LP/CLP system (for example, a saved state), which first loads the file given to it as the first argument (but excluding the first line and routing all loading messages away from the standard output) and then starts execution at main/1 (the argument provides the list of command line options). Then, for example, in a Unix system, the following program can be run directly as a script without any need for compilation:
#!/usr/local/bin/lpshell main(_) :- write('Content-type: text/html'), nl, write('<HTML>'), write('Hello world.'), write('</HTML>').
Since CGI executables only produce output, and this output is not a funtion of the input, CGI executables by themselves are only of limited interest. However, they become most useful when combined with HTML forms. HTML forms are HTML documents (or parts of HTML documents) which include special fields such as text areas, menus, radio buttons, etc. The steps involved in the handling of the input contained in a form are illustrated in Figure 2. When a document containing a form is accessed via a form-capable browser (Mosaic, Netscape, Lynx, etc.), the browser displays the input fields, buttons, menus, etc. indicated in the document, and locally allows the user to perform input by modifying such fields. However, this input is not ultimately handled by the browser. Instead, it will be sent to a ``handler'' program, which can be anywhere on the net, and whose address must be given in the form itself (1). Forms generally have a ``submit'' button such that, when pressed, the input provided through the menus, text areas, etc. is sent by the browser to the HTTP server corresponding to the handler (2). Two methods for sending this input exist: ``GET'' and ``POST''. In the meantime, the sending browser waits for a response from that program, which should come in the form of a new HTML document. The handler program is invoked in much the same way as a cgi-bin application (3), except that the information from the form is supplied to the handler (in different ways depending on the system and if the method is ``GET'' or ``POST'') (4). This information is encoded in a predefined format, which relates each piece of information to the corresponding field in the form, by means of a keyword associated with each field. The handler then identifies the information corresponding to each field in the original form, processes it, and then responds by writing an HTML document to its standard output (5), which is forwarded by the server to the waiting browser when the handler terminates (6). An important point to be noted is that, as with simple cgi-bin applications, the handler is started and should terminate for each transaction. The reader is referred, for example, to http://kuhttp.cc.ukans.edu/info/forms/forms-intro.html for a more complete introduction to CGI scripts and HTML forms.
For example, suppose we want to make a handler which implements a database of telephone numbers and is queried by a form including a single entry field with name person_name. The handler might be coded as follows:
#!/usr/local/bin/lpshell :- use_module('/usr/local/src/pillow/pillow.pl'). main(_) :- get_form_input(Input), get_form_value(Input,person_name,Name), write('Content-type: text/html'), nl, nl, write('<HTML><TITLE>Telephone database</TITLE>'), nl, write('<IMG SRC="phone.gif">'), write('<H2>Telephone database</H2><HR>'), write_info(Name), write('</HTML>'). write_info(Name) :- form_empty_value(Name) -> write('You have to provide a name.') ; phone(Name, Phone) -> write('Telephone number of <B>'), write(Name), write('</B>: '), write(Info) ; write('No telephone number available for <B>'), write(Name), write('</B>.'). phone(daniel, '336-7448'). phone(manuel, '336-7435'). phone(sacha, '543-5316').
The code above is quite simple. On the other hand, the interspersion throughout the text of calls to write with HTML markup inside makes the code somewhat inelegant. Also, there is no separation between computation and input/output, as is normally desirable. It would be much preferable to have an encoding of HTML code as Prolog terms, which could then be manipulated easily in a more elegant way, and a predicate to translate such terms to HTML for output. This facility, provided by the PiLLoW library, is presented in the next section.
In an HTML term certain atoms and structures represent special functionality at the HTML level. An HTML term can be recursively a list of HTML terms. The following are legal HTML terms:
hello [hello, world] ['This is an ', em('HTML'), ' term']
html_term/3 in general leaves atoms in HTML terms unchanged, but translates structures into the corresponding format in HTML, applying then html_term/2 recursively to their arguments. HTML terms may contain logic variables, provided they are instantiated before the term is translated or output. This allows creating documents piecemeal, backpatching of references in documents, etc.
In the following sections we list the meaning of the principal Prolog structures that represent special functionality at the HTML level. Only special atoms are translated, the rest are assumed to be normal text and will be passed through to the HTML document.
An HTML environment has the form ``<NAME Attributes >
Text </NAME>
'' were NAME is the name of the
environment an Attributes has the same form as before.
The general Prolog structures that represent these two HTML constructions include:
img$[src='images/map.gif',alt='A map',ismap]
is translated into the HTML source
<img src="images/map.gif" alt="A map" ismap>
Note that HTML is not case-sensitive, so we can use lower-case
atoms. address('clip@dia.fi.upm.es')
is translated into the HTML source
<address>clip@dia.fi.upm.es</address>
a([href='http://www.clip.dia.fi.upm.es/'],'Clip home')
represents the HTML source
<a href="http://www.clip.dia.fi.upm.es/">Clip home</a>
Now we can rewrite the previous example as follows:
#!/usr/local/bin/lpshell :- use_module('/usr/local/src/pillow/pillow.pl'). main(_) :- get_form_input(Input), get_form_value(Input,person_name,Name), response(Name,Response), output_html([ 'Content-type: text/html', html([title('Telephone database'), img$[src='phone.gif'], h2('Telephone database'), hr$[], Response)]). response(Name, Response) :- form_empty_value(Name) -> Response = 'You have to provide a name.' ; phone(Name, Phone) -> Response = ['Telephone number of ',b(Name),': ',Info] ; Response = ['No telephone number available for ',b(Name),'.']. phone(daniel, '336-7448'). phone(manuel, '336-7435'). phone(sacha, '543-5316').
Any HTML construction can be represented with these structures (except comments, which could be included as atoms), but the PiLLoW library provides additional, specific structures to simplify HTML creation.
<html>
).</html>
).<hr>
).\\
Produces a line break (translates to <br>
).<p>
).<img>
element).<a href="
Addr">
Text</a>
).<a name="
Label">
Text</a>
).<h
N>
environment).<ul>
environment).<pre>
environment).With these additional structures, we can rewrite the previous example as follows (note that in this particular example the use of heading/2 or h2/1 is equally suitable):
#!/usr/local/bin/lpshell :- use_module('/usr/local/src/pillow/pillow.pl'). main(_) :- get_form_input(Input), get_form_value(Input,person_name,Name), response(Name,Response), output_html([ form_reply, start, title('Telephone database'), image('phone.gif'), heading(2,'Telephone database'), --, Response, end]). response(Name, Response) :- form_empty_value(Name) -> Response = 'You have to provide a name.' ; phone(Name, Phone) -> Response = ['Telephone number of ',b(Name),': ',Info] ; Response = ['No telephone number available for ',b(Name),'.']. phone(daniel, '336-7448'). phone(manuel, '336-7435'). phone(sacha, '543-5316').
We have not included above the specific structures for creating forms, they are included and explained in the following section.
</form>
).checkbox
with name Name, State=on if the checkbox is
initially checked (translates to an <input>
element).radio
with name Name (several radio buttons
which are interlocked must share their name), Value is the the
value returned by the button, if Selected=Value the button
is initially checked (translates to an <input>
element).text
, hidden
, submit
,
reset
, ...(translates to an <input>
element).<textarea>
environment).<select>
environment).
For example, in order to generate a form suitable for sending input to the previously described phone database handler one could type at a Prolog prompt:
?:- ['/usr/local/src/pillow/pillow.pl'], output_html([ start, title('Telephone database'), heading(2,'Telephone database'), $, start_form('http://www.clip.dia.fi.upm.es/cgi-bin/phone_db.pl'), 'Click here, enter name of clip member, and press Return:', \\, input(text,[name=person_name,size=20]), end_form, end]).
Of course, one could have also simply written directly the resulting HTML document:
<html> <title>Telephone database</title> <h2>Telephone database</h2> <p> <form method="POST" action="http://www.clip.dia.fi.upm.es/cgi-bin/phone_db.pl"> Click here, enter name of clip member, and press Return: <br> <input type="text" name="person_name" size="20"> </form> </html>
#!/usr/local/bin/lpshell :- use_module('/usr/local/src/pillow/pillow.pl'). main(_) :- get_form_input(Input), get_form_value(Input,person_name,Name), response(Name,Response), my_url(MyURL), output_html([ form_reply, start, title('Telephone database'), image('phone.gif'), heading(2,'Telephone database'), --, Response, start_form(MyURL), 'Click here, enter name of clip member, and press Return:', \\, input(text,[name=person_name,size=20]), end_form, end]). response(Name, Response) :- form_empty_value(Name) -> Response = [], ; phone(Name, Phone) -> Response = ['Telephone number of ',b(Name),': ',Info,$] ; Response = ['No telephone number available for ',b(Name),'.',$]. phone(daniel, '336-7448'). phone(manuel, '336-7435'). phone(sacha, '543-5316').
This combination of the form producer and the handler allows producing applications that give the impression of being interactive, even if each step involves starting and running the handler to completion. Note that forms can contain fields which are not displayed and are passed as input to the next invocation of the handler. This allows passing state from one invocation of the handler to the next one.
GlobalOptions provides a means of specifying options which will apply to all the requests generated by fetch_urls. At present these include:
It is worth noting that fetch_urls only fetches an entire document when it absolutely has to, which is normally only when the content or file options are included. In any other case, it merely requests the document header from the server, which is clearly more efficient in terms of speed, amount of memory used for the request, and amount of network traffic generated. Ideally, document content only ought to be fetched if matching of the options output arguments related to the header succeed, but currently the predicate unifies them after the fetching. Although not explained for brevity, comprehensive treatment of errors is also provided in the PiLLoW package.
For example, a simple fetch of a document can be done as follows:
fetch_urls(doc('http://www.foo.com',[content(D)]),[]).Note that the first argument does not need to be a list since only one document is being fetched. For convenience, the functionality of the predicate above is provided also simply as fetch_url(URL,Content). The following call fetches two documents, getting also the type and size of the first, and checking for non_fatal errors in the second, allowing only one socket for use:
fetch_urls([doc('http://www.foo.com', [content(D1), content_length(S1), content_type(T1)]), doc('http://www.bar.com/~hello/world.html', [content(D2), errors(non_fatal,E)]), ], [sockets(1)] ).
Other useful predicates provided are:
The following is a simple application illustrating the use of fetch_urls and ascii2terms. The example defines check_links(URL,BadLinks). The predicate jumps to the HTML document pointed to by URL, extracts all the links in that document, and then recursively traverses each link to see if it is a valid one (that is, it checks that no fatal errors occur when it attempts to jump through a link). Recursion does not venture into documents that are not on URL's host machine. The list BadLinks contains all the bad links found, stored as compound terms of the form: badlink(Link,URL,Errors) where Link is the problematic link, URL is the URL of the parent document containing the bad link, and Errors is the list of errors that occurred when check_links attempted to jump through the link. Note that this code, for simplicity, does not handle cyclic references.
check_links(URL,BadLinks) :- url_info(URL,Protocol,Host,_,_), supported_protocol(Protocol), check_link(Host,Host,URL,first_link,BadLinks/[]). check_link(Host,Host,Link,Parent,BadLinks/BadLinks_) :- !, % Local link fetch_urls(doc(Link,[content(Content),content_type(Type), errors(fatal,Errors)]),[]), ( Errors \== [] -> BadLinks = [badlink(Link,Parent,Errors)|BadLinks_] ; Type == 'text/html' -> ascii2terms(Content,Terms), extract_links(Terms,Links/[]), check_list_links(Links,Host,Link,BadLinks/BadLinks_) ; BadLinks = BadLinks_ ). check_link(Host,_LinkHost,Link,Parent,BadLinks/BadLinks_) :- % Remote link fetch_urls(doc(Link,[errors(fatal,Errors)]),[]), ( Errors \== [] -> BadLinks = [badlink(Link,Parent,Errors)|BadLinks_] ; BadLinks = BadLinks_ ). check_list_links([],_Host,_Parent,BadLinks_/BadLinks_). check_list_links([Link|MoreLinks],Host,Parent,BadLinks/BadLinks_) :- url_info(Link,Protocol,LinkHost,_,_), ( supported_protocol(Protocol) -> check_link(Host,LinkHost,Link,Parent,BadLinks/BadLinks1) ; BadLinks = BadLinks1 ), check_list_links(MoreLinks,Host,Parent,BadLinks1/BadLinks_). extract_links([E|Es], Links/Links_) :- !, extract_links(E,Links/Links1), extract_links(Es,Links1/Links_). extract_links(env(a,AnchorAtts,_),[URL|Links_]/Links_) :- member((href=URL),AnchorAtts), !. extract_links(env(_Name,_Atts,Env_html),LinksDL) :- !, extract_links(Env_html,LinksDL). extract_links(_,Links_/Links_).
#!/usr/local/bin/lpshell :- use_module('http://www.clip.dia.fi.upm.es/lib/pillow.pl'). main(_) :- get_form_input(Input), get_form_value(Input,person_name,Name), ...would load the current version of the library each time it is executed. This generalized module declaration is just syntactic sugar, using expand_term, for a document fetch, using fetch_url, followed by a standard use_module declaration. It is obviously interesting to combine this facility with caching strategies. An interesting (and straightforward to implement) additional feature is to fetch remote byte-code (as generally done by use_module), if available, but this is only possible if the two systems use the same byte-code (this can normally be checked easily in the bytecode itself). Also, it may be interesting to combine this type of code downloading with WWW document accesses, so that code is downloaded automatically when a particular document is fetched. This issue is addressed in Section 11. Finally, there are obvious security issues related to downloading code in general, which can be addressed with standard techniques such as security signatures.
Despite its power, the cgi-bin interface also has some shortcomings. The most serious is perhaps the fact that the handler is started and expected to terminate for each interaction. This has two disadvantages. First, no state is preserved from one query to the next. However, as mentioned before, this can be fixed by passing the state through the form or also by saving it in a temporary file at the server side. Second, and more importantly, starting and stopping the application may be inefficient. For example, if the idea is to query a large database or a natural language understanding system, it may take a long time to start and stop the system. In order to avoid this we propose an alternative architecture for cgi-bin applications (a similar idea, although not based on the idea of active modules, has been proposed independently by Ken Bowen [3]).
The basic idea is illustrated in Figure 3. The operation is identical to that of standard form handlers, as illustrated in Figure 2, up to step 3. In this step, the handler started is not the application itself, but rather an interface to the actual application, which is running continuously and thus contains state. Thus, only the interface is started and stopped with every transaction. The interface simply passes the form input received from the server (4) to the running application (5) and then forwards the output from the application (6) to the server before terminating, while the application itself continues running. Both the interface and the application can be written in LP/CLP, using the predicates presented. The interface can be a simple script, while the application itself will be typically compiled.
An interesting issue is that of communication between interface and application. This can of course be done through sockets. However, as a cleaner and much simpler alternative, the concept of active modules [4] can be used to advantage in this application. An active module (or an active object, if modularity is implemented via objects) is an ordinary module to which computational resources are attached (for example, a process on a UNIX machine), and which resides at a given (socket) address on the network. Compiling an active module produces an executable which, when running, acts as a server for a number of relations, which are the predicates exported by the module. The relations exported by the active module can be accessed by any program on the network by simply ``loading'' the module and thus importing such ``remote relations.'' The idea is that the process of loading an active module does not involve transferring any code, but rather setting up things so that calls in the local module are executed as remote procedure calls to the active module, possible over the network. Except for saving it in a special way, an active module is identical from the programmer point of view to an ordinary module. Also, a program using an active module imports it and uses it in the same way as any other module, except that it uses ``use_active_module'' rather than ``use_module'' (see below). Also, an active module has an address (network address) which must be known in order to use it. The address can be announced by the active module when it is started via a file or a name server (which would be itself another active module with a fixed address).
We now present the constructs used by active modules. Note that for concreteness and compatibility in the description of modules we mainly follow the same scheme as SICStus Prolog.
Note that this scheme is very flexible. For example, the predicate module_address/2 itself could be imported, thus allowing a configurable standard way of locating active modules. One could, for example, use a directory accessible by all the involved machines to store the addresses of the active modules in them, and this predicate would examine this directory to find the required data. A more elegant solution would be to implement a name server, that is, an active module with a known address that records the addresses of active modules and supplies this data to the modules that actively import it.
From the implementation point of view, active modules are essentially daemons: Prolog executables which are started as independent processes at the operating system level. In the CIAO system library, communication with active modules is implemented using sockets (thus, the address of an active module is a UNIX socket in a machine). Requests to execute goals in the module are sent through the socket by remote programs. When such a request arrives, the process running the active module takes it and executes it, returning through the socket the computed results. These results are then taken by the remote processes.
Thus, when the compiler finds a use_active_module declaration, it defines the imported predicates as remote calls to the active module. For example, if the predicate P is imported from the active module M, the predicate would be defined as
P :- module_address(M,A), remote_call(A,P)
The predicate save_active_module/3 saves the current code like save/1, but when the execution is started a socket is created whose address is the second argument of the predicate, and the expression in the third argument is executed. Then, the execution goes into a loop of reading execution requests from the socket, executing them, and returning the solutions back through the socket.
Compiling the following program creates an executable phone_db which, when started as a process (for example, by typing ``phone_db &'' at a UNIX shell prompt) saves its address (i.e., that of its socket) in file phone_db.addr and waits for queries from any module which ``imports'' this module (it also provides a predicate to dynamically add information to the database):
:- module(phone_db,[response/2,add_phone/2]). response(Name, Response) :- form_empty_value(Name) -> Response = 'You have to provide a name.' ; phone(Name, Phone) -> Response = ['Telephone number of ',b(Name),': ',Info] ; Response = ['No telephone number available for ',b(Name),'.']. add_phone(Name, Phone) :- assert(phone(Name, Phone)). :- dynamic phone/2. phone(daniel, '336-7448'). phone(manuel, '336-7435'). phone(sacha, '543-5316'). :- save_active_module(phone_db,Address, (tell('phone_db.addr'), write(Address),told)).
The following simple script can be used as a cgi-bin executable which will be the active module interface for the previous active module. When started, it will process the form input, issue a call to response/2 (which will be automatically handled by the phone_db active module), and produce a new form before terminating. It will locate the address of the phone_db active module via the module_address/2 predicate it defines.
#!/usr/local/bin/lpshell :- use_active_module(phone_db,[response/2]). :- use_module('/usr/local/src/pillow/pillow.pl'). main(_) :- get_form_input(Input), get_form_value(Input,person_name,Name), response(Name,Response), my_url(MyURL), output_html([ form_reply, start, title('Telephone database'), image('phone.gif'), heading(2,'Telephone database'), --, Response, $, start_form(MyURL), 'Click here, enter name of clip member, and press Return:', \\, input(text,[name=person_name,size=20]), end_form, end]). module_address(phone_db,Address) :- see('phone_db.addr'), read(Address), seen.
There are many enhancements to this simple schema which, for brevity, are only sketched here. One is to add concurrency to the active module (or whatever means of handling the client-server interaction is being used), in order to handle queries from different clients concurrently. This is easy to do in systems that support concurrency natively, such as &-Prolog/CIAO, BinProlog/-Prolog, AKL, Oz, and KL1. We feel that &-Prolog/CIAO can offer advantages in this area because it offers compatibility with Prolog and CLP systems while at the same time efficiently supporting concurrent execution of clause goals via local or distributed threads. Such goals can communicate at different levels of abstraction: sockets/ports, blackboard, or shared variables. BinProlog/-Prolog also supports threads, although the communication mechanisms are somewhat different. Finally, as shown in [27], it is also possible to exploit the concurrency present in or-parallel Prolog systems such as Aurora for implementing a multiprocessing server.
It is also interesting to set up things so that a single active module can handle different forms. This can be done even dynamically (i.e., the capabilities of the active module are augmented on te fly, being able to handle a new form), by designating a directory in which code to be loaded by the active module would be put, the active module consulting the directory periodically to increase its functionalities. Finally, another important issue that has not been addressed is that of providing security, i.e., ensuring that only allowed clients connect to the active module. As in the case of remote code downloading, stardard forms of authentication based on codes can be used.
An implementation of active modules as described is included in the CIAO library which provides the concurrent and distributed execution facilities mentioned above [4]. As the PiLLoW library, this library only uses standard features of LP/CLP systems, although it does require attributed variables [18] if shared-variable communication is to be used [15]. However, this is not necessary for implementing active modules.
In this section we describe an architecture which, using only the facilities we have presented in previous sections, allows the downloading and local execution of Prolog (or other LP/CLP) code by accessing a WWW address, without requiring a special browser. This is a complementary approach to giving WWW access to an active module in the sense that it provides code which will be executed in the client machine. More concretely, the functionality that we desire is that by simply clicking on a WWW pointer, and transparently for the user, remote Prolog code is automatically downloaded in such a way that it can be queried via forms and all the processing is done locally.
To allow this, the HTTP server on the server machine is configured to give a specific mime.type (for example application/x-prolog) to the files which will hold WWW-downloadable Prolog code (for example those with a special suffix, like .wpl). On the other side, the browser is configured to start the wpl_handler helper application when receiving data of type application/x-prolog. This wpl_handler application is the interface to a Prolog engine which will execute the WWW downloaded code, acting as an active module. We now sketch the procedure (see figure 4):
The net effect of the approach is that by simply clicking on a WWW pointer, remote Prolog code is automatically downloaded to a local Prolog engine. Queries posed via the form are answered locally by the Prolog engine.
There are obvious security issues that need to be taken care of in this architecture. Again, standard authentication techniques can be used. However, since source code is being passed around, it is comparatively easy to verify that no dangerous predicates (for example, perhaps those that can access files) are executed. Note again that it is also possible to download bytecode, since this is supported by most current LP/CLP systems, using a similar approach.
The main other previous body of work related to general-purpose interfacing of logic programming and the WWW that we have knowledge of is the LogicWeb [22] system, by S.W. Loke and A. Davison. The aim of LogicWeb is to use logic programming to extend the concept of WWW pages, incorporating in them programmable behaviour and state. In this, it shares goals with Java. It also offers rich primitives for accessing code in remote pages and module structuring. The aims of LogicWeb are different from those of html.pl/PiLLoW. LogicWeb is presented as a system itself, and its implementation is done through a tight integration with the Mosaic browser, making use of special features of this browser. In contrast, html.pl/PiLLoW is a general purpose library, meant to be used by a general computational logic systems and is browser-independent. html.pl/PiLLoW offers a wide range of functionalities, such as syntax conversion between HTML and logic terms, access predicates for WWW pages, predicates for handling forms, etc., which are generally at a somewhat lower level of abstraction than those of LogicWeb. We believe that using PiLLoW and the ideas sketched in this paper it is possible to add the quite interesting functionality offered by LogicWeb to standard LP and CLP systems. We have shown some examples including access to passive remote code (modules with an ftp or http address) from programs and automatic remote code access and querying using standard browsers and forms. In addition, we have discussed active remote code, where the functionality, rather than the code itself, is exported.
Also, much related work is being presented in this workshop. The work in [26] is based on LogicWeb, and aims to provide distributed lightwight databases on the WWW. As with the basic LogicWeb system, we believe that the PiLLoW library can be used to implement in other systems the interesting ideas proposed therein. As briefly mentioned before, the work in [27] proposes an architecture similar to that of our active modules in order to handle form requests. In this solution the handling multiple requests is performed by using or-parallelism. While we feel that and-parallelism is more natural for modelling concurrency, the ideas proposed are quite interesting. The ECLiPSe HTTP-library [24], aimed at implementing INTERNET agents, offers functionality that is in part similar to that of the CIAO html.pl/PiLLoW libraries, including facilities that are similar to our active modules. The approach is different, however, in several respects. The ECLiPSe library implements special HTTP servers and clients. In contrast, PiLLoW uses standard HTTP servers and interfaces. Using special purpose servers may be interesting because the approach possibly allows greater functionality. On the other hand this approach in general requires either the substitution of the standard server on a given machine or setting the special server at a different socket address from the standard one. The ECLiPSe library also contains functionality that is related to our active modules, although the interface provided is at a lower level. Finally, other papers describing very interesting WWW applications are being presented, which underline the suitability of computational logic systems for the task. We believe that the CIAO PiLLoW library can contribute to making it even easier to develop such applications in the future.
We are currently working on extended versions of the library which for example may make extensive use of concurrency internally (on those LP/CLP systems that support it) to overlap network requests and include support for (active) VRML (some quite interesting work in this area, in the motivating context of MOOs, is presented in [29]). We are also considering interfaces with the Java language, including making the LP/CLP system be a Java library and also calling Java from the LP/CLP system in order to use its libraries. Finally, we are also considering the possibility of compiling LP/CLP code to the Java abstract machine. This seems possible, although at a cost in performance with respect to a direct WAM-like implementation, since the Java abstract machine does not have built-in support for unification or backtracking, which would have to be interpreted.
In addition to being part of the &-Prolog/CIAO system, the PiLLoW library is being provided as a public domain standard library for SICStus Prolog and other Prolog and CLP systems, supporting most of its functionality. Please contact the authors or consult our WWW site http://www.clip.dia.fi.upm.es for details.
The authors would like to thank Mats Carlsson, Tony Beaumont, Ken Bowen, Michael Codish, Markus Fromherz, Paul Tarau, Andrew Davison, and Koen De Bosschere for useful feedback on previous versions of the presented libraries. The first versions of the CIAO system and the html.pl library were developed under partial support from the ACCLAIM ESPRIT project.