Persistent predicate database
Author(s): José Manuel Gómez Pérez, Daniel Cabeza, Manuel Hermenegildo, The CLIP Group.
Introduction to persistent predicates
This library implements a generic persistent predicate database. The basic notion implemented by the library is that of a persistent predicate. The persistent predicate concept provides a simple, yet powerful generic persistent data access method [CHGT98,Par97]. A persistent predicate is a special kind of dynamic, data predicate that “resides” in some persistent medium (such as a set of files, a database, etc.) that is typically external to the program using such predicates. The main effect is that any changes made to a persistent predicate from a program “survive” across executions. I.e., if the program is halted and restarted the predicate that the new process sees is in precisely the same state as it was when the old process was halted (provided no change was made in the meantime to the storage by other processes or the user).
Persistent predicates appear to a program as ordinary predicates, and calls to these predicates can appear in clause bodies in the usual way. However, the definitions of these predicates do not appear in the program. Instead, the library maintains automatically the definitions of predicates which have been declared as persistent in the persistent storage.
Updates to persistent predicates can be made using enhanced versions of asserta_fact/1, assertz_fact/1 and retract_fact/1. The library makes sure that each update is a transactional update, in the sense that if the update terminates, then the permanent storage has definitely been modified. For example, if the program making the updates is halted just after the update and then restarted, then the updated state of the predicate will be seen. This provides security against possible data loss due to, for example, a system crash. Also, due to the atomicity of the transactions, persistent predicates allow concurrent updates from several programs.
Persistent predicates, files, and relational databases
The concept of persistent predicates provided by this library essentially implements a light-weight, simple, and at the same time powerful form of relational database (a deductive database), and which is standalone, in the sense that it does not require external support, other than the file management capabilities provided by the operating system. This is due to the fact that the persistent predicates are in fact stored in one or more auxiliary files below a given directory.
This type of database is specially useful when building small to medium-sized standalone applications in Prolog which require persistent storage. In many cases it provides a much easier way of implementing such storage than using files under direct program control. For example, interactive applications can use persistent predicates to represent their internal state in a way that is close to the application. The persistence of such predicates then allows automatically restoring the state to that at the end of a previous session. Using persistent predicates amounts to simply declaring some predicates as such and eliminates having to worry about opening files, closing them, recovering from system crashes, etc.
In other cases, however, it may be convenient to use a relational database as persistent storage. This may be the case, for example, when the data already resides in such a database (where it is perhaps accessed also by other applications) or the volume of data is very large. persdb_sql [CCG98] is a companion library which implements the same notion of persistent predicates used herein, but keeping the storage in a relational database. This provides a very natural and transparent way to access SQL database relations from a Prolog program. In that library, facilities are also provided for reflecting more complex views of the database relations as predicates. Such views can be constructed as conjunctions, disjunctions, projections, etc. of database relations, and may include SQL-like aggregation operations.
A nice characteristic of the notion of persistent predicates used in both of these libraries is that it abstracts away how the predicate is actually stored. Thus, a program can use persistent predicates stored in files or in external relational databases interchangeably, and the type of storage used for a given predicate can be changed without having to modify the program (except for replacing the corresponding persistent/2 declarations).
An example application of the persdb and persdb_sql libraries (and also the pillow library [CH97]), is WebDB [GCH98]. WebDB is a generic, highly customizable deductive database engine with an html interface. WebDB allows creating and maintaining Prolog-based databases as well as relational databases (residing in conventional relational database engines) using any standard WWW browser.
Using file-based persistent predicates
Persistent predicates can be declared statically, using persistent/2 declarations (which is the preferred method, when possible), or dynamically via calls to make_persistent/2. Currently, persistent predicates may only contain facts, i.e., they are dynamic predicates of type data/1.
Predicates declared as persistent are linked to directory, and the persistent state of the predicate will be kept in several files below that directory. The files in which the persistent predicates are stored are in readable, plain ASCII format, and in Prolog syntax. One advantage of this approach is that such files can also be created or edited by hand, in a text editor, or even by other applications.
An example definition of a persistent predicate implemented by files follows:
:- persistent(p/3,dbdir). persistent_dir(dbdir, '/home/clip/public_html/db').
The first line declares the predicate p/3 persistent. The argument dbdir is a key used to index into a fact of the relation persistent_dir/2-4, which specifies the directory where the corresponding files will be kept. The effect of the declaration, together with the persistent_dir/2-4 fact, is that, although the predicate is handled in the same way as a normal data predicate, in addition the system will create and maintain efficiently a persistent version of p/3 via files in the directory /home/clip/public_html/db.
The level of indirection provided by the dbdir argument makes it easy to place the storage of several persistent predicates in a common directory, by specifying the same key for all of them. It also allows changing the directory for several such persistent predicates by modifying only one fact in the program. Furthermore, the persistent_dir/2-4 predicate can even be dynamic and specified at run-time.
Implementation Issues
We outline the current implementation approach. This implementation attempts to provide at the same time efficiency and security. To this end, up to three files are used for each predicate (the persistence set): the data file, the operations file, and the backup file. In the updated state the facts (tuples) that define the predicate are stored in the data file and the operations file is empty (the backup file, which contains a security copy of the data file, may or may not exist).
While a program using a persistent predicate is running, any insertion (assert) or deletion (retract) operations on the predicate are performed on both the program memory and on the persistence set. However, in order to incurr only a small overhead in the execution, rather than changing the data file directly, a record of each of the insertion and deletion operations is appended to the operations file. The predicate is then in a transient state, in that the contents of the data file do not reflect exactly the current state of the corresponding predicate. However, the complete persistence set does.
When a program starts, all pending operations in the operations file are performed on the data file. A backup of the data file is created first to prevent data loss if the system crashes during this operation. The order in which this updating of files is done ensures that, if at any point the process dies, on restart the data will be completely recovered. This process of updating the persistence set can also be triggered at any point in the execution of the program (for example, when halting) by calling update_files.
Defining an initial database
It is possible to define an initial database by simply including in the program code facts of persistent predicates. They will be included in the persistent database when it is created. They are ignored in successive executions.
Using persistent predicates from the top level
Special care must be taken when loading into the top level modules or user files which use persistent predicates. Beforehand, a goal use_module(library(persdb(persdbrt))) must be issued. Furthermore, since persistent predicates defined by the loaded files are in this way defined dynamically, a call to initialize_db/0 is commonly needed after loading and before calling predicates of these files.
Usage and interface
- Library usage:
There are two packages which implement persistence: persdb and 'persdb/ll' (for low level). In the first, the standard builtins asserta_fact/1, assertz_fact/1, and retract_fact/1 are replaced by new versions which handle persistent data predicates, behaving as usual for normal data predicates. In the second package, predicates with names starting with p are defined, so that there is no overhead in calling the standard builtins. In any case, each package is used as usual: including it in the package list of the module, or using the use_package/1 declaration. - Exports:
- Predicates:
passerta_fact/1, passertz_fact/1, pretract_fact/1, pretractall_fact/1, asserta_fact/1, assertz_fact/1, retract_fact/1, retractall_fact/1, initialize_db/0, make_persistent/2, update_files/0, update_files/1, create/2. - Regular Types:
meta_predname/1, directoryname/1. - Multifiles:
$is_persistent/2, persistent_dir/2, persistent_dir/4.
- Predicates:
- Imports:
- System library modules:
lists, read, aggregates, system, file_locks/file_locks, persdb/persdbcache. - Packages:
prelude, nonpure, assertions, regtypes, nortchecks, persdb(persdb_decl).
- System library modules:
Documentation on exports
Usage:passerta_fact(Fact)
Persistent version of asserta_fact/1: the current instance of Fact is interpreted as a fact (i.e., a relation tuple) and is added at the beginning of the definition of the corresponding predicate. The predicate concerned must be declared persistent. Any uninstantiated variables in the Fact will be replaced by new, private variables. Defined in the 'persdb/ll' package.
- The following properties should hold at call time:
(basic_props:callable/1)Fact is a term which represents a goal, i.e., an atom or a structure.
Usage:passertz_fact(Fact)
Persistent version of assertz_fact/1: the current instance of Fact is interpreted as a fact (i.e., a relation tuple) and is added at the end of the definition of the corresponding predicate. The predicate concerned must be declared persistent. Any uninstantiated variables in the Fact will be replaced by new, private variables. Defined in the 'persdb/ll' package.
- The following properties should hold at call time:
(basic_props:callable/1)Fact is a term which represents a goal, i.e., an atom or a structure.
Retracts a predicate in both, the dynamic and the persistent databases.
Usage:pretract_fact(Fact)
Persistent version of retract_fact/1: deletes on backtracking all the facts which unify with Fact. The predicate concerned must be declared persistent. Defined in the 'persdb/ll' package.
- The following properties should hold at call time:
(basic_props:callable/1)Fact is a term which represents a goal, i.e., an atom or a structure.
Retracts all the instances of a predicate in both, the dynamic and the persistent databases.
Meta-predicate with arguments: pretractall_fact(fact).
Usage:asserta_fact(Fact)
Same as passerta_fact/1, but if the predicate concerned is not persistent then behaves as the builtin of the same name. Defined in the persdb package.
- The following properties should hold at call time:
(basic_props:callable/1)Fact is a term which represents a goal, i.e., an atom or a structure.
Usage:assertz_fact(Fact)
Same as passertz_fact/1, but if the predicate concerned is not persistent then behaves as the builtin of the same name. Defined in the persdb package.
- The following properties should hold at call time:
(basic_props:callable/1)Fact is a term which represents a goal, i.e., an atom or a structure.
Usage:retract_fact(Fact)
Same as pretract_fact/1, but if the predicate concerned is not persistent then behaves as the builtin of the same name. Defined in the persdb package.
- The following properties should hold at call time:
(basic_props:callable/1)Fact is a term which represents a goal, i.e., an atom or a structure.
Usage:retractall_fact(Fact)
Same as pretractall_fact/1, but if the predicate concerned is not persistent then behaves as the builtin of the same name. Defined in the persdb package.
- The following properties should hold at call time:
(basic_props:callable/1)Fact is a term which represents a goal, i.e., an atom or a structure.
Usage:
Initializes the whole database, updating the state of the declared persistent predicates. Must be called explicitly after dynamically defining clauses for persistent_dir/2.
Usage:make_persistent(PredDesc,Keyword)
Dynamic version of the persistent declaration.
- The following properties should hold at call time:
(persdbrt:meta_predname/1)persdbrt:meta_predname(PredDesc)
(persdbcache:keyword/1)Keyword is an atom corresponding to a directory identifier.
Usage:
Updates the files comprising the persistence set of all persistent predicates defined in the application.
Usage:update_files(PredSpecList)
Updates the files comprising the persistence set of the persistent predicates in PredSpecList.
- Call and exit should be compatible with:
(basic_props:list/2)PredSpecList is a list of prednames.
Documentation on multifiles
Predicate Spec persists within database Key. Programmers should not define this predicate directly in the program.
The predicate is multifile.
The predicate is of type data.
Usage:persistent_dir(Keyword,Location_Path)
Relates identifiers of locations (the Keywords) with descriptions of such locations (Location_Paths). Location_Path is a directory and it means that the definition for the persistent predicates associated with Keyword is kept in files below that directory (which must previously exist). These files, in the updated state, contain the actual definition of the predicate in Prolog syntax (but with module names resolved).
- Call and exit should be compatible with:
(persdbcache:keyword/1)Keyword is an atom corresponding to a directory identifier.
(persdbrt:directoryname/1)Location_Path is an atom, the name of a directory.
The predicate is of type data.
Usage:persistent_dir(Keyword,Location_Path,DirPerms,FilePerms)
The same as persistent_dir/2, but including also the permission modes for persistent directories and files.
- Call and exit should be compatible with:
(persdbcache:keyword/1)Keyword is an atom corresponding to a directory identifier.
(persdbrt:directoryname/1)Location_Path is an atom, the name of a directory.
(basic_props:int/1)DirPerms is an integer.
(basic_props:int/1)FilePerms is an integer.
The predicate is of type data.
Documentation on internals
Usage::- persistent(PredDesc,Keyword).
Declares the predicate PredDesc as persistent. Keyword is the identifier of a location where the persistent storage for the predicate is kept. The location Keyword is described in the persistent_dir predicate, which must contain a fact in which the first argument unifies with Keyword.
- The following properties should hold upon exit:
(basic_props:predname/1)PredDesc is a Name/Arity structure denoting a predicate name:predname(P/A) :- atm(P), nnegint(A).
(persdbcache:keyword/1)Keyword is an atom corresponding to a directory identifier.
Known bugs and planned improvements
- Run-time checks have been reported not to work with this code. That means that either the assertions here, or the code that implements the run-time checks are erroneous.
- To load in the toplevel a file which uses this package, module library(persdb(persdbrt)) has to be previously loaded.