next up previous contents
Next: Adding Computation Domains Up: A Basic Language Previous: Database Programming

   
Datalog and the Relational Database Model

The language we have seen so far, having (logical) variables, constants, user-defined predicates (which can be assimilated to program procedures), and the equality constraint =/2 is a constraint language. This language is, however, severely impeded by the lack of data structures and arithmetical operations, and we will introduce them later. In fact, its power is equivalent to that of propositional logic (i.e., logic without variables), because every program in our first language can be rewritten to a semantically equivalent propositional program, and any propositional program is, directly, correct in our language.

Notwithstanding, augmenting this language with numbers and arithmetical operations, and (for the sake of practicality) other facilities (such as negation), produces a far superior language, termed Datalog, which is often used in advanced databases. Without adding anything to our language, we will show how it can be directly used to model common operations in relational databases.

Basic structural components of relational databases are tables, which are collections of tuples (rows) having the same number of components in each tuple. Each component of every row has a type, such a string, number, date, etc., usually from a set of predefined types available in the database; we will not deal with such types at the moment. The arguments in the same position of all the rows in each table belong to the same column, and every column has an attribute, which usually names that column. Figure 2.4 shows two tables which can be part of a database which collects information about persons and cities where they have lived.


  
Figure 2.4: Two tables in the relational database model


Name Age Sex
Brown 20 M
Jones 21 F
Smith 36 M
     


Person





Name Town Years
Brown London 15
Brown York 5
Jones Paris 21
Smith Brussels 15
Smith Santander 5
     


Lived-in




The order of rows in immaterial, since they are not accessed and retrieved by number, but according to the matching of the arguments. Similarly, the order of columns is not important either, since they are labeled with attributes; but it will be important for our translation to a logic language. It is important to note that duplicate rows are not allowed, or, rather, that they are meaningless, since duplicated solutions are not taken into account at all.

A translation to our logic language takes every part of the database and casts it into the component of the constraint language following the paths below:


Relat. Database $\rightarrow$
Logic Program


Relation Name $\rightarrow$ Predicate symbol
Relation $\rightarrow$ Predicate consisting of ground facts (facts without variables)
Tuple $\rightarrow$ Ground fact
Attribute $\rightarrow$ Argument of predicate

It is important to note that, since in our language, arguments of an atom cannot receive a name (but other logic languages allow it), the correspondence attribute name $\rightarrow$ argument position must be respected in the whole translation. The fragment of database in Figure 2.4 can be translated to the set of facts below:

person(brown,20,male).
person(jones,21,female).
person(smith,36,male).

lived_in(brown,london,15).
lived_in(brown,york,5).
lived_in(jones,paris,21).
lived_in(smith,brussels,15).
lived_in(smith,santander,5).

Using this translation scheme, which uses a set of facts to model a static database, the usual operations on relational databases can be easily defined, an implemented using clauses. As mentioned before, the result is that not only the database, but also the different queries, views, etc. can be programmed using the same language.

Some operations can be expressed as derivatives from the above ones, but they can also be expressed more directly in CLP:

The appearance of duplicate answers, even if there are no duplicates in the original table (e.g., projecting the table lived-in on its first argument) is not a theoretical problem, since they are simply ignored, but it can be a practical problem. Database implementations automatically discard repeated tuples. Similarly, CLP languages have built-in primitives which allow the gathering of all answers to a query and remmoving duplicates.


% latex2html id marker 3365
$\mathbf\therefore$


The so-called deductive databases are relational databases which use heavily concepts from first-order logic to implement (actually, to program) explicitly deduction and coherence rules. They use commonly a language similar to the one we have just developed, plus some extended facilities. This language is usually a subset of a logic-based full-fledged language. It is language of this kind, even augmented with constraint solving capabilities, which we are aiming at now.


% latex2html id marker 3367
$\mathbf\therefore$



next up previous contents
Next: Adding Computation Domains Up: A Basic Language Previous: Database Programming
MCL
1998-12-03