Multiple Argument Indexing

Author(s): Anil Nair (original work), Tom Howland (http://home.pacbell.net/tomjdnh/pd.html, derived the original work), Francisco Bueno (initial port to Ciao), Jose F. Morales (improvements in implementation and documentation).

Indexing (in Prolog) is an optimization technique that reduces the search space of the predicates without altering the Prolog semantics.

In the most general case, predicate clauses are tryed on backtracking one after the other, in sequential order. We can call this list of clauses a try-list. Indexing is based on removing clauses that are known to fail without any observable output (no side-effects) from the try-list. A typical implementation introduces tests before the actual predicate execution to discriminate among a collection of precomputed specialized try-lists. In the best case, this technique can obtain try-lists with 0 or 1 elements for calls.

Currently, the Ciao engine implements a limited but fast 1st-argument-1st-level indexing. The indexer package provides more powerful indexing schemes. It lets you pick different combinations of arguments to index on. E.g., it will let you index on the first and third argument or the second and the third argument of a predicate.

The selection of the try-list is based on computing a hash value for the terms (or part of them) to be indexed upon. Given this, the optimization pays off only when the amount of clashing that your original predicate causes without indexing superseeds the cost of the hashing function. Such amount of course depends on the number and form of the facts in your predicate, and the calling modes.

Important Note about Performance

The current implementation of the package is done at the source level, so it may sometimes not be as fast as expected.
The complexity of the hashing function currently used is $O(n)$ with $n$ the number of characters in the textual representation of the term. Thus, even if the search tree is reduced, performance can be much slower in some cases that the cheaper internal (1st argument, 1st level) indexing used in Ciao.

Despite this, the package implements some indexing schemes with low overhead.

A single :- index p(+,?,...?) indexer (1st argument, 1st level). Reuses the internal indexing.
A single :- index p(?,...,+,...?) indexer (one argument, 1st level). Reuses the internal indexing by reordering the predicate arguments.

Usage and interface

Library usage:
This facility is used as a package, thus either including indexer in the package list of the module, or by using the use_package/1 declaration. The facility predicate hash_term/2, documented here, is defined in library module indexer(hash).
Exports:
- Predicates:
  hash_term/2.
Imports:
- System library modules:
  assertions/native_props, indexer/hash.
- Packages:
  prelude, nonpure, assertions, regtypes.

Documentation on exports

PREDICATE

hash_term/2:

hash_term(Term,HashValue)

Provides an efficient way to calculate an integer HashValue for a ground Term.

Usage 1:hash_term(T,N)

N is a hashing index for T.

The following properties should hold at call time:
(term_typing:ground/1)T is currently ground (it contains no variables).
(term_typing:var/1)N is a free variable.
The following properties should hold upon exit:
(basic_props:int/1)N is an integer.

Usage 2:hash_term(T,N)

The following properties should hold at call time:
(native_props:nonground/1)T is not ground.
(term_typing:var/1)N is a free variable.
The following properties should hold upon exit:
(term_typing:var/1)N is a free variable.

Documentation on internals

DECLARATION

index/1:

Usage::- index(IndexSpecs).

Declares an indexing scheme for a predicate. Each spec declares an indexing on a combination of the arguments. Indexing will be performed using any of the specs in IndexSpecs (being thus interpreted as an or).

You should use a * in an argument position if you wish to hash on the entire term in that argument. If a + is used only one level of the term in the argument is used for hashing. An i is used to indicate that argument is already an integer, and therefore its own value will be used for hashing. The argspec ? simply indicates not to use the argument for indexing.

For example, the index specification:

:- index foo(+,?,*,i), foo(?,?,?,i).

declares indexing for foo/4 either on a combination of the first, third, and fourht arguments, or only on the last argument, which is an integer. In the first case, only the principal functor of the first argument will be used for hashing; the third argument will be used in its entirety.

The argspec n is a pragmatic extension and can not be used in conjunction with the other specifiers aside from ?. It stands for "nonvar" and implies that the argument will not be used for hashing, since only ground terms can effectively be used in hashing. Thus, it can not be used in combination with other specifiers within a particular index specification. It is often the fastest thing to use.

The following properties should hold upon exit:
(indexer_doc:indexspecs/1)IndexSpecs is an index specification.

REGTYPE

indexspecs/1:

An index specification is defined as follows:

indexspecs(Spec) :-
        indexspec(Spec).
indexspecs((Spec,Specs)) :-
        indexspec(Spec),
        indexspecs(Specs).

indexspec(Spec) :-
        Spec=..[_F|Args],
        list(Args,argspec).

Usage:indexspecs(IndexSpecs)

IndexSpecs is an index specification.

REGTYPE

argspec/1:

An argument hash specification is defined as follows:

argspec(+).
argspec(*).
argspec(i).
argspec(n).
argspec(?).

Usage:argspec(Spec)

Spec is an argument hash specification.

Known bugs and planned improvements

The semantics of cut (!) are not preserved with the 'general' indexing scheme (see implementation). Translations should happen after choice idiom / cut idiom are introduced. That is not possible with the current expansion mechanism. Communication with the internal indexing tables could be the easier solution
Indexing specs must appear before the clauses of the predicate they specify.

Multiple Argument Indexing

Usage and interface

Documentation on exports

Documentation on internals

Known bugs and planned improvements

Module Sections

Global Links

Indices