Pattern (regular expression) matching
Author(s): The CLIP Group.This library provides facilities for matching strings and terms against patterns. There are some prolog flags
- There is a prolog flag to case insensitive match. Its name is case_insensitive. If its value is on, matching is case insenseitive, but if its value is off matching isn't case insensitive. By default, its value is off.
- There is a syntax facility to use matching more or less like a unification. You can type, " =~ "regexp" " as an argument of a predicate. Thus, that argument must match with regexp. For example:
pred ( =~ "ab*c", B) :- ...
is equivalent to
pred (X,B) :- match_posix("ab*c",X,R), ...
So, there are two prolog flags about this. One of this prolog flags is "format". Its values are shell, posix, list and pred, and sustitute in the example match_posix by match_shell, match_posix, match_struct and macth_pred respectivly. By default its value is posix. The other prolog flag is exact. Its values are on and off. If its value is off sustitute in the example R by []. If its value is on, R is a variable. By default, its value is on.
Usage and interface
- Library usage:
:- use_package(regexp). or :- module(...,...,[regexp]). - New operators defined:
=~/1 [200,fy]. - Imports:
- System library modules:
regexp/regexp_code. - Packages:
prelude, nonpure, assertions.
- System library modules:
Documentation on internals
Usage:match_shell(Exp,IN,Rest)
Matches IN against Exp. Rest is the longest remainder of the string after the match. For example, match_shell("??*","foo.pl",Tail) succeeds, instantiating Tail to "o.pl".
- The following properties should hold at call time:
(regexp_code:shell_regexp/1)Exp is a shell regular expression to match against.
(basic_props:string/1)IN is a string (a list of character codes).
(basic_props:string/1)Rest is a string (a list of character codes).
Usage:match_shell(Exp,IN)
Matches completely IN (no tail can remain unmatched) against Exp similarly to match_shell/3.
- The following properties should hold at call time:
(regexp_code:shell_regexp/1)Exp is a shell regular expression to match against.
(basic_props:string/1)IN is a string (a list of character codes).
Usage:match_posix(Exp,IN)
Matches completely IN (no tail can remain unmatched) against Exp similarly to match_posix/3.
- The following properties should hold at call time:
(regexp_code:shell_regexp/1)Exp is a shell regular expression to match against.
(basic_props:string/1)IN is a string (a list of character codes).
Usage:match_posix(Exp,In,Match,Rest)
- The following properties should hold at call time:
(regexp_code:shell_regexp/1)Exp is a shell regular expression to match against.
(basic_props:string/1)In is a string (a list of character codes).
(basic_props:list/2)Match is a list of strings.
(basic_props:string/1)Rest is a string (a list of character codes).
Usage:match_posix_rest(Exp,IN,Rest)
Matches IN against Exp. Tail is the remainder of the string after the match. For example, match_posix("ab*c","abbbbcdf",Tail) succeeds, instantiating Tail to "df".
- The following properties should hold at call time:
(regexp_code:posix_regexp/1)Exp is a posix regular expression to match against.
(basic_props:string/1)IN is a string (a list of character codes).
(basic_props:string/1)Rest is a string (a list of character codes).
Usage:match_posix_matches(Exp,IN,Matches)
Matches completely IN against Exp. Exp can contain anchored expressions of the form \(regexp\). Matches will contain a list of the anchored expression which were matched on success. Note that since POSIX expressions are being read inside a string, backslashes will have to be doubled. For example,
?- match_posix_matches("\\(aa|bb\\)\\(bb|aa\\)", "bbaa", M). M = ["bb","aa"] ? ; no ?- match_posix_matches("\\(aa|bb\\)\\(bb|aa\\)", "aabb", M). M = ["aa","bb"] ? ; no
- The following properties should hold at call time:
(regexp_code:shell_regexp/1)Exp is a shell regular expression to match against.
(basic_props:string/1)IN is a string (a list of character codes).
(basic_props:list/2)Matches is a list of strings.
Usage:match_struct(Exp,IN,Rest,Tail)
Matches IN against Exp. Tail is the remainder of the list of atoms IN after the match. For example, match_struct([a,*(b),c],[a,b,b,b,c,d,e],Tail) succeeds, instantiating Tail to [d,e].
- Call and exit should be compatible with:
(regexp_code:struct_regexp/1)Exp is a struct regular expression to match against.
(basic_props:string/1)IN is a string (a list of character codes).
(basic_props:string/1)Rest is a string (a list of character codes).
Usage:match_pred(Pred1,Pred2)
Tests if two predicates Pred1 and Pred2 match using posix regular expressions.
Usage:replace_first(IN,Old,New,Resul)
Replace the first ocurrence of the Old by New in IN and copy the result in Resul.
- The following properties should hold at call time:
(basic_props:string/1)IN is a string (a list of character codes).
(regexp_code:posix_regexp/1)Old is a posix regular expression to match against.
(basic_props:string/1)New is a string (a list of character codes).
(basic_props:string/1)Resul is a string (a list of character codes).
Usage:replace_all(IN,Old,New,Resul)
Replace all ocurrences of the Old by New in IN and copy the result in Resul.
- The following properties should hold at call time:
(basic_props:string/1)IN is a string (a list of character codes).
(regexp_code:posix_regexp/1)Old is a posix regular expression to match against.
(basic_props:string/1)New is a string (a list of character codes).
(basic_props:string/1)Resul is a string (a list of character codes).