Pattern (regular expression) matching

Author(s): The CLIP Group.

This library provides facilities for matching strings and terms against patterns. There are some prolog flags

  • There is a prolog flag to case insensitive match. Its name is case_insensitive. If its value is on, matching is case insenseitive, but if its value is off matching isn't case insensitive. By default, its value is off.

  • There is a syntax facility to use matching more or less like a unification. You can type, " =~ "regexp" " as an argument of a predicate. Thus, that argument must match with regexp. For example:

              pred ( =~ "ab*c", B) :- ...    
     

    is equivalent to

              pred (X,B) :- match_posix("ab*c",X,R), ...
     

    So, there are two prolog flags about this. One of this prolog flags is "format". Its values are shell, posix, list and pred, and sustitute in the example match_posix by match_shell, match_posix, match_struct and macth_pred respectivly. By default its value is posix. The other prolog flag is exact. Its values are on and off. If its value is off sustitute in the example R by []. If its value is on, R is a variable. By default, its value is on.


Usage and interface

  • Library usage:
    :- use_package(regexp). or :- module(...,...,[regexp]).
  • New operators defined:
    =~/1 [200,fy].
  • Other modules used:

Documentation on internals

PREDICATE

Usage: match_shell(Exp,IN,Rest)

  • Description: Matches IN against Exp. Rest is the longest remainder of the string after the match. For example, match_shell("??*","foo.pl",Tail) succeeds, instantiating Tail to "o.pl".
  • The following properties should hold at call time:
    (regexp_code:shell_regexp/1)Exp is a shell regular expression to match against.
    (basic_props:string/1)IN is a string (a list of character codes).
    (basic_props:string/1)Rest is a string (a list of character codes).

PREDICATE

Usage: match_shell(Exp,IN)

  • Description: Matches completely IN (no tail can remain unmatched) against Exp similarly to match_shell/3.
  • The following properties should hold at call time:
    (regexp_code:shell_regexp/1)Exp is a shell regular expression to match against.
    (basic_props:string/1)IN is a string (a list of character codes).

PREDICATE

Usage: match_posix(Exp,IN)

  • Description: Matches completely IN (no tail can remain unmatched) against Exp similarly to match_posix/3.
  • The following properties should hold at call time:
    (regexp_code:shell_regexp/1)Exp is a shell regular expression to match against.
    (basic_props:string/1)IN is a string (a list of character codes).

PREDICATE

Usage: match_posix(Exp,In,Match,Rest)

  • The following properties should hold at call time:
    (regexp_code:shell_regexp/1)Exp is a shell regular expression to match against.
    (basic_props:string/1)In is a string (a list of character codes).
    (basic_props:list/2)Match is a list of strings.
    (basic_props:string/1)Rest is a string (a list of character codes).

PREDICATE

Usage: match_posix_rest(Exp,IN,Rest)

  • Description: Matches IN against Exp. Tail is the remainder of the string after the match. For example, match_posix("ab*c","abbbbcdf",Tail) succeeds, instantiating Tail to "df".
  • The following properties should hold at call time:
    (regexp_code:posix_regexp/1)Exp is a posix regular expression to match against.
    (basic_props:string/1)IN is a string (a list of character codes).
    (basic_props:string/1)Rest is a string (a list of character codes).

PREDICATE

Usage: match_posix_matches(Exp,IN,Matches)

  • Description: Matches completely IN against Exp. Exp can contain anchored expressions of the form \(regexp\). Matches will contain a list of the anchored expression which were matched on success. Note that since POSIX expressions are being read inside a string, backslashes will have to be doubled. For example,

    ?- match_posix_matches("\\(aa|bb\\)\\(bb|aa\\)", "bbaa", M).
    M = ["bb","aa"] ? ;
    no
    
    ?- match_posix_matches("\\(aa|bb\\)\\(bb|aa\\)", "aabb", M).
    M = ["aa","bb"] ? ;
    no
    
  • The following properties should hold at call time:
    (regexp_code:shell_regexp/1)Exp is a shell regular expression to match against.
    (basic_props:string/1)IN is a string (a list of character codes).
    (basic_props:list/2)Matches is a list of strings.

PREDICATE

Usage: match_struct(Exp,IN,Rest,Tail)

  • Description: Matches IN against Exp. Tail is the remainder of the list of atoms IN after the match. For example, match_struct([a,*(b),c],[a,b,b,b,c,d,e],Tail) succeeds, instantiating Tail to [d,e].
  • Call and exit should be compatible with:
    (regexp_code:struct_regexp/1)Exp is a struct regular expression to match against.
    (basic_props:string/1)IN is a string (a list of character codes).
    (basic_props:string/1)Rest is a string (a list of character codes).

PREDICATE

Usage: match_pred(Pred1,Pred2)

  • Description: Tests if two predicates Pred1 and Pred2 match using posix regular expressions.

PREDICATE

Usage: replace_first(IN,Old,New,Resul)

  • Description: Replace the first ocurrence of the Old by New in IN and copy the result in Resul.
  • The following properties should hold at call time:
    (basic_props:string/1)IN is a string (a list of character codes).
    (regexp_code:posix_regexp/1)Old is a posix regular expression to match against.
    (basic_props:string/1)New is a string (a list of character codes).
    (basic_props:string/1)Resul is a string (a list of character codes).

PREDICATE

Usage: replace_all(IN,Old,New,Resul)

  • Description: Replace all ocurrences of the Old by New in IN and copy the result in Resul.
  • The following properties should hold at call time:
    (basic_props:string/1)IN is a string (a list of character codes).
    (regexp_code:posix_regexp/1)Old is a posix regular expression to match against.
    (basic_props:string/1)New is a string (a list of character codes).
    (basic_props:string/1)Resul is a string (a list of character codes).