HTML/XML parser and generator

Author(s): Daniel Cabeza, Manuel Hermenegildo, Sacha Varma, The Ciao Development Team.

This module implements predicates for HTML/XML generation and parsing.

Usage and interface

Documentation on exports

A term representing HTML code in canonical, structured way. It is a list of terms defined by the following predicate:
canonic_html_item(comment(S)) :-
    string(S).
canonic_html_item(declare(S)) :-
    string(S).
canonic_html_item(env(Tag,Atts,Terms)) :-
    atm(Tag),
    list(tag_attrib,Atts),
    canonic_html_term(Terms).
canonic_html_item($(Tag,Atts)) :-
    atm(Tag),
    list(tag_attrib,Atts).
canonic_html_item(S) :-
    string(S).
tag_attrib(Att) :-
    atm(Att).
tag_attrib(Att=Val) :-
    atm(Att),
    string(Val).
Each structure represents one HTML construction:

env(tag,attribs,terms)
An HTML environment, with name tag, list of attributes attribs and contents terms.

$(tag,attribs)
An HTML element of name tag and list of attributes attribs. ($)/2 is defined by the pillow package as an infix, binary operator.

comment(string)
An HTML comment (translates to/from <!--string-->).

declare(string)
An HTML declaration, they are used only in the header (translates to/from <!string>).

string
Normal text is represented as a list of character codes.

For example, the term

env(a,[href="www.therainforestsite.com"],
      ["Visit ",img$[src="TRFS.gif"]])
is output to (or parsed from):
<a href="www.therainforestsite.com">Visit <img src="TRFS.gif"></a>

Usage:canonic_html_term(HTMLTerm)

HTMLTerm is a term representing HTML code in canonical form.

    A term representing XML code in canonical, structured way. It is a list of terms defined by the following predicate (see tag_attrib/1 definition in canonic_html_term/1):
    canonic_xml_item(Term) :-
        canonic_html_item(Term).
    canonic_xml_item(xmldecl(Atts)) :-
        list(tag_attrib,Atts).
    canonic_xml_item(env(Tag,Atts,Terms)) :-
        atm(Tag),
        list(tag_attrib,Atts),
        canonic_xml_term(Terms).
    canonic_xml_item(elem(Tag,Atts)) :-
        atm(Tag),
        list(tag_attrib,Atts).
    
    In addition to the structures defined by canonic_html_term/1 (the ($)/2 structure appears only in malformed XML code), the following structures can be used:

    elem(tag,atts)
    Specifies an XML empty element of name tag and list of attributes atts. For example, the term
    elem(arc,[weigh="3",begin="n1",end="n2"])
    is output to (or parsed from):
    <arc weigh="3" begin="n1" end="n2"/>

    xmldecl(atts)
    Specifies an XML declaration with attributes atts (translates to/from <?xml atts?>)

    Usage:canonic_xml_term(XMLTerm)

    XMLTerm is a term representing XML code in canonical form.

      A term which represents HTML or XML code in a structured way. In addition to the structures defined by canonic_html_term/1 or canonic_xml_term/1, the following structures can be used:

      begin(tag,atts)
      It translates to the start of an HTML environment of name tag and attributes atts. There exists also a begin(tag) structure. Useful, in conjunction with the next structure, when including in a document output generated by an existing piece of code (e.g. tag = pre). Its use is otherwise discouraged.

      end(tag)
      Translates to the end of an HTML environment of name tag.

      start
      Used at the beginning of a document (translates to <html>).

      end
      Used at the end of a document (translates to </html>).

      --
      Produces a horizontal rule (translates to <hr>).

      \
      Produces a line break (translates to <br>).

      $
      Produces a paragraph break (translates to <p>).

      image(address)
      Used to include an image of address (URL) address (equivalent to img$[src=address]).

      image(address,atts)
      As above with the list of attributes atts.

      ref(address,text)
      Produces a hypertext link, address is the URL of the referenced resource, text is the text of the reference (equivalent to a([href=address],text)).

      label(name,text)
      Labels text as a target destination with label name (equivalent to a([name=name],text)).

      heading(n,text)
      Produces a heading of level n (between 1 and 6), text is the text to be used as heading. Useful when one wants a heading level relative to another heading (equivalent to hn(text)).

      itemize(items)
      Produces a list of bulleted items, items is a list of corresponding HTML terms (translates to a <ul> environment).

      enumerate(items)
      Produces a list of numbered items, items is a list of corresponding HTML terms (translates to a <ol> environment).

      description(defs)
      Produces a list of defined items, defs is a list whose elements are definitions, each of them being a Prolog sequence (composed by ','/2 operators). The last element of the sequence is the definition, the other (if any) are the defined terms (translates to a <dl> environment).

      preformatted(text)
      Used to include preformatted text, text is a list of HTML terms, each element of the list being a line of the resulting document (translates to a <pre> environment).

      verbatim(text)
      Used to include text verbatim, special HTML characters (<,>,&," and space) are translated into its quoted HTML equivalent.

      prolog_term(term)
      Includes any prolog term term, represented in functional notation. Variables are output as _.

      nl
      Used to include a newline in the HTML source (just to improve human readability).

      entity(name)
      Includes the entity of name name (ISO-8859-1 special character).

      start_form(addr,atts)
      Specifies the beginning of a form. addr is the address (URL) of the program that will handle the form, and atts other attributes of the form, as the method used to invoke it. If atts is not present (there is only one argument) the method defaults to POST.

      start_form
      Specifies the beginning of a form without assigning address to the handler.

      end_form
      Specifies the end of a form.

      checkbox(name,state)
      Specifies an input of type checkbox with name name, state is on if the checkbox is initially checked.

      radio(name,value,selected)
      Specifies an input of type radio with name name (several radio buttons which are interlocked must share their name), value is the the value returned by the button, if selected=value the button is initially checked.

      input(type,atts)
      Specifies an input of type type with a list of attributes atts. Possible values of type are text, hidden, submit, reset, ldots

      textinput(name,atts,text)
      Specifies an input text area of name name. text provides the default text to be shown in the area, atts a list of attributes.

      option(name,val,options)
      Specifies a simple option selector of name name, options is the list of available options and val is the initial selected option (if val is not in options the first item is selected by default) (translates to a <select> environment).

      menu(name,atts,items)
      Specifies a menu of name name, list of attributes atts and list of options items. The elements of the list items are marked with the prefix operator $ to indicate that they are selected (translates to a <select> environment).

      name(text)
      A term with functor name/1, different from the special functors defined herein, represents an HTML environment of name name and included text text. For example, the term:
      address('clip@clip.dia.fi.upm.es')
      is translated into the HTML source:
      <address>clip@clip.dia.fi.upm.es</address>
      name(atts,text)
      A term with functor name/2, different from the special functors defined herein, represents an HTML environment of name name, attributes atts and included text text. For example, the term
      
         a([href='http://www.clip.dia.fi.upm.es/'],"Clip home")
      represents the HTML source
         <a href="http://www.clip.dia.fi.upm.es/">Clip home</a>

      Usage:html_term(HTMLTerm)

      HTMLTerm is a term representing HTML code.

        PREDICATEoutput_html/1
        output_html(HTMLTerm)

        Outputs HTMLTerm, interpreted as an html_term/1, to current output stream.

        Usage:

        • The following properties should hold at call time:
          (html:html_term/1)HTMLTerm is a term representing HTML code.

        PREDICATEhtml2terms/2
        html2terms(String,Terms)

        String is a character list containing HTML code and Terms is its prolog structured representation.

        Usage 1:

        Translates an HTML-term into the HTML code it represents.

        • The following properties should hold at call time:
          (term_typing:var/1)String is a free variable.
          (html:html_term/1)Terms is a term representing HTML code.
        • The following properties should hold upon exit:
          (basic_props:string/1)String is a string (a list of character codes).

        Usage 2:

        Translates HTML code into a structured HTML-term.

        • Call and exit should be compatible with:
          (html:canonic_html_term/1)Terms is a term representing HTML code in canonical form.
        • The following properties should hold at call time:
          (basic_props:string/1)String is a string (a list of character codes).
        • The following properties should hold upon exit:
          (html:canonic_html_term/1)Terms is a term representing HTML code in canonical form.

        PREDICATExml2terms/2
        xml2terms(String,Terms)

        String is a character list containing XML code and Terms is its prolog structured representation.

        Usage 1:

        Translates a XML-term into the XML code it represents.

        • The following properties should hold at call time:
          (term_typing:var/1)String is a free variable.
          (html:html_term/1)Terms is a term representing HTML code.
        • The following properties should hold upon exit:
          (basic_props:string/1)String is a string (a list of character codes).

        Usage 2:

        Translates XML code into a structured XML-term.

        • Call and exit should be compatible with:
          (html:canonic_xml_term/1)Terms is a term representing XML code in canonical form.
        • The following properties should hold at call time:
          (basic_props:string/1)String is a string (a list of character codes).
        • The following properties should hold upon exit:
          (html:canonic_xml_term/1)Terms is a term representing XML code in canonical form.

        html_template(Chars,Terms,Dict)

        Interprets Chars as an HTML template returning in Terms the corresponding structured HTML-term, which includes variables, and unifying Dict with a dictionary of those variables (an incomplete list of name=Var pairs). An HTML template is standard HTML code, but in which ``slots'' can be defined and given an identifier. These slots represent parts of the HTML code in which other HTML code can be inserted, and are represented in the HTML-term as free variables. There are two kinds of variables in templates:

        • Variables representing page contents. A variable with name name is defined with the special tag <V>name</V>.

        • Variables representing tag attributes. They occur as an attribute or an attribute value starting with _, followed by its name, which must be formed by alphabetic characters.

        As an example, suposse the following HTML template:

        <html>
        <body bgcolor=_bgcolor>
        <v>content</v>
        </body>
        </html>
        

        The following query in the Ciao toplevel shows how the template is parsed, and the dictionary returned:

        ?- file_to_string('template.html',_S), html_template(_S,Terms,Dict). 
        
        Dict = [bgcolor=_A,content=_B|_],
        Terms = [env(html,[],["
        ",env(body,[bgcolor=_A],["
        ",_B,"
        "]),"
        "]),"
        "] ? 
        
        yes
        If a dictionary with values is supplied at call time, then variables are unified accordingly inside the template:
        ?- file_to_string('template.html',_S),
           html_template(_S,Terms,[content=b("hello world!"),bgcolor="white"]). 
        
        Terms = [env(html,[],["
        ",env(body,[bgcolor="white"],["
        ",b("hello world!"),"
        "]),"
        "]),"
        "] ? 
        
        yes

        Usage:

        Documentation on multifiles

        Usage:html_expansion(Term,Expansion)

        Hook predicate to define macros. Expand occurrences of Term into Expansion, in output_html/1. Take care to not transform something into itself!

          The predicate is multifile.

          Documentation on imports

          This module has the following direct dependencies:

          Other information

          The code uses input from from L. Naish's forms and Francisco Bueno's previous Chat interface. Other people who have contributed are (please inform us if we leave out anybody): Markus Fromherz.