bare-bones HTML parser

HTML::SimpleParse is a bare-bones HTML parser, similar to HTML::Parser, but with a couple important distinctions:

First, HTML::Parser knows which tags can contain other tags, which start tags have corresponding end tags, which tags can exist only in the <HEAD> portion of the document, and so forth. HTML::SimpleParse does not know any of these things. It just finds tags and text in the HTML you give it, it does not care about the specific content of these tags (though it does distinguish between different _types_ of tags, such as comments, starting tags like <b>, ending tags like </b>, and so on).

Second, HTML::SimpleParse does not create a hierarchical tree of HTML content, but rather a simple linear list. It does not pay any attention to balancing start tags with corresponding end tags, or which pairs of tags are inside other pairs of tags.

       HTML::SimpleParse - a bare-bones HTML parser

        use HTML::SimpleParse;

        # Parse the text into a simple tree
        my $p = new HTML::SimpleParse( $html_text );
        $p->output;                 # Output the HTML verbatim

        $p->text( $new_text );      # Give it some new HTML to chew 


libhtml-simpleparse-perl (0.12-2) unstable; urgency=low

  [ gregor herrmann ]
  * debian/control: Added: Vcs-Svn field (source stanza); Vcs-Browser
    field (source stanza); Homepage field (source stanza).
  * Set Maintainer to Debian Perl Group.
  * Use dist-based URL in debian/watch.
  * debian/control: Changed: Switched Vcs-Browser field to ViewSVN
    (source stanza).
  * debian/control: Add


Revision history for Perl extension HTML::SimpleParse.

0.12  Wed Jul  9 12:19:38 CDT 2003

 - Clarify the relationship between this module and HTML::TreeBuilder
   in the documentation. [suggested by Gisle Aas]

 - Moved regression tests from to t/basic.t

0.11  Sun Jan 26 10:00:41 CST 2003

 - Use to output testing results.

