[ic] Froogle.google.com anyone using this yet?

Kevin Walsh interchange-users@icdevgroup.org
Fri Dec 20 09:47:01 2002


cfm@maine.com wrote:
>
> The problem you have is the HTML in the database.  That makes
> it really hard to reuse.  You might want to consider ways of
> getting HTML out of your raw data.
>
A quick test script for you:

----------------------------------------------------------------------
use HTML::TreeBuilder;
use HTML::FormatText;
use strict;

my $text =<<'EOB';
    <body>
        <p>
            This is a test blah blah.&nbsp;
            <a href="foobar.html">What's this, a link?</a>.
        </p>
        <p>
            Let's have some text in <font color="#FF0000">red</font>.
        </p>
        <p>
            Some &quot;entities&quot; will make another test case.
        </p>
    </body>
EOB

my $tree = new HTML::TreeBuilder;
$tree->parse($text);

my $formatter = new HTML::FormatText(
    leftmargin => 4,
    rightmargin => 74,
);
$text = $formatter->format($tree);
print $text;
----------------------------------------------------------------------

The output is:

    This is a test blah blah.  What's this, a link?.

    Let's have some text in red.

    Some "entities" will make another test case.

--
   _/   _/  _/_/_/_/  _/    _/  _/_/_/  _/    _/
  _/_/_/   _/_/      _/    _/    _/    _/_/  _/   K e v i n   W a l s h
 _/ _/    _/          _/ _/     _/    _/  _/_/    kevin@cursor.biz
_/   _/  _/_/_/_/      _/    _/_/_/  _/    _/