[ic] strip html filter AND apply two filters to one value?

John Young interchange-users@icdevgroup.org
Wed Jul 24 18:10:02 2002


> Can't get the message in that URL, but this should work, if the docs on the
> subject are right:
> 
> GlobalSub <<EOR
> sub new_filter {
>      BEGIN {
>          package Vend::Interpolate;
>          $Filter{nohtml} = sub {
>                          my $val = shift;
>                          $val =~ s/<\w+?>//g;
>                          return val;
>                          };
>      }
> }
> EOR
> 
> ...call it as [filter nohtml]...[/filter]

Depending on the HTML received, you might want to also worry about
mid-tag word wrapping and nested tags.  Consider:
<!-- Some comment
  <p>Some old paragraph</p>
-->

For these tricky situations, Perl Cookbook recommends HTML parsing
routines from CPAN.  Following is an example from Perl Cookbook (by
Tom Christiansen & Nathan Torkington) (not in a UserTag framework,
obviously):

package MyParser;
use HTML::Parser;
use HTML::Entities qw(decode_entities);

@ISA = qw(HTML::Parser);

sub text {
  my ($self, $text) = @_;
  print decode_entities($text);
}

package main;
MyParser->new->parse_file(*F);