[ic] Strip HTML and IC tags

Ron Phipps interchange-users@interchange.redhat.com
Fri Feb 22 16:47:01 2002


> From: interchange-users-admin@interchange.redhat.com
[mailto:interchange-
> users-admin@interchange.redhat.com] On Behalf Of Ron Phipps
> 
> How should I go about stripping all HTML and IC tags from a variable
or
> field?  A search of the filter tag only turned up text2html which will
> convert line breaks to <BR> for display on an html page.  I'd like to
go
> the other way, but remove all ITL and HTML tags.  I gather that it
would
> take a complex set of regexes to do this from my search of google.  Is
> there a way to do this that is included with IC or should I look at
> writing a usertag of my own?
> 
> Thanks,
> -Ron
> 

To strip html add the following filter to interchange.cfg:

GlobalSub <<EOR
sub strip_html {
        BEGIN { 
                package Vend::Interpolate;
                $Filter{striphtml} = sub {
                                                my $val = shift;
                                                $val =~
s/<(.|\n)+?>//gis;
                                                return $val;
                                        };
        }
}
EOR

And call it like this:  [filter striphtml]<a href="test">Testing the
striphtml filter</a>[/filter]

Not as difficult as the first example I saw on google :)  This one will
remove all text between < and >.  There is probably a better way to
handle the regex in the case that <> appear outside of an html tag.
I'll work on a strip ITL as well.  IC team, if you'd like to add this to
the default filter list please do.

Thanks,
-Ron