[ic] Strip HTML and IC tags

interchange-users@interchange.redhat.com interchange-users@interchange.redhat.com
Fri Feb 22 18:44:00 2002


On Sat, Feb 23, 2002 at 12:09:41AM +0100, Joachim Leidinger wrote:
> Ron Phipps wrote:
> > 
> > How should I go about stripping all HTML and IC tags from a variable or
> > field?  A search of the filter tag only turned up text2html which will
> > convert line breaks to <BR> for display on an html page.  I'd like to go
> > the other way, but remove all ITL and HTML tags.  I gather that it would
> > take a complex set of regexes to do this from my search of google.  Is
> > there a way to do this that is included with IC or should I look at
> > writing a usertag of my own?
> 
> Maybe that or you can define your own filter, which can used like the
> other IC filter. 

You can write a filter.  PITA.  We've had to do that in some
cases where we get the data marked up in the first place, say in
book descriptions that come to us word processed.  I think that
is a flawed process conceptually.

It's WAY better to pull the data **before** it gets marked up if that
is at all possible.  There is no reason you should have to be
stripping IC tags.



> 
> But, I've trouble to understand your wish. Maybe you want to get the IC
> page (like the page, which is viewing in the browser) as a simple text
> file? How about creating a script, which use LWP or any kind of Agent to
> access that page and store it as a file and call a script or program to
> convert that HTML page into any kind of file like a text file, pdf file
> and so on? 
> 
> I'm in a muddle!
> 
> Joachim
> 
> 
> -- 
> Hans-Joachim Leidinger | Dipl.-Phys.Ing. Entwicklung eCommerce
> [leidinger@bpanet.de] 
> Black Point Arts Internet Solutions GmbH
> http://www.bpanet.de
> _______________________________________________
> interchange-users mailing list
> interchange-users@interchange.redhat.com
> http://interchange.redhat.com/mailman/listinfo/interchange-users

-- 

Christopher F. Miller, Publisher                               cfm@maine.com
MaineStreet Communications, Inc           208 Portland Road, Gray, ME  04039
1.207.657.5078                                         http://www.maine.com/
Content/site management, online commerce, internet integration, Debian linux