[ic] get-url strip selective

Jon Jensen jon at endpoint.com
Sat Jan 8 23:15:50 EST 2005


On Sat, 8 Jan 2005, David Radovanovic wrote:

> the get-url tag above results are good, however this tag below doesn't
> strip the <PRE> tags that reside in the document:
>
> [get-url
> url="http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&id=12345,9997&retmode=html&rettype=abstract"
> strip=1*]
>
> I'd like to use the strip parm though any apparent reason for its
> selective stripping?

I'm not sure where you got the idea that the strip parameter did anything 
with <PRE> tags ... was that (mis)documented somewhere?

The strip parameter strips everything up to the opening <body> tag, and 
everything from the closing </body> tag to the end of the file:

     if($opt->{strip}) {
         $html =~ s/.*<body[^>]*>//si;
         $html =~ s:</body>.*::si;
     }

No more, no less.

Of course you could write your own usertag or filter to strip <PRE> tags 
and anything else you want, and pass in the output from your [get-url] tag 
call.

Jon


-- 
Jon Jensen
End Point Corporation
http://www.endpoint.com/
Software development with Interchange, Perl, PostgreSQL, Apache, Linux, ...


More information about the interchange-users mailing list