[ic] Filters with UTF-8 body

David Christensen david at endpoint.com
Thu Mar 12 21:26:51 UTC 2009


On Mar 12, 2009, at 4:10 PM, Peter wrote:

> On 03/12/2009 12:28 PM, David Christensen wrote:
>> I have a commit queued to fix all instances of explicit ranges,
>> however, there was something I found which I'm not sure is a wart or
>> not.  From dist/lib/UI/Primitive.pm:
>>
>> 45:$DECODE_CHARS = qq{&[<"\000-\037\177-\377};
>
> Provided we think it may still be needed, I think the best way to deal
> with this one is:
> $DECODE_CHARS = qq{&[<"[[:^print:]]};

Does [[:print:]] include only traditional ASCII, or would the unicode  
code points fall in this range as well?  I'm under the impression that  
extended Unicode characters would fall into the printable class, and  
hence not be decoded, as implied by the character class, but without  
knowing the calling context of any code which uses these arguments, I  
don't know how to verify this.  Also, this threw me off because it was  
a literal string and not a regex (at least directly).

Regards,

David
--
David Christensen
End Point Corporation
david at endpoint.com
212-929-6923
http://www.endpoint.com/






More information about the interchange-users mailing list