[ic] problem with new filter
New Media E.M.S.
ic_users at newmediaems.com
Fri May 7 16:30:18 EDT 2004
At 08:55 PM 5/7/2004 +0100, you wrote:
>At 20:22 07/05/2004, you wrote:
>>At 07:22 PM 5/7/2004 +0100, you wrote:
>>
>>
>>>Hi everyone,
>>>
>>>I'm having a spot of bother whilst trying to write a new [filter] op
>>>that's supposed to return anything before and including the first "."
>>>(dot) in the string passed to it.
>>>
>>>eg.
>>>
>>>[filter op=sentence]This is sentence one. This is sentence two.[/filter]
>>>
>>>returns: "This is sentence one."
>>>
>>>
>>>Following the working example in the online docs, I've put this in my
>>>interchange.cfg and tried to restart IC.
>>>
>>>GlobalSub <<EOR
>>>sub new_filter {
>>> BEGIN {
>>> package Vend::Interpolate;
>>> $Filter{sentence} = sub {
>>> my $val = shift;
>>> $val =~ m/^*\.//o;
>>> return $val;
>>> };
>>> }
>>>}
>>>EOR
>>>
>>>Once the above is added IC refuses to start up giving the following error:
>>>
>>>Starting Interchange: Bad GlobalSub 'new_filter': Bareword "o" not
>>>allowed while "strict subs" in use at (eval 124) line 6, <GLOBAL> line 17.
>>>BEGIN not safe after errors--compilation aborted at (eval 124) line 9,
>>><GLOBAL> line 17.
>>>In line 17 of the configuration file '/etc/interchange.cfg':
>>>GlobalSub <<EOR
>>>
>>>The original example for reverse I got from
>>>http://www.icdevgroup.org/i/dev/docfly.html?mv_arg=ictags04%2e28 is fine
>>>and IC loads up, so it must be my code that's at fault.
>>>
>>>Sadly my limited Perl knowledge doesn't permit me to understand the
>>>error to know what I'm doing wrong here.
>>>
>>>I'd be very grateful if someone could point out my silly mistake(s) :)
>>>
>>>Many thanks
>>>
>>>Mark
>>
>>I'm no regex guru, but this may work better:
>>
>>GlobalSub <<EOR
>>sub new_filter {
>> BEGIN {
>> package Vend::Interpolate;
>> $Filter{sentence} = sub {
>> my $val = shift;
>> $val =~ s/\..*$/\./;
>> return $val;
>> };
>> }
>>}
>>EOR
>>
>>Also, if you wrote this on a Windows box and uploaded it, you'll want to
>>strip out any carriage returns from the file, as they could cause a problem:
>>
>> perl -i -p -e 's/\r//g' interchange.cfg
>>
>>- Ed
>
>Hi Ed,
>
>Many thanks for your reply.
>
>It's a definite improvement, but it's still not quite right......
>
>The following appears to be matching words that have dots in them instead
>of matching the dots at the end of a sentence:
>
>$val =~ s/\..*$/\./;
>
>eg. "This handy 1.5ml single use sachet is......." gets truncated/matched
>to: "This handy 1."
>
>I've tried changing the pattern match to the following to allow for a
>space but this seems to return multiple matches and makes the situation
>worse :(
>
>$val =~ s/\. .*$/\./;
>
>I feel I should be matching from the start of the string and dumping
>anything after the first ". " (dot<space>). Perhaps with split() ?
>
>eg. ($val, $crap) = split (/. /, $val);
>
>Many thanks
>
>Mark
Mark -
It's a tough call. You're assuming through it all the you'll have clean
data formatted properly with a '. ' between each sentence. Unless you
control all the data entry it's possible that the space will be omitted,
there will be two spaces, a newline, etc.
I guess something like:
$val =~ s/^(.+)?\.\s+(.*)$/$1\./;
....might work in most cases, but again I'm not a regex guru. You might
want to test-drive some regex's with sample data using a [calc] in a page
first, since it is easier to do interactive tweaking that way, then
incorporate the best regex into your filter when you are satisfied.
- Ed
===============================================================
New Media E.M.S. Technology Solutions for Business
11630 Fair Oaks Blvd., #250 eCommerce | Consulting | Hosting
Fair Oaks, CA 95628 edl at newmediaems.com
(916) 961-0446 http://www.newmediaems.com
(866) 519-4680 Toll-Free (916) 961-0447 Fax
===============================================================
More information about the interchange-users
mailing list