[ic] Prevent search from matching on html

Kevin Walsh kevin at cursor.biz
Thu Oct 26 18:29:52 EDT 2006


Josh Lavin <josh at myprivacy.ca> wrote:
> I am finding that when we use HTML in our product descriptions, the  
> search results will include products where an HTML tag matched the  
> search query.
> 
> Simple example: if my description contains "<h2>Features</h2>" and  
> someone searches for 'h2', then that product will be returned in the  
> results.
> 
> I would like to avoid this, and figured I needed a custom SearchOp,  
> but I'm having no luck with this one:
> 
> CodeDef not_tags SearchOp
> CodeDef not_tags Routine <<EOR
> sub {
>          my ($self, $i, $pat) = @_;
> 
>          return sub {
>              my $string = shift;
>              $string =~ s:<[/\w].*?\s?/?>::gi;
>              return $string;
>          };
> }
> EOR
> 
> The idea is to remove any HTML tags before searching. Any ideas?
> 
You are always returning a true value.  A SearchOp's coderef needs
to return true if a match is found or false if no match is found.

Try something like this instead:

    CodeDef not_tags SearchOp
    CodeDef not_tags Routine <<EOR
    sub {
        my ($self, $i, $pat) = @_;
        $pat = qr/$pat/i;

        return sub {
            my $string = shift;

            $string =~ s:<[/\w].+?>::gi;
            return $string =~ $pat;
        };
    }
    EOR

-- 
   _/   _/  _/_/_/_/  _/    _/  _/_/_/  _/    _/
  _/_/_/   _/_/      _/    _/    _/    _/_/  _/   K e v i n   W a l s h
 _/ _/    _/          _/ _/     _/    _/  _/_/    kevin at cursor.biz
_/   _/  _/_/_/_/      _/    _/_/_/  _/    _/


More information about the interchange-users mailing list