[ic] Inktomi/Yahoo Search Engine Results include Session ID's
Kevin Walsh
kevin at cursor.biz
Sat Feb 21 13:58:41 EST 2004
Jon [prtyof5 at attglobal.net] wrote:
> > It is relatively easy to clean out the "invalid" search engine index
> > entries with a small change to the Interchange core. Once your website
> > has been re-crawled (perhaps a month later) and the indexes are clean,
> > the extra Interchange core code can be removed.
> >
> > At least, with Interchange 5, you will not see any new session IDs in
> > the indexes. Google, of course, is more sensible and tends to simply
> > not follow URIs with arguments at all.
> >
> Perhaps one other possibility to consider and that is that Yahoo
> may not be using the UA inktomi. One of my pages cached at Yahoo has a
> change that occurred
> after inktomi last visited. This is part of the reason I went looking
> down the IP path for google. And I've read, but only seen once, where
> googlebot visited with out identifying with the UA. The IP address was
> googlebot though. Also based on what others have stated in the email
> thread RobotIPs would it be safe and appropriate to add
> 64.68.82? to the RobotIP list ? I don't know the IPs of Yahoo though
>
Google own 64.68.80.0/21, which is 64.68.80.0 through 64.68.87.255,
I don't know how many of those are spiders, so I would be wary about
adding 2048 IP addresses to the RobotIP configuration. Even adding
just the 256 addresses in 64.68.82.0/24 would make me nervous.
The decision is yours, of course.
--
_/ _/ _/_/_/_/ _/ _/ _/_/_/ _/ _/
_/_/_/ _/_/ _/ _/ _/ _/_/ _/ K e v i n W a l s h
_/ _/ _/ _/ _/ _/ _/ _/_/ kevin at cursor.biz
_/ _/ _/_/_/_/ _/ _/_/_/ _/ _/
More information about the interchange-users
mailing list