[ic] RobotUA Problems
Jamie Neil
jamie at versado.net
Fri Mar 19 08:15:56 EST 2004
Hi all,
Just been doing some site optimisation for spiders (disabling "more" in
search results etc.) and I've stumbled across a problem with the default
robot detection settings.
RobotUA matches on substrings in the HTTP User Agent. This is fine for
things like "Googlebot" or "Slurp", but I've noticed when trawling
through the logs that some users have customised user agent strings
after installing "branded" browsers or toolbars. A couple of examples:
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; AskBar 3.00; YPC
3.0.2; yplus 4.3.01b)
Mozilla/4.0 (compatible; MSIE 6.0; AOL 9.0; Windows 98; sureseeker.com;
searchengine2000.com)
Both of these will match the default RobotUA list (Ask and seek) and so
won't get a sessionid (which I assume means the basket won't work).
I'm not sure whether this is a widespread problem, but searching through
the usertrack log with:
tail -n 100000 usertrack |grep nsession.*ADD
showed up 7 users in the last week without a sessionid who tried to add
stuff to the basket.
I've replaced "Ask" with "Ask?Jeeves?Teoma" (I assume spaces and / are
not allowed so I've used wildcards), but I'm not sure what to do with
the more generic matches like "seek" or "search".
--
Jamie Neil | <jamie at versado.net> | 0870 7777 454
Versado I.T. Services Ltd. | http://versado.net/ | 0845 450 1254
More information about the interchange-users
mailing list