[ic] Maximum size for a session?

Jon Jensen jon at endpoint.com
Tue Nov 17 00:17:59 UTC 2009


On Sun, 15 Nov 2009, DB wrote:

> Recently I saw Yahoo's shopping bot hitting my site pretty hard. Apache
> access log has lines like this every 20 seconds or so:
>
> ...HTTP/1.0" 200 19391 "-" "YahooSeeker/1.2 (compatible; Mozilla 4.0;
> MSIE 5.5; yahooseeker at yahoo-inc dot com ;
> http://help.yahoo.com/help/us/shop/merchant/)"
>
> I saw no entry for this in my system's robots.cfg and I suspect (can't 
> prove) that this robot was obtaining a session which grew *very* large. 
> So I have two questions:
>
> What exactly should I add to my robots.cfg

Are you sure that it was not being flagged as a robot? I'm pretty sure 
that the "Yahoo" entry in the default robots.cfg will catch "YahooSeeker" 
as well. Take a look at your interchange.structure file with debug 
enabled, and you can see the regex created for the RobotUA directive, and 
Yahoo isn't anchored so should match YahooSeeker too.

(In this case that's good, but in other cases you may find a RobotUA 
setting matches too loosely, such as "Google" matching "GoogleToolbar" or 
similar.)

> Is there a way to set a maximum size for sessions so that the next time 
> a robot that's not in my robots.cfg file comes along this problem won't 
> repeat?

I don't know of a way to limit the size of a session proactively.

Jon

-- 
Jon Jensen
End Point Corporation
http://www.endpoint.com/



More information about the interchange-users mailing list