[ic] bad robots - undefined session ids?

DB db at m-and-d.com
Wed Dec 18 20:48:22 UTC 2013


> I've noticed that when I use top, I sometimes see IC processes with what
> looks like an undefined session id.
> 
> Ones like this I believe should be robots:
> interchange: Store 180.76.5.149 nsession - /page.html
> 
> Ones like this I believe should be regular users
> interchange: Store 213.108.209.218 WydgPRgP - /page.html
> 
> But what about ones like this?
> interchange: Store 5.248.13.5  - /scan/se=....
> 
> these with an undefined session id often seem to be misbehaving -
> bad/evil robots doing many and frequent /scan/... The IPs often seem to
> be in Russia, China etc. Anyone have thought about this or how to prevent?
> 
> DB
> 

More info... Looking at my web server access log, these might be spiders
of some sort following expired 'more' scan links:

188.112.168.206 www.domain.com - [18/Dec/2013:13:50:34 -0500] "GET
/scan/MM=56eff3856746c4a09774b09fe30f3107:1250:1299:50.html?mv_more_ip=1&mv_nextpage=page2&pf=sql&id=9qviycEr
HTTP/1.1" 200 30213 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT
6.1; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET
CLR 3.0.30729; Media Center PC 6.0)"

I know this isn't a new issue, and for some time I've had 'Disallow:
/scan/' in my robots.txt. But not all robots behave. Any new ideas about
how to handle this? I know Racke will suggest moving to IC6 :)

DB




More information about the interchange-users mailing list