[ic] bad robots - undefined session ids?
db at m-and-d.com
Wed Dec 18 20:48:22 UTC 2013
> I've noticed that when I use top, I sometimes see IC processes with what
> looks like an undefined session id.
> Ones like this I believe should be robots:
> interchange: Store 184.108.40.206 nsession - /page.html
> Ones like this I believe should be regular users
> interchange: Store 220.127.116.11 WydgPRgP - /page.html
> But what about ones like this?
> interchange: Store 18.104.22.168 - /scan/se=....
> these with an undefined session id often seem to be misbehaving -
> bad/evil robots doing many and frequent /scan/... The IPs often seem to
> be in Russia, China etc. Anyone have thought about this or how to prevent?
More info... Looking at my web server access log, these might be spiders
of some sort following expired 'more' scan links:
22.214.171.124 www.domain.com - [18/Dec/2013:13:50:34 -0500] "GET
HTTP/1.1" 200 30213 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT
6.1; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET
CLR 3.0.30729; Media Center PC 6.0)"
I know this isn't a new issue, and for some time I've had 'Disallow:
/scan/' in my robots.txt. But not all robots behave. Any new ideas about
how to handle this? I know Racke will suggest moving to IC6 :)
More information about the interchange-users