[ic] Possible bug: Too many new ID assignments for this IP address

John1 list_subscriber at yahoo.co.uk
Wed Aug 24 17:01:13 EDT 2005


On Wednesday, August 24, 2005 5:57 PM, mike at perusion.com wrote:

> Quoting John1 (list_subscriber at yahoo.co.uk):
>> On Wednesday, August 24, 2005 2:45 PM, mike at perusion.com wrote:
>>
>>> Quoting John1 (list_subscriber at yahoo.co.uk):
>>>> On Wednesday, August 24, 2005 2:29 AM, mike at perusion.com wrote:
>>>>
>>>> Consequently the addr_ctr/IP file will keep counting up unless
>>>> there is a *gap* of greater than "limit robot_expire" before a new
>>>> session id is requested by the same IP address.
>>>
>>> Yes, this is correct.
>>>
>>>>
>>>> i.e.  So if you use "Limit robot_expire 0.05", provided there are
>>>> at least 2 requests per hour for a new session id from the same IP
>>>> address the addr_ctr/IP file will keep counting up forever.
>>>
>>> Well, until it locks someone out for an hour.
>>>
>> Except it is highly likely to be a lot longer than an hour (possibly
>> indefinitely) if the IP in question is a large ISP's proxy server
>> (using NAT as do NTL and AOL in the UK - 2 of the biggest ISPs in
>> the UK).  Has anybody any idea why AOL operate these NAT proxies?
>
> Should not happen. Since you don't assign a new session, and the
> counter gets incremented only at that time, after an hour of no new
> session you can get one.
>
But, what I am saying is that it appears that all UK AOL customers appear at
our server on only a handful of IP addresses, i.e. the IP addresses of their
proxy servers (and similarly for the major cable operator NTL).

I don't know why they don't use a standard pass-through proxy server
approach, but they don't seem to.  Indeed, our web stats always list AOL and
NTL proxy servers as the most popular visitors by IP address.

So this does mean that our server is being asked to hand out maybe hundreds
of session ids per hour to the same IP address (i.e. the proxy server's IP).

>> If RobotLimit is set to 500, then whilst it may take a little while
>> for the 500 to be reached, once it has been reached the shutter
>> comes down and the count_ip code operates like a latch as only *one*
>> new session id per hour is required to *keep* the latch closed, not
>> 500!
>
> You can't get a new session after you are locked out -- if you can,
> there is an error in the code.
>
Ah, OK, I now realise that I had made one wrong assumption.  I was thinking
that the mtime would still be updated even when the request for a new
sessionid was denied.  But, I now understand that mtime will remain
untouched during the lockout period.

So I now see that the lockout should end after time robot_expire.

>>
>> And also note that RobotLimit 500 doesn't actually require traffic
>> of 500 per hour for addr_ctr/IP to eventually reach 500.  All that
>> is needed is at least *one* new session id per hour provided that it
>> never drops below *one* new session id per hour for the number of
>> hours it takes to reach a count of 500.
>>
>>> Looking at it, it may indeed be less than ideal. Perhaps someone can
>>> suggest an algorithm -- nothing clean and correct comes to my mind
>>> (new file every day, counting down instead of up if time >
>>> Limit->robot_expire * .1, etc.).
>>>
>>> In the interim, I would think
>>>
>>> Limit robot_expire 0.002
>>>
>>> would work in all but the most extreme cases, where again I suggest
>>> you need more than RobotLimit to defend you from the onslaught.
>>>
>> That's a fair point.  I hadn't given any thought to the use of Limit
>> robot_expire with very small values.  A value of 0.002 would means
>> that addr_ctr/IP would be deleted if there were no accesses from the
>> same IP for 3 minutes.
>
> Not no accesses, no new sessions.
>
Oh yes, that's what a meant :-)

>> I guess that would work most of the time as I suppose in the
>> middle of the night (if not during the day) requests for new session
>> ids are likely to drop below this level at least once and therefore
>> the addr_ctr/IP file will at least be deleted once every 24 hours.
>>
>> At the same time I suppose a 3 minute expiry limit is long enough to
>> provide protection against unrecognised and unruly robots causing
>> lots of new sessions to be spawed in quick succession - I guess this
>> would tend to happen over a timeframe of seconds rather than
>> minutes, so the 3 minutes should be sufficient to mitigate against
>> this.  Is this assumption correct? Do I understand the issue of
>> runaway robots correctly?
>>
>
> That is why I think it is probably good enough. In fact, so good
> that I may just pick 0.003 as the new value to put in the foundation
> catalog.cfg.
>
Great - I am glad something useful has come out of our discussion :-)

Thanks for your help and suggestions Mike - you've persuaded me to put 
RobotLimit back to 100 from 0 but this time with a "Limit robot_expire 
0.002"

I will let you know if I see any "Too many new ID assignments" reappear in 
the error log :-) 



More information about the interchange-users mailing list