[ic] RobotLimit

J van Dijk 'BV Kunststoffenindustrie Attema' j.vandijk at attema.nl
Thu Dec 16 10:00:13 EST 2004


Hi all,

After activation of the Webbuddy free website monitoring tool
(http://www.webbuddy.nl/ sorry,Dutch only)) to check the availablilty
of our Interchange hosted website (http://www.attema.nl), we got a
message from Webbuddy that our website (www.attema.nl) was down.
Surprised, because our website is always available, we immediately
checked our website and asked our hosting provider to do the same but
everything was ok.
After some research we found out that Webbuddy accessed www.attema.nl
every 30 minutes (only one line in httpd.log) but that Interchange 5.3.0
blocked the ipaddress of Webbuddy.
This was caused by accessing www.attema.nl 50 hours, 2 times per hour =
100 times, which is the RobotLimit value.
So Webbuddy reported the site as unreachable, but our website was only
unreachable for Webbuddy not for the rest of the world.

This same problem did we have some time ago:
Our company is behind a proxy server so for the outside world, all
workstations have the same ipaddress and the Interchange server is
located elsewhere (hosting provider).
After a lot of changes on the site, a lot of colleages checking things,
we reached this limit too. (only humans, no robots...)
Which caused that no one in our company could go to our own website.

RobotLimit = 100 will say that:

(1) number of accesses within 30 sec must be lower than 100 within the
same interchange session, otherwise ip will be blocked.
(2) max. number of Interchange sessions without a 24 hour pause,
otherwise ip will be blocked

So this RobotLimit has a double function, we want to keep function (1)
to protect ourselves against heavy crawlers but we also want to be able
to use Webbuddy, which can check our website up to every minute (paid),
which generates only 1 access per minute, which is much less than a
normal visitor, and of course we ourselves don't want to be excluded if
we do a lot of changes and checks on our website.

The only way i can see to accomplish that, is to split the
functionality of RobotLimit:

RobotLimit default 100:  for number of accesses within 30 sec must be
lower than 100 within the same interchange session, otherwise ip will be
blocked. (1)
SessionLimit default 60: for the max. number of sessions per hour per
ipaddress, otherwise ip will be blocked

Rule (2) max. number of Interchange sessions without a 24 hour pause,
otherwise ip will be blocked could be omitted ?

Jan.


Met vriendelijke groet,

Jan van Dijk
j.vandijk at attema.nl
B.V. Kunststoffenindustrie Attema
tel : 0183-650650 tst 674
fax: 0183-650751
www.attema.nl


More information about the interchange-users mailing list