[interchange-cvs] interchange - kwalsh modified dist/robots.cfg
interchange-cvs at icdevgroup.org
interchange-cvs at icdevgroup.org
Thu Mar 8 10:40:35 EST 2007
User: kwalsh
Date: 2007-03-08 15:40:35 GMT
Modified: dist robots.cfg
Log:
* Racke suggested (in IRC) that the file would be more version control
friendly if each UA and IP etc. was on its own line.
Revision Changes Path
2.2 +148 -26 interchange/dist/robots.cfg
rev 2.2, prev_rev 2.1
Index: robots.cfg
===================================================================
RCS file: /var/cvs/interchange/dist/robots.cfg,v
retrieving revision 2.1
retrieving revision 2.2
diff -u -r2.1 -r2.2
--- robots.cfg 8 Mar 2007 15:03:01 -0000 2.1
+++ robots.cfg 8 Mar 2007 15:40:35 -0000 2.2
@@ -1,36 +1,158 @@
-# $Id: robots.cfg,v 2.1 2007/03/08 15:03:01 kwalsh Exp $
-
RobotUA <<EOR
- ATN_Worldwide, AltaVista, Arachnoidea, Aranha, Architext, Argus, Ask,
- Atomz, BackRub, Bookdog, BookmarkSync, Builder, CFNetwork, CMC, Contact,
- Creep, Digital*Integrity, Directory, EZResult, Excite, FavOrg, Ferret,
- Fireball, GoogleBot, Google-Sitemaps, GetRight, Gromit, Gulliver, Harvest,
- Hubater, H?m?h?kki, INGRID, IncyWincy, Jack, JPluck, KIT*Fireball, Kototoi,
- Leech, LWP, Lycos, Mediapartners, MegaSheep, Mercator, MimeLive, Miva,
- Nazilla, NetMechanic, NetScoop, Nutch, Ocelli, ParaSite, Pokey, Pompos,
- Refiner, RoboDude, Rover, Rutgers, Scooter, Slurp, Snappy, Snoopy, Spyder,
- T-H-U-N-D-E-R-S-T-O-N-E, Toutatis, Tv*Merc, Valkyrie, Voyager,
- W3C_Validator, Walker, WhizBang, Wire, Wombat, WordPress, Yahoo, Yandex,
- ZyBorg, adressendeutschland, archive, appie, agent, asterias, bot, ccubee,
- cfetch, contact, crawl, collector, complex_network_group, dogpile, fido,
- find, gazz, gonzo, grab, griffon, holmes, index, larbin, legs, locator,
- marvin, mirago, moget, newscan, ozelot, pagebull, retrieve, search, seek,
- speedy, silk, sna, spider, suke, swish, tarantula, topiclink, urllib,
- voyager, wget, whowhere, winona, worm, wwwster, xtreme,
+ ATN_Worldwide,
+ AltaVista,
+ Arachnoidea,
+ Aranha,
+ Architext,
+ Argus,
+ Ask,
+ Atomz,
+ BackRub,
+ Bookdog,
+ BookmarkSync,
+ Builder,
+ CFNetwork,
+ CMC,
+ Contact,
+ Creep,
+ Digital*Integrity,
+ Directory,
+ EZResult,
+ Excite,
+ FavOrg,
+ Ferret,
+ Fireball,
+ GetRight,
+ Google-Sitemaps,
+ GoogleBot,
+ Gromit,
+ Gulliver,
+ H?m?h?kki,
+ Harvest,
+ Hubater,
+ INGRID,
+ IncyWincy,
+ JPluck,
+ Jack,
+ KIT*Fireball,
+ Kototoi,
+ LWP,
+ Leech,
+ Lycos,
+ Mediapartners,
+ MegaSheep,
+ Mercator,
+ MimeLive,
+ Miva,
+ Nazilla,
+ NetMechanic,
+ NetScoop,
+ Nutch,
+ Ocelli,
+ ParaSite,
+ Pokey,
+ Pompos,
+ Refiner,
+ RoboDude,
+ Rover,
+ Rutgers,
+ Scooter,
+ Slurp,
+ Snappy,
+ Snoopy,
+ Spyder,
+ T-H-U-N-D-E-R-S-T-O-N-E,
+ Toutatis,
+ Tv*Merc,
+ Valkyrie,
+ Voyager,
+ W3C_Validator,
+ Walker,
+ WhizBang,
+ Wire,
+ Wombat,
+ WordPress,
+ Yahoo,
+ Yandex,
+ ZyBorg,
+ adressendeutschland,
+ agent,
+ appie,
+ archive,
+ asterias,
+ bot,
+ ccubee,
+ cfetch,
+ collector,
+ complex_network_group,
+ contact,
+ crawl,
+ dogpile,
+ fido,
+ find,
+ gazz,
+ gonzo,
+ grab,
+ griffon,
+ holmes,
+ index,
+ larbin,
+ legs,
+ locator,
+ marvin,
+ mirago,
+ moget,
+ newscan,
+ ozelot,
+ pagebull,
+ retrieve,
+ search,
+ seek,
+ silk,
+ sna,
+ speedy,
+ spider,
+ suke,
+ swish,
+ tarantula,
+ topiclink,
+ urllib,
+ voyager,
+ wget,
+ whowhere,
+ winona,
+ worm,
+ wwwster,
+ xtreme,
EOR
RobotIP <<EOR
- 202.9.155.123, 204.152.191.41, 208.146.26.19,
- 208.146.26.233, 209.185.141.209, 209.185.141.211,
- 209.202.148.36, 209.202.148.41, 216.200.130.207,
- 216.35.103.6?, 216.35.103.70,
+ 202.9.155.123,
+ 204.152.191.41,
+ 208.146.26.19,
+ 208.146.26.233,
+ 209.185.141.209,
+ 209.185.141.211,
+ 209.202.148.36,
+ 209.202.148.41,
+ 216.200.130.207,
+ 216.35.103.6?,
+ 216.35.103.70,
EOR
RobotHost <<EOR
- *.ask.com, *.crawler*.com, *.csccorporatedomains.com,
- *.excite.com, *.analys.google.com, *.googlebot.com,
- *.infoseek.com, *.inktomi.com, *.inktomisearch.com,
- *.lycos.com, msnbot.msn.com, *.pa-x.dec.com,
+ *.ask.com,
+ *.crawler*.com,
+ *.csccorporatedomains.com,
+ *.excite.com,
+ *.analys.google.com,
+ *.googlebot.com,
+ *.infoseek.com,
+ *.inktomi.com,
+ *.inktomisearch.com,
+ *.lycos.com,
+ msnbot.msn.com,
+ *.pa-x.dec.com,
add-url.altavista.com,
westinghouse-rsl-com-usa.NorthRoyalton.cw.net,
EOR
More information about the interchange-cvs
mailing list