Guy Ellis' Tech Blog: January 2012

Thursday, January 19, 2012

Kintiskton LLC IP Ranges

There's a company called Kintiskton LLC who either own or are owned by Mark Manager and they provide a trademark protection service. They have a spider that crawls the web attempting to identify their customers' copyright material posted on sites other than their customers'. In principal I don't have a problem with this because I agree that copyright should be respected.

There's a good write-up about them here: http://endellion.me.uk/info/Kintiskton.html

The problem is that their spider aggressively spiders sites without respecting the robots.txt file. It hits the site hard and fast and ignores the crawl-delay directive and exclude directives. Ignoring the excludes directive is understandable (but not tolerable) as rouge web sites that are violating copyright could "hide" their content from respectful spiders by adding an exclude directive in the robots.txt file for that part of the site. This spider, however, also ignores the crawl-delay and is also not very well written as it generates a fair number of errors in the log files making it easy to see.

If you want to exclude this spider from your site you can exclude these IP ranges:

65.208.151.112 - 65.208.151.119
63.110.158.48 - 63.110.158.55
65.200.47.0 - 65.200.47.7
65.208.189.24 - 65.208.189.31
65.208.185.96 - 65.208.185.103
65.211.195.16 - 65.211.195.23
5.208.151.112 - 5.208.151.119 (probably a mistake - see Zap's comment below)

If you discover another range that they are using please post as a reply to this blog post and I'll add it to the above list.