Never a panacea: ModSecurity drawbacks

I’ve written before of using ModSecurity for reducing bots traffic, especially for those bots that are not important to the success of a site, like almost all of the so-called “marketing bots”. Unfortunately, installing and setting up ModSecurity with the defaults parameter can cause quite a bit of headaches, especially for technically-oriented applications. There are indeed quite a few different drawbacks to the use of that module, in particular related to the Core Rule Set that ships with it; some of the rules are quite taxing to the web server (since it has to parse eventually a lot of data), and others are simply hitting false positives quite easily. For instance, the rule with id 960017 (Host header is a […]

CrawlBot Wars

Everybody who ever wanted to write a “successful website” (or more recently, thanks to the Web 2.0 hype, a “successful blog”) knows the bless and curse of crawlers, or bots, that are unleashed by all kind of entities to scan the web, and report the content back to their owners. Most of these crawlers are handled by search engines, such as Google, Microsoft Live Search, Yahoo! and so on. With the widespread use of feeds, at least Google and Yahoo! added to their standard crawler bots also feed-specific crawlers that are used to aggregate blogs and other feeds into nice interfaces for their users (think Google Reader). Together with this kind of crawlers, though, there are less useful, sometimes nastier […]