Never a panacea: ModSecurity drawbacks

I’ve written before of using ModSecurity for reducing bots traffic, especially for those bots that are not important to the success of a site, like almost all of the so-called “marketing bots”. Unfortunately, installing and setting up ModSecurity with the defaults parameter can cause quite a bit of headaches, especially for technically-oriented applications. There are indeed quite a few different drawbacks to the use of that module, in particular related to the Core Rule Set that ships with it; some of the rules are quite taxing to the web server (since it has to parse eventually a lot of data), and others are simply hitting false positives quite easily. For instance, the rule with id 960017 (Host header is a […]

(Val)grinding your code

One of of the most exoteric tool when working on software projects that developers should be using more often than they do, is certainly Valgrind. It’s a multi-faced tool: timing profiler, memory profiler, race conditions detector, … it can do all this because it’s mainly a simulated CPU, which executes the code, keeping track of what happens, and replacing some basic functions like the allocation routines. So this works pretty well, to a point; unfortunately it does not always work out that well because there are a few minor issues: it needs to know about all the instructions used in the programs (otherwise it raises SIGILL crashes), and right now it does not know about SSE4 on x86 and x86-64 […]

CrawlBot Wars

Everybody who ever wanted to write a “successful website” (or more recently, thanks to the Web 2.0 hype, a “successful blog”) knows the bless and curse of crawlers, or bots, that are unleashed by all kind of entities to scan the web, and report the content back to their owners. Most of these crawlers are handled by search engines, such as Google, Microsoft Live Search, Yahoo! and so on. With the widespread use of feeds, at least Google and Yahoo! added to their standard crawler bots also feed-specific crawlers that are used to aggregate blogs and other feeds into nice interfaces for their users (think Google Reader). Together with this kind of crawlers, though, there are less useful, sometimes nastier […]

Autotools Come Home

With our experience as Gentoo developers, me and Luca have had experience with a wide range of build systems; while there are obvious degrees of goodness/badness in build system worlds, we express our preference for autotools over most of the custom build systems, and especially over cmake-based build systems, that seem to be high on the tide thanks to KDE in the last two years. I have recently written my views on build systems: in which I explain why I dislike CMake and why I don’t mind it when it is replacing a very-custom bad build system. The one reason I gave for using CMake is the lack of support for Microsoft Visual C++ Compiler, which is needed by some […]

Profiling Proficiency

Trying to assess the impact of the Ragel state machine generator on software, I’ve been trying to come down with a quick way to seriously benchmark some simple testcases, making sure that the results are as objective as possible. Unfortunately, I’m not a profiling expert; to keep with the alliteration in the post’s title, I’m not proficient with profilers. Profiling can be done either internally to the software (manually or with GCC’s help), externally with a profiling software, or in the kernel with a sampling profiler. Respectively, you’d be found using the rdtsc instruction (on x86/amd64), the gprof command, valgrind and one of either sysprof or oprofile. Turns out that almost all these options don’t give you anything useful in […]