Luckily the issue I was experiencing with gluster 2.0.0rc1 was just an ugly bug squashed in the 2.0.0rc2 release. Right now I’m keeping the configuration I blogged about and now we are thinking about topologies and expansion.
Right now the big issue is trying to provide enough bandwidth for write in replication since a single Gbit link isn’t enough. It’s too late to order infiniband so I’m stuck thinking what is the best topology given we have a single writer, 70 readers, 3 storage (gluster) and about 4 24port gigabit switches with 10Gbit expansion link unused and at least 2 gigabit interfaces per node.
More will follow soon
PS: I’m wondering how hard would be trying to get a round-robin translator to accelerate replicated writes by just issuing a write from the client node to one of the N replicating nodes and then have them sync automatically by themselves…
Recently we started to dabble with clustering file systems, in particular a rather new and promising one called gluster
So far, even if people suggests to use the upcoming 2.0 version we found already some annoying glitches in the 2.0.0rc1, namely the writebehind capability wasn’t working at all, reducing the writing speed to 3Mb/s (on a gigabit link to a cluster of 3 nodes each one with a theoretical peak speed of 180Mb/s), luckily they fixed it in their git, sadly the peak speed for a single node is about 45Mb/s per single transfer and around 75Mb/s when aggregating 5 concurrent transfers, nfs on the same node reaches 95Mb/s on single transfer.
Since looks like there is lots of time wasted waiting somehow (as the experiment with concurrent transfer hints) we’ll probably investigate more and obviously look for advices.
The current setup uses iocache+writebehind as performance translators and maps the nodes as 6 bricks (2 bricks exported per node), replicating 3 times (one for each node) and using dht to join the 2 replicating groups.