Redis and MongoDB insertion performance analysis

Tuesday, March 16th, 2010 | Computer Science

Recently we had to study a software where reads can be slow, but writes need to be as fast as possible. Starting from this requirement we thought about which one between redis and mongodb would better fit the problem. Redis should be the obvious choice as its simpler data structure should make it light-speed fast, and actually that is true, but we found a we interesting things that we would like to share.

This first graph is about MongoDB Insertion vs Redis RPUSH.
Up to 2000 entries the two are quite equivalent, then redis starts to get faster, usually twice as fast as mongodb. I expected this, and I have to say that antirez did a good job in thinking the redis paradigm, in some situations it is the perfect match solution.
Anyway I would expect mongodb to be even slower relating to the features that a mongodb collection has over a simple list.

This second graph is about Redis RPUSH vs Mongo $PUSH vs Mongo insert, and I find this graph to be really interesting.
Up to 5000 entries mongodb $push is faster even when compared to Redis RPUSH, then it becames incredibly slow, probably the mongodb array type has linear insertion time and so it becomes slower and slower. mongodb might gain a bit of performances by exposing a constant time insertion list type, but even with the linear time array type (which can guarantee constant time look-up) it has its applications for small sets of data.

I would like to say that this benchmarks have no real value, as usual, and have been performed just for curiosity

You can find here the three benchmarks snippets

import redis, time
MAX_NUMS = 1000
 
r = redis.Redis(host='localhost', port=6379, db=0)
del r['list']
 
nums = range(0, MAX_NUMS)
clock_start = time.clock()
time_start = time.time()
for i in nums:
    r.rpush('list', i)
time_end = time.time()
clock_end = time.clock()
 
print 'TOTAL CLOCK', clock_end-clock_start
print 'TOTAL TIME', time_end-time_start
import pymongo, time
MAX_NUMS = 1000
 
con = pymongo.Connection()
db = con.test_db
db.testcol.remove({})
db.testlist.remove({})
 
nums = range(0, MAX_NUMS)
clock_start = time.clock()
time_start = time.time()
for i in nums:
    db.testlist.insert({'v':i})
time_end = time.time()
clock_end = time.clock()
 
print 'TOTAL CLOCK', clock_end-clock_start
print 'TOTAL TIME', time_end-time_start
import pymongo, time
MAX_NUMS = 1000
 
con = pymongo.Connection()
db = con.test_db
db.testcol.remove({})
db.testlist.remove({})
oid = db.testcol.insert({'name':'list'})
 
nums = range(0, MAX_NUMS)
clock_start = time.clock()
time_start = time.time()
for i in nums:
    db.testcol.update({'_id':oid}, {'$push':{'values':i}})
time_end = time.time()
clock_end = time.clock()
 
print 'TOTAL CLOCK', clock_end-clock_start
print 'TOTAL TIME', time_end-time_start

Tags: ,

  • Paweł Kobylak

    Server and client on 1 computer? Try to do that benchmark on 2 separates computers (not VM).

  • antirez

    Pawel this seems like a good idea at first, but it's not so easy. For instance the redis benchmark figures are obtained using client and server in the same computer so actually redis is faster than the 130k ops/sec we claim, but how do you get a so fast link as loopback easily between two separated boxes?

  • Paweł Kobylak

    As I know that the redis is slower over the net than mongo. Today I'll doing replication and performance tests with mongo (5 / 10 / 50 / 200 / 500 connections) over the 1gbps link to 2 pair of servers.
    First pair (each has): 2x quad core , 12GB ram and 6x 143GB SAS (raid 10)
    Second pair (each has): 2x quad core, 12GB ram and 4x 50GB SSD (raid 10).
    I write about results if you like.

    --
    sorry for my english ;)

  • Paweł Kobylak
  • Label your axes!

blog comments powered by Disqus

Search