Python 3 Object Oriented Programming

Python 3 Object Oriented Programming had been reviewed already by a number of people in the python community and I’m lucky to have been given the occasion to read it as well.

As most of the other reviewers I say that the book is pleasant, it’s easy to follow and crystal clear. It gives a good overview of everything you have to expect from python.

The book is composed of 12 chapters

  • The first 5 introduce the reader to basic concepts related to object oriented programming like Objects and Classes, Exceptions and Inheritance:
    Object-oriented Design, Objects in Python, When Objects are Alike, Expecting the Unexpected, When to Use Object-oriented Programming
  • The next 2 delve into more python specific features and uses: Python Data Structures, Python Object-oriented Shortcuts
  • Two chapters on common design patterns and patterns commonly used in python follow: Python Design Patterns I, Python Design Patterns II
  • The remaining three provide some basic I/O introduction (Files and Strings) and a good intruduction to useful libraries and tools: Testing Object-oriented Programs and Common Python 3 Libraries

Each chapter share the same Introduction-Details-Case_Study-Exercise-Summary internal structure.

The book seems really good for teaching in university, it explains very clearly lots of basic concepts that usually are given as already known and introduces to concept like UML, design patterns and test driven development in a quite soft and easy way.
It is perfect for an initial object oriented programming course, if it comes along or one term before software engineering course.

  • The usage of UML is to the point and isn’t heavy at all, making it good for people that just learnt what UML is or that were going to learn it along in other courses.
  • The chapters about design patterns touch most of the commonly used patterns in python and explain why some patterns aren’t used at all. In an academic context those are quite good to show how the abstract concepts get used in the real code.
  • Since we are using a lot nose probably I’d prefer having it used while explaining unittest, still I the chapter is quite good to have somebody without previous knowledge start playing with tests.
  • The end of chapter summary and an exercise sections are quite useful for reviewing and do a bit of self check. Probably having also an appendix with the exercises solved would made the book even more student-suited.

All in all I consider this book quite well suited for Universities (both professors and students will enjoy it) and python newcomers. More skilled readers will still find it a good book to read about python3.

Android Motorola phones policy

As a previous HTC Dream owner I have been quite sad with my Motorola Backflip in the past few months. I couldn’t upgrade it to unofficial roms, I couldn’t run root software, my bootloader was constrained to run only specific signed roms probably by using IBM eFuse and last but not least Motorola announced that it didn’t have any intention to provide an upgrade to Android 2.X for the Backflip in Europe. It’s 2010 and I am confined to a device running an Android version older than the version that my 2 years old HTC Dream did run.

As I really liked the policy from Samsung to release their complete OS sources for their android phones and I really liked the design of the Galaxy S, my next phone will probably be a Samsung. In the mean time I’m finally happy to discover that someone got a working root on the Motorola Backflip and having tested it myself I can say that it actually works.

I can only hope that someone will also be able to find a way to produce backed roms for the Backflip providing and unofficial upgrade to 2.x as Motorola left me alone with my plain old Android 1.5

Python Sequence VS Iterable protocols

Recently my colleague Luca got an extracted chapter of Python 3 Object Oriented Programming book to review. As I am a long time Python developer and lover I couldn’t stop myself from taking a look at the chapter so satisfy my curiosity.

A little excerpt from the chapter talks about len, reversed and zip functions illuminated me about the fact that usually due to duck typing Python developers tend to consider sequences and iterables quite the same.

The author of the books says that “The zip function takes two or more sequences and creates a new sequence of tuples” and also the help(zip) documentation says “Return a list of tuples, where each tuple contains the i-th element from each of the argument sequences”.

Indeed the zip function works on every iterable, not just sequences. Python tends to define non-formals protocols and looking around in the doc one can discover that the sequence protocol is defined as implementing __len__() and __getitem__() while the iterable protocol requires to implement __iter__() and return an iterator.

This made me think that having duck typing not only allows easier development of code, but also easier communication of concepts. In any other language commonly using formal protocols (ie interfaces, protocols, or any other formal definition of them) the author of the book would have been required to specify also the definition of the two protocols and the differences between them before the reader would have been actually able to use the zip function. In the Python case the author just had to write the sentence in natural language and the reader is aware that he/she can actually call zip on any collection, container or even generator. Anything that he unconsciously recognizes as a sequence without even having to know the existance of the protocol itself..

Indeed I recognize that duck typing and non-formal protocols tend to be more error prone, but its interesting to notice that they also help to simply communication of concepts as human tend to find easier to feel intuitions over formal definitions.

Mercurial “git grep” equivalent extension

As we are used to git when working on mercurial we miss a lot the “git grep” command as “hg grep” doesn’t really do the same thing. So we managed to quickly create a mercurial extension to add the hgrep command to hg which behaves a bit like git grep.

import itertools, os, os.path

def hgrep(ui, repo, what, **opts):
    files = []
    status = repo.status(clean=True)
    
    for f in itertools.chain(status[0], status[1], status[4], status[6]):
        files.append(os.path.join(repo.root, f))
       
    os.system("grep %s %s" % (what, ' '.join(files)))
    
cmdtable = {"hgrep": (hgrep, [], "[what]")}

To have the “hg hgrep” command and make it work just save it as hgrep.py in your python modules path and add it to ~/.hgrc inside the extensions section like:

[extensions]
hgrep =

Then you will be able to run “hg grep what” inside a mercurial repository and it will find each file at its current state that contains what you were looking for giving you the complete absolute path to the file.

Redis and MongoDB insertion performance analysis

Recently we had to study a software where reads can be slow, but writes need to be as fast as possible. Starting from this requirement we thought about which one between redis and mongodb would better fit the problem. Redis should be the obvious choice as its simpler data structure should make it light-speed fast, and actually that is true, but we found a we interesting things that we would like to share.

This first graph is about MongoDB Insertion vs Redis RPUSH.
Up to 2000 entries the two are quite equivalent, then redis starts to get faster, usually twice as fast as mongodb. I expected this, and I have to say that antirez did a good job in thinking the redis paradigm, in some situations it is the perfect match solution.
Anyway I would expect mongodb to be even slower relating to the features that a mongodb collection has over a simple list.

This second graph is about Redis RPUSH vs Mongo $PUSH vs Mongo insert, and I find this graph to be really interesting.
Up to 5000 entries mongodb $push is faster even when compared to Redis RPUSH, then it becames incredibly slow, probably the mongodb array type has linear insertion time and so it becomes slower and slower. mongodb might gain a bit of performances by exposing a constant time insertion list type, but even with the linear time array type (which can guarantee constant time look-up) it has its applications for small sets of data.

I would like to say that this benchmarks have no real value, as usual, and have been performed just for curiosity

You can find here the three benchmarks snippets

import redis, time
MAX_NUMS = 1000

r = redis.Redis(host='localhost', port=6379, db=0)
del r['list']

nums = range(0, MAX_NUMS)
clock_start = time.clock()
time_start = time.time()
for i in nums:
    r.rpush('list', i)
time_end = time.time()
clock_end = time.clock()

print 'TOTAL CLOCK', clock_end-clock_start
print 'TOTAL TIME', time_end-time_start
import pymongo, time
MAX_NUMS = 1000

con = pymongo.Connection()
db = con.test_db
db.testcol.remove({})
db.testlist.remove({})

nums = range(0, MAX_NUMS)
clock_start = time.clock()
time_start = time.time()
for i in nums:
    db.testlist.insert({'v':i})
time_end = time.time()
clock_end = time.clock()

print 'TOTAL CLOCK', clock_end-clock_start
print 'TOTAL TIME', time_end-time_start
import pymongo, time
MAX_NUMS = 1000

con = pymongo.Connection()
db = con.test_db
db.testcol.remove({})
db.testlist.remove({})
oid = db.testcol.insert({'name':'list'})

nums = range(0, MAX_NUMS)
clock_start = time.clock()
time_start = time.time()
for i in nums:
    db.testcol.update({'_id':oid}, {'$push':{'values':i}})
time_end = time.time()
clock_end = time.clock()

print 'TOTAL CLOCK', clock_end-clock_start
print 'TOTAL TIME', time_end-time_start

Canvas 3D and various modern web technologies reflections

Every ~10 years we face an incredible new technology that looks a lot like an old technology that spent his life in total oblivion.

Recently we faced the return of clustered computing under the new brand of cloud computing, we faced the return of time sharing systems under the software as a service paradigm and recently 3D movies have returned to life from 1970. Technology improves and sometimes we recall that something that we already tried can be made better and released under a new less geeky brand.

Recently HTML5, which is trying to remove the need for plugins for nowadays common actions (like playing videos online) and to stop the HTML vs XHTML war by permitting to integrate SVG and MathML in HTML, caused the return of the idea of making complex graphic online.

Indeed being able to draw a few pixels using <canvas> doesn’t mean being able to draw complex 3D graphic and so we started to see the old idea of VRML getting new life under the flag of Canvas3DL and O3D. (Actually VRML and Canvas3DL/O3D have little in common, but we are still talking about 3d graphic on web)

This was quite inevitable and I think that in HTML6 we will probably see some kind of 3D API integrated inside the standard canvas, but for now I’m just curious to see if they will be used for something good or if they will face the same doom as VRML and flash which have been used 90% to create disturbing and useless web pages, animations and introductions.

Anyway, Cadie was already moving on this front and has provided the solution for 3D web

At least it would be cool to be able to retrieve your heavy machine gun inside Quake6 with jQuery(‘#machinegun’) but, as the world hates me, 3D objects won’t be part of the DOM and will only rely on your javascript heap, I’m very sad about this :/

Liskov Substitution Principle Reflection

I have found a quite interesting shakespearean article about Liskov Substitution Principle. It is mainly about C++, but being it the core of Object Oriented paradigm I think that it might be interesting for everyone.

Most of the think is nowadays resolved with mixins and Policy Based Design, but they run away a bit from pure Objects Oriented and so is still an interesting reflection

So if you are interested take a look at this  Blog Post

Remember to sync (or what the heck is that RAW C: partition on my Windows on mac?)

Our current computers use MBR to store the table of primary partitions. MBR for convention allows only 4 primary partitions and then uses Logical Partitions inside Extended Boot Records for the successive. Instead Apple computers use the new Extensible Firmware Interface with the GUID Partition Table to describe disk partitions permitting any number of primary partitions to be allocated.

MacOSX uses the GPT partition table while Windows expects to find the MBR one (Actually Windows for Itanium can boot from GPT, but that is not what you are usually installing on your desktop). Fortunately the EFI keeps a “fake” MBR table for backward compatibility and that is why we can install Windows on our MacIntel. But what happens when we change our GPT partitions from OSX/Linux which supports GPT? It happens that our MBR becames out of sync with the GPT.

Recently I had to install a triple boot with OSX+Win+Linux on my MacBookPro. This required 5 partitions (1 for EFI, 1 for OSX, 1 for Linux, 1 for Windows, 1 Linux Swap). After creating them from OSX I synchronized the MBR to GPT with refit tools to have the first four partition available also on MBR (this way I lost only the SWAP partition)  then I installed Windows XP. Everything went well but after booting, Windows ended up installing on D: and there was a C: partition resulting of RAW type. I just thought that obviously this was because there was the un-formatted Linux partition before the Windows one and restarted to install Linux.

Then I formatted my linux partition as EXT3 and installed Gentoo over it. Everything was fine until I had to restart windows to install the Service Pack 3 for XP. My wonderful RAW C: partition was still there… and I wasn’t able to install the service pack because SP3 installer wasn’t able to install on anything else then C: even if my Windows installation resulted on D: and C: didn’t result as a valid partition (off course… It’s my Linux partition!)

Windows should have known that it isn’t a valid partition for him and should hide it, that didn’t happen… Why? After thinking a bit I remembered that the Partition Type is also a field inside the MBR. When we format a partition the system specifies also the partition type inside the MBR to tell everyone that partition has been reserved by some OS. Having initialized my EXT3 partition from Linux had as a result that only the GPT was updated and the MBR still thought that my third partition was un-formatted.

After losing about an hour fighting with the Service Pack installer I just realized that I had to resynchronize the two tables… After doing that the RAW C: partition disappeared from Windows and my service pack installed correctly.

New Lesson: Ok, Microsoft could make a better installer and stop preventing their OS to install from something different from what they call “C:” but sometimes you have to think more before enraging with your OS 😀

The misunderstood one

I’m a usual follower of C++ related blogs and newsgroups and still I’m impressed how much this language is misunderstood after 10 years of a precise standard. I’m actually starting to ponder if we are ready for C++0x while a lot of people yet misunderstand C++98 and C++03.

I often end up discussing with my colleagues about the fact that they prefer a less powerful, but more reliable and less obscure language like C. I know that C++ isn’t one of the easiest languages to master correctly and I also know that it has some problems about compilers often changing their behaviour and with slow templates compilation time. All things that C++0x is trying to make easier or faster with extern templates, move constructors, delegate constructors and type determination.

C++0x will be a quite complex and advanced language, which will give the programmer a lot of control over the compiler, but still after 10 years of existence of the standard C++ and formal STL libraries you can find posts like this which confuse good design practices and theories with runtime optimizations.

Having the stack container with the pop method returning void is exception neutrality not a runtime optimization. Strong exceptions safety means that the operation has either completed successfully or thrown an exception, leaving the program state exactly as it was before the operation started.

What would happen if we have a T pop() method returning the head after removing it? We might have the copy-constructor of T fail throwing an exception and find that after removing our head we also lost it forever. Having it just remove the item means that our code might only have success removing our item or fail to remove our item leaving the state exactly as it was before.

This is one of the first issues raised years ago, an issue discussed in depth since the article of Tom Cargil in 1994. An issues explained by Meyers and Sutter in their books and available in a lot of community blogs like the boost one. And still this isn’t clear even to long time C++ developers.

C++ is a multipurpose language, having all that freedom of movement requires more deep knowledge of the language itself. Shouldn’t we change how C++ is taught and more strictly enforce some practices before moving to a new standard? As a sporadic C++ teacher I often end up with this question…