Successor to cloud computing, aka tweelter new architecture

When you have to perform a real big amount of operations you have two options

  • Increase your computational power (like using a cloud solution or scale on more servers)
  • Move your computations to the most available cloud platform of the world: your users
To improve tweelter performances and avoid overloading twitter API we are studying a new computational architecture for tweelter which can return results to the user faster and give less overhead to our servers.
The key to achieve the result is to move most of the search overhead to the computers of the other users currently viewing tweelter, like SETI@home does, by using tweelter you would speed up other users searches and also your own searches.

Tweelter, the twitter filter

While speaking with the top-ix people during a meeting we started to talk about the need of a way to filter out “noise” from twitter searches.

Probably everyone found that searching something on twitter returns a big list of retweets and duplicated tweets. As those reduce the ability to follow a discussion or an event on twitter they are usually more a problem than a useful result.

At the end of that meeting Tweelter was born.

Tweelter is a twitter search engine which filters out duplicated entries, retweets and permits to search results older than one month on most followed topics. More interesting thing is that tweelter performs those search in a parallel manner and on a distributed mongodb. While retrieving all the results of the same search using the twitter api would require more then 10-20 seconds by using tweelter you will get the same results in 2-3 seconds and the more a search is performed the faster it gets.

So give tweelter a try if you need to follow a discussion on twitter, it might help you to follow the discussion in an easier manner.

Rehearsing new ACR look and feel

As some turbogears projects are starting to use ACR as their CMS library we received the first few requests by real users and the most prominent one is to have a better administrative section. Currently administration section is implemented by using the great tgext.admin and sprox, even if those are really good to quickly implement a CRUD section they might not couple very well when a more interactive and advanced user experience is required.

So a transition phase that will end with a totally new administration section for ACR has been started, currently the system implements a new user interface still using the same backend as before to handle the operations, but on the long time the backend itself will be rewritten to handle easier contents creation and management. In the mean time also support for multi-language, versioning and authors has been added.

Injecting static content in TurboGears

Something that you usually need to do when providing a library or a reusable wsgi application is installing static data with the library itself.

This can be quickly performed by adding something like

package_data = {”:[‘*.html’, ‘*.js’, ‘*.css’, ‘*.png’]}

to your setup.py

But then how can we let our turbogears application serve that?

  • For html files (genshi templates) the solution is quite simple, you can just expose them by using @expose(‘librarypackage.templatesdir.template’). For example supposing we are installing libcool with its templates in libcool/templates you can do @expose(‘libcool.templates.index’)
  • For js and css files you can add them to your pages by creating a tw.api.JSLink or tw.api.CSSLink object. Just create inside your library something like: cool_js = tw.api.JSLink(modname = __name__, filename = ‘static/cool.js’) and then place in the controller exposing the view where you want to use that js file cool_js.inject()
  • Exposing images can be more complex, you have to declare a widget which will render the img tag by using ToscaWidgets resources exposure.
    class IconLink(tw.api.Link):
        """
        A link to an icon.
        """
        template = """<img src="$link" alt="$alt" />"""
    
        params = dict(alt="Alternative text when not displaying the image")
    
        def __init__(self, *args, **kw):
            super(IconLink, self).__init__(*args, **kw)
            self.alt = kw.get('alt')

    then you can create one IconLink for each icon in your library with something like: parent = IconLink(modname=__name__, filename=’static/icons/parent.png’, alt=’Up’) and inside your views you can place the icon by doing ${parent.display()} . This will add the image by exposing it from inside the static directory of your library package. Remember that you need to add an __init__.py inside the static directory and its subdirectory if you want setuptools to correctly install the static files.

Using SwfUpload with TurboGears 2

SwfUpload doesn’t permit to upload things through authenticated methods, this is because it doesn’t pass the cookies needed to identify your users.

Partly this problem can be solved by using swfupload.cookies.js plugin. This plugins fetches all your cookies and passes them as POST arguments. This way you can get your authtkt cookie and use it to identify your user.

from webob.exc import *
from paste.auth import auth_tkt

if kw.has_key('authtkt'):
    #by default it is usually configured to do not use the remote address
    #otherwise you can fetch it from request.environ['REMOTE_ADDR']
    remote_addr = '0.0.0.0'

    #cookie secret is usually defined in your config/app_cfg.py
    #as base_config.sa_auth.cookie_secret or in your development.ini
    cookie_secret = "some_random_string_like_BQQP+BeyrTzTHClBCEdW"

    try:
        data = auth_tkt.parse_ticket(cookie_secret, 
                                      kw.get('authtkt'), 
                                      remote_addr)
        username = data[1]
        user = DBSession.query(User).filter_by(username=username).one()
    except:
        raise HTTPBadRequest

filename = kw['Filename']
file = kw['Filedata'].file

By using this code you can fetch the user that is uploading the file. This requires the method to do not use @require decorator to check for user permissions, as you will know the user only after entering the method. But you can create your own predicate if you really want to use @require.

ACR divided in ACRcms and libACR

As new projects started to use ACR to implement the content management part of the site we started to divide the ACR application from the content management framework to permit to other people to embed the cmf inside their own applications.

Now has been divided in libACR which is the content management framework and ACRcms which is the cms application. This should make easier to use libACR to implement your own CMS or extend your web applications and also fast for anyone who needs a quickly available CMS solution to just use ACRcms and tune the graphic theme and layout.

Caching in TurboGears 2

TurboGears 2 has a quite good and complete caching support inherited from Pylons, as it is a pylons feature it is not really available by itself, but you can import it.

All you need is those three little things:

from pylons.decorators.cache import beaker_cache from pylons.controllers.util import etag_cache from pylons import cache

The first imports a decorator which makes possible to cache entire controller methods, the second imports a function to use client side caching and the third makes available a caching repository where to store whatever data you might want.

The easiest caching mechanism is etag_cache, this tells to the browser to use its own cached version of the page if it has any available instead of requesting it again to the server. etag_cache requires only one parameter: the caching key. By placing as the first instruction inside your controller action etag_cache(‘mykey’) you will tell to the browser to use its own cached version if it has any. You can use the same key inside each action as the browser will check both for key and url, so different urls won’t collide. Keep in mind that this will let the browser keep using the cached version of the page until the browser won’t be restarted and this is usually something that you don’t want. To avoid this behaviour I suggest to keep changing the key argument constantly each time you want the cache to decay, using a timestamp as key might be a good idea.

For example you can add to your lib.base.BaseController.__call__ method something like

app_globals = tg.config['pylons.app_globals'] if app_globals.caching_key+datetime.timedelta(0,10) < datetime.datetime.now(): app_globals.caching_key = datetime.datetime.now() self.caching_key = str(app_globals.caching_key)

and then use etag_cache(self.caching_key) inside the controller action, this will let your cache expire every 10 seconds.

This might be enough in some situations where you want to completely cache your page, but often you might want to cache only your controller and render your view again. This can be achieved by using the @beaker_cache decorator. This will use Beaker to perform caching of the values returned by your controller, if it finds any available data for your controller it will return it without calling the controller method.

@expose()
@beaker_cache(expire=10)
def index(self):
    #Long and slow operation here
    return 'OK'

This way you will keep your action cached for 10 seconds and will cache different versions if the action parameters change.

For more complex things you might want to cache only parts of an action, this can be achieved by directly using the cache object.

c = cache.get_cache(‘my_function’)
result = c.get_value(key=function_args, createfunc=slow_function, type=”memory”, expiretime=10)

This will get the caching namespace for the current function and will retrieve the available value with the given key if available (you might see that I have called the key “function_args”, this is because it is usually a good idea to build the key by using function arguments that have any effect on the result). If it isn’t found any value (or the value has expired) slow_function will be called to calculate the new value.

New versions of beaker have a nice decorator @cache.cache which prevent you from having to get the cache namespace and the cache value by yourself, by applying @cache.cache to slow_function each call to slow_function will return the available cached value by itself. More information can be found on the relative beaker documentation section. Keep in mind that @cache.cache decorator can only be used by passing arguments as a list, it won’t work for keyword arguments.

Personalize your Error pages in Turbogears 2

I was looking for a way to propagate exceptions from my tg2 app to the ErrorController to permit to propagate errors from controllers to the user. To generate errors and show them you usually have to redirect to /error/document and pass as parameters the error message and error code, but this isn’t really flexible and also modern languages have a really good feature to propagate errors: Exceptions.

So I was looking for a way to raise a webob.exc.HTTPForbidden, place a message inside it and let the ErrorController render the message. Usually you don’t want to tell to the user what went wrong with your 500 server side exception, but you might want to tell to the user why he can’t do what he is trying to do, let him know why it is Forbidden to him.

First thing you can easily do is check for resp.status_int inside your ErrorController.document and fetch only 403 error (the forbidden one). This permits to create a specific message for Forbidden errors, but doesn’t tell much to the user about why it is forbidden. webob.exc.HTTPForbidden permits to set a detail message and also generates an error page, but the turbogears stack when gets a status code different from 200 hooks the response and calls the ErrorController.document to generate a new response. This way your HTTPForbidden exception is lost forever.

Actually it isn’t really lost, as you can access the previous error page from request.environ.get(‘pylons.original_response’). If you  want a quick solution you can hook ErrorController if status_int is 403 and return the original_response instead of rendering ErrorController.document template.

But the original response isn’t really nice and you usually want to adapt it.

My solution has been to create a new ApplicationError(webob.exc.Forbidden) class defined as the following one:

from webob.exc import HTTPForbidden

try:
  from string import Template
except ImportError:
  from webob.util.stringtemplate import Template

class HTTPMyAppError(HTTPForbidden):
  body_template_obj = Template('''<div>${detail}</div>''')

  def __init__(self, msg):
    super(HTTPMyAppError, self).__init__(msg)

This by itself doesn’t change a lot as you will get the same ugly page with simply a div around your error. But now by using BeautifoulSoup you are able to get only your message from the original_response inside your ErrorController.document action.

if resp.status_int == 403: title = "Application Error" message = "We can't perform this, reason was:" details = str(BeautifulSoup(resp.body).find('div'))

Simply personalize your ErrorController.document template and display your details somewhere and you will be able to report errors to your users by simply doing something like raise HTTPMyAppError(‘You have already used this registration code’)

I know that this isn’t a really great solution as you have to parse your already generated error page to fetch only the error message and generate a new error page, if anyone has a better solution that permits to directly access the exception instance feel free to tell me!

ACR got Google Maps view support

We have recently put inside the ACR svn the support for the GMap view, this means that now you will be able to display both static and dynamic google maps by using ACR.

Using MapView is as simple as specifying the location to display and set map as the view of the slice.

ACR Slice Preview support, remote disk, Comment and File views.

Latest version of ACR, our opensource cms for turbogears, got some new interesting features:

Now each view can have a “preview mode” which shows a minimized version of the slice to which is binded. For example you can show in your home page a short version of a news linking to the complete one, or you can show the thumbnail of an image and link to the full version. This can be quite useful in some situations and can be triggered by setting preview=1 inside a slice group. Each slice inside the group will render in preview mode.

Also the Remote Disk feature has been implemented and can be accessed as /rdisk. Inside this file manager you will be able to upload any file which can be referred and used inside you web page. The File View itself has been implemented to permit to quickly link to files and serve them, it will show images or videos or it will provide a link to download files of unknown types.

Also cloned from our iJamix project now ACR has a Comment view, which permits to insert a comments thread inside any page.