VideoLAN Web Plugin: xpi vs crx

One of the main issue while preparing streaming solution is answering the obnoxious question:

  • Question: Is possible to use the service through a browser?
  • Answer: No, rtsp isn’t* http, a browser isn’t a tool for accessing any network content.
  • * Actually would be neat having rtsp support within the video tag but that’s yet another large can of worms

Once you say that you have half of your audience leaving. Non technical people is too much used to consider the browser the one and only key to internet. The remaining ones will ask something along those lines:

  • Question: My target user is a complete idiottechnically impairednaive and unaccustomed and could not be confronted with the hassle of a complex installation procedure, is there something that fits the bill?
  • Answer: VideoLAN Web Plugin

Usually that makes some people happy since it’s something they actually know or at least they have heard about. Some might start complaining since they experienced an old version and well it crashed a lot. What would you be beware of is the following one:

  • Question: Actually I need to install the VideoLAN Web Plugin and it requires attention, isn’t there a quicker route?
  • Answer: Yes xpi an crx for Firefox an Chrome

Ok, that answer is more or less from the future and it’s the main subject of this post: Seamless bundling something as big and complex as vlc and make our non tecnical and naive target user happy.

I picked the VideoLAN web plugin since it is actually quite good already, has a nice javascript interface to let you do _lots_ of nice stuff and there are people actually working on it. Additional points since it is available on windows and MacOSX. Some time ago I investigated how to use the extension facility of firefox to have the fabled “one click” install. The current way is quite straightforward and has already landed in the vlc git tree for the curious and lazy:


  
    vlc-plugin@videolan.org
    VideoLAN
    1.2.0-git
    
      
        {ec8030f7-c20a-464f-9b0e-13a3a9e97384}
        1.5
        3.6.*
      
    
  

Putting that as install.rdf in a zip containing a directory called plugins with libvlc, it’s modules and obviously the npapi plugin does the trick quite well.

Chrome now has something similar and it seems also easier so that’s what I put in the manifest.json:

{
"name": "VideoLAN",
"version": "1.2.0.99",
"description": "VideoLan Web Plugin Bundle",
"plugins": [{"path":"plugins/npvlc.dll", "public":true }]
}

Looks simpler and neater, isn’t it? Now we get to the problematic part about chrome extension packaging:

It is mostly a zip BUT you have to prepend to it a small header with more or less just the signature.

You can do that either by using chrome built-in facility or by a small ruby script. Reimplementing the same logic in Makefile using openssl is an option, for now I’ll stick with crxmake.

Then first test build for win32 are available as xpi and crx hosted on lscube.org as usual.

Sadly the crx file layout and the not so tolerant firefox xpi unpacker make impossible having a single zip containing both the manifest.xpi and the install.rdf served as xpi and crx.

by the way, wordpress really sucks

ACR got support for user permissions

ACR opensource Turbogears2 CMS got support for user permissions to allow users to edit only some pages and create children only in some sections. This should permit to separate work between multiple people in ACR based sites.

ACR also got support for blog/news slice template. This permits to create blogs in ACR with just two clicks instead of having to declare the ACR slicegroup youself.

As usual you can download ACR from http://repo.axant.it/hg/acr by using mercurial

Remote desktop, meet multimedia; school, meet remote desktop

Recently we got contacted about crafting some kind of solution for remote participation to school lesson. Hospitalized students may have hard time catching up and the current technologies, even the overpriced and underused “interactive blackboards” may help a bit in the picture.

What’s an “interactive blackboard” ? It is more or less a projector and any kind of tracking pointer, usually the IR flavour you can see in wide use through the Nintendo Wii. Not exactly a breakthrough it’s something you could craft with about 50e of components and any price for a projector (like the relatively inexpensive and highly portable ones from 3m). You might have some “value added software” that give you an UI that is more “blackboardish” than your standard desktop but that’s all.

How is it used during lessons? Pretty much like a normal blackboard, worst case you have a dumb teacher feeding his poor students dull slides made not so well.

That said it gets pretty easy think about a way to keep the hospitalized student and the rest of the class linked: put a remote desktop solution (nx, vnc, whatever) on the system wired to the blackboard and arrange some controls so that the teacher could give and take the “chalk” to the remote student.

Simple enough isn’t it?

Problems:

– What if the teacher would like to see and heard the remote student?

Well there is plenty of streaming solutions (I’m eyeing sip-communicator currently since they really put a great show at Fosdem, but ekiga or skype could do as well).

– What if the teacher starts to use the “interactive board” to show a DVD and wants the remote student enjoy it?

Ok, there we have a problem, having a large surface updated quite often and asking to have a _good_ quality and expecting the remote student having just a wireless link like umts, edge, gprs is getting really painful.

There are some solutions that have some heuristics in place to discover when a surface is holding a video and they try to compress using some not-so-lossy and quite-enough-low-delay. A bit suboptimal but should work somehow. I wonder if somebody has already thought about harnessing XV, XvMC and libVA capabilities and try to wire the not so fully decoded bitstream this way. Given you have a vaapi implementation on both endpoint and the right codec you may get a perfect movie and probably also spare some bandwidth. If I’ll have time probably I’ll try to have a proof of concept using the efikamx as endpoint, given it will get a vaapi bridget to it’s hardware accelerators.

For the audio you can compress it quite well w/out many complaints and wiring it from a desktop to another is relatively easy (hi, pulse!).

So I already described some months of work, now the last problem:

– What if I want many remote students interact with the same class and blackboard?

Ops. Given the class may have a link that’s no better than umts as well if we are talking about bare remote desktop might be feasible

If we could cut the video feeds and keep just the voices of the remote students we could still survive with 3-4 at most

If we want them to enjoy the video the teacher is about to show to the classe then… we need something else, completely. The whole blackboard&such software could stay better on a server with enough bandwidth and cpu to serve all the remote nodes with ease. Also the class could use a thin client wired to the interactive blackboard and more or less everybody could be happy. Sadly such technologies aren’t that ready. There is spice that’s quite promising, but not ready yet.

That’s all for now, I spent enough time rambling. We’ll see if this “cloud”y ideas will end in an implementation or not. And I haven’t started yet thinking about which software would run on this contraption… Anybody has a any experience with educational software for middle/high schools?

Rehearsing new ACR look and feel

As some turbogears projects are starting to use ACR as their CMS library we received the first few requests by real users and the most prominent one is to have a better administrative section. Currently administration section is implemented by using the great tgext.admin and sprox, even if those are really good to quickly implement a CRUD section they might not couple very well when a more interactive and advanced user experience is required.

So a transition phase that will end with a totally new administration section for ACR has been started, currently the system implements a new user interface still using the same backend as before to handle the operations, but on the long time the backend itself will be rewritten to handle easier contents creation and management. In the mean time also support for multi-language, versioning and authors has been added.

ACR divided in ACRcms and libACR

As new projects started to use ACR to implement the content management part of the site we started to divide the ACR application from the content management framework to permit to other people to embed the cmf inside their own applications.

Now has been divided in libACR which is the content management framework and ACRcms which is the cms application. This should make easier to use libACR to implement your own CMS or extend your web applications and also fast for anyone who needs a quickly available CMS solution to just use ACRcms and tune the graphic theme and layout.

ACR got Google Maps view support

We have recently put inside the ACR svn the support for the GMap view, this means that now you will be able to display both static and dynamic google maps by using ACR.

Using MapView is as simple as specifying the location to display and set map as the view of the slice.

ACR Slice Preview support, remote disk, Comment and File views.

Latest version of ACR, our opensource cms for turbogears, got some new interesting features:

Now each view can have a “preview mode” which shows a minimized version of the slice to which is binded. For example you can show in your home page a short version of a news linking to the complete one, or you can show the thumbnail of an image and link to the full version. This can be quite useful in some situations and can be triggered by setting preview=1 inside a slice group. Each slice inside the group will render in preview mode.

Also the Remote Disk feature has been implemented and can be accessed as /rdisk. Inside this file manager you will be able to upload any file which can be referred and used inside you web page. The File View itself has been implemented to permit to quickly link to files and serve them, it will show images or videos or it will provide a link to download files of unknown types.

Also cloned from our iJamix project now ACR has a Comment view, which permits to insert a comments thread inside any page.

CrawlBot Wars

Everybody who ever wanted to write a “successful website” (or more recently, thanks to the Web 2.0 hype, a “successful blog”) knows the bless and curse of crawlers, or bots, that are unleashed by all kind of entities to scan the web, and report the content back to their owners.

Most of these crawlers are handled by search engines, such as Google, Microsoft Live Search, Yahoo! and so on. With the widespread use of feeds, at least Google and Yahoo! added to their standard crawler bots also feed-specific crawlers that are used to aggregate blogs and other feeds into nice interfaces for their users (think Google Reader). Together with this kind of crawlers, though, there are less useful, sometimes nastier crawlers that either don’t respond to search engines, or respond to search engines whose ethical involvement makes somewhat wonder.

Good or bad, at the end of the day you might not want some bots to crawl your site; some Free Software -bigots- activists some time ago wanted, for instance, to exclude the Microsoft bot from their sites (while I have some other ideas), but there are certain bots that are even more useful to block, like the so-called “marketing bots”.

You might like Web 2.0 or you might not, but certainly lots of people found the new paradigm of Web as a gold mind to make more money out of content others have written – incidentally these are not, like RIAA, MPAA and SIAE insist, the “pirates” that copy music and movies, but rather companies whose objective is to provide other companies with marketing research and data based on content of blogs and similar services. While some people might be interested in getting their blog scanned by these crawlers either way, I’d guess that for most users who host their own blog this is just a waste of bandwidth: the crawlers tend to be quite pernicious since they don’t use If-Modified-Since or Etag headers in their request, and even when they do, they tend to make quite a few requests on the feeds per hour (compare this with Google’s Feedfetcher bot that requires at most one copy of the same feed per hour – well, if it isn’t confused by multiple compatibility redirects like it unfortunately is with my main blog).

While there is a voluntary exclusion protocol (represented by the omni-present robots.txt file), only actually “good” robots do consider that, while evil or rogue robots can simply ignore it. Also, it might be counter-productive to block rogue robots even when they do look at it. Say that a rogue robot wants your data, and to pass as a good one is advertising itself in the User-Agent string, complete with a link to a page explaining what it’s supposedly be doing, and accepting the exclusion. If you exclude it in robots.txt you can give it enough information to choose a _different_ User-Agent string that is not listed in the exclusion protocol.

One way to deal with the problem is by blocking the requests at the source, answering straight away with an HTTP 403 (Access Denied) on the web server when making a request. When using the Apache web server, the easiest way to do this is by using modsecurity and a blacklist rule for rogue robots, similar to the antispam system I’ve been using for a few months already. The one problem I see with this is that Apache’s mod_rewrite seem to be executed _before_ mod_security, which means that for any request that is rewritten by compatibility rules (moved, renamed, …) there is first a 301 response and just after that an actual 403.

I’m currently working on compiling such a blacklist by analysing the logs of my server, the main problem is deciding which crawlers to block and which to keep. When the description page explicitly states they are marketing research, blocking them is quite straightforward; when they _seem_ to provide an actual search service, that’s more shady, and it turns down to checking the behaviour of the bot itself on the site. And then there are the vulnerability scanners.

Still, it doesn’t stop here: given that in the Google description of GoogleBot they provide a (quite longish to be honest) method to verify that a bot is actually GoogleBot as it advertises itself to be, one has to assume that there are rogue bots out there trying to pass for GoogleBot or other good and lecit bot. This is very likely the case because some website that are usually visible only by registered users make an exception for search engine crawlers to access and index their content.

Especially malware, looking for backdoors into a web application, is likely to forge the User-Agent of a known good search engine bot (that is likely _not_ blocked by the robots.txt exclusion list), so that it doesn’t fire up any alarm in the logs. So finding “fake” search engine bots is likely to be an important step in securing a webserver running webapplications, may them be trusted or not.

As far as I know there is currently no way in Apache to check that a request actually does come from the bot it’s declared to come from. The nslookup method that Google suggests works fine for a forensic analysis but it’s almost impossible to perform properly with Apache itself, and not even modsecurity, by itself, can do much about that. On the other hand, there is one thing in the recent 2.5 versions of modsecurity that can be probably used to implement an actually working check: the LUA scripts loading. Which is what I’m going to work on as soon as I find some extra free time.

Autotools Come Home

With our experience as Gentoo developers, me and Luca have had experience with a wide range of build systems; while there are obvious degrees of goodness/badness in build system worlds, we express our preference for autotools over most of the custom build systems, and especially over cmake-based build systems, that seem to be high on the tide thanks to KDE in the last two years.

I have recently written my views on build systems: in which I explain why I dislike CMake and why I don’t mind it when it is replacing a very-custom bad build system. The one reason I gave for using CMake is the lack of support for Microsoft Visual C++ Compiler, which is needed by some type of projects under Windows (GCC still lacks way too many features); this starts to become a moot point.

Indeed if you look at the NEWS file for the latest release (unleashed yesterday) 1.11, there is this note:

– The `depcomp’ and `compile’ scripts now work with MSVC under MSYS.

This means that when running configure scripts under MSYS (which means having most of the POSIX/GNU tools available under the Windows terminal prompt), it’s possible to use the Microsoft compiler, thanks to the compile wrapper script. Of course this does not mean the features are on par with CMake yet, mostly because all the configure scripts I’ve seen up to now seem to expect GCC or compatible compilers, which means that it will require for more complex tests, and especially macro archives, to replace the Visual Studio project files. Also, CMake having a fairly standard way to handle options and extra dependencies, can have a GUI to select those, where autotools are still tremendously fragmented in that regard.

Additionally, one of the most-recreated and probably useless features, the Linux-style quiet-but-not-entirely build, is now implemented directly in automake through the silent-make option. While I don’t see much point in calling that a killer feature I’m sure there are people who are interested in seeing that.

While many people seem to think that autotools are dead and that they should disappear, there is actually fairly active development behind them, and the whole thing is likely going to progress and improve over the next months. Maybe I should find the time to try making the compile wrapper script work with Borland’s compiler too, of which I have a license; it would be one feature that CMake is missing.

At any rate, I’ll probably extend my autotools guide for automake 1.11, together with a few extras, in the next few days. And maybe I can continue my Autotools Mythbuster series that I’ve been writing on my blog for a while.

Profiling Proficiency

Trying to assess the impact of the Ragel state machine generator on software, I’ve been trying to come down with a quick way to seriously benchmark some simple testcases, making sure that the results are as objective as possible. Unfortunately, I’m not a profiling expert; to keep with the alliteration in the post’s title, I’m not proficient with profilers.

Profiling can be done either internally to the software (manually or with GCC’s help), externally with a profiling software, or in the kernel with a sampling profiler. Respectively, you’d be found using the rdtsc instruction (on x86/amd64), the gprof command, valgrind and one of either sysprof or oprofile. Turns out that almost all these options don’t give you anything useful in truth, and you have to decide whether to get theoretical data, or practical timing, and in both cases, you don’t have a very useful result.

Of that list of profiling software, the one that looked more promising was actually oProfile; especially more interesting since the AMD Family 10h (Barcelona) CPUs supposedly have a way to have accurate reporting of instruction executed, which should combine the precise timing reported by oProfile with the invariant execution profile that valgrind provides. Unfortunately oprofile’s documentation is quite lacking, and I could find nothing to get that “IBS” feature working.

Since oprofile is a sampling profiler, it has a few important points to be noted: it requires kernel support (which in my case required a kernel rebuild because I had profiling support disabled), it requires root to set up and start profiling, through a daemon process, by default it profiles everything running in the system, which on a busy system running a tinderbox might actually be _too_ much. Support for oprofile in Gentoo also doesn’t seem to be perfect, for instance there is no standardised way to start/stop/restart the daemon, which might not sound that bad for most people, but is actually useful because sometimes you forget to have the daemon running and you try to start it time and time again and it doesn’t seem to work as you expect. Also, for some reason it stores the data in /var/lib instead of /var/cache; this wouldn’t be a problem if it wasn’t that if you don’t pay enough attention you can easily see your /var/lib filesystem filling up (remember: it runs at root so it bypasses the root-reserved space).

More worrisome, you’ll never get proper profiling on Gentoo yet, at least for AMD64 systems. The problem is that all sampling profilers (so the same holds true for sysprof too) require frame pointer information; the well-known -fomit-frame-pointer flag, that allows to save precious registers on x86, and used to break debug support, can become a problem, as Mart pointed out to me. The tricky issue is that since a few GCC and GDB versions, the frame pointer is no longer needed to be able to get complete backtraces of processes being debugged; this meant that, for instance on AMD64, the flag is now automatically enabled by -O2 and higher. On some architecture, this is still not a problem to sample-based profiling, but on AMD64 it is. Now, the testcases I had to profile are quite simple and minimal and only call into the C library (and that barely calls the kernel to read the input files), so I only needed to have the C library built with frame pointers to break down the functions; unfortunately this wasn’t as easy as I hoped.

The first problem is that the Gentoo ebuild for glibc does have a glibc-omitfp USE flag; but does not seem to explicitly disable frame pointer omitting, just explicitly enable it. Changing that, thus, didn’t help; adding -fno-omit-frame-pointer also does not work properly because flags are (somewhat obviously) filtered down; adding that flag to the whitelist in flag-o-matic does not help either. I have yet to see it working as it’s supposed to, so for now oprofile is just accumulating dust.

My second choice was to use gprof together with GCC’s profiling support, but that only seems to provide a breakdown in percentage of execution time, which is also not what I was aiming for. The final option was to fall back to valgrind, in particular the callgrind tool. Unfortunately the output of that tool is not human-readable, and you usually end up using software like KCacheGrind to be able to parse it; but since this system is KDE-free I couldn’t rely on that; and also KCacheGrind does not provide an easy way to extract data out of it to compile benchmark tables and tests.

On the other hand, the callgrind_annotate script does provide the needed information, although still in a not exactly software-parsable fashion; indeed to find the actual information I had to look at the main() calltrace and then select the function I’m interested in profiling, that provided me with the instruction count of execution. Unfortunately this does not really help me as much as I was hoping, since it tells me nothing on how much time will it take for the CPU to execute the instruction (which is related to the amount of cache and other stuff like that), but it’s at least something to start with.

I’ll be providing the complete script, benchmark and hopefully results once I can actually have a way to correlate them to real-life situations.