Speed up PyGTK and Cairo by reusing images
March 18th 2010

As you might have read in this blog, I own a Neo FreeRunner since one year ago. I have used it far less than I should have, mostly because it’s a wonderful toy, but a lousy phone. The hardware is fine, although externally quite a bit less sexy than other smartphones such as the iPhone. The software, however, is not very mature. Being as open as it is, different Linux-centric distros have been developed for it, but I haven’t been able to find one that converts the Neo into an everyday use phone.

But let’s cut the rant, and stick to the issue: that the Neo is a nice playground for a computer geek. Following my desire to play, I installed Debian on it. Next, I decided to make some GUI programs for it, such a screen locker. I found Zedlock, a program written in Python, using GTK+ and Cairo. Basically, Zedlock paints a lock on the screen, and refuses to disappear until you paint a big “Z” on the screen with your finger. Well, that’s what it’s supposed to do, because the 0.1 version available at the Openmoko wiki is not functional. However, with Zedlock I found just what I wanted: a piece of software capable of doing really cool graphical things on the screen of my Neo, while being simple enough for me to understand.

Using Zedlock as a base, I am starting to have real fun programming GUIs, but a problem has quickly arisen: their response is slow. My programs, as all GUIs, draw an image on the screen, and react to tapping in certain places (that is, buttons) by doing things that require that the image on the screen be modified and repainted. This repainting, done as in Zedlock, is too slow. To speed things up, I googled the issue, and found a StackOverflow question that suggested the obvious route: to cache the images. Let’s see how I did it, and how it turned out.

Material

You can download the three Python scripts, plus two sample PNGs, from: http://isilanes.org/pub/blog/pygtk/.

Version 0

You can download this program here. Its main loop follows:

C = Canvas()

# Main window:
C.win = gtk.Window()
C.win.set_default_size(C.width, C.height)

# Drawing area:
C.canvas = gtk.DrawingArea()
C.win.add(C.canvas)
C.canvas.connect('expose_event', C.expose_win)

C.regenerate_base()

# Repeat drawing of bg:
try:
  C.times = int(sys.argv[1])
except:
  C.times = 1

gobject.idle_add(C.regenerate_base)
C.win.show_all()

# Main loop:
gtk.main()

As you can see, it generates a GTK+ window (line 04), with a DrawingArea inside (line 08), and then executes the regenerate_base() function every time the main loop is idle (line 20). Canvas() is a class whose structure is not relevant for the discussion here. It basically holds all variables and relevant functions. The regenerate_base() function follows:

def regenerate_base(self):

    # Base Cairo Destination surface:
    self.DestSurf = cairo.ImageSurface(cairo.FORMAT_ARGB32, self.width, self.height)
    self.target   = cairo.Context(self.DestSurf)

    # Background:
    if self.bg == 'bg1.png':
      self.bg = 'bg2.png'
    else:
      self.bg = 'bg1.png'

    self.i += 1

    image       = cairo.ImageSurface.create_from_png(self.bg)
    buffer_surf = cairo.ImageSurface(cairo.FORMAT_ARGB32, self.width, self.height)
    buffer      = cairo.Context(buffer_surf)
    buffer.set_source_surface(image, 0,0)
    buffer.paint()

    self.target.set_source_surface(buffer_surf, 0, 0)
    self.target.paint()

    # Redraw interface:
    self.win.queue_draw()

    if self.i > self.times:
      sys.exit()

    return True

As you can see, it paints the whole window with a PNG file (lines 15-25), choosing alternately bg1.png and bg2.png each time it is called (lines 07-11). Since the re-painting is done every time the main event loop is idle, it just means that images are painted to screen as fast as possible. After a given amount of re-paintings, the script exits.

You can run the code above by placing two suitable PNGs (480×640 pixels) in the same directory as the above code. If an integer argument is given to the script, it re-paints the window that many times, then exits (default, just once). You can time this script by executing, e.g.:

% /usr/bin/time -f %e ./p0.py 1000

Version 1

You can download this version here.

The first difference with p1.py is that the regenerate_base() function has been separated into the first part (generate_base()), which is executed only once at program startup (see below), and all the rest, which is executed every time the background is changed.

def generate_base(self):

    # Base Cairo Destination surface:
    self.DestSurf = cairo.ImageSurface(cairo.FORMAT_ARGB32, self.width, self.height)
    self.target   = cairo.Context(self.DestSurf)

The main difference, though, is that two new functions are introduced:

  def mk_iface(self):

    if not self.bg in self.buffers:
      self.buffers[self.bg] = self.generate_buffer(self.bg)

    self.target.set_source_surface(self.buffers[self.bg], 0, 0)
    self.target.paint()

  def generate_buffer(self, fn):

    image       = cairo.ImageSurface.create_from_png(fn)
    buffer_surf = cairo.ImageSurface(cairo.FORMAT_ARGB32, self.width, self.height)
    buffer      = cairo.Context(buffer_surf)
    buffer.set_source_surface(image, 0,0)
    buffer.paint()

    # Return buffer surface:
    return buffer_surf

The function mk_iface() is called within regenerate_base(), and draws the background. However, the actual generation of the background image (the Cairo surface) is done in the second function, generate_buffer(), and only happens once per each background (i.e., twice in total), because mk_iface() reuses previously generated (and cached) surfaces.

Version 2

You can download this version here.

The difference with Revision 1 is that I eliminated some apparently redundant procedures for creating surfaces upon surfaces. As a result, the generate_base() function disappears again. I get rid of the DestSurf and C.target variables, so the mk_iface() and expose_win() functions end up as follows:

  def mk_iface(self):

    if not self.bg in self.buffers:
      self.buffers[self.bg] = self.generate_buffer(self.bg)

    buffer = self.canvas.window.cairo_create()
    buffer.set_source_surface(self.buffers[self.bg],0,0)
    buffer.paint()

  def expose_win(self, drawing_area, event):

    nm = 'bg1.png'

    if not nm in self.buffers:
      self.buffers[nm] = self.generate_buffer(nm)

    ctx = drawing_area.window.cairo_create()
    ctx.set_source_surface(self.buffers[nm], 0, 0)
    ctx.paint()

A side effect is that I can get also rid of the forced redraws of self.win.queue_draw().

Results

I have run the three versions above, varying the C.times variable, i.e., making a varying number of reprints. The command used (actually inside a script) would be something like the one mentioned above:

% /usr/bin/time -f %e ./p0.py 1000

The following table sumarizes the results for Flanders and Maude (see my computers), a desktop P4 and my Neo FreeRunner, respectively. All times in seconds.

Flanders
Repaints Version 0 Version 1 Version 2
1 0.26 0.43 0.33
4 0.48 0.40 0.42
16 0.99 0.43 0.40
64 2.77 0.76 0.56
256 9.09 1.75 1.15
1024 37.03 6.26 3.44
Maude
Repaints Version 0 Version 1 Version 2
1 4.17 4.70 5.22
4 8.16 6.35 6.41
16 21.58 14.17 12.28
64 75.14 44.43 35.76
256 288.11 165.58 129.56
512 561.78 336.58 254.73

Data in the tables above has been fitted to a linear equation, of the form t = A + B n, where n is the number of repaints. In that equation, parameter A would represent a startup time, whereas B represents the time taken by each repaint. The linear fits are quite good, and the values for the parameters are given in the following tables (units are milliseconds, and milliseconds/repaint):

Flanders
Parameter Version 0 Version 1 Version 2
A 291 366 366
B 36 6 3
Maude
Parameter Version 0 Version 1 Version 2
A 453 3218 4530
B 1092 648 487

Darn it! I have mixed feelings for the results. In the desktop computer (Flanders), the gains are huge, but hardly noticeable. Cacheing the images (Version 1) makes for a 6x speedup, whereas Version 2 gives another twofold increase in speed (a total of 12x speedup!). However, from a user’s point of view, a 36 ms refresh is just as immediate as a 6 ms refresh.

On the other hand, on the Neo, the gains are less spectacular: the total gain in speed for Version 2 is a mere 2x. Anyway, half-a-second repaints instead of one-second ones are noticeable, so there’s that.

And at least I had fun and learned in the process! :^)

Tags: , , , , , , , , , , , , ,

No Comments yet »

LWD – March 2010
March 4th 2010

This is a continuation post for my Linux World Domination project, started in this May 2008 post. You can read the previous post in the series here.

In the following data T2D means “time to domination” (the expected time for Windows/Linux shares to cross, counting from the present date). DT2D means difference (increase/decrease) in T2D, with respect to last report. CLP means “current Linux Percent”, as given by last logged data, and DD means domination day (in YYYY-MM-DD format), and DCLP means “difference in CLP”, with respect to last logged data.

Project T2D DT2D DD CLP DCLP
Einstein already crossed - September 2009 54.80 +3.45
MalariaControl >10 years - - 12.12 +0.17
PrimeGrid >10 years - - 11.78 +1.47
POEM >10 years - - 11.52 +0.69
Rosetta >10 years - - 8.61 +0.01
SETI >10 years - - 8.12 +0.05
QMC >10 years - - 8.11 -0.12
Spinhenge >10 years - - 4.46 +0.09

The numbers (again) seem a bit discouraging, but the data is what it is. Now MalariaControl goes up (it went down in previous report), but QMC goes slightly down. All others go up. The Linux tide seems unstoppable, however its forward speed is not necessarily high.

As promised, today I’m showing the plots for Spinhenge@home. In next issue, QMC@home.

Number of hosts percent evolution for Spinhenge@home (click to enlarge)

Accumulated credit percent evolution for Spinhenge@home (click to enlarge)

Tags: , , , , , , , ,

No Comments yet »

Avoiding time_increment_bits problem when encoding bad header MPEG4 videos to Ogg Theora
January 28th 2010

There is some debate going on lately about the migration of YouTube to HTML5, and whether they (i.e. YouTube’s owner, Google) should support H.264 or Theora as standard codecs for the upcoming <video> tag. See, for example, how the FSF asks for support for Theora.

The thing is, I discovered x264 not so long ago, and I thought it was a “free version” of H.264. I began using it to reencode the medium-to-low quality videos I keep (e.g., movies and series). The resulting quality/file size ratio stunned me. I could reencode most material downloaded from e.g. p2p sources to 2/3 of their size, keeping the copy indistinguishable from the original with the bare eye.

However, after realizing that x264 is just a free implementation of the proprietary H.264 codec, and in the wake of the H.264/Theora debate, I decided to give Ogg Theora a go. I expected a fair competitor to H.264, although still noticeably behind in quality/size ratio. And that I found. I for one do not care if I need a 10% larger file to attain the same quality, if it means using free formats, so I decided to adopt Theora for everyday reencoding.

After three paragraphs of introduction, let’s get to the point. Which is that reencoding some files with ffmpeg2theora I would get the following error:

% ffmpeg2theora -i example_video.avi -o output.ogg
[avi @ 0x22b7560]Something went wrong during header parsing, I will ignore it and try to continue anyway.
[NULL @ 0x22b87f0]hmm, seems the headers are not complete, trying to guess time_increment_bits
[NULL @ 0x22b87f0]my guess is 15 bits ;)
[NULL @ 0x22b87f0]looks like this file was encoded with (divx4/(old)xvid/opendivx) -> forcing low_delay flag
Input #0, avi, from 'example_video.avi':
  Metadata:
    Title           : example_video.avi
  Duration: 00:44:46.18, start: 0.000000, bitrate: 1093 kb/s
    Stream #0.0: Video: mpeg4, yuv420p, 624x464, 23.98 tbr, 23.98 tbn, 23.98 tbc
    Stream #0.1: Audio: mp3, 48000 Hz, 2 channels, s16, 32 kb/s
  [audio disabled].

[mpeg4 @ 0x22b87f0]hmm, seems the headers are not complete, trying to guess time_increment_bits
[mpeg4 @ 0x22b87f0]my guess is 16 bits ;)
[mpeg4 @ 0x22b87f0]hmm, seems the headers are not complete, trying to guess time_increment_bits
[mpeg4 @ 0x22b87f0]my guess is 16 bits ;)
[mpeg4 @ 0x22b87f0]looks like this file was encoded with (divx4/(old)xvid/opendivx) -> forcing low_delay flag
    Last message repeated 1 times
[mpeg4 @ 0x22b87f0]warning: first frame is no keyframe

I searched the web for solutions, but to no avail. Usually pasting literal errors in Google yields good results, but in this case I only found developer forums where this bug was discussed. What I haven’t found is simple instructions on how to avoid it in practice.

Well, here it goes my simple solution: pass it through MEncoder first. Where the following fails:

% ffmpeg2theora -i input.avi -o output.ogg

the following succeeds:

% mencoder input.avi -ovc copy -oac copy -o filtered.avi
% ffmpeg2theora -i filtered.avi -o output.ogg

I guess that what happens is basically that mencoder takes the “raw” video data in input.avi and makes a copy into filtered.avi (which ends up being exactly the same video), building sane headers in the process.

Tags: , , , , , , , , , , , ,

3 Comments »

Reverse SSH to twart over-zealous firewalls
January 4th 2010

I guess it is not very uncommon, since it has happened twice to me, in two sites I have worked. “Over-cautious” sysadmins decide that the University, Institute, Corporation, or whatever, would be safer if connections to the LAN from outside of it were banned, including the port 22. In an effort to avoid making security trample service (how considerate!) the usual solution to allow remote conection is to use VPN.

While VPN might have some advantages over SSH, I prefer the latter by far, and don’t think a proper SSH setup has any lack of security, specially comparing to poorly implemented VPNs. For example, I would never trust something as vital as VPN software to a private company, yet most popular VPNs are proprietary (at least the University of the Basque Country uses the Cisco VPN). It is at least paradoxical that a free and open SSH implementation as e.g. OpenSSH, tested in such a throughout way, and for so long, is dumped, and a black-box solution developed by a profit-driven organization is used instead.

But I digress. I am not interesting in justifying why I want SSH. What I want to show here is a trick I learned reading tuxradar.com. Esentially, allows one to connect (with SSH) from machine A to machine B, even if machine B has all ports closed (so SSH-ing using another port would be useless either).

The idea (see below) is to connect from machine B to A, which is allowed (and is also the exact reverse of what we actually want to do), in a way that opens a canal for a “reverse” connection from A to B:

(In machine_B)
% ssh -R 1234:localhost:22 username_in_A@machine_A

Then we will be able to use port 1234 (or whatever port we specify in the ssh -R command above) in machine A to connect to machine B, as long as the original ssh -R holds:

(In machine_A)
% ssh username_in_B@localhost -p 1234

The picture shows it better:

SSHing from A to B (dashed red arrow) is disallowed, but the reverse (in black) is not. The ssh -R command line (see code above), opens up the link between ports 22 and 1234 (two-headed black arrow), so that a ssh -p to port 1234 in machine A will redirect us to machine B. If we are asked for a password (at the ssh -p stage), they are requesting the one for machine B, since we are being redirected to machine B.

Please, recall that the above recipe is no less secure than a regular SSH from A to B (if it were allowed), since anyone SSHing to port 1234 in machine A will be automatically redirected to machine B, but undergoing the same security checks as usual (password, public/private key…). Note also that I am talking about what is possible, not necessarily desirable or comfortable. It’s just another tool if you want to use it.

Tags: , , , , , ,

No Comments yet »

Hardware compatibility is better with Windows… not
January 3rd 2010

One of the (few, but legitimate) reasons given by some Windows users to not switch to Linux is that many pieces of hardware are not recognized by the latter. Sure enough, 99.9%, if not all, of the devices sold in shops are “Windows compatible”. The manufacturers of devices make damn sure their device, be it a pendrive or a printer, a computer screen or a keyboard, will work on any PC running Windows. They will even ship a CD with the drivers in the same package, so that installation of the device is as smooth as possible in Microsoft’s platform. Linux compatibility? Well, they usually just don’t care. Those hackers will make it work anyway, so why bother? And their market share is too small to take them into account.

Now, let’s pass to some personal experience with a webcam. I bought a webcam for my girlfriend’s laptop, which doesn’t have one integrated. The webcam was a cheap Logitech USB one, with “Designed for Skype” and “Windows compatible” written all around on the box. It even came with a CD, marked prominently as “Windows drivers”. My girlfriend’s laptop runs Windows Vista, and I decided to give it a chance, and plugged the webcam without further consideration. A message from our beloved OS informed me that a new device had been plugged (brilliant!) but Windows lacked the necessary drivers to make it work (bummer!). OK, no problem. We had the drivers, right? I unplugged the camera, inserted the CD, and followed the instructions to get the drivers installed. Everything went fine, except that the progress bar with the installation percent went on for more than 12 minutes (checked on the watch) before reaching 100%. After installation, Windows informed me that a system reboot was necessary, and so I did. After reboot, the camera would work.

As I had my Asus Eee at hand, I decided to try the webcam on it. I plugged it, and nothing happened. I just saw the green light on the camera turn on. Well, maybe it worked… I opened Cheese, a Linux program to show the output of webcams. I was a bit wary, because the Eee has an integrated webcam, so maybe there would be some interference or something. Not so. Cheese showed me immediately the output of the webcam I had just plugged, and offered me a menu with two entries (USB webcam and integrated one), so I could choose. That’s it. No CD with drivers, no 12-minute installation, no reboot, no nothing. Just plug and play.

Perhaps it is worth mentioning that the next time I tried to use the webcam on the Vista laptop, it would ask me for driver installation again! I don’t know why… I must have done something wrong in the first installation… With Windows, who knows?

Tags: , , , , , , , , , , ,

10 Comments »

ChopZip: a parallel implementation of arbitrary compression algorithms
December 20th 2009

Remember plzma.py? I made a wrapper script for running LZMA in parallel. The script could be readily generalized to use any compression algorithm, following the principle of breaking the file in parts (one per CPU), compressing the parts, then tarring them together. In other words, chop the file, zip the parts. Hence the name of the program that evolved from plzma.py: ChopZip.

Introduction

Currently ChopZip supports lzma, xz, gzip and lzip. Of them, lzip deserves a brief comment. It was brought to my attention by the a reader of this blog. It is based on the LZMA algorithm, as are lzma and xz. Apparently unlike them, multiple files compressed with lzip can be concatenated to form a single valid lzip-compressed file. Uncompressing the latter generates a concatenation of the formers.

To illustrate the point, check the following shell action:

% echo hello > head
% echo bye > tail
% lzip head
% lzip tail
% cat head.lz tail.lz > all.lz
% lzip -d all.lz
% cat all
hello
bye

However, I just discovered that all gzip, bzip2 and xz do that already! It seems that lzma is advertised as capable of doing it, but it doesn’t work for me. Sometimes it will uncompress the concatenated file to the original file just fine, others it will decompress it to just the first chunk of the set, yet other times it will complain that the “data is corrupt” and refuse to uncompress. For that reason, chopzip will accept two working modes: simple concatenation (gzip, lzip, xz) and tarring (lzma). The relevant mode will be used transparently for the user.

Also, if you use Ubuntu, this bug will apply to you, making it impossible to have xz-utils, lzma and lzip installed at the same time.

The really nice thing about concatenability is that it allows for trivial parallelization of the compression, while maintaining compatibility with the serial compression tool, which can still uncompress the product of a parallel compression. Unfortunatelly, for non-concatenatable compression formats, the output of chopzip will be a tar file of the compressed chunks, making it imposible to uncompress with the original compressor alone (first an untar would be needed, then uncompressing, then concatenation of chunks. Or just use chopzip to decompress).

The rationale behind plzma/chopzip is simple: multi-core computers are commonplace nowadays, but still the most common compression programs do not take advantage of this fact. At least the ones that I know and use don’t. There are at least two initiatives that tackle the issue, but I still think ChopZip has a niche to exploit. The most consolidated one is pbzip2 (which I mention in my plzma post). pbzip2 is great, if you want to use bzip2. It scales really nicely (almost linearly), and pbzipped files are valid bzip2 files. The main drawback is that it uses bzip2 as compression method. bzip2 has always been the “extreme” bother of gzip: compresses more, but it’s so slow that you would only resort to it if compression size is vital. LZMA-based programs (lzma, xz, lzip) are both faster, and even compress more, so for me bzip2 is out of the equation.

A second contender in parallel compression is pxz. As its name suggests, it compresses in using xz. Drawbacks? it’s not in the official repositories yet, and I couldn’t manage to compile it, even if it comprises a single C file, and a Makefile. It also lacks ability to use different encoders (which is not necessarily bad), and it’s a compiled program, versus chopzip, which is a much more portable script.

Scalability benchmark

Anyway, let’s get into chopzip. I have run a simple test with a moderately large file (a 374MB tar file of the whole /usr/bin dir). A table follows with the speedup results for running chopzip on that file, using various numbers of chunks (and consequently, threads). The tests were conducted in a 4GB RAM Intel Core 2 Quad Q8200 computer. Speedups are calculated as how many times faster did #chunks perform with respect to just 1 chunk. It is noteworthy that in every case running chopzip with a single chunk is virtually identical in performance to running the orginal compressor directly. Also decompression times (not show) were identical, irrespective of number of chunks. ChopZip version vas r18.

#chunks xz gzip lzma lzip
1 1.000 1.000 1.000 1.000
2 1.862 1.771 1.907 1.906
4 3.265 1.910 3.262 3.430
8 3.321 1.680 3.247 3.373
16 3.248 1.764 3.312 3.451

Note how increasing the number of chunks beyond the amount of actual cores (4 in this case) can have a small benefit. This happens because N equal chunks of a file will not be compressed with equal speed, so the more chunks, the smaller overall effect of the slowest-compressing chunks.

Conclusion

ChopZip speeds up quite noticeably the compression of arbitrary files, and with arbitrary compressors. In the case of concatenatable compressors (see above), the resulting compressed file is an ordinary compressed file, apt to be decompressed with the regular compressor (xz, lzip, gzip), as well as with ChopZip. This makes ChopZip a valid alternative to them, with the parallelization advantage.

Tags: , , , , , , , , , , , , , ,

2 Comments »

LWD – December 2009
December 3rd 2009

This is a continuation post for my Linux World Domination project, started in this May 2008 post. You can read the previous post in the series here.

In the following data T2D means “time to domination” (the expected time for Windows/Linux shares to cross, counting from the present date). DT2D means difference (increase/decrease) in T2D, with respect to last report. CLP means “current Linux Percent”, as given by last logged data, and DD means domination day (in YYYY-MM-DD format), and DCLP means “difference in CLP”, with respect to last logged data. I have dropped the “Confidence” column, for it gave little or no info.

Project T2D DT2D DD CLP DCLP
Einstein already crossed - September 2009 51.35 +4.24
MalariaControl >10 years - - 11.95 -0.32
POEM 83.4 months - 2016-10-08 11.52 +0.69
PrimeGrid >10 years - - 10.31 +0.46
Rosetta >10 years - - 8.60 +0.10
QMC >10 years - - 8.23 +0.15
SETI >10 years - - 8.07 +0.05
Spinhenge >10 years - - 4.37 +0.15

Except for the good news that Einstein@home has succumbed to the Linux hordes, the numbers (again) seem quite discouraging, but the data is what it is. All CLPs but MalariaControl have gone up (which goes down less than in previous report). The Linux tide seems unstoppable, however its forward speed is not necessarily high.

As promised, today I’m showing the plots for Rosetta@home, in next issue Spinhenge@home.

Number of hosts percent evolution for Rosetta@home (click to enlarge)

Accumulated credit percent evolution for Rosetta@home (click to enlarge)

Tags: , , , , , , , , ,

1 Comment »

First impressions with Arch Linux
October 9th 2009

I have been considering for some time trying some Linux distro that would be a little faster than Ubuntu. I made the switch from Debian to Ubuntu some time ago, and I must say that I am very pleased with it, despite it being a bit bloated and slow. Ubuntu is really user-friendly. This term is often despised among geeks, but it does have a huge value. Often times a distro will disguise poor dependency-handling, lack of package tuning and absence of wise defaults as not having “fallen” for user-friendliness and “allowing the user do whatever she feels like”.

However comfortable Ubuntu might be, my inner geek wanted to get his hands a little bit dirtier with configurations, and obtain a more responsive OS in return. And that’s where Arch Linux fits in. Arch Linux is regarded as one of the fastest Linux distros, at least among the ones based on binary packages, not source code. Is this fame deserved? Well, in my short experience, it seem to be.

First off, let us clarify what one means with a “faster” Linux distro. There are as I see it, broadly speaking, three things that can be faster or slower in the users’ interaction with a computer. The first one, and very often cited one, is the boot (and shutdown) time. Any period of time between a user deciding to use the computer and being able to do so is wasted time (from the user’s point of view). Many computers stay on for long periods of time, but for a home user, short booting times are a must. A second speed-related item would be the startup time of applications. Booting would be a sub-section of this, if we consider the OS/kernel as an “app”, but I refer here to user apps such as an e-mail client or text editor. Granted, most start within seconds at most, many below one second or apparently “instantly”, but some others are renowned for their slugginess (OpenOffice.org, Firefox and Amarok come to mind). Even the not-very-slow apps that take a few seconds can become irritating if used with some frequency. The third speed-related item would be the execution of long-running CPU-intensive software, such as audio/video coding or scientific computation.

Of the three issues mentioned, it should be made clear that the third one (execution of CPU-intensive tasks) is seldom affected at all by the “speed” of the OS. Or it shouldn’t be. Of course having the latest versions of the libraries used by the CPU-intensive software should make a difference, but I doubt that encoding a video with MEncoder is any faster in Gentoo than Ubuntu (for the same version of mencoder and libraries). However, the first two (booting and start up of apps) are different from OS to OS.

Booting

I did some timings in Ubuntu and Arch, both in the same (dual boot) machine. I measured the time from GRUB to GDM, and then the time from GDM to a working desktop environment (GNOME in both). The exact data might not be that meaningful, as some details could be different from one installation to the other (different choice of firewall, or (minimally) different autostarted apps in the DE). But the big numbers are significant: where Ubuntu takes slightly below 1 minute to GDM, and around half a minute to GNOME, Arch takes below 20 seconds and 10 seconds, respectively.

App start up

Of the three applications mentioned, OpenOffice.org and Firefox start faster in Arch than in Ubuntu. I wrote down the numbers, but don’t have them now. Amarok, on the other hand, took equally long to start (some infamous 35 seconds) in both OSs. It is worth mentioning that all of them start up faster the second and successive times, and that the Ubuntu/Arch differences between second starts is correspondingly smaller (because both are fast). Still Arch is a bit faster (except for Amarok).

ABS, or custom compilation

But the benefits of Arch don’t end in a faster boot, or a more responsive desktop (which it has). Arch Linux makes it really easy to compile and install any custom package the user wants, and I decided to take advantage of it. With Debian/Ubuntu, you can download the source code of a package quite easily, but the compilation is more or less left to you, and the installation is different from that of a “official” package. With Arch, generating a package from the source is quite easy, and then installing it with Pacman is trivial. For more info, refer to the Arch wiki entry for ABS.

I first compiled MEncoder (inside the mplayer package), and found out that the compiled version made no difference with respect to the stock binary package. I should have known that, because I say so in this very post, don’t I? However, one always thinks that he can compile a package “better”, so I tried it (and failed to get any improvement).

On the other hand, when I recompiled Amarok, I did get a huge boost in speed. A simple custom compilation produced an Amarok that took only 15 seconds to start up, less than half of the vanilla binary distributed with Arch (I measured the 15 seconds even after rebooting, which rules out any “second time is faster” effect).

Is it hard to use?

Leaving the speed issue aside, one of the possible drawbacks of a geekier Linux distro is that it could be harder to use. Arch is, indeed, but not much. A seasoned Linux user should hardly find any difficulty to install and configure Arch. It is certainly not for beginners, but it is not super-hard either.

One of the few gripes I have with it regards the installation of a graphical environment. As it turns out, installing a DE such as GNOME does not trigger the installation of any X Window System, such as X.org Server, as dependencies are set only for really vital things. Well, that’s not too bad, Arch is not assuming I want something until I tell it I do. Fine. Then, when I do install Xorg, the tools for configuring it are a bit lacking. I am so spoiled by the automagic configurations in Ubuntu, where you are presented a full-fledged desktop with almost no decision on your side, that I miss a magic script that will make X “just work”. Anyway, I can live with that. But some thing that made me feel like giving up was that after following all the instruction in the really nice Arch Wiki, I was unable to start X (it would start as a black screen, then freeze, and I could only get out by rebooting the computer). The problem was that I have a Nvidia graphics card, and I needed the (proprietary) drivers. OK, of course I need them, but the default vesa driver should work as well!! In Ubuntu one can get a lower resolution, non-3D effect, desktop with the default vesa driver. Then the proprietary Nvidia drivers allow for more eye-candy and fanciness. But not in Arch. When I decided to skip the test with vesa, and download the proprietary drivers, the X server started without any problem.

Conclusions

I am quite happy with Arch so far. Yes, one has to work around some rough edges, but it is a nice experience as well, because one learns more than with other too user-friendly distros. I think Arch Linux is a very nice distro that is worth using, and I recommend it to any Linux user willing to learn and “get hands dirty”.

Tags: , , , , , , , , , , , ,

3 Comments »

No market for Linux games? The Koonsolo case
September 19th 2009

I’ve read via Phoronix the case of the indie PC game producer Koonsolo, which sells a game for both Windows, Mac and Linux. The interesting thing is that, as you can read on Koonsolo’s blog, the Linux version is being sold in larger numbers than the Windows one!

Apparently a 40% of the visitors or the Koonsolo site use Windows, vs less than 23% for Linux. However, despite the majority of visitors using Windows (there are even more Mac visitors than Linux ones), the Linux version sales amount to a 34% of the total sales, whereas Windows sales are only 23%. Visit the site for some more numbers and comments.

Tags: , , , , , ,

No Comments yet »

LDW – September 2009
September 2nd 2009

This is a continuation post for my Linux World Domination project, started in this May 2008 post. You can read the previous post in the series here.

In the following data T2D means “time to domination” (the expected time for Windows/Linux shares to cross, counting from the present date). DT2D means difference (increase/decrease) in T2D, with respect to last report. CLP means “current Linux Percent”, as given by last logged data, and DD means domination day (in YYYY-MM-DD format).

Project T2D DT2D DD CLP Confidence %
Einstein 38.6 days -55 days 2009-10-10 47.11 (+2.60) 16.1
MalariaControl >10 years - - 12.27 (-0.37) -
POEM >10 years - - 10.83 (+0.17) -
PrimeGrid >10 years - - 9.85 (+0.24) -
Rosetta >10 years - - 8.50 (+0.13) -
QMC >10 years - - 8.07 (+0.15) -
SETI >10 years - - 8.02 (+0.02) -
Spinhenge >10 years - - 4.22 (+0.37) -

The numbers (again) seem quite discouraging, but the data is what it is. All CLPs but MalariaControl have gone up, with Spinhenge going up by almost a 0.4% in 3 months. The Linux tide seems unstoppable, however its forward speed is not necessarily high.

As promised, today I’m showing the plots for QMC@home, in next issue Rosetta@home.

Number of hosts percent evolution for QMC@home (click to enlarge)

Accumulated credit percent evolution for QMC”home (click to enlarge)

Tags: , , , , , , , , ,

1 Comment »

Next »