My public and open University

As the readers of this blog may know, I recently became a Doctor in the University of the Basque Country.

As a follow-up to the Thesis Defense act, there is still some paperwork to be done, as e.g. filling a datasheet called “Teseo”. Anyway, what I will comment applies to all the paperwork I did before, during and after the Thesis Defense.

The matter is that this freaking “Teseo” sheet is available online as a RTF or PDF. An original handwritten copy must be sent to the University, so I used a printed down PDF for that. No problem.

The problem came when the University requested that an “electronic” form be sent by e-mail (for which a scanned copy of my manuscript would not do). These bright minds surely wanted me to fill in the RTF, and send it. However, not everyone who wants to get a Ph.D. has adhered (or wants to adhere) to any expensive and abusive license for a proprietary product like MS Windows or MS Office. Certainly I haven’t, so I had to make do with GNU/Linux and OpenOffice to fill the RTF. The result was crappy, due to incompatibilities of the friggin’ RTF proprietary format… but that I sent.

Now the point is: does the public University of the Basque Country (or the public Spanish Ministry of Education) have any reason to discriminate in favor of the private and foreign company Microsoft? Do we, the tax-payers who put the money for their salaries, have to put up with being forced to use specific proprietary formats to communicate with the public institutions? It disgusts me to no end.

Picture the following example: I want to attend the University, and they tell me that I have to wear shoes for that (e.g., use a computer). OK, this might be more or less arbitrary, but I can accept it. Now, imagine that they ask me to wear Gucci shoes (e.g., a proprietary file format, such as RTF). That would be inacceptable, because a public institution can not favor that way a private company, at least not if there is any conceivable substitute (e.g. acceptable shoes of any other brand). And it doesn’t matter if instead of Gucci they require that one uses any cheaper shoe brand. The problem is not if it is expensive, but rather that they are discriminating against other options. And they have no right to do it. They are there to serve us, not the other way around.

Someone could say that they have to use some electronic format, and any would be equally arbitrary. No, not at all. There is something called “open standards”, to which “things” (e.g. electronic document formats) can adhere. One such standard is the ISO, and one document format adhering to a standard (the ISO/IEC 26300) is the Open Document Format (ODF), so they can use that.

The basics are simple: readers and editors for open formats can be made by anyone freely. No-one can force me to pay them royalties so that they allow me to make a program that reads these documents. With proprietary formats (such as DOC, RTF and others), the owner of the license (e.g. Microsoft) can ban anyone from making a program that writes documents in that format, or charge royalties as they please. Put bluntly: since the exchange of documents in my University depends on proprietary formats (RTF and DOC), Microsoft could decide tomorrow to disrupt its operations by denegating further licenses for e.g. MS Office. Of course, this will not happen, because the University will pay as requested. I call this extorsion, because the University can not afford not to pay, so where do the “free competition” and “open market” ideas fit in here? Moreover, I call the University bunch of fools, because they put themselves in a position that can be extorted. The aforementioned is not possible if one uses open formats, because free (not free of cost, but free as in freedom) document editors are, and will always be, available.

Comments

Parsing command line arguments

In UNIX-like environments, such as GNU/Linux, command line is often used to operate on a bunch of files, such as:

rm -f *.dat

In the command above, “*.dat” is expanded by the shell (the command interpreter), to all matching files in the present directory (e.g.: “file1.dat file2.dat dont_delete_me.dat this_file_is_rubbish.dat“). However, this expansion is performed as a first step, and then the expanded command line is executed, e.g.:

rm -f file1.dat file2.dat dont_delete_me.dat this_file_is_rubbish.dat

This behaviour can potentially fail if a lot of files match the *.dat, because there is an upper limit to how wide a command line can be (brutally high, but finite). This can happen, for example, if you try to delete the contents of a directory with 100,000 files, and use rm -f * (yes, this can happen). For example, a ls in a directory with 100,000 files works fine, but an rm * does not:

Bart[~/kk]: rm *
/bin/rm: Argument list too long.

To avoid this problem, we can make use of xargs and echo (since echo does not seem to suffer from this argument number limitation), in the following way:

echo * | xargs rm -f

Now, xargs takes care of the argument list, and processes it so that rm does not see an endless argument list.

xargs can also be given other uses. For example, consider the following example: We want to convert some OGG files to MP2 (I won’t be caught dead near an MP3, due to its patent issues), so that a Windows luser can hear them. We can use oggdec to convert OGG to WAV, then toolame to convert WAV to MP2. Now, oggdec accepts multiple arguments as input files to convert, so the following works:

oggdec *.ogg

The above generates a .wav for each .ogg in the argument list. However, toolame does not work like that; it expects a single argument, or, if it finds two, the second one is interpreted as the desired output file, so the following fails (too many arguments):

toolame *.wav

This is where xargs can come in handy, with its -n option (see the xargs man page). This option tells xargs how many arguments to process at the same time. If we want each single argument to be processed separately, the following will do the trick:

echo *.wav | xargs -n 1 toolame

In the above example, toolame is called once per WAV file, and sees a single WAV file as argument each time, so it generates a .mp2 file per .wav file, as we wanted.

Comments

My music collection hits 5000

OK, OK, having 5,000 songs is not that much. I’ve heard people with collection in the 15,000s and over. The eMule freaks, downloading 24/7 without ever actually listening to anything can easily have collections in the 50,000s.

However, I never downloaded a copyrighted song from eMule, even though my collection does not entirely consist on original CDs. Actually I downloaded a lot of albums from the Internet… namely Creative Commons music from Jamendo.

Some info about my collection (some statistics taken from the Amarok player, since around June 2006):

Total songs:        5015
  - Commercial:     3944
  - Jamendo:        1055
  - Other CC:       31
Total playing time: 1 week and 6 days
Total file size:    22GB
Song playcount:     10910
Different artists:  627
Different albums:   413
In MP3:             1562
In OGG:             3468

There is some mismatch (5030 songs, as counted from actual files (all OGG and MP3 files I have in a given directory) vs. 5015 songs as counted from the Amarok collection), but I see no easy way of filtering the 15 rogue files (update: I did, some time after writing the post).

Comments (5)

Gospel in Donostia

Yesterday I attended a gospel concert with my friends, at the Kursaal Palace.

The singing choir was the London Community Gospel Choir, and a really find ensemble it was. The basque society is well known for our coldness, and how shy and quiet we are. However, the LCGC, under the commands of its vital director, managed to make us not only clap hands, but also sing to the tunes, and even stand up and dance! My hands hurt of so much clapping, and time passed like a flash.

I never expected to be carried out by such a music (little religious feeling I have, and little connection to black culture and music), but they made it.

If they sing near you… go see them.

Comments

Little Miss Sunshine

This wednesday I went to the cinema with my workmate Julen, and we watched “Little Miss Sunshine” (IMDb|FilmAffinity).

It is a hell of a good movie, humble and simple, but with a very good plot, and very good acting. Drama and comedy are intermixed, with the latter being more prominent. One does not spend the whole movie laughing, but when the moments come, one laughs out loud.

I wholeheartedly recommend this film to anyone not feeling that a movie needs lasers and explosions to be any good.

Comments

I’m a Doctor!

Let the world know it: I made my Ph.D. Thesis defense last tuesday, and I passed! From now on, people may refer to me as “Dr. Silanes” :^)

The title of the thesis was:

Ethylene Polymerization by Group IV Transition Metal Metallocenes

The presentation can be freely downloaded as a 3.2MB PDF. It was done using the PowerDot class for LaTeX. The PowerDot style used was designed by me (from the default style by H. Adriaens and C. Ellison), and is available as a .sty file.

The Thesis book itself can be downloaded under a Creative Commons license as a 6.1MB PDF (the PDF I link here is a very slightly modified version, with errata corrected).

Some pics to illustrate the defense day:



Giving explanations.



Good vibrations during the Tribunal’s question turn.



Serious, waiting for the calification.

Videos:

Comments (1)

TeX capacity exceeded error

I am definitely dumb. Well, LaTeX has its part in it, too.

It turns out that all of a sudden, I started having this error when compiling a .tex file:

! TeX capacity exceeded, sorry [input stack size=1500].

After googling for an answer, I found out that the “stack size” limit is defined in the following file:

/usr/share/texmf/web2c/texmf.cnf

However, changing the value made no good: any limit, no matter how large, would be “exceeded”. The reason (after a little more hitting my head against the wall) is that there was an infinite loop in the input .tex (maybe \input{file.tex} inside file.tex, or somesuch). 10 hours (well, 5 minutes, actually) of head-banging later, when I was pretty sure no freaking infinite loop was there, I found the answer:

I had deleted the \end{document} tag!!

Now, yes, how stupid am I? And… how stupid is LaTeX to give that silly error, instead of:

TeX warning: You are too dumb, and forgot an \end{document}

Comments (43)

Custom style in PowerDot

Rembember I mentioned PowerDot for LaTeX? PowerDot is a LaTeX class to produce PowerPoint-like presentations. It creates PDFs that can be read fullscreen with any PDF reader, and turn out to be very nice looking presentations.

I am now fiddling with it, and wanted to do a custom style. I have read the PowerDot Manual[PDF], and it says all you have to do is to copy and rename an existing style, then modify it:

% cd /usr/share/texmf-texlive/tex/latex/powerdot/
% cp powerdot-default.sty powerdot-isilanes.sty
% vi powerdot-isilanes.sty

Then, put style=isilanes in your .tex, et voilà!. Well, it fails misserably, saying (among the usual garbage):

! Class powerdot Error: unknown style `isilanes'.

But the .sty is there!

OK, the problem is that LaTeX “doesn’t know” you added the style. To remind it, in my Debian Etch box:

% dpkg-reconfigure tetex-base

or, much better (thanks to a comment by bjacquem):

% texhash

This seems to “refresh” the internal LaTeX database, and now it works.

Comments

Popularity of Free Software generating bug exploitation?

[Esta entrada está también disponible en castellano|English PDF|PDF en castellano]

It is often said (by FLOSS-skeptics), that Free Software has less exploited bugs than the Proprietary Software because it is less popular. They argue that, since less people uses FLOSS, the crackers are less inclined to waste their time exploiting the bugs it could have. The greater user base of the proprietary software would also, in their words, make bugs more prominent, and their exploits spread faster. The corollary of this theory would be that popularization of FLOSS applications (e.g. Firefox), would lead to an increase in the number of bugs discovered and exploited, eventually reaching a proprietary-like state (e.g. “Firefox will have as many bugs as IE, when Firefox is as popular as IE”).

In this blog entry I will try to outline a mathematical model, proposed to demonstrate the utter nonsense of this theory. Specifically, I will argue that an increase in community size benefits a FLOSS project in at least 3 ways:

  1. Faster development
  2. Shorter average life of open bugs
  3. Shorter average life of exploited bugs

A more thorough explanation is available in PDF format. Recall that the math display in HTML is generally poor (much more so when I don’t have the time nor skills to tune it). If you like pretty formulas, the PDF is for you.

Both this blog entry and the linked PDF are released under the following license:

Creative Commons License

This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 2.5 License.

What this basically means is that you are free to copy and/or modify this work, and redistribute it freely. The only limitations are that you can not make it for profit, and that you have to cite its original author (or at least link to this blog).

1 – Propositions and derivations

We have a FLOSS project P, new versions being released every T time, and each version incorporating G new bugs. Each new version will be released when all bugs from the previous release have been patched. At any point in time, there will be B open bugs (remaining from G).

The patching speed is assumed proportional to the size of the comunity of users (U):

dB/dt=-KpU

1.1 – Faster development

From above, the time dependency of open bugs:

B = G – Kp U t

The inter-release period (T), from B = 0:

T = G/KpU

So the inter-release time (T) is shortened for growing U.

1.2 – Shorter average life of open bugs

In a dt time period, (-dB/dt)dt bugs are patched, their age being t. If we call Ï„ the average lifetime of bugs, we have the definition:

τ = (∫t(-dB/dt)dt)/(∫dB)

From that it follows:

Ï„ = T/2

So, the average life of open bugs equals half the inter-release time, which (as stated above) has an inverse proportionality with U.

1.3 – Fraction of bugs exploited before being patched

We define the following bug exploitation speed, where Bx is the total amount of exploited bugs, Kx is the “exploiting efficiency” of the crackers (whose amount will be proportional to U), and Bou is the amount of open and unexploited bugs:

dBx/dt = Kx U Bou

We also define α = Box/B, where Box is the amount of open and exploited bugs.

It can be derived the evolution of α with time:

α(t) = 1 – exp(-Kx U t)

We then define γ = Kp/KxG, and derive the fraction of G bugs that end up exploited by time t:

Bx(t)/G = 1 – γ + (γ – 1 + t/T)exp(-Kx U t)

Solving for t=T, and taking into account that T=G/KpU, we get the fraction of total bugs that gets exploited ever, during the inter-release period (Fx):

Fx = 1 – γ + γ exp(-1/γ)

Recall that Fx is independent of U, that is, increasing the size of the user community does not increase the fraction of total bugs that get exploited ever, even though the amount of crackers is increased along with the user base.

1.4 – Shorter average life of exploited bugs

We want to find out how long exploited bugs stay unpatched, calling this time τx. After some slightly complex algebra, but always deriving from the previously defined equations (see PDF version), we obtain a fairly simple expresion for τx:

τx = Fxτ

That is, the average exploitation time of exploited bugs is proportional to Ï„, which is to say it is proportional to T, or inversely proportional to U.

2 – Conclusions

The “Increasing popularity = Increasing bugginess” motto is a non sequitur. According to the simple model outlined here, the broader the user community of a FLOSS program, the faster bugs will be patched, even admitting that an increase in user base brings an equal increase in the number of crackers committed to doom it. Even for crackers that are more effective in their cracking work than the bona fide users in their patching work (Kx >>> Kp), increasing the community size does reduce how long the bugs stay unpatched and also how long the exploited bugs stay unpatched. No matter how clumsy the users, and how rapacious the crackers, the free model (whereby the users are granted access to the code, and thus empowered to contribute to the program), ensures that popularization is positive, for both the program and the community itself.

Compare that with a closed model, in which an increased user base may boost the number or crackers attacking a program, but certainly adds little, if anything, to the code patching and correcting speed. It is actually proprietary software that should fear popularization. It is easy to see that when a particular proprietary software piece grows over a certain “critical mass” of users, the crackers could potentially disrupt its evolution (say, Ï„x = T, Fx = 1), because G, P and thus T, are kept constant (depend only on the sellers of the code).

Comments (1)

An inconvenient truth

95% screaming and 5% crying. That’s what I felt when watching An incovenient truth (es: Una verdad incómoda) (IMDb|FilmAffinity).

Crying because some problems are huge, and are real. And it is sad. Screaming because there are so many sons of a bitch trying to get away with their dirty business, disguised as “skeptics” of Global Warming.

The movie is a kind of documentary about Global Warming, featuring Al Gore as conductor. Good ol’ Al may ring a bell to you: yes he’s the one who came on top of George Bush in the 2000 US Presidential Elections, but nonetheless was declared loser due to highly controversial decisions by the US Supreme Court.

The movie is tagged by some people as “boring” or “not really saying anything new”. I disagree. I am no GW expert, but by no means illiterate, either. I have a B.Sc. in Chemistry, soon to get the Ph.D. (albeit in Quantum Chemistry, which has little direct link with enviromental matters), and I did find the movie interesting.

Some of the data presented is, of course, redundant to some extent, but nonetheless it seems appropriate to mention it once again, even if “we all know it”. However, there is a non-trivial amount of data that, at least for me, was new. To name three such points:

  • Of over 900 scientific articles studied, NOT ONE cast any doubt about the human source of the excess CO2 in the atmosphere, or this excess being the culprit of the global warming. There is no controversy among scientist, but rather a complete agreement. On the other hand, a surprising 53% of the studied mass media (newspapers,…) did mention some “disagreements” or “doubts”, making it look like we don’t really know what causes the global warming, or if it even exists! Clearly someone is trying to intoxicate the general public with doubts, when there’s none. Think of it.
  • The CO2/temperature increase in the last decades is alleged by some to be just “a part of a trend of ups and downs”, because “temperatures have always fluctuated”. Bullshit. Gore shows studies of deep Antarctic ice that go back a friggin’ 650,000 years, and plots of atmospheric CO2 concentration and temperature along these years. There are fluctuations, and even several glaciations could be seen. However, the CO2 concentration and temperature deviation from the average is now more than double what has ever been! Such a trend has never ever happened before.
  • Underdeveloped countries have loose environmental policies, that make them pollute a lot. Developed countries pollute more just because they have more industry, even if it is relatively cleaner. Not completely true. For example, the environmental policies regarding car manufacturing are tighter in China than in the USA. In fact, Chinese cars can be sold in the USA, but most north-american cars can not be sold in China, because they pollute too much. And that China being a book example of reckless industrialization, with little care for environment!



Image (taken from the Wikimedia), showing date much like the one Gore presents in the movie. Notice the point to the extreme right: the CO2 concentration goes over the roof (380 ppm), and is not shown.

Go watch the movie, it’s quite informing (and no, not too boring).

Comments (1)

« Previous Page« Previous entries « Previous Page · Next Page » Next entries »Next Page »