February, 2008 - handyfloss

Archive for February, 2008

Gmail CAPTCHA broken?

February 28, 2008 at 12:20 pm · Filed under This evil world

I just read, through a link provided by Julen, that apparently Gmail CAPTCHA has been broken (referred to at Slashdot).

This CAPTCHA in particular is the one Google asks a new user to identify correctly to create a new Gmail account. If a robot, or any other automated process, is able to make the correct guess and pretend is a legit user, this opens the doors for massive amounts of new Gmail accounts for spammers. We’ll see what comes out of that (more spam, probably).

Permalink Comments (3)

A vueltas con el incremento de ancho de banda de Euskaltel

February 21, 2008 at 14:33 pm · Filed under This evil world

Como el lector quizÃ¡ sabrÃ¡, Euskaltel ha duplicado (y triplicado) la anchura de banda de todas o casi todas sus ofertas de conexiÃ³n a Internet. Y lo ha hecho manteniendo los precios, lo cual es de agradecer (aunque no del todo soprendente, dado que llevaban mÃ¡s de 2 aÃ±os sin cambiar su oferta).

Pues bien, la lÃnea de 300 kb que tienen mis padres contratada, supuestamente ahora la ofrecen a 1Mb. Â¡Genial, el triple de velocidad por el mismo precio! Bueno, la realidad es que no es del todo cierto. Parece ser que cambiar la pÃ¡gina web para ofrecer mayores velocidades es mÃ¡s fÃ¡cil que realmente servir mayores velocidades, con lo cual hay una pequeÃ±a discrepancia entre lo ofertado y lo servido: mis padres siguen con 300 kb.

DecidÃ esperar hasta febrero para que tuvieran un tiempo “de gracia” para adecuar el servicio a la oferta, pero como ya estamos a finales, he decidido quejarme a travÃ©s de su Ã¡rea de cliente.

Como admiro y respeto a Euskatel por su buen trato al cliente y eficiencia en el servicio, he querido homenajearlos publicando en el blog la conversaciÃ³n electrÃ³nica que estoy teniendo con ellos. De esta manera, mis lectores verÃ¡n lo buenos que son en Euskaltel (o lo malos que son: en su mano estÃ¡). Mi experiencia es una gota en el ocÃ©ano, pero con que un solo lector decida contratar Euskaltel por leer esto ya sentirÃ© que he hecho algo por una compaÃ±ia que se desvive por darme el mejor servicio posible.

MÃ¡s posts sobre Euskaltel:

18-01-2007 – Euskaltel, avanzamos por tÃ
21-01-2007 – Euskaltel (II)
07-10-2007 – Euskaltel y sus tarifas

Mi queja original (18-02-2008):

Veo en su pÃ¡gina (euskaltel.es), que el contrato Despega 300 se ha convertido en Despega 1M, manteniendo la tarifa. Mis padres tienen contratado dicho servicio, pero la velocidad de conexiÃ³n sigue siendo de 300 kb. Quisiera saber quÃ© tipo de error han cometido uds., bien sea por publicidad engaÃ±osa (si el error estÃ¡ en la pÃ¡gina web), o deficiencia de servicio (si nos estÃ¡n dando un ancho de banda menor del contratado). Por supuesto, tambiÃ©n me gustarÃa que dicho error sea subsanado cuanto antes.

Gracias de antemano.

Respuesta de Euskaltel (20-02-2008):

Estimado cliente:

En respuesta a la consulta que nos remite a traves de su correo, efectivamente el servicio de Internet Despega 300 Kbps a dejado de comercializarse para pasar a ser Despega 1 Mb por la misma cuota mensual.

En el caso de los clientes que tiene contratado el Despega 300 Kpbs, se les va a subir la velocidad a 1 Mg sin coste adicional, estas subidas de velocidad se estan realizando paulatinamente y esta previsto que para verano ya esten todos nuestros clientes con las velocidades actualizadas. No obstante, cuando se vaya a producir el aumento de velocidad, recibiran noticias por parte nuestras informandoles de dicho cambio.

Esperando haber aclarado sus dudas,

Reciba un cordial saludo,

Euskaltel, S.A.

Mi respuesta (21-02-2008):

Estimada Euskaltel,

Comprendo y respeto los motivos (aunque no se me expliquen) de Euskaltel para hacer una subida paulatina de velocidades a los clientes actuales (aunque esto suponga un agravio comparativo frente a los nuevos clientes, que lo obtienen inmediamente).

Como entiendo que Euskaltel es igual de compresiva o mÃ¡s que yo, supongo que no le importarÃ¡ que yo, a cambio, pague un tercio de mi cuota mensual habitual, ya que se me da un tercio de la velocidad contratada (me dan el ancho de banda del momento que firmÃ©, pero no el ACTUAL del servicio que contratÃ©). Por supuesto, y al igual que Euskaltel conmigo, irÃ© aumentando “paulatinamente” mi aporte mensual a Euskaltel, y espero (salvo imprevistos) pagar el 100% de mi cuota “para verano”, cuando previsiblemente uds. me darÃ¡n el 100% del servicio contratado.

IÃ±aki

P.D.: pueden uds. seguir esta conversaciÃ³n, al igual que todos mis lectores, en mi blog: http://handyfloss.wordpress.com/2008/02/21/a-vueltas-con-el-incremento-de-ancho-de-banda-de-euskaltel/

Respuesta Euskaltel (22-02-2008)

Estimado cliente:

En respuesta a la consulta que nos remite a traves de su mensaje, le informamos de que Euskaltel cuando comunico el aumento de velocidad que aplicaria sobre los servicios ya contratados por los clientes sin modificar las cuotas, tambien comunico que el cambio se aplicaria de forma escalonada durante los proximos meses. Le informamos tambien que para este tipo de cambios, la ley tiene un plazo estipulado de 6 meses.

Asi mismo, Euskaltel tambien comunico que a partir de ese momento la velocidad minima que ofreceria seria 1M.

A los clientes que tengan contratado el servicio despega 300kb, Euskaltel les aumentara la velocidad de conexion sin que por ello se incremente la cuota mensual, lo cual no perjudica al cliente en ningun momento. Ni reducira la cuota a un tercio, puesto que Euskaltel en ningun momento comunico que modificaria la cuota mensual sobre los servicios de banda ancha contratados manteniendo la velocidad que en breve quedara obsoleta, sino que aumentaria la velocidad manteniendo la cuota mensual.

Tambien le recordamos que a los clientes se les esta ofreciendo el ancho de banda que contrataron, como Vd. bien dice, asta(sic) que Euskaltel les aplique el aumento de velocidad cuando llegue el momento; y se les notificara dicho cambio.

Reciba un cordial saludo,

Euskaltel, S.A.

Mi respuesta (22-02-2008):

Estimada Euskaltel,

En ningÃºn momento he dudado de que tuvieran uds. la ley de su parte. Es mÃ¡s, estaba totalmente seguro de que si la ley les permitÃa retrasar el aumento de velocidad prometido 6 meses, se tomarÃan uds. los 6 meses, como admiten que harÃ¡n. Si les hubiera permitido tomarse 12 meses, se habrÃÃ¡n tomado 12, obviamente. Todo ello por dar el mejor servicio posible a sus clientes, Â¡faltarÃa mÃ¡s!

Soy reticente a tomarme mi aumento de velocidad “manteniendo la cuota” (como tanto repiten), como un regalo que Euskaltel me hace en su infinita bondad. MÃ¡s bien me lo tomo como obligaciÃ³n legal de no discriminaciÃ³n de unos clientes frente a otros, ya que (por motivos de negocio) han actualizado sus obsoletas tarifas (llevaban mÃ¡s de 2 aÃ±os congeladas) para los nuevos clientes, y (mal que les pese) no pueden tener doble tarificaciÃ³n para clientes nuevos y viejos. Por tanto, se ven obligados a aumentarme el ancho de banda, y lo van a hacer lo mÃ¡s tarde que les permite la ley. AsÃ que excÃºsenme que no les dÃ© las gracias.

La Ãºnica duda que me queda es la justificaciÃ³n moral (ya que legal parece haber) para ofrecer un servicio mejor a los nuevos clientes, con el consiguiente agravio comparativo para los clientes actuales. Parece que en vez de premiar la fidelidad prefieren insultarla.

Dada su polÃtica, lo mÃ¡s sabio por mi parte serÃa darme de baja, e inmediatamente darme de alta, para poder beneficiarme de su actual tarifa. Claro que no dudo de que uds. contarÃ¡n con innumerables salvaguardas legales para obstaculizarme dicha operaciÃ³n lo mÃ¡s posible, retrasando la baja tanto como la ley les permita, de manera que no me saliese ventajoso hacer eso.

Â¿Leyendo mis argumentos les parece a uds. que estÃ¡n trabajando por tener contentos a los clientes?

Mi humilde consejo, para la prÃ³xima vez, es que si van a hacer un cambio de tarifas o servicios, lo hagan para TODOS los clientes simultÃ¡neamente (si no pueden, esperen hasta poder), y hagan el anuncio del cambio 1 minuto DESPUÃ‰S de efectuarlo. CrÃ©anme, nadie les denunciarÃ¡ por haber duplicado el ancho de banda sin avisar. Avisar sin duplicar, por el contrario, sÃ puede ser constitutivo de delito (o al menos grave falta a los ojos de los clientes).

Atentamente,

IÃ±aki

Update:

Respuesta Euskaltel (25-02-2008):

Estimado cliente:

En respuesta a su mensaje, le confirmamos su recepcion.

Muchas gracias por su colaboracion.

Reciba un cordial saludo,

Euskaltel, S.A.

Update:

A dÃa de 11 de abril de 2008, ya me han subido la velocidad a 1Mb, como comento en este post, publicado dos dÃas mÃ¡s tarde de los hechos.

Permalink Comments (3)

Some more tweaks to my Python script

February 19, 2008 at 22:35 pm · Filed under howto

Update: you can find the outcome of all this in a latter post: Project BHS

All the comments to my previous post have provided me with hints to increase further the efficiency of a script I am working on. Here I present the advices I have followed, and the speed gain they provided me. I will speak of “speedup”, instead of timing, because this second set of tests has been made in a different computer. The “base” speed will be the last value of my previous test set (1.5 sec in that computer, 1.66 in this one). A speedup of “2” will thus mean half an execution time (0.83 s in this computer).

Version 6: Andrew Dalke suggested the substitution of:

line = re.sub('>','<',line)

with:

line = line.replace('>','<')

Avoiding the re module seems to speed up things, if we are searching for fixed strings, so the additional features of the re module are not needed.

This is true, and I got a speedup of 1.37.

Version 7: Andrew Dalke also suggested substituting:

search_cre = re.compile(r'total_credit').search if search_cre(line):

with:

if 'total_credit' in line:

This is more readable, more concise, and apparently faster. Doing it increases the speedup to 1.50.

Version 8: Andrew Dalke also proposed flattening some variables, and specifically avoiding dictionary search inside loops. I went further than his advice, even, and substituted:

stat['win'] = [0,0]

loop
  stat['win'][0] = something
  stat['win'][1] = somethingelse

with:

win_stat_0 = 0
win_stat_1 = 0

loop
  win_stat_0 = something
  win_stat_1 = somethingelse

This pushed the speedup futher up, to 1.54.

Version 9: Justin proposed reducing the number of times some patterns were matched, and extract some info more directly. I attained that by substituting:

loop:
  if 'total_credit' in line:
    line   = line.replace('>','<')
    aline  = line.split('<')
    credit = float(aline[2])

with:

pattern    = r'total_credit>([^<]+)<';
search_cre = re.compile(pattern).search

loop:
  if 'total_credit' in line:
    cre    = search_cre(line)
    credit = float(cre.group(1))

This trick saved enough to increase the speedup to 1.62.

Version 10: The next tweak was an idea of mine. I was diggesting a huge log file with zcat and grep, to produce a smaller intermediate file, which Python would process. The structure of this intermediate file is of alternating lines with “total_credit” then “os_name” then “total_credit”, and so on. When processing this file with Python, I was searching the line for “total_credit” to differentiate between these two lines, like this:

for line in f:
  if 'total_credit' in line:
    do something
  else:
    do somethingelse

But the alternating structure of my input would allow me to do:

odd = True
for line in f:
  if odd:
    do something
    odd = False
  else:
    do somethingelse
    odd = True

Presumably, checking falsity of a boolean is faster than matching a pattern, although in this case the gain was not huge: the speedup went up to 1.63.

Version 11: Another clever suggestion by Andrew Dalke was to avoid using the intermediate file, and use os.popen to connect to and read from the zcat/grep command directly. Thus, I substituted:

os.system('zcat host.gz | grep -F -e total_credit -e os_name > '+tmp)

f = open(tmp)
for line in f:
  do something

with:

f = os.popen('zcat host.gz | grep -F -e total_credit -e os_name')

for line in f:
  do something

This saves disk I/O time, and the performance is increased accordingly. The speedup goes up to 1.98.

All the values I have given are for a sample log (from MalariaControl.net) with 7 MB of gzipped info (49 MB uncompressed). I also tested my scripts with a 267 MB gzipped (1.8 GB uncompressed) log (from SETI@home), and a plot of speedups vs. versions follows:

Execution speedup vs. version
(click to enlarge)

Notice how the last modification (avoiding the temporary file) is of much more importance for the bigger file than for the smaller one. Recall also that the odd/even modification (version 10) is of very little importance for the small file, but quite efficient for the big file (compare it with Version 9).

The plot doesn’t tell (it compares versions with the same input, not one input with the other), but my eleventh version of the script runs the 267 MB log faster than the 7 MB one with Version 1! For the 7 MB input, the overall speedup from Version 1 to Version 11 is above 50.

Permalink Comments (11)

Summary of my Python optimization adventures

February 17, 2008 at 22:57 pm · Filed under howto

This is a follow up to two previous posts. In the first one I spoke about saving memory by reading line-by-line, instead of all-at-once, and in the second one I recommended using Unix commands.

The script reads a host.gz log file from a given BOINC project (more precisely one I got from MalariaControl.net, because it is a small project, so its logs are also smaller), and extracts how many computers are running the project, and how much credit they are getting. The statistics are separated by operating system (Windows, Linux, MacOS and other).

Version 0

Here I read the whole file to RAM, then process it with Python alone. Running time: 34.1s.

#!/usr/bin/python

import os
import re
import gzip

credit  = 0
os_list = ['win','lin','dar','oth']

stat = {}
for osy in os_list:
  stat[osy] = [0,0]

# Process file:
f = gzip.open('host.gz','r')
for line in f.readlines():
  if re.search('total_credit',line):
    credit = float(re.sub('/?total_credit>',' ',line.split()[0])
  elif re.search('os_name',line):
    if re.search('Windows',line):
      stat['win'][0] += 1
      stat['win'][1] += credit
    elif re.search('Linux',line):
        stat['lin'][0] += 1
        stat['lin'][1] += credit
    elif re.search('Darwin',line):
      stat['dar'][0] += 1
      stat['dar'][1] += credit
    else:
      stat['oth'][0] += 1
      stat['oth'][1] += credit
f.close()

# Return output:
nstring = ''
cstring = ''
for osy in os_list:
  nstring +=   "%15.0f " % (stat[osy][0])
  try:
    cstring += "%15.0f " % (stat[osy][1])
  except:
    print osy,stat[osy]

print nstring
print cstring

Version 1

The only difference is a “for line in f:“, instead of “for line in f.readlines():“. This saves a LOT of memory, but is slower. Running time: 44.3s.

Version 2

In this version, I use precompiled regular expresions, and the time-saving is noticeable. Running time: 26.2s

#!/usr/bin/python

import os
import re
import gzip

credit  = 0
os_list = ['win','lin','dar','oth']

stat = {}
for osy in os_list:
  stat[osy] = [0,0]


pattern    = r'total_credit'
match_cre  = re.compile(pattern).match
pattern    = r'os_name';
match_os   = re.compile(pattern).match
pattern    = r'Windows';
search_win = re.compile(pattern).search
pattern    = r'Linux';
search_lin = re.compile(pattern).search
pattern    = r'Darwin';
search_dar = re.compile(pattern).search

# Process file:
f = gzip.open('host.gz','r')

for line in f:
  if match_cre(line,5):
    credit = float(re.sub('/?total_credit>',' ',line.split()[0])
  elif match_os(line,5):
    if search_win(line):
      stat['win'][0] += 1
      stat['win'][1] += credit
    elif search_lin(line):
      stat['lin'][0] += 1
      stat['lin'][1] += credit
    elif search_dar(line):
      stat['dar'][0] += 1
      stat['dar'][1] += credit
    else:
      stat['oth'][0] += 1
      stat['oth'][1] += credit
f.close()

# etc.

Version 3

Later I decided to use AWK to perform the heaviest part: parsing the big file, to produce a second, smaller, file that Python will read. Running time: 14.8s.

#!/usr/bin/python

import os
import re

credit  = 0
os_list = ['win','lin','dar','oth']

stat = {}
for osy in os_list:
  stat[osy] = [0,0]

pattern    = r'Windows';
search_win = re.compile(pattern).search
pattern    = r'Linux';
search_lin = re.compile(pattern).search
pattern    = r'Darwin';
search_dar = re.compile(pattern).search

# Distile file with AWK:
tmp = 'bhs.tmp'
os.system('zcat host.gz | awk \'/total_credit/{printf $0}/os_name/{print}\' > '+tmp)

stat = {}
for osy in os_list:
  stat[osy] = [0,0]
# Process tmp file:
f = open(tmp)
for line in f:
  line = re.sub('>','<',line)
  aline = line.split('<')
  credit = float(aline[2])
  os_str = aline[6]
  if search_win(os_str):
    stat['win'][0] += 1
    stat['win'][1] += credit
  elif search_lin(os_str):
    stat['lin'][0] += 1
    stat['lin'][1] += credit
  elif search_dar(os_str):
    stat['dar'][0] += 1
    stat['dar'][1] += credit
  else:
    stat['oth'][0] += 1
    stat['oth'][1] += credit
f.close()

# etc

Version 4

Instead of using AWK, I decided to use grep, with the idea that nothing can beat this tool, when it comes to pattern matching. I was not disappointed. Running time: 5.4s.

#!/usr/bin/python

import os
import re

credit  = 0
os_list = ['win','lin','dar','oth']

stat = {}
for osy in os_list:
  stat[osy] = [0,0]

pattern    = r'total_credit'
search_cre = re.compile(pattern).search

pattern    = r'Windows';
search_win = re.compile(pattern).search
pattern    = r'Linux';
search_lin = re.compile(pattern).search
pattern    = r'Darwin';
search_dar = re.compile(pattern).search

# Distile file with grep:
tmp = 'bhs.tmp'
os.system('zcat host.gz | grep -e total_credit -e os_name > '+tmp)

# Process tmp file:
f = open(tmp)
for line in f:
  if search_cre(line):
    line = re.sub('>','<',line)
    aline = line.split('<')
    credit = float(aline[2])
  else:
    if search_win(line):
      stat['win'][0] += 1
      stat['win'][1] += credit
    elif search_lin(line):
      stat['lin'][0] += 1
      stat['lin'][1] += credit
    elif search_dar(line):
      stat['dar'][0] += 1
      stat['dar'][1] += credit
    else:
      stat['oth'][0] += 1
      stat['oth'][1] += credit

f.close()

# etc

Version 5

I was not completely happy yet. I discovered the -F flag for grep (in the man page), and decided to use it. This flag tells grep that the pattern we are using is a literal, so no expansion of it has to be made. Using the -F flag I further reduced the running time to: 1.5s.

Running time vs. script version (Click to enlarge)

Permalink Comments (13)

Minipunto para Arsys

February 17, 2008 at 17:53 pm · Filed under Free software and related beasts

Vaya por delante que no conozco nada de Arsys, y que (por ahora) no tengo nada que ver con ellos. Simplemente querÃa compartir el hecho de que he vistado su pÃ¡gina (fantaseando con adquirir un dominio propio), y he visto esto:

Â¿Nada raro? Pues fijÃ¡os en que, como buen servicio relacionado con Internet, tiene una fotico con un seÃ±or y un navegador web abierto… Â¿Internet Explorer? Yo creo que no…

Permalink Comments (2)

Speeding up file processing with Unix commands

February 17, 2008 at 15:31 pm · Filed under howto

In my last post I commented some changes I made to a Python script to process a file reducing the memory overhead related to reading the file directly to RAM.

I realized that the script needed much optimizing, and resorted to reading the link a reader (Paddy3118) was kind enough to point me to, I realized I could save time by compiling my search expressions. Basically my script opens a gzipped file, searches for lines containing some keywords, and uses the info read from those lines. The original script would take 44 seconds to process a 6.9 MB file (49 MB uncompressed). Using compile on the search expressions, this time went down to 29 s. I tried using match instead of search, and expressions like “if pattern in line:“, instead of re.search(), but these didn’t make much of a difference.

Later I thought that Unix commands such as grep were specially suited for the task, so I gave them a try. I modified my script to run in two steps: in the first one I used zcat and awk (called from within the script) to create a much smaller temporary file with only the lines containing the information I wanted. In a second step, I would process this file with standard Python code. This hybrid approach reduced the processing time to just 12 s. Sometimes using the best tool really makes a difference, and it seems that the Unix utilities are hard to come close to in terms of performance.

It is only after programming exercises like this one that one realizes how important writing good code is (something I will probably never do, but I try). For some reason I always think of Windows, and how Microsoft refuses to make an efficient program, relying on improvementes on the hardware instead. It’s as if I tried to speed up my first script using a faster computer, instead of fixing the code to be more efficient.

Permalink Comments (3)

Python: speed vs. memory tradeoff reading files

February 15, 2008 at 9:22 am · Filed under howto

I was making a script to process some log file, and I basically wanted to go line by line, and act upon each line if some condition was met. For the task of reading files, I generally use readlines(), so my first try was:

f = open(filename,'r')
for line in f.readlines():
  if condition:
    do something
f.close()

However, I realized that as the size of the file read increased, the memory footprint of my script increased too, to the point of almost halting my computer when the size of the file was comparable to the available RAM (1GB).

Of course, Python hackers will frown at me, and say that I was doing something stupid… Probably so. I decided to try a different thing to reduce the memory usage, and did the following:

f = open(filename,'r')
for line in f:
  if condition:
    do something
f.close()

Both pieces of code look very similar, but pay a bit of attention and you’ll see the difference.

The problem with “f.readlines()” is that it reads the whole file and assigns lines to the elements of an (anonymous, in this case) array. Then, the for loops through the array, which is in memory. This leads to faster execution, because the file is read once and then forgotten, but requires more memory, because an array of the size of the file has to be created in the RAM.

Fig. 1: Memory vs file size for both methods of reading the file

When you do “for line in f:“, you are effectively reading the lines one by one when you do each cycle of the loop. Hence, the memory use is effectively constant, and very low, albeit the disk is accessed more often, and this usually leads to slower execution of the code.

Fig. 2: Execution time vs file size for both methods of reading the file

Permalink Comments (2)

Password cracking with John the Ripper

February 10, 2008 at 23:48 pm · Filed under howto

Following some security policy updates (not necessarily for better) in my workplace, a colleague and I discussed the vulnerability of user passwords in the accounts of our computers. He assured that an attack with a cracker program such as John the Ripper could potentially break into someone’s account, if only the cracker would have access to an initial user account.

I am by no means an expert on cryptography and computer security, but I would like to outline some ideas about the subject here, and explain why my colleague was partially wrong.

How authentication works

When we log in to an account in a computer, we enter a password. The computer checks it, and if it is the correct one, we are granted access. For the computer to check the password, we must have told it beforehand what the correct password is. Now, if the computer knows our password, anyone with access to the place where it is stored could retrieve our password.

We can avoid that by not telling the computer our password, but only an encrypted version. The encrypted version can be obtained from the password, but there is no operation to obtain the password from its encrypted form. When the computer asks for a password, it applies the encrypting algorithm, and compares the result with the stored encrypted form. If they are equal, it infers that the password was correct, since only from the correct password could one obtain the encrypted form.

On the other hand, no-one can possibly obtain the original password, even by inspection of the contents of the host computer, because only the encrypted form is available there.

How password cracking works

I will only deal with brute force attacks, i.e., trying many passwords, until the correct one is found.

Despite the “romantic” idea that a cracker will try to log in to an account once and again, until she gets access, this method is really lame, since such repeated access tries can be detected and blocked.

The ideal approach is to somehow obtain the encrypted password that the computer stores, and then try (in the cracker’s computer) to obtain the plain password from it. To do so, the cracker will make a guess, encrypt it with the known encrypting method, and compare the result with the encrypted key, repeating the process until a match is found. This task is the one performed by tools such as John the Ripper.

Why this shouldn’t work in a safe (Linux) system

The security of a password relies heavily on the difficulty of guessing it by the cracker. If our password is the same as our user name, this will be the first guess of the cracker, and she’ll find it immediately. If our password is a word that appears in a dictionary, they’ll find it quickly. If it is a string of 12 unrelated characters, plus digits, dots or exclamation marks, then it will take ages for the cracking program to reach the point where it guesses it.

The second hurdle for the cracker is that, even if she gets access to a regular user account, the file where the encrypted passwords are stored is only readable by the root (administrator) user (in a Linux system). Information about users and their passwords is stored in /etc/passwd (that any user can read) and /etc/shadow (that only root can read). The encrypted password is stored only in the latter. In the past all info was in /etc/passwd, but later on it was split, to increase the security.

In short: you need root access to start trying to crack passwords in a machine… but, if you have root access, why bother? You already have full access to all accounts!

Permalink Comments

Filelight makes my day

February 7, 2008 at 11:52 am · Filed under my ego and me

First of all: yes, this could have been made with du. Filelight is just more visual.

The thing is that yesterday I noticed that my root partition was a bit on the crowded side (90+%). I though it could be because of /var/cache/apt/archives/, where all the installed .deb files reside, and started purging some unneeded installed packages (very few… I only install what I need). However, I decided to double check, and Filelight has given me the clue:

(click to enlarge)

Some utter disaster in a printing job filled the /var/spool/cups/tmp/ with 1.5GB of crap! After deleting it, my root partition is back to 69% full, which is normal (I partitioned my disk with 3 roots of 7.5GB (for three simultaneous OS installations, if need be), a /home of 55GB, and a secondary disk of 250GB).

Simple problem, simple solution.

Permalink Comments

App of the week: digiKam

February 6, 2008 at 20:16 pm · Filed under Application of the Week

As digital cameras get more and more common, and personal photo collections grow bigger, solutions for managing all these images are more and more needed.

I bought my first digital camera (a Nikon CoolPix 2500) almost 4 years ago (now I see the model was 1 year old when I bought my unit), and now I own a Panasonic Lumix DMC FX10 I’m so happy with. I obviously have the need outlined above, plus the desire to sometimes share some pictures over the web. I didn’t want to go for something like Picasa, and made a lengthy Perl/Tk script to generate HTML albums from some info I would introduce.

When I later discovered digiKam, I realized it had all the features I wanted. It is incredibly useful to tag your pictures, so that you can later on retrieve, say, “all the pictures in which my father appears”. It also has many other features, like easy access to image manipulation (of which I only use the rotation for photos requiring it), or ordering of the pictures by date, so you can see how many pictures were taken each month. The humble, but for me killer, features is that you can automatically generate HTML albums from a list of pictures, which can be selected e.g. by their tags.

Give it a try, and you’ll love it.

Permalink Comments

« Previous entries Next Page » Next Page »

Archive for February, 2008

Meta