Taking time to check your sums

I’ve got an old laptop that is currently running Windows XP (that was installed on it when it was given to me a few months ago) and at long last I’m getting round to carrying out my intention (which I’ve had ever since I received it and managed to get it to actually work) of putting Linux on there instead.

Since it is a fairly old laptop (a Sony Vaio from either the late nineties or the early naughties, as far as I can tell) and consequently somewhat lacking in system resources by today’s standards, it is likely to work best with a relatively lightweight Linux distribution.  I’ve decided to try out #! (aka. Crunchbang), a variant of Debian built around the Openbox window manager.  This is supposed to be fairly light in its demands on resources so should work ok with limited memory, processor speed and disk space.  If even this proves too much for the old machine, I’ll have to try one of the truly lightweight distros, such as DSL, instead.

I downloaded the ISO file for the latest version of #! last night, via bittorrent as this seems to be the way forward for large downloads.

Fortunately, I’ve adopted the habit of checking the MD5 checksum of any large download I make (I don’t always bother for smaller ones).  The Wikipedia page I just linked to will give you plenty more detail if you want it but, essentially, MD5 is a cryptographic hash function that provides a 32 digit hexadecimal integer (or checksum) corresponding to any input datastream (such as a computer file), in such a way that any change in the data will result in a completely different checksum.  By providing the MD5 checksum of the file as it is supposed to exist, the person sending you the file (or from whose website you download it) enables you to check that the file you have received is the same as the original.  Although MD5 is no longer cryptographically secure (i.e. it is vulnerable to deliberate attack) it still provides a pretty reliable way of checking whether your data has become corrupted in transit.

The reason I mention all this is that, on checking my shiny new Linux ISO file’s MD5 checksum against the one listed on the #! website for the same file, I discovered that it was different.  That meant that my download hadn’t been entirely successful and I was able to discover this fact before wasting time writing the file to a CD (since my laptop is too old to support booting from USB drives) and trying to run it (#! is, I believe, a Live CD distribution that can be installed to the hard drive later if you so choose – as is common with many contemporary flavours of Linux).

At this point, I expected to have to download the whole file (nearly 1GB worth) again and hope for better luck next time.  However, I discovered that bittorrent (using the Transmission client on Linux) had a function to validate my local copy of the file and, when it found a discrepancy, it was able to figure out which bits of the data were missing or corrupted and download them (again?) to create a working file.  On the second attempt, my MD5 checksum matched the expected one, so I am confident that the ISO I now have is a full working copy of the one from the #! website (which presumably should work ok).  Of course, bittorrent’s validate feature probably obviates the need to run a separate check on the MD5 checksum but it’s still nice to be able to get independent confirmation that the file is sound (and it would also still be useful for checking downloads that I haven’t got using bittorrent).

The next stage is to burn the ISO file onto a CD and then have a go at booting my laptop with it.  That’s probably going to be my main task for later this evening.