Friday 16 March 2007

Codecs

By the mid ‘90s, usage of internet increased manifold and with the dawn of the new millennium certain radical changes were happening in the world of digital audio. Audio CDs, by far, were in charge and everybody and anybody had a CD player. Nonetheless, there was one fundamental problem. First, CD’s scratch easily and second, one cannot transport a CD over the net physically.

With the Internet slowly becoming the hub of all entertainment, it was necessary to look for other, more cost effective options for a good audio-visual experience that did not hog bandwidth yet provided decent quality-both in terms of audio and video. This was the basic idea that facilitated the inception of audio and video codecs.

A “codec” can be defined as the combination of two words-Encoder and Decoder. These encoders and decoders can be either hardware or software, but we will only talk about the software codecs. Before we actually delve into the world of software codecs, let us get our basics on digital music right.

When we talk about a software codec, it is a software program containing an algorithm to compress and then decompress data. For instance, if an MP3 file is encoded at a bit-rate of 128K from a source file that is 20MB in size (uncompressed), then the resulting file would be approximately 2MB in the compressed form, going by the 11:1 compression ratio, which the MP3 codec compresses at. Talking about compression ratios, it is the ratio of the size of the original uncompressed audio file and the compressed resulting audio file.

Next up are the two categories of codecs-lossless and lossy respectively. An example of a lossless codec is WMA and a lossy codec is MP3. The term "lossless" means that the compressed file is identical to the original file in all aspects. If you take a spectrum analysis of the original file and compare it with the compressed file, the resultant spectrum will be one single line with all the peaks and valleys and spikes matching each other to the ‘T’. If not, it means there is some loss that is taking place during compression.

Encoding music using a lossless codec takes a lot of time and does not compress music a great deal. A 10MB file will be compressed to a maximum of about 7MB using WMA (Windows Media Audio 9). There are other codecs such as AAC (Advance Audio Codec) and Ogg Vorbis, which we will talk about in a little while. But before we talk about these codecs, let’s talk about the MP3 codec, the one piece of software that started it all.

An apt example of a lossy codec is MP3. The MP3 codec uses the "perceptual coding" technique. The Fraunhoffer institute in Germany, along with Prof. Dieter Seitzer of the University of Erlangen, developed an algorithm that was standardised as the ISO-MPEG Audio Layer-3.

MP3 is far more popular than any other format or codec as it allows for exceptionally small file sizes without much difference in the rendered audio quality. However, there is a large difference statistically between the original uncompressed music file and the resultant MP3 file. This is where perceptual coding steps in. The trick that “perceptual coding” uses, is it removes information from a particular audio or video file. Users cannot "perceive" the loss of higher or lower frequencies that are present in the original file but are cut off in the compressed MP3 file. The overall music quality does not suffer but the file size reduces due to such loss of information and hence MP3 is called a "lossy codec".

If you run a spectrum analyser on a compressed MP3 file and compare it with the original uncompressed recording, you will find that there is no “one-to-one” match of the two spectrums. Very high or very low frequencies which the codec thinks is beyond audible perception of the user are cut off.

Codecs such as AAC, Ogg Vorbis etc. use different algorithms. For instance, AAC is based on the MPEG-2/-4 standard and can use be used as a lossless or lossy compression technique as required. So which audio codec is right for you? Read on to find out.

No comments: