Sunday, November 6, 2011

Music files = MP3 ?!

    .AAC, .OGG, .flac, have you ever heard about the above file formats? No? Then, what about mp3? I am quite sure that you have heard about it, even if you have never tried downloading it. But what if I tell you that all of them are actually audio files that they serve quite the same function as mp3, i.e. storing audio data? Here, I would try to introduce to you what mp3 is and whether there are alternatives to it, which shall be shown later to possess some kinds of drawbacks.

    Mp3 is widely used as synonym to music files. It is popular in field of portable or digital music playing and is well supported by a great number of both hardware and software. The term “MP3” refer to an audio format called “MPEG-1 Audio Layer III” or “MPEG-2 Audio Layer III”. It is a kind of audio format that compresses and stores raw audio data with a lossy method, producing a much smaller file than the original source, with quality comparable to the original one. Typically, an mp3file would be 4 to 11 times smaller than the raw audio file, depending on the compression rate. Its notably reduced file size makes it popular, especially in the days when hard disks stored up to tens of gigabytes and the network speed was slow.  However, the reduction in file size comes with cost. Now, let’s look at how the compression works first.

    Mp3 manipulates a perceptual limitation of human hearing called auditory masking. Sometimes when you listen to two sounds, one would become inaudible, or we say, being masked by another sound. This is quite the principle of auditory masking. Of course not all sounds mask the others, however, through extensive experiments, the general mechanism can be known. Simply speaking, mp3 makes use of the mechanism and filters out the sound that is not audible to human in a sound clip, hence greatly reducing the data content, resulting in reduced file size.

    However, the above is just what a theory tells. In practice, there are some constraints.  No matter how well the theory or mechanism work, as a lossy compression format, it unavoidably losses some of the audio data after compression. The maximum data an mp3 file can store per second of the audio clip depends on its bit rate. Typically, a soundtrack on a CD has a bit rate of 1411.2kbps.Bit rate of an mp3 file is confined to several levels, with the lowest as 8 kbps and the highest as 320 kbps. The trick is on how to retain as less data as possible while maintaining quality comparable to that of CD by using auditory masking. But of course, for files with very low bit rate, any algorithm won’t help, i.e. mp3 file with too low bit rate still sound bad. Officially, it is suggested that a 128kbps mp3 file should sound like a CD. However, this result is subject to change under different circumstances and with different audience. This brings out the second constraint of mp3. Mp3 algorithm depends on the perceptual limitation of humans, however, different people have different audial sensitivity. One would find a 128kbps mp3 acceptable while another may finds it unbearable.

    Moreover, there are some technical and legal concerns on using mp3. Mp3 was developed about twenty years ago, there exist some technical defects in the algorithm, such as not being able to record sound with frequency above 15 kHz, while human can hear frequency up to 20kHz. Legally, mp3 is a patented audio format. The use of it incurs license fees. Be minded that the license fees here are for the compression algorithm itself, not relating to copyright issue relating to the music clip. The story is made more complicated as several companies are claiming ownership of patents related to mp3 while someone argues that it should be patent-free as in the United States, patents cannot claim inventions that were already publicly disclosed.

    Mp3, due to its compact file size, fairly good quality and, more importantly, its history, is now very popular that other audio formats are not well known by the public. Here, I just want to point out some of them. AAC is a newer and more advanced audio format designed to be the successor of mp3. AAC generally achieves better sound quality than MP3 at similar bit rates. .OGG normally refers to audio files containing an audio format called Vorbis. It’s worth mention as it is a patent-free open-source format while providing audio files with quality higher than the other lossy audio formats. Nowadays, more and more applications are supporting ogg, alongside with an increasing number of games using ogg for audio effect. Note that the above two are both lossy audio format, i.e. upon compression, some data from the raw audio clip is lost irreversibly.  There is another type of compression format called lossless compression, i.e. all the data is conserved in the compression process, while producing a relatively small resultant file. Of course, it cannot produce files smaller than that by lossy method, but it is adored by audiophiles as it can output audio with quality exactly the same as the original CD. FLAC is one of the lossless compression formats. It provides satisfactory size reduction while requiring relatively low computational effort. Again, it is patent-free and open-source.

    I believe that digital music shall be the main trend afterwards. By now, you should have better understanding about mp3 and other audio formats. I don’t mean to ask you to abandon mp3. Here, I just want to remind you that what you hear may not be the whole truth and introduce to you some of the more advanced format, especially those that are open-source and patent-free, which I think would dominate in the future.

No comments:

Post a Comment