Understanding MP3s (and other compressed music) – Part 3… Finale

Welcome to the final installment of my 3 part series of posts about the pros and cons of compressed audio. If you haven’t read from the beginning, it’d be a good idea. Here’s a link: Understanding MP3s (and other compressed music) – Part 1

By the end of Part 2 you hopefully have an understanding of the process of compression (i.e. removing sounds that we theoretically won’t hear) and also the impact that this removal has on the overall “picture” created by the sound. For this final part of the article, you need to keep this concept of a musical “picture” in mind because this final concept is all about the hidden magic within the picture, not the individual, identifiable details.

Harmonics

You might have heard of harmonics before. If you’ve played certain musical instruments (particularly stringed instruments), you might have even deliberately created pure harmonics. If you haven’t heard of harmonics, don’t worry – here’s a short explanation.

Anytime you play an instrument that uses a string or air to create sound (i.e. just about any instrument other than electronic synthesizers), you are creating harmonics. Harmonics are the sympathetic vibrations that occur along with the note that you’re creating. Have you ever run your finger around the rim of a glass to create a musical note? That’s the same concept. Your finger running on the edge of the glass creates vibrations. If you get the speed of your finger movements correct, the vibrations you create, match the natural vibration frequency of the glass. As a result, the whole glass vibrates together and forms a beautiful clear note. Different glasses will vibrate at different speeds of movement  and will create different notes as a result. This is the concept of harmonics.

If you were to walk up to a piano and strike the key known as “Middle C”, you would hear a note – just one single note, but that note will have a quality very different to the same note on another piano or on a violin. The reason for this is the creation of resonance and harmonics. To explain this, I’m going to talk about the note called “A” which is a few notes above “Middle C”. I’m using the “A” because it makes the maths easier.

If you now strike the “A” you’ll hear a single note once again. This time, the note will sound higher than the previous “C”. What’s actually happening though is that your ear is receiving vibrations in the ear and these vibrations are moving 440 times every second (440 Hz). However, there are also other vibrations going on and the majority of these vibrations are directly related to the 440 Hz we began with. As the “A” string inside the piano vibrates, it creates waves of vibration. The loudest of these move 440 times per second, but it also creates other waves moving 880 times, 1760 times, 3520 times per second, etc.

Every note created by an acoustic instrument naturally creates these harmonics which go up in doubling increments (i.e. like 1, 2, 4, 8, 16, 32, etc.) Old synthesizers sounded particularly fake because they didn’t recreate these harmonics and the output sounded flat and lifeless. Newer synthesizers create harmonics artificially and have come closer to the sound of the real thing, but there’s still a degree of difference created by the subtleties that can be created by acoustic instruments. A slight difference in strike pressure on a piano, plucking/strumming strength on a guitar or force of air through a trumpet can create a significantly different tone as a result of the different range of vibrations it creates. All of these subtleties are the “magic” that make music so special and exciting.

A quick note: this blog is not an anti electronic music. Electronic instruments (i.e. synthesizers, drum machines, etc.) can create amazing music which is impossible with traditional acoustic instruments. The discussion of acoustic versus electronic instruments is designed purely to illustrate the importance of keeping harmonics where they were originally intended/recorded.

Harmonics, Subtleties & Compression

In reading the section above, you might have wondered why you’ve never heard these harmonics. You might even choose to put on your favourite CD and try to listen for them. You can actually hear these harmonics if you listen carefully, but the key thing to recognise here is that we aren’t consciously aware of them in normal circumstances. The harmonics and subtleties happen “behind the scenes” of the music and are rarely noticed by the casual listener or anyone who is not actively listening for harmonics.

If you now think back to my previous discussion of compression and the removal of sounds that we theoretically don’t hear, you might see the connection. The first things be “compressed” (i.e. removed) are the harmonics and subtle, quiet sounds that create the finest details and tonal qualities of the music. To the casual ear, nothing seems to be missing, but play the same song compressed and uncompressed through good speakers and you might notice a difference that you can’t quite put your finger on. Here’s another visual example.

The following picture is a hi-resolution (1900 x 1200) desktop wallpaper image provided with Microsoft Windows 7. I’ve used it because it has a certain magic about it in terms of its depth and detail.

The next version of that image is at a lower resolution of 800 x 500 pixels (a bit like a lower bit-rate of compression).

Notice there’s a certain level of the “magic” missing from the second image? It’s hard to put a finger on exactly what’s missing, but the image isn’t as instantly captivating and engaging to the eye. It almost looks flatter somehow – less bright and alive.

Here’s one last version at 600 x 375 pixels, making it even lower resolution and stealing more of the “magic”.

Are you seeing a difference? Don’t worry if you’re not. Go back now and take a close look at the textures of the character’s face and the stitching on his costume. As the resolution drops, so does the detail. See it? That’s exactly what’s happening to your music.

Compressed Music in Real Life

Although it’s probably clear by now that my preference is always for uncompressed music (known as lossless music because no detail/information is lost), it’s not always practical. Understanding compression allows you to choose what suits your needs best. Here are some factors to consider when choosing your level of compression (or choosing no compression):

  • How much space do you have for your music on your computer, device hard drive, iPod, etc? You’ll need to use compression if your space is limited and you want to store a large number of tracks. Here you need to weigh up quality, quantity and space. You can consider increasing storage space, decreasing the quantity of tracks or increasing the compression (and therefore decreasing the quality of the music).
  • Where and how do you listen to your music? If you listen in noisy environments, at very low volume (i.e. background music only) or use low quality speakers/headphones then you might as well use slightly higher compression to maximise the quantity of tracks. The noisy environment issue can be overcome with in-ear earphones and noise cancelling earphones, but the other situations generally mean you can afford to sacrifice quality for quantity.
  • How much does it matter to you? After all, you’re the one doing the listening so if you’re happy with music at 128 kbps that’s all that matters. There’s no such thing as a right or wrong level of compression – it’s entirely up to you.

The best way to decide is actually quite simple. Take a well-recorded track (or two) that you really like and use your music player (iTunes, Windows Media Player, etc.) to compress it in different ways. Next, listen to the different versions on your favourite headphones and/or speakers and decide what you’re happy with. Way up the differences you noticed between the different levels of compression and think about how much space you have to store music and then make a decision.

Summary

Compression is a fantastic tool for portable audio and convenience, but if you have no significant space restrictions, I highly recommend sticking with lossless audio (either Apple Lossless Audio Codec – ALAC, Free Lossless Audio Codec – FLAC or Windows Media Audio 9.2 Lossless). You never know when you might upgrade your speakers or headphones and even if you can’t hear a difference now, you might be amazed at the benefits you get with that next pair of speakers or the next set of headphones! Don’t give up the magic of the music unless you absolutely have too!

Understanding MP3s (and other compressed music) – Part 2

Welcome to Part 2 of my series of posts about the pros and cons of compressed audio. If you haven’t read Part 1, it’d be a good idea. Here’s a link: Understanding MP3s (and other compressed music) – Part 1

Wielding the Eraser

I explained in Part 1 that compression means pulling out sounds that we won’t actually hear, but think about this… The music is like a painting that we “see” with our ears. Compressing music is the equivalent to taking an eraser to the Mona Lisa. It’s like saying, “No-one will notice this brush stroke of stray colour or this tiny bit of shading.” Perhaps that’s true and, to a degree, no-one would notice, but at some point the whole painting’s just going to lose something. It’ll lose a little bit of soul. Sure, you might not pick exactly which parts are missing, but you’ll know something’s not right. Here’s an example:

Notice how the sky in the second image looks unnatural and full of lines? That’s because the process of compressing has removed some of the subtle shades of blue and replaced them with wider bands of other shades. For example, let’s number the different shades 1.1, 1.2, 1.3 and 1.4. During the compression process we would replace shade 1.2 with a second band of 1.1 and replace 1.4 with a second band of 1.3. Now that blue sky would be made of bands of shades 1.1, 1.1, 1.3, 1.3. You can see the evidence of this above in the second image.

So looking at the example photos, it’s clear that they’re both the same photo, but if you had to choose one to print and frame, I’m guessing you’d choose the first one because it’s closer to real life and therefore more pleasing to the eye. The same goes for music.

Think of music as a complex bunch of vibrations making a particular range of patterns. Any little detail you remove from those vibrations will permanently alter the overall “picture”. You’ll still recognise the sound or the song, but it won’t actually sound identical to the original.

Let’s talk about the ear again. Remember my description of how we hear? The ear perceives music like the eyes perceive a painting. You take it all in at once, you don’t pick out a particular colour here and a particular texture there, you just see it as a picture. When we compress sound we permanently alter the “picture” as if we had taken to it with an eraser. To our ears, the result is no different to the photo above on the right. It might not be as dramatic (depending on the level of compression), but it’s essentially the same. You don’t notice a loss of individual sounds, you notice a loss of overall quality and realism.

Here’s one final visual version to show you what I mean. The following charts are spectrograms that show sound as colour. The darker the colour, the louder the sound and the higher up the colour appears, the higher pitch the sound is. A bass guitar shows up down the bottom while a violin shows up further towards the top. There are 2 lines in each chart – these are the left and right stereo channels.

Spectogram - lossless

"This is How a Heart Breaks" - no compression

"This is How a Heart Breaks" - moderate compression

"This is How a Heart Breaks" - mid-high compression (128 kbps)

Notice the density of the yellow and orange colours reduces as you get more compression? The more blue you see, the less of the musical “picture” is still intact. You might also notice that there is more variety and clarity in the colours on the top chart and the colours all get more “blurry” as you move down the charts. That’s the effect of averaging things out. If you look at the first spectrogram and then the second, you might notice that the second one looks like a slightly out-of-focus copy of the first one.

By the time we get to 128 kbps, nearly every high frequency sound is removed. That’s because we lose those hearing at these frequencies first and are less likely to notice the missing sound… or at least that’s the theory. The key thing to notice here is that the musical pictures are different. This is the most visual representation of sound that I can provide and it illustrates exactly how the musical “picture” is gradually erased by compression.

In the Final Installment

Now that you know how we perceive sound and how compression works, you’re all ready to read about why compressed music loses its “magic”. In Part 3, I’ll explain a bit harmonics and their role in creating the soul of the music. I’ll also sum up what this all means when it comes to choosing the level of compression that’s right for you.

As always, I hope you’re enjoying this information and I welcome any feedback or questions you might have.

Ready for Part 3?

Understanding MP3s (and other compressed music) – Part 1

Introduction & Context

As a music lover, I want to experience my music in its purest form. The true purest form is live performance, but we can’t always be at concerts so someone created recorded music. Then someone realised that you can’t take a record player or CD player wherever you go so they created compressed audio. There are many different compression formats including MP3, Microsoft’s WMA, Apple’s AAC, Sony’s ATRAC, and Ogg Vorbis. They all have different names and slightly different methods, but the overall concept is the same.

My aim in this series of posts is to explain what happens when you turn a CD into an MP3 or similar compressed format. In most cases, if you put a CD in your computer, PlayStation, Xbox, etc. and “rip” that music to a disc drive or portable music player, there’s a very good chance the music’s been compressed.

Just like it sounds, compressing music is all about squishing the same length of song into a smaller amount of data. A music track of about 3 minutes 30 seconds takes up between 20-30Mb as pure uncompressed audio. That same track can be compressed at “high” quality to about 7Mb. That’s a massive reduction, but you might be wondering what you’re losing to get the file to shrink by two thirds. Over the next few posts I’ll explain the process and the pros / cons of compression in a simple, real-world way so don’t worry if you’re not technically minded – you won’t need to be.

I should add that I’m not a fan of compressing music, but I recognise the need for it if we want portable music so the overall theme of these posts is to understand what you’re sacrificing when you choose compressed music. Once you know what you’re giving up, you can make an informed decision about what you’re willing to sacrifice in order to carry those extra songs. I hope the information is helpful and interesting.

Key Concepts

The Physics of Hearing: To understand the impact of compression you need to understand how we hear sound. The process begins with a sound source (like a musical instrument) that creates vibrations in the air.  These vibrations travel through the air until they hit our ears. Inside our ears is a thin layer of skin that we know as the ear drum. When the vibrations hit the ear drum, it is pushed around and vibrates in time with the incoming sound. Behind the ear drum are some small bones and our inner ear. The bones get pushed by the ear drum and they vibrate accordingly. As the bones vibrate, they continue to pass the vibrations to our inner ear. You can think of the bones in your ears like the string between two tin can telephones – they just carry a simple vibration.

The inner ear receives the vibrations next and the vibrations “tickle” a bunch of nerves which translate the vibration to a new type of signal for our brain. Don’t worry about the final signal to the brain though, just think about the vibrations until they hit the inner ear. These vibrations are chaotic. They aren’t clear and defined with separate little vibrations for the drums and another set of vibrations for the guitar and another set for the singer, etc. No, the vibrations all pile up and create a big mess of vibration.

A single, perfect note looks like this:

Sine Wave Graph

A graph of a perfect note

This type of vibration is impossible to create with a musical instrument (other than a synthesizer) or voice. Here’s the type of vibration created by instruments and voices:

Music Wave Graph

A graph of musical vibrations

Notice the mostly chaotic nature of the vibrations? There are definitely patterns there, but it’s a big mess of different vibrations. What this graph shows us is how our ear drum would move when receiving this music. The higher or lower each line is, the more our ear drum moves. Lines towards the top push our ear drum in. Lines towards the bottom pull our ear drum out. These movements are all tiny (if the music’s not too loud), but enough to send these crazy vibrations through to our ear nerves. The miracle of hearing is that our brain translates this crazy bunch of vibrations into beautiful melodies and harmonies.

Masking: The second key concept to understand is the concept of masking. Masking is the effect of a louder sound making it difficult to hear a quieter sound played at the exact same time. Think about having dinner in a busy restaurant. You might find it difficult to hear what your friends are saying because of the noise in the restaurant – that’s masking. The combined noise of everyone else’s conversations are masking the voice of your friend across the table.

When some clever bunnies wanted to create a way to store music on computers and iPods (or similar devices) they needed to take some data out of our music. The only data in our music is sound, so they had to find a way to take some sounds out of the music. Sounds tricky, yes? That’s where masking comes into play.

Studies showed that people don’t notice when certain individual sounds are removed from the overall musical landscape. In basic terms, if two sounds occur simultaneously, the quieter one can be removed and we don’t really notice. That’s a slight over-simplification, but it sums up the concept. There are very complex mathematical algorithms and formulas that help determine what sounds will and won’t be missed. I don’t even pretend to fully understand those algorithms so I won’t try to explain it. It also doesn’t really matter how the maths works because the key information to understand is that compression involves removing small pieces of the music that you won’t miss (in theory).

End of Part 1

That’s the end of the first section. Hopefully now you understand how we hear and how masking works. In Part 2 I’ll explain how that knowledge applies to compress sound and how it affects what we hear after the compression is done.