16 March 2013

Home Studio: Mixing - Part 1

Introduction

Indeed, what I really love is to play guitar. I enjoy playing guitars, amps and pedal effects.
My problems started just the day that I've decided myself to "record one song".

If you have money enough, you can go straight to the best Studio your money can afford, you can even rent some session musics there and you have your work done, just doing what you really like: playing guitar.

But, people like myself already have spent the few money he had in guitars, amps and pedal effects (because is what we really love) and, therefore we have not founds enough to support long studio sessions.

Things go even worst if, like myself, you aren't a pro guitarist. You want to finish things really quickly  but, that simple idea puts a lot of pressure over you and you finally see things taking way more time that they would took if you were at home, relaxed.
Recording is the most important phase of the musical production. We already know the sentence: "Garbage In, Garbage Out". The higher quality our takes will have, the best results we will have during the following phases: mixing and mastering.

Perhaps, we had the money to go to a Studio and we succeed recording good tracks.
Or, maybe we invested in our own home studio system, purchasing some mics, pre-amp, audio card and we have some proper room to record our amp' sound.
Or maybe we went the silent way, using speaker simulators, ISO boxes, DI boxes, preamps, rack amps, amp simulators or any other trick that allowed us to have a good track to start our mixing work.

So, well, imagine that we already have a good set of tracks, well recorded, and that we want to do the mixing, because we wanted to save money, because we want to learn or whatever other valid reason.
We have a nice DAW software, with lot of tools but... How do we start (OMG!)?.

Every phase in the musical production is, simultaneously, art and science.
Science because the corrections and transformations that we do to sound, have origin in the set of technical knowledge related to sound and audio gear used to represent it and to transform it.
Art because the ways to process the sound are infinite and, everyone tries as many paths as he can imagine to represent exactly that sound that has in mind and that, so difficult to achieve it seems.

During several entries, with same title, I will try to share some basis knowledge and, some tricks I've been collecting, while I was fighting myself with my own Home Studio.


Digital Audio

If you are still reading, you are probably a guitarist trying to mix in his/her own PC and, due to the high cost of outboard gear, you would be trying to Mix ITB (In The Box).

The mix inside the PC is a digital mix, while the original sound is analogic.

The analogic sound is being characterized by its continuity. The signal constantly varies but without gaps.
The guitar' sound (and, overall, any other instrument), is usually recorded using mics, which work analogically also.
The weak mic signal is being, lately, amplified across a quality pre-amp (that allows the best representation of the audible frequencies) and, that amplified signal is being routed to an analogic mixing console (Studios) or to the input of some audio card.

Even in Recording Studios, the analogic technique based on old multi-track recorders leaved pass to the digital mixing technique, because its infinitely easier to work with a DAW software (as Pro Tools, Logic, Nuendo, Sonar...).

Most of the issues, when processing sound, occur just when we cross the borders between those two very distinct worlds: from analogic to digital and, from digital to analogic.

When the analogic sound is being recorded in our PC, it's being converted to a digital format.
When we hear our digital mix by our studio monitors, the sound it's being converted back to analogic.
If we send some track to an external device (compressor, equalizer, by example) and, we record back the resulting processed sound, we are double-converting, from digital to analogic and from analogic to digital.
Every time that we cross the frontier, there is some kind of degradation of the sound.

At this point, you will realize that one of the most important things in your studio gear are the audio converters (A/D and D/A). Each of your studio elements should count with high quality converters.
What quality level?. The best you can afford.

In Studios, you will find sophisticated outboard gear dedicated exclusively to the tasks of converting information A/D and D/A. They are really expensive and, every time that the signal needs to be transformed between both worlds will be routed through such sophisticated devices.

In our home studio, would be enough to get a good audio card with proved quality converters.


How do we represent the analogic audio in a digital format?

The A/D converters have as their task to get a sample of the electric values of an input signal, several times in a second. You should be already familiar with some typical values for sampling frequencies: 11 KHz, 22 KHz, 44.1 KHz or 48 KHz.
Why those values and no others?. Are they arbitrary?.

Some researches in audio world demonstrated that the human ear can hear as continuous one discontinuous sound if we send 44,000 samples in a second (44 KHz).
This is quite similar to what happens with a cinema film. We "build" a continuous visual flow with pictures that are being projected 24 times by second.

In case of audio, the converters are the devices that "build" the continuous analog flow from the discontinuous bites of the digital samples. And, converters are also the responsible to split the analog audio in small bites that are stored in a digital format.

The Musical Industry fixed the value of 44.1 KHz for audio CDs.
But, if 44 KHz are enough, why 44.1 KHz?.
It seems that the conversion tasks introduce some issues, related to the filters used in Converters. Those extra 0.1 KHz are enough to solve the issues with those filters.

If we use sampling frequencies of 44 KHz, we can represent the whole range of frequencies that the human can hear (from 20 Hz to 20 KHz). But, you will see that most of audio devices are using sampling frequencies way lower, as 22 KHz or even 11 KHz.

With 22 KHz, we can represent just half of the frequencies that we can represent with 44 KHz.
With 11 KHz, we can represent just a quarter of the frequencies that we can represent with 22 KHz.

Which range of frequencies are "deleted" depend exclusively on the design of the particular device.

Devices can use some psycho acoustics tricks to force your brain to "fill" the gaps in the sound information.
The range of frequencies that are fundamental for us are the Middle frequencies and, that's a trick that many algorithms use to compact the size of the information, preserving the "important" frequencies and removing the less "relevant" ones.

Usually, the first that I missed when working with lower sampling frequencies is the spatial information, the dimensionality of the sound (echos, delays, reverbs,...).
Professional Audio works with 48 KHz and, for higher definition environments, 96 KHz.

Alright!.
We know that the sampling frequency has two missions: the more samples I take, the more close to the real thing will be the digital representation of the original analogic signal and, greater will be the range of frequencies that I would be able to represent and reproduce.

But, we still have to clarify one more aspect of the digital representation of sound: the bits depth (or resolution). What's that?.

As you probably know, in the digital world, any kind of value can be expressed just using a string of 0 and 1. The length of that string of zeros and ones will limit the maximum value that we can represent with such a string.

By example, with 16 bits, we can represent 2^16 values (65,536), half positive, half negative. Don't forget that the analogic audio follows a sinusoidal function that takes both, positive and negative values.
With 24 bits, we can represent 2^24 values (16,777,216), half positive and half negative, what starts to be an important amount of values.
With 32 bits, we can represent 2^32 values (4,294,967,296), an impressive amount of more than 4 hundred millions of different values.
But, how important is this?.

The sound taken by the mic and lately amplified by the preamp, is a continuous electrical signal, that varies it voltage. The differences between the lowest and maximum voltage will determine the dynamic range of the sound. The measured value of such a voltage, in a given instant of time, is what is being stored in a digital sample, which bits depth will allow to distinguish more or less voltage levels.

With 32 bits, we can divide the whole range of infinite voltages in 4 hundred million of different steps.
With 24 bits, we can divide the whole range of infinite voltages in 16 million of steps.
With 16 bits, we can divide the whole range of infinite voltages in just 65 different steps.

Since the number of steps (or values) is finite in the digital audio, we need to assign a digital value to a range of analogic values, we need to group together the infinity voltages between two given values and, we assign just a digital value to that range. Imagine, all analogic values between +0.040 and 0.045 will be assigned to the digital value +0.040.

So, you can imagine that the more the number of "boxes" that we have to pack together ranges of analogic values, the more accurate the representation of the analogic value can be.

The D/A Converters, are creating a continuous sound using discrete digital values. Between two digital values, the converter interpolates a series of values to fill the time gap. The closer each pair of samples will be in time (higher sampling frequency) and the more differentiated (higher bits depth), the less wrong results will come from such a interpolation.


How much digital audio weights?

Alright!. Now, we know that to digitally represent audio, we need to use two important variables: the sampling frequency and the bits depth. So, how much room it takes?.
ENOUGH!.

1 minute of audio, at 16 bits and 22 KHz needs 21,168,000 bits (2.52 MB !!!).
1 minute of audio, at 24 bits and 44.1 KHz needs 63,504,000 bits (7.57 MB !!!).
1 minute of audio, at 16 bits and 44.1 KHz needs 42,336,000 bits (5.04 MB !!!).

And, multiply those by two, if we work in stereo (two channels), by five or seven, if we work in some multi-aural, multi-speaker environment.

The Audio CD format uses 16 bits of depth and 44.1 KHz of sampling frequency. Capacity of an Audio CD is around 650 Mb so, it will allow to store around 64 minutes of stereo music (650  / (2 * 5.04)), making simple calculations.

Now, imagine that you record several takes for a guitar track (to choose the best or to mix the best parts of each take), together with similar takes for the bass guitar, with a great amount of takes for drums and voices... yes... audio eats disk space really fast.

At this point you will realize that, to have a quality sound stored in your PC, you will need a huge amount of disk space, in the PC or in any other external storage device that allows you to save your work during the different stages.


So good but, then, which digital format do we need?

Long time ago, when I've started my own learning around audio world, I've read some interesting work from Helsinki University that was discussing about the impact of sampling frequency and bits depth. This work stated that the quality of sound benefits more of a higher bits depth than of a higher sampling frequency.
So, you achieve better results working at 24 bits and 44.1 KHz instead of doing it at 16 bits and 96 KHz.

Big Studios, with great resources can work up to 48 bits and 96 KHz but, for more modest Studios and, Home Studios, 24 bits and 44.1 KHz is an excellent compromised solution.
Once again, it would be better to work at 32 bits and 48 KHz than to work at 16 bits and 96 KHz.

According to the people that really knows something about all this (not myself), as Bob Katz and similar, all audio digital transformations that work with 32 bits (or higher), floating point, have not significant lose of information, neither introduce "digital artifacts". Moreover, there is the only way to correctly represent the spatial information (reverberation, echo, etc.).

When we go down, from 24 bits to 16 bits, we are loosing 8 bits of information. To say it in some way, we are cutting the tail (the fine nuances) of the audio value represented.

There is a technique, named dithering, that basically consists into introducing a floor noise to the converted signal, in a way that this floor noise seems to mathematically correct most of the interpolation errors that happen together with this lose of accuracy.
If you have some digital audio outboard gear, it's very important that it works internally at 32 bits or higher bits depth. The dithering is only necessary when going down from a higher resolution to a lower resolution, as an artificial trick to try to restore the original signal (for which part of the information has been lost).

If you work at 24 bits, in your DAW and with any digital outboard gear at same resolution and frequency, you will only need to do a dithering at the very end of your mixing, when you render the final song to a CD format (16 bits / 44.1 KHz). The less times you go from higher to lower resolution, the higher quality will achieve in your mixes.

If you have outboard gear (or even software plugins) working with at lower resolutions than your project's one,  you will loose information every time you send the track to any of those devices or plugins and, you will introduce the floor noise of the dithering operation. The more these kind of operations happen in your mix, the more the floor noise and the more your signal goes far away from the original analogic signal and, the more the digital artifacts that are being introduced.


To be continued ...

I wanted to start with these very basis concepts of audio digital, because there are even people running small studios that doesn't seem to be aware about these "little digital audio details" and, they are working with 16 bits and 44.1 KHz in their mixes, just because it takes less disk space. They even don't know nothing about dithering and what happens when you mix stuff at different resolution and frequency.
If you mix yourself or you go to an small Studio, now you have some valuable information to determine the technical level of your "Audio Engineer" and, to guess the quality of the results.

I think that knowing the basis ideas about Digital Audio, will allow you to better choose your tools (DAW, plugins, outboard gear, etc.).

There are many other critical concepts, as the jittering, which I will not discuss about, since they happen in more sophisticated environments, with lot of outboard gear and, where there is a mix of analogic and digital electronics devices.

In the next blog entry related to this matter, I will try to talk about the different phases of the mixing, as well as about the 3 dimensions of mixing.

No comments:

Post a Comment

Please, feel free to add your comments.