Chapter 3. Volume and Loudness
Once we are ready to play a sound, whether from an
AudioBuffer
or from other sources, one of the most basic
parameters we can change is the loudness of the sound.
The main way to affect the loudness of a sound is using
GainNodes
. As previously mentioned, these nodes have a gain
parameter, which acts as a multiplier on the incoming sound buffer. The
default gain value is one, which means that the input sound is unaffected.
Values between zero and one reduce the loudness, and values greater than one
amplify the loudness. Negative gain (values less than zero) inverts the
waveform (i.e., the amplitude is flipped).
Equal Power Crossfading
Often in a game setting, you have a situation where you want to crossfade between two environments that have different sounds associated with them. However, when to crossfade and by how much is not known in advance; perhaps it varies with the position of the game avatar, which is controlled by the player. In this case, we cannot do an automatic ramp.
In general, doing a straightforward, linear fade will result in the following graph. It can sound unbalanced because of a volume dip between the two samples, as shown in Figure 3-2.
To address this issue, we use an equal power curve, in which the corresponding gain curves are neither linear nor exponential, and intersect at a higher amplitude (Figure 3-3). This helps avoid a dip in volume in the middle part of the crossfade, when both sounds are mixed together equally.
The graph in Figure 3-3 can be generated with a bit of math:
function
equalPowerCrossfade
(
percent
)
{
// Use an equal-power crossfading curve:
var
gain1
=
Math
.
cos
(
percent
*
0.5
*
Math
.
PI
);
var
gain2
=
Math
.
cos
((
1.0
-
percent
)
*
0.5
*
Math
.
PI
);
this
.
ctl1
.
gainNode
.
gain
.
value
=
gain1
;
this
.
ctl2
.
gainNode
.
gain
.
value
=
gain2
;
}
Demo: to listen to an equal power crossfade, visit http://webaudioapi.com/samples/crossfade/.
Using Meters to Detect and Prevent Clipping
Since multiple sounds playing simultaneously are additive with no level reduction, you may find yourself in a situation where you are exceeding past the threshold of your speaker’s capability. The maximum level of sound is 0 dBFS, or 216, for 16-bit audio. In the floating point version of the signal, these bit values are mapped to [−1, 1]. The waveform of a sound that’s being clipped looks something like Figure 3-5. In the context of the Web Audio API, sounds clip if the values sent to the destination node lie outside of the range. It’s a good idea to leave some room (called headroom) in your final mix so that you aren’t too close to the clipping threshold.
In addition to close listening, you can check whether or not you are clipping your sound programmatically by putting a script processor node into your audio graph. Clipping may occur if any of the PCM values are out of the acceptable range. In this sample, we check both left and right channels for clipping, and if clipping is detected, save the last clipping time:
function
onProcess
(
e
)
{
var
leftBuffer
=
e
.
inputBuffer
.
getChannelData
(
0
);
var
rightBuffer
=
e
.
inputBuffer
.
getChannelData
(
1
);
checkClipping
(
leftBuffer
);
checkClipping
(
rightBuffer
);
}
function
checkClipping
(
buffer
)
{
var
isClipping
=
false
;
// Iterate through buffer to check if any of the |values| exceeds 1.
for
(
var
i
=
0
;
i
<
buffer
.
length
;
i
++
)
{
var
absValue
=
Math
.
abs
(
buffer
[
i
]);
if
(
absValue
>=
1.0
)
{
isClipping
=
true
;
break
;
}
}
this
.
isClipping
=
isClipping
;
if
(
isClipping
)
{
lastClipTime
=
new
Date
();
}
}
An alternative implementation of metering could poll a real-time
analyzer in the audio graph for getFloatFrequencyData
at
render time, as determined by requestAnimationFrame
(see
Chapter 5). This approach is more efficient, but misses a
lot of the signal (including places where it potentially clips), since
rendering happens most at 60 times a second, whereas the audio signal
changes far more quickly.
The way to prevent clipping is to reduce the overall level of the signal. If you are clipping, apply some fractional gain on a master audio gain node to subdue your mix to a level that prevents clipping. In general, you should tweak gains to anticipate the worst case, but getting this right is more of an art than a science. In practice, since the sounds playing in your game or interactive application may depend on a huge variety of factors that are decided at runtime, it can be difficult to pick the master gain value that prevents clipping in all cases. For this unpredictable case, look to dynamics compression, which is discussed in Dynamics Compression.
Demo: for an example of a very simple meter, visit http://webaudioapi.com/samples/metering/.
Dynamics Compression
Compressors are available in the Web Audio API as
DynamicsCompressorNodes
. Using moderate amounts of dynamics
compression in your mix is generally a good idea, especially in a game
setting where, as previously discussed, you don’t know exactly what sounds
will play and when. One case where compression should be avoided is when
dealing with painstakingly mastered tracks that have been tuned to sound
“just right” already, which are not being mixed with any other
tracks.
Implementing dynamic compression in the Web Audio API is simply a matter of including a dynamics compressor node in your audio graph, generally as the last node before the destination:
var
compressor
=
context
.
createDynamicsCompressor
();
mix
.
connect
(
compressor
);
compressor
.
connect
(
context
.
destination
);
The node can be configured with some additional parameters as described in the theory section, but the defaults are quite good for most purposes. For more information about configuring the compression curve, see the Web Audio API specification.
Get Web Audio API now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.