WEBVTT

1
00:00:08.252 --> 00:00:11.550
Hi, I'm Monty Montgomery from Red Hat and Xiph.Org.

2
00:00:11.550 --> 00:00:18.430
A few months ago, I wrote an article on digital audio and why 24bit/192kHz music downloads don't make sense.

3
00:00:18.430 --> 00:00:23.433
In the article, I mentioned--almost in passing--that a digital waveform is not a stairstep,

4
00:00:23.433 --> 00:00:28.680
and you certainly don't get a stairstep when you convert from digital back to analog.

5
00:00:29.865 --> 00:00:33.865
Of everything in the entire article, <b>that</b> was the number one thing people wrote about.

6
00:00:33.865 --> 00:00:37.221
In fact, more than half the mail I got was questions and comments

7
00:00:37.221 --> 00:00:39.663
about basic digital signal behavior.

8
00:00:39.894 --> 00:00:45.285
Since there's interest, let's take a little time to play with some <u>simple</u> digital signals.

9
00:00:49.747 --> 00:00:51.006
Pretend for a moment

10
00:00:51.006 --> 00:00:54.089
that we have no idea how digital signals really behave.

11
00:00:54.734 --> 00:00:56.841
In that case it doesn't make sense for us

12
00:00:56.841 --> 00:00:59.049
to use digital test equipment either.

13
00:00:59.049 --> 00:01:00.937
Fortunately for this exercise, there's still

14
00:01:00.937 --> 00:01:04.020
plenty of working analog lab equipment out there.

15
00:01:04.020 --> 00:01:05.972
First up, we need a signal generator

16
00:01:05.972 --> 00:01:08.190
to provide us with analog input signals--

17
00:01:08.190 --> 00:01:12.692
in this case, an HP3325 from 1978.

18
00:01:12.692 --> 00:01:14.153
It's still a pretty good generator,

19
00:01:14.153 --> 00:01:15.614
so if you don't mind the size,

20
00:01:15.614 --> 00:01:16.532
the weight,

21
00:01:16.532 --> 00:01:17.577
the power consumption,

22
00:01:17.577 --> 00:01:18.910
and the noisy fan,

23
00:01:18.910 --> 00:01:20.329
you can find them on eBay.

24
00:01:20.329 --> 00:01:23.863
Occasionally for only slightly more than you'll pay for shipping.

25
00:01:24.617 --> 00:01:28.500
Next, we'll observe our analog waveforms on analog oscilloscopes,

26
00:01:28.500 --> 00:01:31.550
like this Tektronix 2246 from the mid-90s,

27
00:01:31.550 --> 00:01:34.761
one of the last and very best analog scopes ever made.

28
00:01:34.761 --> 00:01:36.807
Every home lab should have one.

29
00:01:37.716 --> 00:01:40.852
And finally inspect the frequency spectrum of our signals

30
00:01:40.852 --> 00:01:43.177
using an analog spectrum analyzer.

31
00:01:43.177 --> 00:01:47.732
This HP3585 from the same product line as the signal generator.

32
00:01:47.732 --> 00:01:50.615
Like the other equipment here it has a rudimentary

33
00:01:50.615 --> 00:01:52.905
and hilariously large microcontroller,

34
00:01:52.905 --> 00:01:56.276
but the signal path from input to what you see on the screen

35
00:01:56.276 --> 00:01:58.537
is completely analog.

36
00:01:58.537 --> 00:02:00.329
All of this equipment is vintage,

37
00:02:00.329 --> 00:02:01.993
but aside from its raw tonnage,

38
00:02:01.993 --> 00:02:03.844
the specs are still quite good.

39
00:02:04.536 --> 00:02:06.868
At the moment, we have our signal generator

40
00:02:06.868 --> 00:02:12.829
set to output a nice 1kHz sine wave at one volt RMS,

41
00:02:13.414 --> 00:02:15.220
we see the sine wave on the oscilloscope,

42
00:02:15.220 --> 00:02:21.428
can verify that it is indeed 1kHz at one volt RMS,

43
00:02:21.428 --> 00:02:24.108
which is 2.8V peak-to-peak,

44
00:02:24.308 --> 00:02:27.561
and that matches the measurement on the spectrum analyzer as well.

45
00:02:27.561 --> 00:02:30.644
The analyzer also shows some low-level white noise

46
00:02:30.644 --> 00:02:32.190
and just a bit of harmonic distortion,

47
00:02:32.190 --> 00:02:36.649
with the highest peak about 70dB or so below the fundamental.

48
00:02:36.649 --> 00:02:38.612
Now, this doesn't matter at all in our demos,

49
00:02:38.612 --> 00:02:40.574
but I wanted to point it out now

50
00:02:40.574 --> 00:02:42.452
just in case you didn't notice it until later.

51
00:02:44.036 --> 00:02:47.142
Now, we drop digital sampling in the middle.

52
00:02:48.557 --> 00:02:51.024
For the conversion, we'll use a boring,

53
00:02:51.024 --> 00:02:53.374
consumer-grade, eMagic USB1 audio device.

54
00:02:53.374 --> 00:02:55.337
It's also more than ten years old at this point,

55
00:02:55.337 --> 00:02:57.257
and it's getting obsolete.

56
00:02:57.964 --> 00:03:02.676
A recent converter can easily have an order of magnitude better specs.

57
00:03:03.076 --> 00:03:07.924
Flatness, linearity, jitter, noise behavior, everything...

58
00:03:07.924 --> 00:03:09.353
you may not have noticed.

59
00:03:09.353 --> 00:03:11.604
Just because we can measure an improvement

60
00:03:11.604 --> 00:03:13.609
doesn't mean we can hear it,

61
00:03:13.609 --> 00:03:16.404
and even these old consumer boxes were already

62
00:03:16.404 --> 00:03:18.643
at the edge of ideal transparency.

63
00:03:20.244 --> 00:03:22.825
The eMagic connects to my ThinkPad,

64
00:03:22.825 --> 00:03:26.121
which displays a digital waveform and spectrum for comparison,

65
00:03:26.121 --> 00:03:28.788
then the ThinkPad sends the digital signal right back out

66
00:03:28.788 --> 00:03:30.921
to the eMagic for re-conversion to analog

67
00:03:30.921 --> 00:03:33.332
and observation on the output scopes.

68
00:03:33.332 --> 00:03:35.582
Input to output, left to right.

69
00:03:40.211 --> 00:03:41.214
OK, it's go time.

70
00:03:41.214 --> 00:03:43.924
We begin by converting an analog signal to digital

71
00:03:43.924 --> 00:03:47.347
and then right back to analog again with no other steps.

72
00:03:47.347 --> 00:03:49.268
The signal generator is set to produce

73
00:03:49.268 --> 00:03:52.649
a 1kHz sine wave just like before.

74
00:03:52.649 --> 00:03:57.428
We can see our analog sine wave on our input-side oscilloscope.

75
00:03:57.428 --> 00:04:01.694
We digitize our signal to 16 bit PCM at 44.1kHz,

76
00:04:01.694 --> 00:04:03.828
same as on a CD.

77
00:04:03.828 --> 00:04:07.156
The spectrum of the digitized signal matches what we saw earlier. and...

78
00:04:07.156 --> 00:04:10.836
what we see now on the analog spectrum analyzer,

79
00:04:10.836 --> 00:04:15.154
aside from its high-impedance input being just a smidge noisier.

80
00:04:15.154 --> 00:04:15.956
For now

81
00:04:18.248 --> 00:04:20.798
the waveform display shows our digitized sine wave

82
00:04:20.798 --> 00:04:23.966
as a stairstep pattern, one step for each sample.

83
00:04:23.966 --> 00:04:26.388
And when we look at the output signal

84
00:04:26.388 --> 00:04:29.054
that's been converted from digital back to analog, we see...

85
00:04:29.054 --> 00:04:32.052
It's exactly like the original sine wave.

86
00:04:32.052 --> 00:04:33.483
No stairsteps.

87
00:04:33.914 --> 00:04:37.193
OK, 1kHz is still a fairly low frequency,

88
00:04:37.193 --> 00:04:40.633
maybe the stairsteps are just
hard to see or they're being smoothed away.

89
00:04:40.739 --> 00:04:49.492
Fair enough. Let's choose
a higher frequency, something close to Nyquist, say 15kHz.

90
00:04:49.492 --> 00:04:53.545
Now the sine wave is represented by less than three samples per cycle, and...

91
00:04:53.545 --> 00:04:55.838
the digital waveform looks pretty awful.

92
00:04:55.838 --> 00:04:59.798
Well, looks can be deceiving. The analog output...

93
00:05:01.876 --> 00:05:06.033
is still a perfect sine wave, exactly like the original.

94
00:05:06.633 --> 00:05:09.228
Let's keep going up.

95
00:05:17.353 --> 00:05:20.151
16kHz....

96
00:05:23.198 --> 00:05:25.616
17kHz...

97
00:05:28.201 --> 00:05:29.945
18kHz...

98
00:05:33.822 --> 00:05:35.548
19kHz...

99
00:05:40.457 --> 00:05:42.465
20kHz.

100
00:05:49.097 --> 00:05:52.350
Welcome to the upper limits of human hearing.

101
00:05:52.350 --> 00:05:54.377
The output waveform is still perfect.

102
00:05:54.377 --> 00:05:58.025
No jagged edges, no dropoff, no stairsteps.

103
00:05:58.025 --> 00:06:01.342
So where'd the stairsteps go?

104
00:06:01.342 --> 00:06:03.198
Don't answer, it's a trick question.

105
00:06:03.198 --> 00:06:04.318
They were never there.

106
00:06:04.318 --> 00:06:06.652
Drawing a digital waveform as a stairstep

107
00:06:08.712 --> 00:06:10.772
was wrong to begin with.

108
00:06:10.942 --> 00:06:11.998
Why?

109
00:06:11.998 --> 00:06:14.366
A stairstep is a continuous-time function.

110
00:06:14.366 --> 00:06:16.201
It's jagged, and it's piecewise,

111
00:06:16.201 --> 00:06:19.700
but it has a defined value at every point in time.

112
00:06:19.700 --> 00:06:22.004
A sampled signal is entirely different.

113
00:06:22.004 --> 00:06:23.337
It's discrete-time;

114
00:06:23.337 --> 00:06:27.337
it's only got a value right at each instantaneous sample point

115
00:06:27.337 --> 00:06:32.596
and it's undefined, there is no value at all, everywhere between.

116
00:06:32.596 --> 00:06:36.666
A discrete-time signal is properly drawn as a lollipop graph.

117
00:06:40.020 --> 00:06:42.974
The continuous, analog counterpart of a digital signal

118
00:06:42.974 --> 00:06:45.364
passes smoothly through each sample point,

119
00:06:45.364 --> 00:06:50.153
and that's just as true for high frequencies as it is for low.

120
00:06:50.153 --> 00:06:53.033
Now, the interesting and not at all obvious bit is:

121
00:06:53.033 --> 00:06:55.454
there's only one bandlimited signal that passes

122
00:06:55.454 --> 00:06:57.417
exactly through each sample point.

123
00:06:57.417 --> 00:06:58.708
It's a unique solution.

124
00:06:58.708 --> 00:07:01.246
So if you sample a bandlimited signal

125
00:07:01.246 --> 00:07:02.612
and then convert it back,

126
00:07:02.612 --> 00:07:06.462
the original input is also the only possible output.

127
00:07:06.462 --> 00:07:07.838
And before you say,

128
00:07:07.838 --> 00:07:11.721
"Oh, I can draw a different signal that passes through those points."

129
00:07:11.721 --> 00:07:14.283
Well, yes you can, but...

130
00:07:17.268 --> 00:07:20.521
if it differs even minutely from the original,

131
00:07:20.521 --> 00:07:24.905
it contains frequency content at or beyond Nyquist,

132
00:07:24.905 --> 00:07:26.185
breaks the bandlimiting requirement

133
00:07:26.185 --> 00:07:28.358
and isn't a valid solution.

134
00:07:28.574 --> 00:07:30.036
So how did everyone get confused

135
00:07:30.036 --> 00:07:32.702
and start thinking of digital signals as stairsteps?

136
00:07:32.702 --> 00:07:34.900
I can think of two good reasons.

137
00:07:34.900 --> 00:07:37.956
First: It's easy enough to convert a sampled signal

138
00:07:37.972 --> 00:07:39.294
to a true stairstep.

139
00:07:39.294 --> 00:07:42.409
Just extend each sample value forward until the next sample period.

140
00:07:42.409 --> 00:07:44.414
This is called a zero-order hold,

141
00:07:44.414 --> 00:07:47.913
and it's an important part of how some digital-to-analog converters work,

142
00:07:47.913 --> 00:07:50.089
especially the simplest ones.

143
00:07:50.089 --> 00:07:55.591
So, anyone who looks up digital-to-analog conversion

144
00:07:55.592 --> 00:07:59.550
is probably going to see a diagram of a stairstep waveform somewhere,

145
00:07:59.550 --> 00:08:01.982
but that's not a finished conversion,

146
00:08:01.982 --> 00:08:04.250
and it's not the signal that comes out.

147
00:08:04.944 --> 00:08:05.684
Second,

148
00:08:05.684 --> 00:08:07.529
and this is probably the more likely reason,

149
00:08:07.529 --> 00:08:09.449
engineers who supposedly know better,

150
00:08:09.449 --> 00:08:10.441
like me,

151
00:08:10.441 --> 00:08:13.193
draw stairsteps even though they're technically wrong.

152
00:08:13.193 --> 00:08:15.571
It's a sort of like a one-dimensional version of

153
00:08:15.571 --> 00:08:17.395
fat bits in an image editor.

154
00:08:17.395 --> 00:08:19.241
Pixels aren't squares either,

155
00:08:19.241 --> 00:08:23.081
they're samples of a 2-dimensional function space and so they're also,

156
00:08:23.081 --> 00:08:26.366
conceptually, infinitely small points.

157
00:08:26.366 --> 00:08:28.500
Practically, it's a real pain in the ass to see

158
00:08:28.500 --> 00:08:30.804
or manipulate infinitely small anything.

159
00:08:30.804 --> 00:08:32.212
So big squares it is.

160
00:08:32.212 --> 00:08:35.966
Digital stairstep drawings are exactly the same thing.

161
00:08:35.966 --> 00:08:37.684
It's just a convenient drawing.

162
00:08:37.684 --> 00:08:40.404
The stairsteps aren't really there.

163
00:08:45.652 --> 00:08:48.233
When we convert a digital signal back to analog,

164
00:08:48.233 --> 00:08:50.900
the result is <u>also</u> smooth regardless of the bit depth.

165
00:08:50.900 --> 00:08:53.193
24 bits or 16 bits...

166
00:08:53.193 --> 00:08:54.196
or 8 bits...

167
00:08:54.196 --> 00:08:55.486
it doesn't matter.

168
00:08:55.486 --> 00:08:57.534
So does that mean that the digital bit depth

169
00:08:57.534 --> 00:08:58.953
makes no difference at all?

170
00:08:59.245 --> 00:09:00.521
Of course not.

171
00:09:02.121 --> 00:09:06.046
Channel 2 here is the same sine wave input,

172
00:09:06.046 --> 00:09:09.086
but we quantize with dither down to eight bits.

173
00:09:09.086 --> 00:09:14.174
On the scope, we still see a nice
smooth sine wave on channel 2.

174
00:09:14.174 --> 00:09:18.014
Look very close, and you'll also see a
bit more noise.

175
00:09:18.014 --> 00:09:19.305
That's a clue.

176
00:09:19.305 --> 00:09:21.273
If we look at the spectrum of the signal...

177
00:09:22.889 --> 00:09:23.732
aha!

178
00:09:23.732 --> 00:09:26.398
Our sine wave is still there unaffected,

179
00:09:26.398 --> 00:09:28.490
but the noise level of the eight-bit signal

180
00:09:28.490 --> 00:09:32.470
on the second channel is much higher!

181
00:09:32.948 --> 00:09:36.148
And that's the difference the number of bits makes.

182
00:09:36.148 --> 00:09:37.434
That's it!

183
00:09:37.822 --> 00:09:39.956
When we digitize a signal, first we sample it.

184
00:09:39.956 --> 00:09:42.366
The sampling step is perfect; it loses nothing.

185
00:09:42.366 --> 00:09:45.626
But then we quantize it,
and quantization adds noise.

186
00:09:47.827 --> 00:09:50.793
The number of bits determines how much noise

187
00:09:50.793 --> 00:09:52.569
and so the level of the
noise floor.

188
00:10:00.170 --> 00:10:03.646
What does this dithered quantization noise sound like?

189
00:10:03.646 --> 00:10:06.012
Let's listen to our eight-bit sine wave.

190
00:10:12.521 --> 00:10:15.273
That may have been hard to hear anything but the tone.

191
00:10:15.273 --> 00:10:18.740
Let's listen to just the noise after we notch out the sine wave

192
00:10:18.740 --> 00:10:21.683
and then bring the gain up a bit because the noise is quiet.

193
00:10:32.009 --> 00:10:35.049
Those of you who have used analog recording equipment

194
00:10:35.049 --> 00:10:36.670
may have just thought to yourselves,

195
00:10:36.670 --> 00:10:40.382
"My goodness! That sounds like tape hiss!"

196
00:10:40.382 --> 00:10:41.929
Well, it doesn't just sound like tape hiss,

197
00:10:41.929 --> 00:10:43.433
it acts like it too,

198
00:10:43.433 --> 00:10:45.225
and if we use a gaussian dither

199
00:10:45.225 --> 00:10:47.646
then it's mathematically equivalent in every way.

200
00:10:47.646 --> 00:10:49.225
It <u>is</u> tape hiss.

201
00:10:49.225 --> 00:10:51.774
Intuitively, that means that we can measure tape hiss

202
00:10:51.774 --> 00:10:54.196
and thus the noise floor of magnetic audio tape

203
00:10:54.196 --> 00:10:56.233
in bits instead of decibels,

204
00:10:56.233 --> 00:10:59.902
in order to put things in a digital perspective.

205
00:10:59.902 --> 00:11:03.028
Compact cassettes...

206
00:11:03.028 --> 00:11:05.449
for those of you who are old enough to remember them,

207
00:11:05.449 --> 00:11:09.161
could reach as
deep as nine bits in perfect conditions,

208
00:11:09.161 --> 00:11:11.209
though five to six bits was more typical,

209
00:11:11.209 --> 00:11:13.876
especially if it was a recording made on a tape deck.

210
00:11:13.876 --> 00:11:19.422
That's right... your mix tapes were only about six bits
deep... if you were lucky!

211
00:11:19.837 --> 00:11:22.345
The very best professional open reel tape

212
00:11:22.345 --> 00:11:24.553
used in studios could barely hit...

213
00:11:24.553 --> 00:11:26.473
any guesses?...

214
00:11:26.473 --> 00:11:27.604
13 bits

215
00:11:27.604 --> 00:11:28.980
<u>with</u> advanced noise reduction.

216
00:11:28.980 --> 00:11:32.062
And that's why seeing 'DDD' on a Compact Disc

217
00:11:32.062 --> 00:11:35.208
used to be such a big, high-end deal.

218
00:11:40.116 --> 00:11:42.825
I keep saying that I'm quantizing with dither,

219
00:11:42.825 --> 00:11:44.734
so what is dither exactly?

220
00:11:44.734 --> 00:11:47.284
More importantly, what does it do?

221
00:11:47.284 --> 00:11:49.876
The simple way to quantize a signal is to choose

222
00:11:49.876 --> 00:11:52.329
the digital amplitude value closest

223
00:11:52.329 --> 00:11:54.377
to the original analog amplitude.

224
00:11:54.377 --> 00:11:55.337
Obvious, right?

225
00:11:55.337 --> 00:11:57.545
Unfortunately, the exact noise you get

226
00:11:57.545 --> 00:11:59.220
from this simple quantization scheme

227
00:11:59.220 --> 00:12:02.174
depends somewhat on the input signal,

228
00:12:02.174 --> 00:12:04.596
so we may get noise that's inconsistent,

229
00:12:04.596 --> 00:12:06.142
or causes distortion,

230
00:12:06.142 --> 00:12:09.054
or is undesirable in some other way.

231
00:12:09.054 --> 00:12:11.764
Dither is specially-constructed noise that

232
00:12:11.764 --> 00:12:15.273
substitutes for the noise produced by simple quantization.

233
00:12:15.273 --> 00:12:18.025
Dither doesn't drown out or mask quantization noise,

234
00:12:18.025 --> 00:12:20.190
it actually replaces it

235
00:12:20.190 --> 00:12:22.612
with noise characteristics of our choosing

236
00:12:22.612 --> 00:12:24.794
that aren't influenced by the input.

237
00:12:25.256 --> 00:12:27.081
Let's <u>watch</u> what dither does.

238
00:12:27.081 --> 00:12:30.078
The signal generator has too much noise for this test

239
00:12:30.431 --> 00:12:33.161
so we'll produce a mathematically

240
00:12:33.161 --> 00:12:34.782
perfect sine wave with the ThinkPad

241
00:12:34.782 --> 00:12:38.205
and quantize it to eight bits with dithering.

242
00:12:39.006 --> 00:12:41.342
We see a nice sine wave on the waveform display

243
00:12:41.342 --> 00:12:43.452
and output scope

244
00:12:44.222 --> 00:12:44.972
and...

245
00:12:46.588 --> 00:12:49.375
once the analog spectrum analyzer catches up...

246
00:12:50.713 --> 00:12:53.588
a clean frequency peak with a uniform noise floor

247
00:12:56.864 --> 00:12:58.611
on both spectral displays

248
00:12:58.611 --> 00:12:59.646
just like before

249
00:12:59.646 --> 00:13:01.549
Again, this is with dither.

250
00:13:02.196 --> 00:13:04.225
Now I turn dithering off.

251
00:13:05.779 --> 00:13:07.913
The quantization noise, that dither had spread out

252
00:13:07.913 --> 00:13:09.577
into a nice, flat noise floor,

253
00:13:09.577 --> 00:13:12.286
piles up into harmonic distortion peaks.

254
00:13:12.286 --> 00:13:16.030
The noise floor is lower, but the level of distortion becomes nonzero,

255
00:13:16.030 --> 00:13:19.668
and the distortion peaks sit higher than the dithering noise did.

256
00:13:19.668 --> 00:13:22.318
At eight bits this effect is exaggerated.

257
00:13:22.488 --> 00:13:24.200
At sixteen bits,

258
00:13:24.692 --> 00:13:25.929
even without dither,

259
00:13:25.929 --> 00:13:28.308
harmonic distortion is going to be so low

260
00:13:28.308 --> 00:13:30.708
as to be completely inaudible.

261
00:13:30.708 --> 00:13:34.581
Still, we can use dither to eliminate it completely

262
00:13:34.581 --> 00:13:36.489
if we so choose.

263
00:13:37.642 --> 00:13:39.273
Turning the dither off again for a moment,

264
00:13:40.934 --> 00:13:43.444
you'll notice that the absolute level of distortion

265
00:13:43.444 --> 00:13:47.070
from undithered quantization stays approximately constant

266
00:13:47.070 --> 00:13:49.033
regardless of the input amplitude.

267
00:13:49.033 --> 00:13:51.998
But when the signal level drops below a half a bit,

268
00:13:51.998 --> 00:13:54.036
everything quantizes to zero.

269
00:13:54.036 --> 00:13:54.910
In a sense,

270
00:13:54.910 --> 00:13:58.557
everything quantizing to zero is just 100% distortion!

271
00:13:58.833 --> 00:14:01.588
Dither eliminates this distortion too.

272
00:14:01.588 --> 00:14:03.599
We reenable dither and...

273
00:14:03.599 --> 00:14:06.377
there's our signal back at 1/4 bit,

274
00:14:06.377 --> 00:14:09.076
with our nice flat noise floor.

275
00:14:09.630 --> 00:14:11.220
The noise floor doesn't have to be flat.

276
00:14:11.220 --> 00:14:12.798
Dither is noise of our choosing,

277
00:14:12.798 --> 00:14:15.006
so let's choose a noise as inoffensive

278
00:14:15.006 --> 00:14:17.017
and difficult to notice as possible.

279
00:14:18.142 --> 00:14:22.484
Our hearing is most sensitive in the midrange from 2kHz to 4kHz,

280
00:14:22.484 --> 00:14:25.438
so that's where background noise is going to be the most obvious.

281
00:14:25.438 --> 00:14:29.406
We can shape dithering noise away from sensitive frequencies

282
00:14:29.406 --> 00:14:31.241
to where hearing is less sensitive,

283
00:14:31.241 --> 00:14:33.910
usually the highest frequencies.

284
00:14:34.249 --> 00:14:37.460
16-bit dithering noise is normally much too quiet to hear at all,

285
00:14:37.460 --> 00:14:39.668
but let's listen to our noise shaping example,

286
00:14:39.668 --> 00:14:42.234
again with the gain brought way up...

287
00:14:56.020 --> 00:14:59.977
Lastly, dithered quantization noise <u>is</u> higher power overall

288
00:14:59.977 --> 00:15:04.276
than undithered quantization noise even when it sounds quieter.

289
00:15:04.276 --> 00:15:07.902
You can see that on a VU meter during passages of near-silence.

290
00:15:07.902 --> 00:15:10.537
But dither isn't only an on or off choice.

291
00:15:10.537 --> 00:15:14.712
We can reduce the dither's power to balance less noise against

292
00:15:14.712 --> 00:15:18.313
a bit of distortion to minimize the overall effect.

293
00:15:19.605 --> 00:15:22.790
We'll also modulate the input signal like this:

294
00:15:27.098 --> 00:15:30.206
...to show how a varying input affects the quantization noise.

295
00:15:30.206 --> 00:15:33.289
At full dithering power, the noise is uniform, constant,

296
00:15:33.289 --> 00:15:35.643
and featureless just like we expect:

297
00:15:40.937 --> 00:15:42.772
As we reduce the dither's power,

298
00:15:42.772 --> 00:15:46.356
the input increasingly affects the amplitude and the character

299
00:15:46.356 --> 00:15:47.977
of the quantization noise:

300
00:16:09.883 --> 00:16:13.844
Shaped dither behaves similarly,

301
00:16:13.844 --> 00:16:16.553
but noise shaping lends one more nice advantage.

302
00:16:16.553 --> 00:16:18.804
To make a long story short, it can use

303
00:16:18.804 --> 00:16:20.937
a somewhat lower dither power before the input

304
00:16:20.937 --> 00:16:23.662
has as much effect on the output.

305
00:16:49.172 --> 00:16:51.508
Despite all the time I just spent on dither,

306
00:16:51.508 --> 00:16:53.012
we're talking about differences

307
00:16:53.012 --> 00:16:56.372
that start 100 decibels below full scale.

308
00:16:56.372 --> 00:16:59.806
Maybe if the CD had been 14 bits as originally designed,

309
00:16:59.806 --> 00:17:01.513
dither <u>might</u> be more important.

310
00:17:01.989 --> 00:17:02.644
Maybe.

311
00:17:02.644 --> 00:17:05.438
At 16 bits, really, it's mostly a wash.

312
00:17:05.438 --> 00:17:08.019
You can think of dither as an insurance policy

313
00:17:08.019 --> 00:17:11.443
that gives several extra decibels of dynamic range

314
00:17:11.443 --> 00:17:12.804
just in case.

315
00:17:12.990 --> 00:17:14.196
The simple fact is, though,

316
00:17:14.196 --> 00:17:16.361
no one ever ruined a great recording

317
00:17:16.361 --> 00:17:19.182
by not dithering the final master.

318
00:17:24.414 --> 00:17:25.790
We've been using sine waves.

319
00:17:25.790 --> 00:17:28.254
They're the obvious choice when what we want to see

320
00:17:28.254 --> 00:17:32.212
is a system's behavior at a given isolated frequency.

321
00:17:32.212 --> 00:17:34.217
Now let's look at something a bit more complex.

322
00:17:34.217 --> 00:17:35.923
What should we expect to happen

323
00:17:35.923 --> 00:17:39.671
when I change the input to a square wave...

324
00:17:42.718 --> 00:17:45.921
The input scope confirms our 1kHz square wave.

325
00:17:45.921 --> 00:17:47.351
The output scope shows..

326
00:17:48.614 --> 00:17:51.102
Exactly what it should.

327
00:17:51.102 --> 00:17:53.900
What is a square wave really?

328
00:17:54.654 --> 00:17:57.982
Well, we can say it's a waveform that's some positive value

329
00:17:57.982 --> 00:18:00.788
for half a cycle and then transitions instantaneously

330
00:18:00.788 --> 00:18:02.910
to a negative value for the other half.

331
00:18:02.910 --> 00:18:05.076
But that doesn't really tell us anything useful

332
00:18:05.076 --> 00:18:07.241
about how this input

333
00:18:07.241 --> 00:18:09.378
becomes this output.

334
00:18:10.132 --> 00:18:12.713
Then we remember that any waveform

335
00:18:12.713 --> 00:18:15.508
is also the sum of discrete frequencies,

336
00:18:15.508 --> 00:18:18.302
and a square wave is a particularly simple sum

337
00:18:18.302 --> 00:18:19.636
a fundamental and

338
00:18:19.636 --> 00:18:22.228
an infinite series of odd harmonics.

339
00:18:22.228 --> 00:18:24.597
Sum them all up, you get a square wave.

340
00:18:26.398 --> 00:18:27.433
At first glance,

341
00:18:27.433 --> 00:18:29.225
that doesn't seem very useful either.

342
00:18:29.225 --> 00:18:31.561
You have to sum up an infinite number of harmonics

343
00:18:31.561 --> 00:18:33.108
to get the answer.

344
00:18:33.108 --> 00:18:35.977
Ah, but we don't have an infinite number of harmonics.

345
00:18:36.960 --> 00:18:39.902
We're using a quite sharp anti-aliasing filter

346
00:18:39.902 --> 00:18:42.206
that cuts off right above 20kHz,

347
00:18:42.206 --> 00:18:44.158
so our signal is band-limited,

348
00:18:44.158 --> 00:18:46.421
which means we get this:

349
00:18:52.500 --> 00:18:56.468
..and that's exactly what we see on the output scope.

350
00:18:56.468 --> 00:18:59.550
The rippling you see around sharp edges in a bandlimited signal

351
00:18:59.550 --> 00:19:00.926
is called the Gibbs effect.

352
00:19:00.926 --> 00:19:04.137
It happens whenever you slice off part of the frequency domain

353
00:19:04.137 --> 00:19:07.006
in the middle of nonzero energy.

354
00:19:07.006 --> 00:19:09.854
The usual rule of thumb you'll hear is the sharper the cutoff,

355
00:19:09.854 --> 00:19:11.188
the stronger the rippling,

356
00:19:11.188 --> 00:19:12.777
which is approximately true,

357
00:19:12.777 --> 00:19:14.900
but we have to be careful how we think about it.

358
00:19:14.900 --> 00:19:15.774
For example...

359
00:19:15.774 --> 00:19:19.529
what would you expect our quite sharp anti-aliasing filter

360
00:19:19.529 --> 00:19:23.181
to do if I run our signal through it a second time?

361
00:19:34.136 --> 00:19:37.588
Aside from adding a few fractional cycles of delay,

362
00:19:37.588 --> 00:19:39.348
the answer is...

363
00:19:39.348 --> 00:19:40.857
nothing at all.

364
00:19:41.257 --> 00:19:43.302
The signal is already bandlimited.

365
00:19:43.656 --> 00:19:46.590
Bandlimiting it again doesn't do anything.

366
00:19:46.590 --> 00:19:50.686
A second pass can't remove frequencies that we already removed.

367
00:19:52.070 --> 00:19:53.737
And that's important.

368
00:19:53.737 --> 00:19:56.233
People tend to think of the ripples as a kind of artifact

369
00:19:56.233 --> 00:19:59.945
that's added by anti-aliasing and anti-imaging filters,

370
00:19:59.945 --> 00:20:01.737
implying that the ripples get worse

371
00:20:01.737 --> 00:20:03.913
each time the signal passes through.

372
00:20:03.913 --> 00:20:05.950
We can see that in this case that didn't happen.

373
00:20:05.950 --> 00:20:09.492
So was it really the filter that added the ripples the first time through?

374
00:20:09.492 --> 00:20:10.537
No, not really.

375
00:20:10.537 --> 00:20:12.126
It's a subtle distinction,

376
00:20:12.126 --> 00:20:15.252
but Gibbs effect ripples aren't added by filters,

377
00:20:15.252 --> 00:20:18.836
they're just part of what a bandlimited signal <u>is</u>.

378
00:20:18.836 --> 00:20:20.798
Even if we synthetically construct

379
00:20:20.798 --> 00:20:23.508
what looks like a perfect digital square wave,

380
00:20:23.508 --> 00:20:26.206
it's still limited to the channel bandwidth.

381
00:20:26.206 --> 00:20:29.140
Remember the stairstep representation is misleading.

382
00:20:29.140 --> 00:20:32.222
What we really have here are instantaneous sample points,

383
00:20:32.222 --> 00:20:36.148
and only one bandlimited signal fits those points.

384
00:20:36.148 --> 00:20:39.614
All we did when we drew our apparently perfect square wave

385
00:20:39.614 --> 00:20:43.198
was line up the sample points just right so it appeared

386
00:20:43.198 --> 00:20:47.785
that there were no ripples if we played connect-the-dots.

387
00:20:47.785 --> 00:20:49.449
But the original bandlimited signal,

388
00:20:49.449 --> 00:20:52.742
complete with ripples, was still there.

389
00:20:54.004 --> 00:20:56.542
And that leads us to one more important point.

390
00:20:56.542 --> 00:20:59.550
You've probably heard that the timing precision of a digital signal

391
00:20:59.550 --> 00:21:02.409
is limited by its sample rate; put another way,

392
00:21:02.409 --> 00:21:05.140
that digital signals can't represent anything

393
00:21:05.140 --> 00:21:08.041
that falls between the samples...

394
00:21:08.041 --> 00:21:11.422
implying that impulses or fast attacks have to align

395
00:21:11.422 --> 00:21:14.473
exactly with a sample, or the timing gets mangled...

396
00:21:14.473 --> 00:21:16.219
or they just disappear.

397
00:21:16.711 --> 00:21:20.820
At this point, we can easily see why that's wrong.

398
00:21:20.820 --> 00:21:23.742
Again, our input signals are bandlimited.

399
00:21:23.742 --> 00:21:26.036
And digital signals are samples,

400
00:21:26.036 --> 00:21:29.340
not stairsteps, not 'connect-the-dots'.

401
00:21:31.572 --> 00:21:34.592
We most certainly can, for example,

402
00:21:36.777 --> 00:21:39.337
put the rising edge of our bandlimited square wave

403
00:21:39.337 --> 00:21:42.004
anywhere we want between samples.

404
00:21:42.004 --> 00:21:44.354
It's represented perfectly

405
00:21:47.508 --> 00:21:50.218
and it's reconstructed perfectly.

406
00:22:04.620 --> 00:22:06.526
Just like in the previous episode,

407
00:22:06.526 --> 00:22:08.393
we've covered a broad range of topics,

408
00:22:08.393 --> 00:22:10.868
and yet barely scratched the surface of each one.

409
00:22:10.868 --> 00:22:13.620
If anything, my sins of omission are greater this time around...

410
00:22:13.620 --> 00:22:16.286
but this is a good stopping point.

411
00:22:16.286 --> 00:22:17.833
Or maybe, a good starting point.

412
00:22:17.833 --> 00:22:18.708
Dig deeper.

413
00:22:18.708 --> 00:22:19.710
Experiment.

414
00:22:19.710 --> 00:22:21.374
I chose my demos very carefully

415
00:22:21.374 --> 00:22:23.668
to be simple and give clear results.

416
00:22:23.668 --> 00:22:26.217
You can reproduce every one of them on your own if you like.

417
00:22:26.217 --> 00:22:28.766
But let's face it, sometimes we learn the most

418
00:22:28.766 --> 00:22:30.516
about a spiffy toy by breaking it open

419
00:22:30.516 --> 00:22:32.553
and studying all the pieces that fall out.

420
00:22:32.553 --> 00:22:35.230
That's OK, we're engineers.

421
00:22:35.230 --> 00:22:36.350
Play with the demo parameters,

422
00:22:36.350 --> 00:22:37.972
hack up the code,

423
00:22:37.972 --> 00:22:39.774
set up alternate experiments.

424
00:22:39.774 --> 00:22:40.692
The source code for everything,

425
00:22:40.692 --> 00:22:42.398
including the little pushbutton demo application,

426
00:22:42.398 --> 00:22:44.361
is up at Xiph.Org.

427
00:22:44.361 --> 00:22:45.940
In the course of experimentation,

428
00:22:45.940 --> 00:22:47.401
you're likely to run into something

429
00:22:47.401 --> 00:22:49.950
that you didn't expect and can't explain.

430
00:22:49.950 --> 00:22:51.198
Don't worry!

431
00:22:51.198 --> 00:22:54.537
My earlier snark aside, Wikipedia is fantastic for

432
00:22:54.537 --> 00:22:56.788
exactly this kind of casual research.

433
00:22:56.788 --> 00:22:59.956
If you're really serious about understanding signals,

434
00:22:59.956 --> 00:23:03.337
several universities have advanced materials online,

435
00:23:03.337 --> 00:23:07.380
such as the 6.003 and 6.007 Signals and Systems modules

436
00:23:07.380 --> 00:23:08.798
at MIT OpenCourseWare.

437
00:23:08.798 --> 00:23:11.593
And of course, there's always the community here at Xiph.Org.

438
00:23:12.792 --> 00:23:13.929
Digging deeper or not,

439
00:23:13.929 --> 00:23:14.974
I am out of coffee,

440
00:23:14.974 --> 00:23:16.436
so, until next time,

441
00:23:16.436 --> 00:23:19.316
happy hacking!