Saturday, July 07, 2007

I spent nearly the entire week dithering...

If you'll pardon the horrific pun...

Obviously a mastering limiter has dithering. Right?
So, naturally we wanted to provide the best dithering in the world, because that's just what we do.

This is really interesting, go do this:

As many people have pointed out, the psychoacoustics behind dither rely on your hearing perception spectrum changing with volume, so unless you try it nice and quiet, you won't learn much from it.

That said, I tried it out at a higher volume... and I gotta say, things didn't change TOO much! The order of "goodness" remained -roughly- the same for me, the best stayed the best, at least... and that's MegaBitMax. The Waves IDR(/POW-R, for that is what it appears to be) comes a good second... but MBM definitely has it.

I stumbled onto (literally found by accident) the coefficients for the POW-R3 dither (yeah, they're on the net in some sourcecode! Weird!).

I spent the start of the week figuring out how dither works. Today, the maths all made sense, arranged itself in my brain, and fell into place. I now have all the dither you want... :)

So, first up, I ought to start with the background... why do we dither things, how does it work, why, etc.

Well, we're going to reduce bit-depth to fit it onto a CD, and whatever we do to get rid of those excess bits is going to generate some form of distortion. That distortion sounds bad. We want to get rid of it.

Step1: Add some noise. If the noises is louder than the distortion (which is actually quite quiet), you won't hear the distortion.

Good one. A little noise, and the distortion is gone.

Now we have this extra noise that we don't want... so..

Step2: Maybe, with some feedback, we can filter the noise and make it less audible.

Bingo! now, we've hidden the distortion under some noise... and then we hid the noise! ACE! :D
Does it get any better than that?? This is the magical thing about noise... you can bend it - so we fold it out of the way to places we can't hear!

"How is this all psycho-acoustic?"

Easy! We want to get rid of the noise where you can't hear it. Psychoacoustics tells me about where your hearing isn't very sensitive... so... I EQ it away from where you CAN hear, and towards where you CAN'T hear... there are curves for this sort of thing, and all you need to do is invert that curve and you're home and dry... maybe...


Well, actually no! You're not! If that WAS completely true, POW-R3 would sound better than MegaBitMax. But it doesnt, does it? How strange! So... there's a little bit more going on than psychoacoustics wants us to know about... but hey we can experiment and get it right.

"The way you're describing all this, it makes it sound like this is just about some EQ settings"
Well... that's the magic, it really is. Go find Alexey Lukin's page on dither, grab the pdf

see... curves! Magic.

"Is this really the only way to do it?"
No, but it is the best way.
We have a number of other options that we could explore, which I shall detail here:

1) We could not add dither noise.
Sure, and then the distortion comes back- that's the only reason we're doing this.

2) We could just shape the noise but not the quantize error
Sure, and then you'll have noise which isn't masking the error in the most sensitive part of your hearing.

3) We could just shape the quantize error but not the noise
Sure, but if you just want to add loud noise, why not record onto an old tape?

4) We could change the EQ to follow the shape of the music, and hide the noise in places that are loud so it's masked!
Good idea, but there are some real problems here that you must pay attention to: First and foremost, we add noise because it is specifically uncorrelated to the signal - if it WAS correlated, it would be harmonic distortion (that's the definition). Now, if you're planning to change the filter coefficients depending on the signal, then not only is that a time-variant filter (where you'll either be changing the curve too slowly to be useful or too quick, and making bad noises - it's a fine line and not a fun game) you would actually be adding in a correlated change - so you'd actually be adding in more distortion. It's possible that a strategy based on this might one day find the balance with the time-variant filter and allow you to blend between distortion and noise, but it's a very complicated task, and you really wouldn't be gaining anything. (Although it would make for a very interesting paper.)

So, with these decisions fixed, made and justified, my task was reduced to finding a way to generate the perfect dither response. The literature hints at it, but the answer is obviously an optimised Levinson-Durbin recursion since the problem can be resolved to solving a set of
simultaneous equations with a symmetric Toeplitz matrix. I can do it in realtime now :)
I've been A/Bing with the POW-R3 and MegaBitMax, and what I have here sounds at least as good;
I've reduced the problem down to calibrating one number, which is mad really. I think the final
answer will be about 2.2. ;)

I'll get some samples up sometime :)

[I will concede that finding the one number that tweaks it just right was a stroke of luck.. I've seen no reference to it in the literature.]