I’m planning to encode some of my blu-ray discs to AV1 with maximum quality in mind. After thinking I had a good set of settings nailed down, I got sensitized to the topic of banding and found that in certain frames, my encodes were suffering from it quite badly.
I also found the biggest magnet for banding in an animated show: the very first episodes of “The Eminence in Shadow” shows a purple blanket that has crazy banding even at 10-bit with high bit rates.
Here’s aom-av1-lavish, the “opmox mainline merge” branch as of November, 14th, 2023 with --arnr-strength=0 --enable-dnl-denoising=0 --denoise-noise-level=1
After seeing that another (x265) encode did it much better and even SVT-AV1 with mostly default settings performed well (see further down), I changed to --arnr-strength=1 --enable-dnl-denoising=0 --denoise-noise-level=6
and what a difference:
Finally, this is the result of SVT-AV1-psy as of January, 22nd, 2024. The settings are --film-grain 6 --film-grain-denoise 0
:
So how does one estimate a video’s noise / grain level? Do I just develop a feel for which setting corresponds to what look? That might involve quite a bunch of failed encodes, however.
If AV1 noise synthesis “removes” banding, that banding was never part of the video in the first place, but your video player system created it during bit depth reduction, since you’re viewing on an 8-bit display. This can be prevented with dithering, which AV1 noise synthesis can substitute for.
Yes, that’s obvious.
The dithering pattern is random in each frame, so distinguishing between dithered gradients and noise/film grain baked into the Blu-Ray source is hardly possible.
For the encoder, randomly dithered gradients and film grain are just noise. Both AomEnc and SVT-AV1 can remove this noise (thus causing banding) for better compression, but also record information about the noise to allow for statistically identical noise to be composited back on top of each frame during playback, hiding the bands again.
My issue here is simply that there is no reference for what noise that requires --denoise-noise-level=1
looks like, or how I should recognize noise that requires --denoise-noise-level=6
and so on. If my anime screenshot is level 6 already, then is “Alien (1978)”, level 12? level 18? Higher even?
Well, if you encode at high bit depth, the removal of the noise won’t create visible banding (at most barely visible at 10 bpc, completely invisible at 12bpc), which was my point. But the generated noise can still prevent banding during playback in case the player doesn’t dither (which they usually don’t by default).
Denoise-noise-level sets a denoise strength, which affects the denoising done on every frame in case of denoise-noise-level > 0. The denoised frame is then compared with the unaltered frame in some (sadly very unsuitable) way, and then noise is generated based on that calculated difference, and applied to the frame after it is encoded. Because the implementation is so shitty, the visual energy removed during denoising, and the visual energy added with noise synthesis, can diverge drastically.
So, no matter what denoise-noise-level you choose, the result will be far from optimal. And stronger levels won’t just create unnecessary noise, but also create ugly grain patterns, which can become quite obvious beyond denoise-noise-level 10 or so.