Port the lmmse gamma step to the other demosaicers

linear gamma purist

LOL!!!

Log or power function or any kind of convex transfer function applied on 3 individual channels have the known property of desaturating the picture at non-constant hue. So you don’t need a PhD to get a sense that desaturating is going to hide chromatic aberrations.

Now, if it has no link to either physics or physiology/psychology, it is shit. Full period. Light knows no “gamma”. We are interpolating light. Ergo gamma has nothing to do there. How complicated is that ?

I don’t need to test it to know it will mess up the variance upon which some demosaicing methods rely because I have read the papers before pissing shitty code.

Bonus question : how do you choose the exponent ???

1 Like

Yes light has not gamma but every kind of interpolation have artifacts.

…

Right. some are analysed, some are not.

1 Like

So tackle the artifacts at their source by identifying what cause them, rather than using nasty encoding trick to hide the dust under the carpet and pretend it’s clean now.

You really don’t understand how science works, don’t you ? Hint : it’s not randomly hacking code until you find an ad-hoc local solution to a local problem and pretend it solves the general problem.

s/gamma/variance stabilization transform/g and then it perhaps makes more sense:

4 Likes

Indeed, but note that in the context of denoising we try to do an unbiased backtransform afterwards (which means that to remove the sqrt or power, we take into account the fact that we averaged non-linear values instead of linear one, the inverse is not just the algebraic inverse)

2 Likes

That verbal escalation is not helping anyone!

Maybe it’s helpful to point out that by calculating the gradients differently one modifies the inter-channel correlation. While it would be nicer if this could be justified better, it is not the same as “completely void”. Correlation is not an “on-off” thing (and it’s a hypothesis to begin with, as such it is somewhat malleable for a lack of a better word).
My speculation: by transforming to gamma/log/sigmoidal outliers of the noisy CFA data have less weight depending how far away from the mean they are, which could make the gradient estimation “more robust” against salt-n-pepper noise. (in gamma/log-space white pixels have smaller euclidean distances, in sigmoid space it’s probably more complex)
@kmilos and @rawfiner have more to add than my measly speculation.

Another thing to remember is that encoding and linear-light models are two different things. You can absolutely represent linear light with log/gamma-encoding. Whatever math you do in the different domains has to change with it if you want the same results, though obviously.

4 Likes

I tend to read voraciously. Then I remind myself that I have other things to stress about and also remember that I could set notifications to Normal to make the thread less visible (hint for those like me). I will come back in a bit. In the meantime, carry on. :sunny:

So is it correct to say that a gamma equal to 2 is a fast approximation for stabilizing the variance?

It performs very well in the demosaicing context imho, it looks very equilibrated for shadows, highlights and saturated colors.

… with all the gotchas about “fast & loose” approximations already pointed out by others.

Mind you that even the more “formal” variance stabilization transform fails for shadows (and possibly highlights if clipping is not taken into account), so it is just a better approximation in a sense (if variance is what you’re interested in).

1 Like

It would be good to find out if, as per the titel, the ‘other demosaicers’ also assume (implicit or explicit) a constant variance of their input data.
If they do, as Aurelien pointed out some -including LMMSE- might heavily rely on this, but do not include a VST step, what would be an appropriate thing to do?

  • Fast approximation step by means of gamma/log transform?
  • Or the implementation of a reasonable true VST step like a Generalized Anscombe Transform and it’s optimal inversion as per Foi et.al.?

There’s also a method for finding the ‘best’ VST…but I just skimmed over that paper. It could be computationally much more intensive. This one. (I can’t find a DOI on that one)

Edits: paper links added

1 Like

Also notice that all demosaicing methods rely heavily on white balance, which has the same de-saturating effect as any convex channel-wise transfer function. But… accurate chromatic adaptations all rely on a full 3D vector sent to some LMS cone space for that purpose, meaning they need demosaicing and color profiling before… That’s a nasty circular constraint : accurate WB needs demosaicing before but accurate demosaicing needs WB before.

Adobe seems to be doing a shitty 3×1D white balance fix before demosaicing (sensor RGB rescaling), then demosaics, then undo the technical WB and applies a proper CAT + profiling. In any case, desaturating sure takes care of most of chromatic aberrations.

And that’s my biggest grip with all papers about demosaicing… They artificially mosaic image samples from the Kodak and IMAX datasets (that is… digitized film), demosaic them, then compute SSIM or PSNR over the difference. But of course, these are properly white-balanced from the start. So we have no metric of the WB-independence of such methods. We just know how they work for D50 illuminant.

Correlation means the coordinates of the hills and the valleys in the laplacian are the same over all channels. But laplacian is a second order operator, aka the signal modulation around the local average, aka the texture of the image (including noise). Unfortunately, cheap demosaicing methods care about first order gradient only, which magnitude depends on the magnitude of the signal itself (same as the variance — which was the starting point of the exposure-invariant guided filter @rawfiner started).

Fixing the scale of the gradient magnitude is partly taken care of by the WB. As long as you stay in linear space, it’s a matter of applying a coefficient. If you leave linear spaces, good luck… You lose any connection with meaningful stuff connected to real-life to enter the black magic of empirism that looks good® until it doesn’t.

1 Like

Actually, I’m starting to wonder if normalizing sensor RGB by the local average before demosaicing (kind of the “grey-world” assumption piece-wise) and then undoing it after demosaicing would not be a cleaner way of desaturating while retaining a better inter-channel correlation. Because the examples showcased here clearly spot colored highlights that don’t match the global scene illuminant. That would be equivalent to dividing the whole picture by its blurred version, with a blur radius that still needs to be chosen depending on the alignment of Jupiter with Saturn or something.

3 Likes

in fact i can observe a good amount of impact on the kind of colour fringing when doing denoising before demosaicing. the denoising can aggressively lower the colour frequency and the demosaicing will then spread colour along luma edges… not convinced i found the perfect combination there yet.

That is exactly one of the problems. I searched several time for papers calculating some sort of noise but found nothing really good yet. If you know of one, please let me know.
We would need to have such an algo and agreed real-world test files to check.

Unfortunately we don’t have the raw file. Also don’t know the used demosaicer. I’ve seen such spots with ppg / amaze at such harsh transitions.

I thought long about this (emphasized) and I think I can only partly agree with the logic described here.

  1. A CAT should be picture scale/size invariant (at least to first-order approximation if I am not missing something), so what is holding one back to see a bayer rggb unit-cell as one RGB superpixel? For bayer-CFAs take a quarter resolution version of the sensor triplet, do your initial CAT and then do the demosaic. For x-trans take 1/9-th resolution.

  2. The circular constraint is certainly not nice, for sure. Two things can happen: the algo converges onto one solution or it doesn’t converge. To me this sounds that point one could be a way out of this loop if one subscribes to the idea that WB should be scale-invariant.

  3. I’d love to read a good argument why a CAT should be a crucial part of a spatial-sparse-sampling problem. The CFA-SSFs are not the human visual system. The demosaicing problem should remain the same if you shift all CFA-SSFs 500nm into the IR. Proper WB before demosaicing sounds like a backwards subjective quality control, an additional constraint to have the algo ‘behave’.

I see two ways to solve this. A rendered full-spectral test scene which can be used to construct a ground truth AND the mosaiced data (METACOW comes to mind, or someone renders something specifically for the task, like here), or a forummember with a sensor-shift camera could supply test images to specifically have full resolution and mosaiced image data.

Sorry for drifting away from the question if the LMMSE-gamma-step is a poor-persons VST.

If you had perfect white balancing, achromatic (gray) edges would require no interpolation, i.e. there would be no demosaicing error and artifacts for achromatic objects.

If that is meant in response to point 3 above:
Depending on the illuminant and the sensor-SSFs the ‘sensor achromatic’ is/can be very different to ‘human observer achromatic’. My guess is that a von-Kries transform is in it’s simplicity the exact thing one wants to do for the sensor in order to minimize artifacting in that case, but then the sensor-WB is not a catch22 to the post demosaic observer-WB.
In other words: a sensor-grey-world assumption would be straightforward, not circular and should not be tainted by HVS specific CAT ideas (cone specific nonlinearities etc.).

So, my default processing in rawproc has a single whitebalance tool, before demosaic. Looks fine to me; what am I missing?