Finding a norm to preserve ratios across non-linear operations

Tone curves or HDR tone mapping curves handled with simple 1D transfer functions applied separately on the 3 RGB channels are inherentily bad because they can affect the color saturation differently in highlights or shadows, and shift hues. This video shows it all :

The solution pointed in the conference is a ratio-preserving curve, using a well-choosen norm as the ratio denominator. In darktable’s filmic, I use the max(RGB), but they show how it can darken colors. The problem is their yellow-weighted norm is kind of black magic, and different norms yield different results, each of them having different drawbacks and bottlenecks.

My claim is a good norm should be as close as possible as the physical energy of the light spectrum. (To be checked).

The spectrum is a 2D graph linking wavelengths and their intensities, like this daylight spectrum :

(Source)

This energy could be expressed like:
E = \int_\Omega I(\lambda)^2 d\lambda
with I the distribution of intensities among wavelengths, where \lambda is the wavelength and \Omega is the domain of visible wavelengths where \lambda \in [370;750] nm.

Unfortunately, in a photo software editor, what we have is not a spectrum but a tristimulus vector (RGB). So I(\lambda) and E have no trivial expressions.

But a camera sensor is 3 colorimeters stacked at once doing essentially:
\begin{cases} R &= \int_\Omega I(\lambda) \bar R(\lambda) d\lambda \\ G &= \int_\Omega I(\lambda) \bar G(\lambda) d\lambda \\ B &= \int_\Omega I(\lambda) \bar B(\lambda) d\lambda \\ \end{cases}
where R, G, B are 3 scalars representing the pixel components, and \bar R, \bar G, \bar B are the transmittances of the color filters. These are specific to each sensor and are usually not known, even if they can be profiled. However, RGB spaces have base vectors defined in XYZ space (known), and \bar X, \bar Y, \bar Z transmittance functions are known too :

So we can translate RGB chromaticities to XYZ space using the primaries of the camera sensor RGB space, then :
\begin{cases} X &= \int_\Omega I(\lambda) \bar X(\lambda) d\lambda \\ Y &= \int_\Omega I(\lambda) \bar Y(\lambda) d\lambda \\ Z &= \int_\Omega I(\lambda) \bar Z(\lambda) d\lambda \\ \end{cases}

Now, we know the transmittances functions, but we still don’t know the I(\lambda) functions, so we have no way to isolate \int_\Omega I(\lambda)^2 d\lambda = E in the above equations (with an integration by parts, for example). The most we can do is using the Cauchy-Schwarz inequality and express an upper bound:
X \leq \left(\int_\Omega I(\lambda)^2 d\lambda \right)^{1/2} \left(\int_\Omega\bar X(\lambda)^2 d\lambda\right)^{1/2} \Rightarrow \dfrac{X^2}{\int_\Omega\bar X(\lambda)^2 d\lambda} \leq E.

Propagating this result, we can use each tristimulus value to get the lower bound of E :
E \geq max\left(\dfrac{X^2}{\int_\Omega\bar X(\lambda)^2 d\lambda} ; \dfrac{Y^2}{\int_\Omega\bar Y(\lambda)^2 d\lambda} ; \dfrac{Z^2}{\int_\Omega\bar Z(\lambda)^2 d\lambda}\right) .

Now, we need to find an upper bound of E ? Someone has an idea ? What do you think so far ?

[Full disclosure : I wrote that trapped at home with a bad cold and fever]

3 Likes

Get well soon. :sunny:

What RGB colour space do you use? As we all know, it depends on the space. Trouble is we have to traverse many spaces starting from the camera’s decided by the manufacturer, each time abstracting a bit more. There is preliminary talk about Camera spectral sensitivity data but these are gathered data from other researchers. Until we can understand, verify and generate them ourselves, they aren’t very useful for our current workflows.

Watching the video, I am reminded of the paper. Seems to be an empirically derived method that is “suitable for” ACES AP0. Isn’t perfect but better than before according to the authors. So not quite black magic, but in the just works department, which I bet annoys you exceedingly.

I think it is a good start. Thinking through the problem is more important than finding a solution at this point because then you understand more of what is going on: the black box becomes a grey box, and hopefully becomes a box that is fully transparent, robust and well understood. My math went down the drain soon after university, so I can’t help you develop your mathematical ideas. :blush: All the best. Sometimes rest is all you need to make new discoveries. :+1:

I think that this is a great idea Aurélien :slight_smile:
With bounds, we will be able at least to eliminate norms which give results outside the bounds.

I am not sure of the following step:
X \leq \left(\int_\Omega I(\lambda)^2 d\lambda \right)^{1/2} \left(\int_\Omega\bar X(\lambda)^2 d\lambda\right)^{1/2} \Rightarrow \dfrac{X^2}{\int_\Omega\bar X(\lambda)^2 d\lambda} \leq E
Squaring the both sides gives \dfrac{X^2}{\int_\Omega\bar X(\lambda)^2 d\lambda} \leq \int_\Omega I(\lambda)^2 d\lambda
But according to your text, E = \int_\Omega I(\lambda) d\lambda , not \int_\Omega I(\lambda)^2 d\lambda
Unless we have \forall \lambda, I(\lambda) \leq 1 (and I don’t know at all if this is true or not), I don’t see how you get \dfrac{X^2}{\int_\Omega\bar X(\lambda)^2 d\lambda} \leq E

Searching for an upper bound, I got another lower bound:
\bar X , \bar Y, \bar Z should all be positive and should all have a maximum, respectively \bar X_{max}, \bar Y_{max}, \bar Z_{max}

I also suppose the I function only has positive values.

We have:
X = \int_\Omega I(\lambda) \bar X(\lambda) d\lambda
And: \forall \lambda, \bar X(\lambda) \leq \bar X_{max}
Thus: \int_\Omega I(\lambda) \bar X(\lambda) d\lambda \leq \int_\Omega I(\lambda) \bar X_{max} d\lambda
Which gives: X \leq \bar X_{max} \times \int_\Omega I(\lambda) d\lambda
Which gives: \dfrac{X}{\bar X_{max}} \leq E

Using a similar result with Y and Z, we get:
E \geq argmax \left( \dfrac{X}{\bar X_{max}} ; \dfrac{Y}{\bar Y_{max}} ; \dfrac{Z}{\bar Z_{max}} \right)

For an upper bound, we could consider our \bar X, \bar Y, \bar Z function over a spectrum less wide, so that we can find a significant minimum for the function \bar X+ \bar Y+\bar Z.
For example, on your graph, we could consider the interval [420,650] instead of [370,750].
Of course, we would make an error, and I don’t know how problematic this is.

We can consider any linear combination of \bar X, \bar Y, \bar Z with non null coefficients, as such combination will give results strictly positive as long as at least one of them is strictly positive on each part of our wavelengths interval.

\bar f = \alpha_x \bar X + \alpha_y \bar Y + \alpha_z \bar Z
Let’s call \bar f_{min} the minimum of \bar f over the considered interval.
We have:
\alpha_x X + \alpha_y Y + \alpha_z Z = \int_\Omega I(\lambda) \times ( \alpha_x \bar X(\lambda) + \alpha_y \bar Y(\lambda) + \alpha_z \bar Z(\lambda) ) d\lambda
Which gives:
\alpha_x X + \alpha_y Y + \alpha_z Z \geq \bar f_{min} \times \int_\Omega I(\lambda) d\lambda

Thus:
E \leq \dfrac{\alpha_x X + \alpha_y Y + \alpha_z Z }{ \bar f_{min} }

We can try different \alpha values to try to get the higher lower bound possible.

However, all the problem with this upper bound is how to choose the interval in order to get a significant bound without making too much errors. :confused:

Yet another upper bound…
With same conditions as previous one, with function \bar f on an interval on which it is strictly positive.

(\alpha_x X + \alpha_y Y + \alpha_z Z) \times \int_\Omega \dfrac{1}{ \alpha_x \bar X(\lambda) + \alpha_y \bar Y(\lambda) + \alpha_z \bar Z(\lambda) } d\lambda
= \int_\Omega \dfrac{1}{ \alpha_x \bar X(\lambda) + \alpha_y \bar Y(\lambda) + \alpha_z \bar Z(\lambda) } d\lambda \times \int_\Omega I(\lambda) \times ( \alpha_x \bar X(\lambda) + \alpha_y \bar Y(\lambda) + \alpha_z \bar Z(\lambda) ) d\lambda
\geq \int_\Omega I(\lambda) \times ( \alpha_x \bar X(\lambda) + \alpha_y \bar Y(\lambda) + \alpha_z \bar Z(\lambda) ) \times \dfrac{1}{ \alpha_x \bar X(\lambda) + \alpha_y \bar Y(\lambda) + \alpha_z \bar Z(\lambda) } d\lambda (the product of the sums is greater or equal to the sum of products as all numbers are positive)
\geq E

Hi,

Maybe stupid question… why argmax and not max?

argmax and not max because I copy pasted what Aurélien wrote :laughing:

1 Like

Actually, after some checking, the energy of a spectrum is indeed \int_\Omega I(\lambda)^2 d\lambda. So the Cauchy-Schwartz inequality holds true.

Thanks.

For now, ProPhoto RGB but that will be extended to whatever RGB for dt 2.8.

Local workarounds that just work backfire sooner or later.

again, sorry but I don’t understand why argmin and not min

Just call it max if you prefer.

anyway, more general comment: thanks for starting this discussion. I find it quite interesting. Unfortunately I don’t have the necessary physics background to contribute to your analysis, but I’m certainly interested in the outcomes.
I can only contribute with my empirical observations, that match more or less what you wrote in your initial post. Using max as a norm gives not so pleasing results (in particular, it tends to compress blue skies too much). Euclidan norm works better in my experiments, and so does luminance. But I don’t have much physical justifications I’m afraid…

but they are two different functions – hence my question :slight_smile:

Ah. Found that https://jo.dreggn.org/home/2015_spectrum.pdf by @hanatos (it’s a small World), so maybe we can compute an approximation of I(\lambda) after all (from XYZ) and then numerically integrate it so the energy computation is direct (but expensive).

Indeed, it’s a simple max in this case.

… if you’re on the search for fast spectral upsampling, may i suggest you also look into this one: https://jo.dreggn.org/home/2019_sigmoid.pdf ? caveat may be that colours in images will encode emission and not reflectances, so you’d need to scale before and after upsampling.

5 Likes

thanks !

and yes, if i may add: i find this topic super interesting. the distortion in chroma that a tonecurve brings has always annoyed me and i think we can do better by doing things in spectral. the upsampling is really only using a couple fma + one reciprocal sqrt instruction and can be executed in SIMD on many wavelengths quite efficiently. i’ll be excited to hear about any more ideas or results in this direction.

3 Likes

Can I take a step back? For the norm, what is wrong with using luminosity or similar?

In other words, why not simply apply the tone curve to L of Lab or Luv, or Y of XYZ or xyY?

@snibgo There is a desire to move away from the Lab model. See: Darktable, filmic and saturation - #73 by aurelienpierre and related threads.

Would it be reasonable to estimate the spectral response of each sensor by taking a Gaussian around, let’s say, 454 nm, 540 nm and 610 nm? See e.g. https://www.dxomark.com/About/In-depth-measurements/Measurements/Color-sensitivity for a sketch.
That could give you a starting point to try out if the energy-oriented method actually works.