is yet another reminder of how color spaces are treated as black boxes in open source image processing, with words like “perceptual” being thrown everywhere with the underlying hope that they will fix everything since… color is perceptual, you know ? So, if I’m doing perceptual, I’m doing right ?
It’s time to stop, people.
It’s nothing and everything at the same time. I have read times and times again that sRGB was perceptual because it has a
gamma OETF. So, that perceptual bullshit is just that: bullshit. We need to add more words after “perceptual” to convey a meaning that actually ends into something specific enough to be usable.
“Perceptual” is not a blanket word to say “don’t worry, colors gonna be colors, everything is going to be fine”. As we will see below, “perceptual” color spaces are only able to nail one thing correctly at a time, meaning all the rest will suck big time. So, better choose the one thing they nail right accordingly to the priorities and requirements of your particular application.
The beginning is light. Visible light is usually made of bursts of photons having all sorts of different wavelengths : the spectrum. A spectrum is a diagram that links wavelengths with their relative power
Our eyes are nothing but a small lens projecting light onto a small screen: the retina. Actually, just the central 2° to 10° of the retina are useful for “accurate” vision of colors and shapes.
The fovea has 4 types of cells:
- the rods: activated only in low-light vision (called scotopic vision) and not really suited for color vision,
- the cones: activated only in day light vision (called photopic vision):
- L (long wavelengths),
- M (medium wavelengths),
- S (short wavelengths).
At the transition between scotopic and photopic, the rods start tainting the cone perception and are responsible for the Purkinje effect. It’s going to be important because recent CAM account for this while older models like CIE Lab/Luv 1967 simply discarded it (because they are not CAM…).
The cones behave like optical filters at the beginning, meaning they will attenuate some wavelengths more than others. Again, we can plot the attenuation with a wavelength-attenuation graph:
To form a tristimulus, the same spectrum is filtered by the 3 cones. To explain it simply, each wavelength of the spectrum gets its power attenuated by a certain amount by the filter, then the attenuated powers are summed for all wavelengths to form a single intensity per cone.
Imagine photons are balls travelling into a viscous fluid. The viscous fluid will slow them down depending on their initial speed. We then collect the balls within a flat bin and put the bin an a scale to measure the resultant weight. So the final intensity is the final number shown on the scale, which is equivalent to the overall pressure exerted by the moving balls. (This analogy is mathematically true, not physically true since photons are actually converted to electrons, but this explanation is simpler than the weighted integral).
But that’s only the first stage. In the 1960’s, Naka & Rushton plugged fish retina cells on an oscilloscope and blasted flashes of light into them while recording the potential in the optical nerve. In the 1980’s, Valeton & Van Norren did it again for rhesus monkeys. What they find is pretty similar: cone intensities get compressed following a sigmoidal transfer function:
From Valeton & Van Norren
This is the second stage of the retina response.
Painters have decomposed color in color orders relying on 3 dimensions:
- hue, brightness, saturation,
- hue, lightness, chroma.
Too often, color is reduced to hue, but that’s a photographer’s mistake. Hue is the most obvious yet the least easy dimension to define. Let’s say it’s the attribute of color that makes it looks close to “pure” red, yellow, green, cyan, blue, or magenta. That’s actually pretty much its official definition by the CIE.
Chroma and saturation are both interpretations of chromatic strengths, chroma being relative to white, saturation being relative to the achromatic color (grey) having the same lightness.
And then come lightness and brightness. Lightness comes more or less directly from luminance (what ? See below…), while brightness includes both luminance and chromatic strengths to account for the Helmholtz-Kohlrausch effect.
What is certain is that both lightness and brightness try to express the luminous feeling in a way that discards the contribution of the chromatic strength. That is, they represent only the achromatic part of the feeling.
The million dollar question is: how do you link lightness (or, worse, brightness) to the tristimulus ? Because that tristimulus is directly color. But also it’s not directly hue. Yes, the maddening has begun.
So, all that started in the 1920’s and led to the definition of the CIE 1931 XYZ space for the 2° observer. If we look at the XYZ filters (called color-matching functions or CMF in the slang):
It’s not quite our LMS functions… Fact is, the study of LMS structure started in the 1950’s and the Hunt-Pointer-Estevez cone space comes from the 1980’s. So, the 1930’s CIE guys had something, they simply quite didn’t know what. Turns out that Z is pretty much S, but Y is more like L + M.
Y is the radiometric luminance, and we like it because it can be measured in a physical framework on the scene. In CIE Lab and Luv 1976, L* is taken as a cubit root of Y plus noise level and sweeteners.
A nice property of CIE 1931 XYZ is that all visible colors have strictly positive coordinates. Visible colors may have negative values in LMS, which will not be computable through a cubic root or a log or any non-integer power (which is actually evaluated from a log in the computer). Also, as of 2022, a large number of imaging applications still process images encoded as 8-16 bits integers by default, in which case… no negative values can be handled at all. They would have to be destructively clipped to zero.
Another nice property of CIE 1931 XYZ is the LMS space can be computed from XYZ coordinates fairly accurately, at least for the Hunt-Pointer-Estevez variant (for CIE LMS 2006 et 2012, it’s more complicated).
CIE CAM 02, CIE CAM 16 and JzAzBz use a kind of LMS space, then apply the sigmoidal cone compression, which gives L’M’S, then sum them together to get the achromatic signal, such as A = 2L' + M' + 0.05 S' − 0.305 for CIE CAM 16 (from Hellwig & Fairchild 2021).
Of course, the Naka-Rushton sigmoid needs negatives handling with absolute values and copy-pasting of the original sign to deal with negative LMS.
So we have 2 different approaches. Which one is the best ?
The Munsell solid, value on the vertical axis, chroma as distance to the grey spine, hue as the angle. CC Wikipedia
Let’s start by looking at the Munsell color system: is uses value, chroma and hue dimensions. Turns out that value is directly linked to the luminance Y taken from CIE XYZ 1931. So, even though XYZ is not an accurate model of physiology, as far as artistic applications are concerned, it’s not entirely off and produces an useful metric of lightness that also has the nice property of always having positive coordinates for visible colors.
If we take the LMS-based achromatic stimulus A from CIE CAM 16, well it does produce a negative correlate of lightness for positive light stimuli. To make the next graph, I create a 3D LUT of all valid XYZ triplets in [0; 1] by steps of 0.05, and compute both their luminance-based lightness L^* à la CIE Lab 1976 (L^* is fitted to match closely the Munsell value) and their CIE CAM 16 achromatic signal, then I plot the first against the second:
The R² correlation factor says 0.84, so we are not looking at decorrelated objects. But then different CIE CAM 16 A values will yield the same L^* value and some positive sets of XYZ values will create negative CIE CAM 16 A. What are we to do with the negative values ? What is their meaning ?
Brightening a positive stimulus is easy : just multiply it by something greater than 1. But how do you brighten or darken a negative stimulus ? And even if you use some tricks like absolute values, how do you make sure that the resulting values exist in the visible locus and in the RGB working space you are working in ?
This is where we need to draw the distinction between an UCS, a CAM and, well, a color-grading space.
UCS stands for Uniform Color Space. It is a space designed for even delta E. Let’s have a closer look at the normalized CIE XYZ 1931 space, called xyY:
This is the chromaticity space with MacAdam ellipses plotted on. David MacAdam was a Kodak researcher who noticed that color differences were not uniform on this chromaticity graph. The ellipses encompass the colors that the average observer could not tell apart. In other word, pick 2 colours inside the same ellipse, show them to the average Jo DSLR, he will not see the difference. Note that the ellipses are magnified here for clarity. But the trend they show is pretty clear : a 0.1 metric deviation in blue will be noticeable, but a 0.1 metric deviation in green will go unnoticed. But then, it also depends in what direction you apply your deviation…
The purpose of an ideal UCS is to make these ellipses circular and to give them all the same radius. See one of the first attempts with CIE Luv 1976:
Therefore, an UCS aims only at warping the 3D cone space such that color differences (delta E) are uniform and usually provides a 3D system of coordinates where the delta E can be computed from a simple euclidean distance. Meaning an UCS is of interest only for color scientists and industrial applications that need a metric of color deviation from the observer’s vantage point.
CIE Lab, Luv, even IPT or JzAzBz are UCS. They are not designed to modify colors, they are designed to compute color differences and massaged to throw up a meaningful euclidean distance.
For this purpose, they work pretty much all the same : they have a correlate of achromatic signal (called I, J, L, whatever) and 2 opponent color dimensions (blue-yellow, green-magenta) called u-v, a-b, P-T, Az-Bz, Ct-Cp etc. The plane of opponent color dimensions is orthogonal to the plane of achromatic signal.
Now, you may think that the angle on the (a, b) plane (for CIE Lab) is the hue and that the radial distance is the chroma, such that:
That would make sense, wouldn’t ? After all, it’s the very definition of the Lch space from CIE Lab 1976. L, c, h… it’s in the name : lightness, chroma, hue. Well, think again…
Let me show show you what it yields to degrade from sRGB primary blue to white at constant “hue” angle in CIE Lab 1976:
The reality is, that “hue” angle is not really linked to perceptual hue. At constant angle, from blue to white, we take a leap into purple. And remember this is only in sRGB, which is a narrow space… Now, imagine that in larger spaces such as Adobe RGB, Rec 2020, etc.…
Now, let me introduce you an UCS that has been specifically designed for hue linearity: Ebner & Fairchild IPT (1998):
And have a look at how the Munsell chroma dataset spreads in that (P, T) chromaticity plane:
These should be near-circular, in fact this is the same data plotted in CIE CAM 16 UCS:
CAM stands for Color Adaptation Model. CAM are primarily designed to retain the color appearance of artwork when rendered on different media or viewed in different conditions. To do so, they introduce a second step of post-processing on top of the UCS that maps the opponent color coordinates (a, b) explicitly to the chroma correlate and other useful color attributes.
For example, in CIE CAM 16, the relation has been simplified by Hellwig & Fairchild 2021 (paper submitted):
where A_w is the achromatic signal of the white reference in current viewing conditions, e(t) is a hue-wise excentricity correction and N_c is linked to the background luminance in the viewing conditions.
TL; DR : no simple and accurate correlations with perceptual color attributes can be directly derivated from the geometry of an UCS.
Closing on the matter of correlation with perceptual parameters, hue-linearity is a tangled knot.
From All Effects of Psychophysical Variables on Color Attributes: A Classification System
Above are drawn the lines of constant perceived hue in CIE xyY 1931 from various datasets. It’s important to note that all datasets are based on an average of at least a dozen observers. Nevertheless, no dataset agrees with the others. The only thing they agree on is that around 570 nm (yellow), the constant-hue locus is a straight line. For the rest, go figure…
All that comes from the Abney effect, which states that adding white to a color makes its perceived hue shift. Since both LMS and XYZ spaces are additive in nature, they still abide by the laws of light and don’t give a damn about your hue ring. So, blue light + white light is expected to degrade to purple, it’s consistent with how our perceptual system connects to light stimuli.
In that sense, the hue shift displayed by CIE Lab 1976 is consistent with psychophysics. But the last thing we expect from a “perceptual” space is to be physically-accurate…
So, Uniform Color Spaces are meant to warp the 3D cone space in a way that makes the euclidean metric distance a direct correlate of the delta E. They do not presume anything about uniformity of color attributes such as hue or chroma. Namely, they may twist the axes of constant chroma and hue, because they only care about distance, not about alignment with orthogonal perceptual correlates. They are meant for technical applications.
Color Adaptation Models are meant to take a reference look and to transport it into different viewing conditions while retaining the reference appearance. Doing so, they mostly start from an UCS, then add equations on top that account for various optical effects (Purkinje, Stevens, Hunt, Bartelson-Breneman, Bezold-Brücke, Helmholtz-Kohlrausch). The quantification of those effects is only motivated by the need to neutralize them in the adaptation of the reference look to a different medium. So you need to mind what effects are tainting your application and choose a CAM that predicts them. So you basically need to understand what the CAM you are using are actually doing (sorry, no black box).
None of them UCS or CAM are meant to modify colours in an artistic way, aka change the appearance of colors in a 3D setup aligned on perceptual color attributes. At the same time, changing colors does not need to care about Stevens, Hunt or Purkinje effects, so if the CAM were fitted with trade-offs for these effects, we can remove both the model and the trade-offs because we need neither.
When you want to modify colors in a “perceptual” way, what you actually want is the Jacobian matrix of the change expressed in a perceptual hue, brightness, saturation space to be diagonal. That means, modifying saturation at constant brightness and hue, modifying brightness at constant hue and saturation, modifying hue at constant saturation and brightness. This means that:
- we need at least another layer on top of a well-chosen UCS to align the geometry of the UCS on the perceptual correlates,
- CAM are overkill as this a second layer since they account for many effects we don’t need.
Unfortunately, we have also shown that the UCS that have the best hue-linearity have a bad chroma-uniformity, and the other way around. To alleviate this, CAM use ugly maths tricks that are very likely to be implemented with mistakes by developers who wouldn’t know what to check to assert the validity of what they just coded (friendly advice: plot synthetic sweeps).
Moreover, CIE CAM 16 has a well-known numerical instability near 0 because of the sigmoidal function that has a near-zero derivative in this region, resulting in rounding issues. When you push colors, namely when you increase chroma, you have to plan and prepare for the case where the user will push initially valid colors outside of gamut or even outside of the visible locus. If your space is already fragile on this regard, don’t expect things to go just well by themselves. Factor in the fact that positive XYZ triplets may produce negative “lightness” in CIE CAM 16, and you got yourself a nice little nightmare.
For this reason, I developed my own UCS with for artistic saturation change : Color saturation control for the 21th century - Aurélien PIERRE Engineering
It sucks in a way that I have chosen to be the least cumbersome for the task at hand. It is basically designed like a CAM but the numerical and experimental fitting trade-offs have been chosen for color change at constant hue/brightness/saturation instead of constant look in different conditions.
But I feel like the current trend among developers is to believe in the cargo cult of perceptual and implement whatever hot space is trending at the time, as long as it’s written “perceptual” on the label, with no care for what it is designed for or how it behaves in practice. Much like linking your program against LittleCMS and call it “color-managed” while you actually have no clue regarding how to test it to assess if color is actually managed or not.
For image processing and pixel pushing, what you need is not UCS nor CAM, you need spaces of uniform color correlates, which needs to be specifically designed because no CIE stuff cares about artists.
Both UCS and CAM aim ultimately at providing a psychological decomposition of light into hue, chroma and lightness (where chroma and hue are more or less accurate). But are chroma and lightness really relevant for artistic changes in the image ?
To understand the problem, I plot here a chroma (over x axis) and hue (over y axis) sweep at constant lightness (J = 25%). Lightness and luminance (this lightness is taken as a function of luminance, so it’s essentially a rescaled luminance) is constant for both the background and the color patches, and only colors that fit within sRGB gamut are represented :
Does that really feel like even lightness to you ? Don’t you feel like there is a certain chroma threshold above which colors start feeling fluorescent ? That’s the combination of both Helmholtz-Kohlrausch effect (brightness depends on lightness AND colorfulness) and the greyness boundary.
Also… notice that the highest values of chroma are only available for blue. So, imagine you start at chroma = 3 (3rd patch starting from the left), and then apply a ×2 chroma gain to increase the colorfulness of your picture. What you get is chroma = 6. But there is no chroma = 6 available at lightness = 25% in sRGB for orange, yellow, green and cyan, thus you end up straight out of gamut.
So, all in all, by boosting chroma at constant hue and lightness in your (honest) quest to provide a psychological interaction with color to your artists, what you are a actually doing is pushing valid colors out of gamut (and not at the same rate for all hues…) and degrading colors that looked initially reflective into neon fluo. Of course, if you don’t plot sweeps, everything sounds right because… it’s all perceptual, right ?
As a comparison, this is the same sweep at constant brightness (25%), accounting for Helmholtz-Kohlrausch effect, and we sweep over (colorimetric) saturation:
Not perfect, but better. Less fluo, more even brightness across hues. You know what UCS account for Helmholtz-Kohlrausch effect ? Not Lab, not CIE CAM16 UCS, not JzAzBz, just IPT and darktable UCS 22 because I designed it myself.
Now, see another sweep of what I called “painter’s saturation”, with a formula I designed to degrade white into primary colors, and the other way around, in order to match the artistic definition of saturation (which is not exactly what the CIE uses):
No fluo in there. It feels a lot closer to pigments on reflective medium.
So, even with a perfect UCS, the question remains if changing chroma at constant lightness is really relevant as a colorfulness setting for image makers, and my answer is it’s not. Also, mind those out-of-gamut that you let users create from perfectly valid colors, they are future problems in your pipeline for the display conversion.