Some thoughts about linear RGB editing and future PhotoFlow developments

Carmelo_DrRaw · November 23, 2015, 8:58pm

I thought it would be useful to share here some considerations I’ve been recently making about color management in photo retouching software. This post reflects my current understanding of the topic of image editing in RGB colorspace, and is actually the result of exchanges with people that have a much deeper knowledge on the topic than me… as such, I have most likely overlooked and/or misinterpreted certain details, so take what I’m saying with a grain of salt. Also, feel free to correct mistakes and/or add (counter)arguments to my reasoning…

Very often we edit our images in RGB mode. However, one has to be aware that RGB is not an unique representation of color, and that an ICC profile is always needed to interpret the RGB triplets and translate them into a well-defined color representation (usually in terms of XYZ triplets).

ICC profiles are basically composed of two parts: the color primaries, that define which colors are included in a given RGB representation, and a Tone Response Curve (TRC). Several different TRCs are commonly used: gamma=2.2 in the case of AdobeRGB1998, gamma=1.8 in the case of ProPhoto, and a more complex function in the case of sRGB. All those TRCs are introduce in order to pre-multiply (or “encode”) the non-linear response of the human vision system into the RGB representation. However, this sort of encoding is mostly a relic of the past, when computations were done in 8 bits at best and therefore it was important to optimize the use of the few available bits.

The human vision is more sensitive to dark shades than light ones. For example, a gray patch that reflects 18% of the incident light is “felt” to be mid-way between black and white, and therefore is called “mid-gray”. If gamma encoding is not applied, all the shades below mid-gray are compressed in the lower 18% of the available bit range. If only 8 bits are available, banding is therefore likely to appear…

However, gamma encoding is a mere technical “trick” that has no mathematical or physical justification other than optimizing low bit-depths. As a sort of proof, the gamma encoding used in common colorspaces is not unique, and this generates all sorts of confusions and inaccuracies, as I will try to show a bit later.

Nowadays, image processing is performed at high bit-depth (16 bits integers or 32 bits floating point), and the risk of banding or loss of shadow details does not exist anymore. After all, RAW images have at best 14 bits and are linearly encoded, but no one is really complaining about lack of precision! On the other hand, many of the image processing algorithms and tools commonly used in photo retouching do assume linear RGB values as input, and produce incorrect results if fed with gamma-encoded values.

A simple example is the grayscale conversion using Luminance. The formula that is commonly found on the net is

L  =  0.2126f*R + 0.7152f*G + 0.0722f*B

However, what is sometimes not clearly stated is that the input R, G and B values are expected to be in linear sRGB colorspace. All sort of bad things can happen if you use gamma-encoded RGB values instead… and even worse, this is what I’m wrongly doing in PhotoFlow at the moment!

A similar argument holds for the widely used channel mixer and white balance tools. More generally, any operation that mixes and/or multiplies RGB values should be performed in linear encoding, otherwise it will introduce hue and saturation shifts. This is also true for most layer blending modes. The only exception is when the three RGB components are multiplied by the same constant (like in the case of brightness and contrast adjustments), in which case the gamma encoding is irrelevant.

In view of that, it is clear that it makes no sense to keep the internal RGB data of the processing pipeline in an encoding other than linear. Firstly, there is no unique choice of gamma encoding; secondly, most of the image processing operations require linear RGB values, and when liner encoding is not required the choice is actually irrelevant.

Not only: all the colorspaces that are commonly used in photo retouching (sRGB, AdobeRGB1998, ProPhotoRGB…) are actually bad choices that often yield incorrect and confusing results. To give an example, let’s consider this S-shaped RGB curve that we apply to increase the mid-tones contrast.

What is the expected result? The curve should darken the tones below mid-gray and lighten those above mid-gray (with some compression of the darkest shadows and brightest highlights, to preserve the black and white points). Since the curve crosses the center of the graph, mid-gray should stay unchanged. I’ve applied this curve to the image below (from David LaCivita). In the first case, the RGB data was encoded in the sRGB colorspace, in the second case in ProPhoto. Can you see the difference? Which one is the correct result? NONE!

Top: original image. Middle: S-shaped curved applied to sRGB image. Bottom: same curve applied to ProPhoto image.

To understand why, let’s consider a much more simple image: a uniform mid-gray patch, obtained by a colorspace conversion from a Lab color with L=50 a=0 b=0 to RGB (either sRGB or ProPhoto).

Top: sRGB version. Middle: initial patch. Bottom: ProPhoto version.

If we apply the S-shaped curve to this mid-gray patch, we can notice a change in both cases, although more pronounced in the ProPhoto case. Why? Because due to the different gamma encoding of sRGB and ProPhoto compared to Lab, L=50 is not “mapped” to R=G=B=50% in either case. Therefore, the mid-gray patch does not coincide with the center of the RGB curve and the S-shaped adjustment changes the RGB values.
Conclusion: the result of any RGB contrast adjustment is gamma-encoding-dependent, and yields different results depending on the working colorspace of your choice. In fact, the only encoding in which the results match our intuitive expectations is the perceptually uniform encoding used for the L channel of the Lab colorspace, which divides the whole tonal range (from back to white) into equally spaced intervals from the perceptual point of view. In other words, the ten zones of Ansel Adam’s zone system span equal intervals along the L axis.

At this point, the solution to the problem of RGB representation is quite simple: keep the data in linear encoding, and REPRESENT certain adjustments in perceptual encoding (if that simplifies the user interaction). For example, the axes of RGB curves are better represented in perceptual encoding, with input mid-gray in the center of the horizontal axis. However, this graphical representation has no impact on the final result: an RGB curve can be easily mapped from perceptual to linear representation, and then applied to linearly-encode RGB values, and the result will stay the same.

Of course, this statement is only valid in high-bit-depth processing; such mapping to and from liner encoding would significantly degrade the image quality at 8-bit precision… that’s what we call progress!

The last question that remains to be answered is: which linear working colorspace should we choose? As I already stated, the gamma encoding is a property that can be freely changed. It is perfectly valid to create a linear sRGB ICC profile and use it to internally represent the RGB data in the photo processing pipeline. However, it turns out that most of the widely used working colorspaces are not optimal choices, even when they have very large gamuts, for reasons that are out of the scope of this post and also partly out of my understanding.

As far as I understand, the image industry has recently introduced new RGB colorspaces that are better adapted to real-world image manipulation, like Rec.2020 and ACEScg. In particular, ACEScg comprises a very large fraction of the visible colors, and most likely all the colors that can be generated by present and near-future output devices (like displays and printers).

So my final conclusion is: use linear Rec.2020 or linear ACEScg for the internal representation of RGB data in the processing pipeline, and use perceptually uniform encoding to GRAPHICALLY REPRESENT the image adjustments whenever needed (contrast, RGB curves, opacity masks…).

Version 0.3.0 of PhotoFlow will follow those guidelines, and hopefully will also produce better images

CarVac · November 23, 2015, 10:00pm

The choice of doing everything in linear space is good in almost all circumstances; it’s not exactly the case in Filmulator because it makes no promises at all that it’s exactly going to come out matching sRGB’s tone curve. When using Filmulator you also shouldn’t care what’s 18% gray, you should just want to see what the result is. But I digress; for a maximum-control program like PhotoFlow you should definitely always keep the pipeline in linear space, display histograms in some perceptually linear form, and at the end convert to the output color space tone curve.

However, I don’t think that using more widely separated primaries is necessarily a good idea: you should always edit in the same space as the output medium’s gamut.

In one sense they’re good, because they let you perform linear operations without going out of gamut (like LAB tone curves; maximum saturation in most color spaces is only at a specific brightness and you’ll lose that by shifting up or down unless there’s extra saturation available), but on the other hand I question the utility of linear operations like that when working in visual media since our eyes are not linear in the first place. Why do you need to preserve colors that would end up out of gamut anyway? What good do they do you?

I haven’t really fully formed my thoughts on this topic despite having tried to multiple times before, but that’s kinda the gist of my opinion.

I may add more later.

Carmelo_DrRaw · November 24, 2015, 7:29am

This is definitely true. An sort of optimal compromise is probably to stick to Rec.2020 primaries: they are large enough to comprise current and near-future displays, and probably also compatible with the gamut of modern digital cameras and inkjet printers (but I still need to check this last statement).

As far as I understand, the point is that our eyes perceive the surrounding world in a non-linear way, however colors on real objects are combined linearly. So, the correct approach is to do the math in linear representation, and then apply the gamma encoding just before displaying the result.
I’m probably a bit over-simplifying there, but I think what I’m saying is at least not totally wrong…

Gamut and gamma-encoding are two completely independent issues. Using a linear encoding will not preserve more colors than a gamma-encoded representation, because the gamuts are the same in both cases (provided that the same color primaries are used).

Here is a nice article that clearly shows the differences between linear and gamma-encoded editing: normal blend mode and gaussian blur in linear gamma.

CarVac · November 24, 2015, 1:23pm

They are independent, but it’s not irrelevant since you were talking about using Rec.2020 or ACEScg primaries…

I meant perceptually linear. It’s confusing to write about.

Some people really like working with Lab* curves but 1) they need a huge working color space and 2) like you point out, colors add linearly in intensity, not in our perception.

Since I don’t advocate using the perceptually linear operations, I don’t see the need for huge color spaces.