What does linear RGB mean ?

anon41087856 · February 23, 2020, 12:35pm

I’m going to copy-paste here the answer I gave by email, because I think it will be useful to more than one guy.

Your camera sensor converts photons into electrons with a piece of semi-conductor underneath the color filter array (roughly 1 photon becomes 1 electron, except for some that get lost here and there – for the sake of this explanation, you can assume 1 photon => 1 electron exactly). It’s really like a photovoltaic solar cell you would use to produce electricity, except the amount of electricity is quite small.

Once we have a current, all we have to do is then to measure the electric current (= the sum of all the electrons passing through the wire) at each photosite. It’s really just measuring how many (micro)amperes we have there, as you would do with a good old multimeter (but much more sensitive).

Using a piece of electronics called an analog-digital converter (ADC), that current measurement is converted to an integer code value inside some range. If you use an 8 bits ADC, your range is [0 ; 2^8 -1] so you encode between 0 and 255. Most cameras use 12 or 14 bits, so they encode between [0 ; 4095] or [0 ; 16383].

These code values don’t mean much in themselves. They only mean that we split the measurement range of the sensor (between noise threshold and saturation threshold) into that many samples so, as the sampling gets finer, your lightness gradients are more continuous and less prone to staircasing effects (called posterization or quantization artifacts). Just imagine you want to represent a diagonal line with a staircase : the more steps you add, the finer the jumps get, and the smoother your line approximation gets.

But these code values are a linear encoding, meaning if you double the amount of light on the sensor, you also double the code value issued by the measurement. That leads to a nice property : doubling the light amount, physically on the scene, or multiplying the code values by 2, digitally in the computer, has the same effect on the picture (if we put the signal/noise ratio aside). Linearity means the data you are working on is proportional to the intensity (or energy) of the light emission.

Mathematically, linearity of some 1D operation f is proven if a × f(b) = f(a × b), which means that you can multiply in the order you want, before or after applying f on b, and the result will not change. We work on RGB (which is 3D), so it’s a tad more complicated, but the same principle holds.

But here, you might smell an issue. Remember human vision is logarithmic and, therefore, non-linear. That means we have increased sensitivity in shadows, and decreased sensitivity in highlights. The “human light increment” is the EV (exposure value). From one EV to another, you double or halve the amount of light (depending which way you go).

So, your camera code values, let’s say in 12 bits, encode the first EV below pure white between 4095 and 4096 / 2 = 2047. It means that half your encoding range is assigned to only the first one EV, below pure white, where your sensitivity is very low. Then, the second EV is between 2047 and 1023, third between 1023 and 511, fourth between 511 and 254, fifth between 253 and 127, sixth between 127 and 67… until the twelveth, which can only take the value 0 or 1. That means the EV zones where you are the more sensitive are the ones that get the fewer code values.

That triggers a lot of problems, the most common being posterization in shadows (staircasing in the shadows gradients). We have 2 ways to deal with that :

either ditch the integer encoding and switch to floating point representation, so we don’t care about sampling anymore and we could assume a continuous real encoding in the full range,
or redistribute the human-defined EV steps around code values more evenly by applying a non-linear transform (the typical “2.2 gamma” applies a sort of square root, the Lab transfer fuction applies a cubic root, and modern video cameras apply a log directly), so each EV gets roughly the same number of code values (and the first one stops sucking half the values all for itself).

#1. is better to work on pictures, because… it preserves the linear connection between light emisison and code values, so it keeps the multiplication property (along with many more that allow physically-accurate light transforms), and that’s how darktable’s pipeline works, but saving files in 32 bits float is super heavy and quite overkill. To save files, we will rather use #2., which is what modern “gamma” encoded RGB spaces do (Adobe RGB, sRGB, etc.).

So, non-linear RGB spaces are just that : a maths trick to redistribute the code values more evenly between EVs, that should be used only for file saving or to send image buffers to your GPU (and then to you screen). If you plan on working on pictures saved with a gamma-encoding (OETF), you should decode it first, then apply your image operations, then re-encode and save the result. But for some reason, the whole graphics industry has taken the bad habit to work directly on these gamma-encoded files through the whole pipeline, probably because it pegs the 18% middle grey around 50% (0.18^{(1/2.2)}) = 0.46), so it is more convenient to use with levels or curves GUI.

And then, users can introduce non-linear transforms too, even in a linear pipeline, for example by applying a tone curve or a LUT. Basically, every lightness/contrast operator that is not a simple multiplication and/or addition (that is, not an exposure compensation) will de-linearize the RGB, which is fine for creative purposes if you ensured that every physically-accurate transform comes before in your pipeline.

And this is the very reason why I don’t like hiding pipelines to users, because it’s very important to know what operation you are doing on which signal, even if it means they have to wrap their head around non-intuitive concepts, otherwise you could spend hours trying to figure out where those artifacts come from and why you can’t get rid of them.

TL; DR : linear RGB is what comes out of your camera sensor and means the RGB code values are directly (mathematically) connected to the light intensity. Performing multiplications and additions in linear RGB keeps the linearity of the RGB. Anything else turns it into non-linear, which is useful for creative reasons and integer file encodings, but should happen after any operation relying on the physical consistency between code values and light emission, and should be reverted before applying physically-accurate image transforms.

Just be careful, because people usually call “linear RGB” any RGB space free of OETF/gamma encoding (what Elle Beth calls “linear gamma” ), with no care given to what operations have been performed in those spaces. Using linear RGB spaces is only the first condition to preserve the consistency with light emissions, but you also need to ensure that nowhere in you pipe, you apply a non-linear operation on your pixels.

Complement for the geeks : sensors are actually not truly linear to light emissions, and you see that clearly when your scene is not lit by a white-daylight illuminant. That’s why we need better input profiles than the bogus RGB → XYZ 3×3 matrice conversion.

hanatos · February 23, 2020, 3:33pm

methinks you forgot the black level / base voltage that comes out of the CMOS? once we’re doing mathematical nitpicking… that’s not linear anymore but affine, since you’ll need to subtract the black level from the values you get from the raw to get something proportional to photon count/electrons caused by these. i mean, ignoring various sources of noise. this also offsets your encoding considerations, since the values 0…black take up bits, too. in particular if you double the amount of light you’ll not get twice the value in the raw.

anon41087856 · February 23, 2020, 3:52pm

Sure, but that black level has to do with the technicalities of the sensor, and not really with the theoretical meaning of what we do (how we do things vs. what they represent), so, in the spirit of vulgarization, I choosed to stick to the principles and assume a perfect sensor.

ggbutcher · February 23, 2020, 4:07pm

Yes, the important consideration here is the depiction of the original energy relationships captured in the original encoding on the sensor.

Okay, not-math-guy here, wouldn’t ‘affine’ describe the exposure transform? i’ve been looking for a term that categorizes such transforms so we can place them accordingly in the linear-to-display journey… ??

anon41087856 · February 23, 2020, 4:15pm

Yes, affine is y = a \cdot x + b, and linear is the special case of affine where b = 0, so y = a \cdot x.

snibgo · February 23, 2020, 5:17pm

These seem to be contradictions.

If pixel values are proportional to intensity, then multiplying (or dividing) all pixels values by the same number will preserve that proportionality.

But adding (or subtracting) a value removes that proportionality.

For a more technical perspective: multiplying RGB values by the same number preserves chromaticity (x and y channels of xyY, which, rougly speaking, preserves hue and saturation). But adding the same number to RGB values changes chromaticity.

Subtracting a black offset from pixels may be needed to make pixels linear, ie proportional to intensity. Once that is done, any further addition or subtraction will remove proportionality.

anon41087856 · February 23, 2020, 7:31pm

The proportionality to be preserved is with the light emission, not with the input RGB pixel garbage. Depending on how the input RGB garbage has been prepared or massaged, you might need to offset the code values accordingly, for the reasons explained by @hanatos.

We absolutely don’t care about pixel values in themselves. These are code values, aka number garbage. They could be encoded as imaginary numbers, over a complex plane, that wouldn’t change a thing. We care about what they represent, which means we need to care about the pixel values and their encoding cipher. You can do whatever you want if it is to profile your

Subtracting the black offset normalizes the RGB in [0 ; 1], but that 0 means nothing to the light emission (zero light emission implies the picture was taken at -273°C). So that zero is still representing some non-zero energy level, and you will always have an offset somewhere between RGB values and real light. What is important is that, for a light emission of intensity l(i) giving a code value c(i), \dfrac{l(i + h) - l(i)}{h} = \dfrac{c(i + h) - c(i)}{h} (= first order derivative if h \rightarrow 0). Whatever the offset between c(i) and l(i) doesn’t change that relationship, because we care mostly about the consistency of the variation between input and output. An image is only a gradient field around an average value.

snibgo · February 23, 2020, 8:28pm

Total absence of light doesn’t really exist, so there is always some offset we should apply if we want our pixel values to be truly proportional to intensity. Is that what you mean? I agree, but that offset is very very small. In normal photography I suggest this offset is too small to worry about, certainly less than one part in 65536. (It may be significant in astro-photography.)

But we can’t add or subtract arbitrary numbers and think this doesn’t change proportionality. This can make a big difference to results that we should worry about.

A specific example. Suppose we have RGB values of (1000,2000,3000) and that these are proportional to the light transmitted by red, green and blue filters. The green filter has transmitted twice as much light as the red filter.

We can multiply (or divide) these numbers by whatever we like and the proportionality remains. The lightness changes, but hue and saturation do not change.

But if we can add (or subtract) arbitrary numbers, we might subtract 1000 to get (0,1000,2000). The values are no longer proportional to the light. Lightness and saturation both change.

heckflosse · February 23, 2020, 8:38pm

Related: Call for example raw files

Dave22152 · February 23, 2020, 9:10pm

In astrophotography, the issue more due to light pollution and sensor noise.

betazoid · February 23, 2020, 10:58pm

I am not sure I understand this. Is there a gamma in analogue photography? Or is this a stupid question?

ggbutcher · February 23, 2020, 11:29pm

I had to go looking… Found this:

https://www.kodak.com/uploadedfiles/motion/US_plugins_acrobat_en_motion_newsletters_filmEss_06_Characteristics_of_Film.pdf

From Page 51:

“There are two measurements of contrast. Gamma, represented by the Greek symbol γ, is a numeric value determined from the straight-line portion of the curve. Gamma is a measure of the contrast of a negative. Slope refers to the steepness of a straight line, determined by taking the increase in density from two points on the curve and dividing that by the increase in log exposure for the same two points.”

Kirtai · February 23, 2020, 11:44pm

Correct me if I’m wrong, but wasn’t the use of gamma curves also driven by the use of narrow range eight-bit (or less!) values in computers which couldn’t handle larger ranges?

KristijanZic · February 23, 2020, 11:48pm

@anon41087856 Where does one learn all this stuff? Could you make a thread with all your sources, books etc?

I’m kinda getting more and more interested in the technical aspect of cameras and post production whenever I read a new post from you. But the learning curve seems really steep (like a few years of learning steep). So it would be helpful to know where to start learning and where to find the information. Just generally, not any particular thing.

Some people read novels in their spare time. I think many of us would read study books and papers about cameras and post processing instead.

snibgo · February 23, 2020, 11:55pm

As @ggbutcher says. “Gamma” has too many meanings. I hate the word.

Confusingly, yes, analogue (film) photography does have a gamma, but it means something different. It is the slope of the straight-line portion of the characteristic curve. It is the change in density (no units) divided by the change in exposure (in stops) that creates that change. Note that exposure here is measured in stops, which is log base 2 of illumination. Doubling illumination is one more stop, which has a constant additive effect on density (in the straight-line portion).

In that sense, film is similar to (non-linear) sRGB, where an extra stop gives a constant additive effect on digital values.

When shooting negative film, we aim to expose for the straight-line portion. Highlights (clouds etc) may be in the non-linear shoulder, and require burning in the print.

See also https://en.wikipedia.org/wiki/Sensitometry

elstoc · February 24, 2020, 7:30am

There’s some stuff here: Image Processing – Recommended Reading

dafrasaga · February 24, 2020, 8:29am

I’m confused… EV is related to the human vision??
I known EV was related to linear light…if I want +1EV I open 1stop the f-number or halve the time

Jossie · February 24, 2020, 8:30am

Gamma-correction was introduced in the first place to deal with the properties of the cathode ray tubes (monitor). However, the result also has the property of proper encoding, as pointed out above. As Charles Poynton (2012) writes

If gamma correction had not already been necessary for physical reasons at the CRT, we would have had to invent it for perceptual reasons.

This I do not understand. If this would be the case, CCD photometry in astronomy would not work. May I quote Ian S. McLean (Electronic imaging in astronomy, Springer 2008):

If operated properly, CCDs are linear detectors over an immense dynamic range. That is, the output voltage signal from a CCD is exactly proportional to the amount of light falling on the CCD to very high accuracy, often better than 0.1% of the signal. The good linearity makes it possible to calibrate obervations of very faint objects by using shorter - but accurately timed - exposures on much brighter photometric stadndard stars. Linearity curves are usually derived by observing a constant source with various exposure times.

McLean points out, that the signal has to be bias and dark corrected!

Hermann-Josef

Matt_Maguire · February 24, 2020, 8:39am

Complement for the geeks : sensors are actually not truly linear to light emissions, and you see that clearly when your scene is not lit by a white-daylight illuminant. That’s why we need better input profiles than the bogus RGB → XYZ 3×3 matrice conversion.

what sort of function would you suggest to model the sensor response to an input light intensity I? An nth order polynomial function? We’d need to take shots of a colour checker target at n exposure levels to build the profile. I wonder if it is sufficient to vary the shutter speed to map out the non-linearity of the sensor’s response to intensity, or would we need to actually vary the level of the light illuminating the target?

johnny-bit · February 24, 2020, 9:46am

And we’d need an actually good colour target, not some overpriced 12-patches “colour checkers” Would need actual IT8 target every time, with proper shooting technique and scene illumination.

To this thread I’d add link to

and recommend reading it (I’m just sad it takes so long for Troy to update those, but every new answer is great overall clarification)