Yes, it’s a bit of a head-hurter, even if one is familiar with some aspect of the tech. I’m software person by trade, and I still had to do a lot of digging to piece it all together. Below is a missive that requires just a bit of math and digital encoding understanding, so please don’t hesitate to ask clarifying questions. Also, apologies for English, I sense that it may not be your first language, and even though it’s my first I really don’t know it all that well…
It helps to start where the measurement is done, at the sensor. Without getting into all the hardware things, essentially a photo-sensitive thing gets hit by light, and the circuitry takes the electrical response of that thing and turns it into an integer number representing the light’s intensity. 0 means ‘no-light’, or black, and the measurements progress linearly as light intensity increases, to a point where the photo-sensitive thing can no longer resolve a difference, called the saturation point. Light may get stronger, but the measurements of it just pile up at that value.
So, what the camera delivers in the raw file is an array of unsigned integers comprising the light measurements at each pixel (I’m going to ignore the bayer or xtrans filtration for this discussion). With today’s sensor technology, the measurements are usually encoded as 16-bit unsigned integers, even when the camera sensor resolves less. My Nikon D7000 and Z6 will both deliver up to 14-bit sensor data in 16-bit buckets, because that’s how computers are organized. So, even though the 16-bit bucket can express values from 0 to 65535, the values stored only go from 0 to 16383 (14 bits). Notice I’m not talking about stops yet, or the notion of white…
After reading the raw file, the raw processor software has this array of 16-bit integers with which to start work. Some raw processors keep the data in this format through the entire workflow, and that’s not such a bad thing because, 1) 16 bits is a lot of precision with which to work images, and 2) there’s at least two bits of headroom to allow the image values to be manipulated before they hit the top of the 16-bit ‘bucket’. The need for this headroom is seen in such operations as white balance, where the values recorded under the respective red, green, and blue CFA filters are multiplied by respective numbers, e.g., 0.899,1.0,1,35 as respective red,green,and blue multipliers. Two bits of headroom may not sound like much, but in the case of a 14-bit raw image a white balance multiplier of 4.0 (not common) would be needed to reach the 16-bit top.
Still haven’t talked about ‘white’, but now seems like a good time to bring it in. In camera terms, there really isn’t a notion of “white”, but instead there’s that saturation point thing that might look like ‘white’. That’s only because when all three channels pile up there, it looks like white because all three channels are the same value. When we work with our raw data, we really want to defer our anchoring of white in the data until we go to display or export it, where white then becomes meaningful in terms of what the rendition media expects. Indeed, while we process, we ideally want our data to grow and shrink without bound so it retains all the energy relationships of the original light. Then, at display or export, we “put a pin in it” somewhere, and maybe use highlight reconstruction to drag data to the right back into the renderable range. This is what is referred to as “scene-referred” workflow, where all the work is done on the data in its original light energy relationships, or what’s called ‘linear’.
The integer buckets we’ve talked about to date have a problem with that sort of processing, as those 16-bit buckets have a maximum value that’s not that far away. And, when we do math to data that pushes it past that maximum, the data “wraps around” to zero, which looks exceedingly bad in rendered images. This is where floating point representation has value, as its maximums are well beyond anything we’ll encounter in image processing for the foreseeable future (do NOT start thinking about IP addresses, please… ). Indeed, the common convention for using floating point to represent image data is to use the range 0.0 - 1.0. Doesn’t sound like a lot of space, but you can put a lot of digits to the right of a decimal point. The big advantage is that data can grow well past that 1.0; indeed, in scene-referred processing, 1.0 doesn’t have a real meaning (NOT white, get it?). Of note is some softwares use a different range convention for their floating point representation, e.g., RawTherapee uses 0.0-65535.0, which I assume is to line up the numeric values provided by the 16-bit integer raw files. G’MIC uses 0.0-255.0. Note they all have 0.0 as the lower anchor; black is black, no energy, so Zero makes fundamental semantic sense.