Unbounded Floating Point Pipelines

ggbutcher · January 17, 2018, 4:56pm

The aerospace systems engineer in me sees the Rawtherapee Denoise Only - #21 by Hombre thread as an indicator for developing clarity in the specification of floating point out-of-bound image data. Putting my software aside, I personally think it’d be good to have a traceable path through, say, RT/DT into G’MIC/GIMP that starts with a ‘0.0-1.0 floating point’ array of the camera-captured data (note the prior conversion of the integer data from the DAC into the FP array), RT/DT work on that data without clipping to the 0.0-1.0, range, export to a FP TIFF, and ingestion of FP TIFF into G’MIC/GIMP to continue in the same data domain.

So, in rawproc as of the currently released 0.6.2, I take the 16-bit integer image array delivered by Libraw and immediately turn it in to 0.0-1.0 floating point, and it stays that way all the way through any processing. Particularly, if a selected rawproc processing tool pushes data over the 1.0 upper limit, I just keep that number, no internal clipping, and subsequently selected tools get to work on that unbounded data. 0.6.2 outputs to either JPEG or integer TIFF, and for each of these outputs clips the data to the respective integer range. I’ve recently put in the code to output FP TIFF, 0.0-1.0 range, and through a fortunate prior use of LittleCMS the data is unbounded. In fact, I now need to add code to selectively clip (or maybe even recover) FP TIFF output. The important thing I can then say in the next release is, “I start with 0.0-1.0 FP data, use it throughout the processing pipeline, and can export it intact to FP TIFFs.”

I think the utility of passing around unbounded images during PP is clear; every clip between tools loses data the subsequent tool could effectively use. There’s been recent discussion about some of the specifics of this, both the denoise thread and some of the TIFF things @Morgan_Hardwood brought up; I thought I’d broaden the consideration a bit.

Oh, ‘aerospace systems engineer’ is a synonym for ‘cat herder’…

Hombre · January 17, 2018, 8:59pm

@ggbutcher That’s why I opened issue 3781 almost a year ago. Using unclipped data / unbounded profiles is really something we should investigate.

agriggio · January 17, 2018, 9:16pm

I agree. Unfortunately, as far as I know, RT clips in various places in the processing pipeline. Fixing this might not be a 1-day task…

afre · January 18, 2018, 9:44am

I find that the need for clipping depends on what the task is. Maybe there is a way around those situations but I am not a programmer, mathematician or wizard.

RawConvert · January 18, 2018, 9:58am

Also my clipping comment the other day in this topic -

Clipping feels wrong, at least “internally”. If there’s an appetite to make changes, maybe I should and would offer to do some RT testing. Sounds like quite a big job though with all the tools, outputs, soft-proofing, monitor profiles…?

houz · January 18, 2018, 12:12pm

That’s (basically) how darktable works. There are some modules working on the early integer data, but clipping is the devil.

CarVac · January 18, 2018, 6:03pm

Filmulator doesn’t clip until the end either.

Carmelo_DrRaw · January 18, 2018, 9:28pm

It all depends on what you are doing, which tool you are using, and what you want to achieve… for example, multiplying an image by itself leads to wrong results in the case of negative values, because they turn out positive.

More details on @Elle Stone’s web pages (see for example here and here).

ggbutcher · January 19, 2018, 2:12am

The other part is saving that unbounded floating-point data to a TIFF for opening in another program that groks unbounded FP.

I started that work in rawproc after the TIFF Tag Predictor PREDICTOR_FLOATINGPOINT on 32-bit images thread smacked me in the head. Turned out to be surprisingly easy, as the internal image array was ready as-is to pass to libtiff, but I didn’t realize that until I wrote a FP option into my getImageData() method. Also gratified to see LIttleCMS transform to the output profile without clipping to 0.0-1.0. I think I’m going to make unbounded (unclipped) the only option for FP TIFF output. rawproc also opens “regular” images, so I did the work to handle FP TIFF input. It’ll all be in 0.6.3.

When I get some time, I’m going to read some of these FP TIFFs with out-of-bounds data into 2.9 GIMP to see how it is handled. I also want to explore some out-of-bound recovery ideas, other than normalization, where the scale is preserved. Maybe, local normalization at the black and white limits?

RawConvert · January 19, 2018, 2:05pm

@Elle, it’s perplexing to the non-expert to hear of negative values, after all, the value fundamentally represents how much light the subject is emitting, so it’s hard to stomach negative! I appreciate integer arithmetic can overflow, and this has presumably led to negative values in the past, and perhaps still does in some code, but I hope @Carmelo_DrRaw 's comment about -ve values is not simply arising out of overflow, which is really the result of less-than-perfect software, and effectively doesn’t happen with fl.point operations.

I was looking at your great site here - Photographic colors that exceed the sRGB color gamut
It says “I applied my ArgyllCMS-made (simple linear gamma matrix) camera input profile to an interpolated Wolf IT8 Target shot and then converted the target shot to 32-bit floating point unbounded sRGB. In Figure 1 below, the color patches with dots in the middle exceeded the sRGB color gamut, being negative in one channel.”

I’m trying to understand where the negative comes from. I think the Raw data will have integers in the range 0 to 16k, say. This is interpolated to pixels with positive components. The camera profile is then applied - can this make a pixel go negative in one or more channels? And then the pixel is converted to unbounded sRGB. Can this result in a negative component? At what point does it happen please?!

Elle · January 19, 2018, 2:59pm

Hi @RawConvert - it really is not easy to wrap one’s head around the idea of negative channel values.

I think the best way to build an intuitive grasp is by experimenting. I’d suggest to start by reading the following article, and not just reading it, but also using my “CCE” version of GIMP to follow along each step:

Addition is a chromaticity-independent editing operation, producing the same result in any unbounded linear gamma RGB working space: Adding Rec. 2020 red and green in the unbounded sRGB color space

Despite the slightly scary-sounding title (well, the title is awful - suggestions for a better title are very welcome!), the article is easy to read, and it’s simple to replicate the results.

Regarding negative channel values, consider how Luminance is calculated, and then consider that a negative channel value in one or two channels doesn’t necessarily mean that the resulting Luminance is also negative.

As long as the Luminance (Y of xyY and XYZ) is positive, and the color’s xy coordinates fall inside the “horseshoe shape” of all real colors on the xy plane of the xyY color space, then the fact that in a given RGB color space a given xyY color has to be specified using one or two negative channel values is just math, not a violation of the nature of light and color. This article has pictures illustrating how Luminance is calculated:

Models for image editing: Display-referred and scene-referred - Display-referred, scene-referred

LebedevRI · January 19, 2018, 3:08pm

Before black level subtraction.
After you subtract the black level (say, 2000), the data is in the range of -2000…14k
So you already have the negative data, even without color space conversions.
(You can, of course, clamp it so it is in the range of 0…14k in this case)

elGordo · January 19, 2018, 3:49pm

@LebedevRI that clarified a lot…thanks!

ggbutcher · January 19, 2018, 4:12pm

I think that’s why it helps to understand some of the entire pipeline. To allay your fears, cameras don’t capture negative values, each pixel is a light sensor, and, simplistically, if there’s no light at a pixel, it’ll measure 0, or some positive value close to that (think, “noise”). It won’t say, “gee it’s SO DARK I’m going to report it as a negative number.”

So, the Real raw data from a sensor is an array of light measurements that roughly range from 0 to the saturation limit of the sensor. Those measurements are usually presented by the analog-to-digital converters on the sensor (ADCs) as binary integer numbers. For the dynamic range of our cameras, that is usually delivered as a 16-bit integer, even though the camera’s range may only be 12- or 14-bit. My camera, a Nikon D7000, delivers 14-bit raw data, so it’s maximum possible measurement is 16,384. The light at a pixel may be brighter, but the DAC won’t push out a number bigger than its limit. That, is highlight clipping.

So, when smart folk advocate “editing in 16-bit”, they’re talking about keeping a room big enough for the data to move around while we toadies brighten, white balance, saturate, and do other math things to it. Oh, one of the early math operations usually is some sort of scaling to spread the data “evenly” through the 16-bit range. So, from now on, I’m going to talk about 16-bit image data.

Given the sensor delivers integer data, most processing software continues to work with it that way. Thing is, integer data represents the measurements in a series of “buckets”, 0. 1. 2. 3, … 65536. There is no place for the DAC to put a measurement of 23.586, so it has to go either in 23 or 24. And there is the first loss of beautiful information from the scene. We continue to lose information as we process the image, because each math operation wants to give us those precise rational numbers, but the results have to be truncated or rounded to the nearest integer bucket. There’s ~65K buckets in 16-bit, so we don’t see much difference as we work, but when the data is glommed (that’s a generic math operation ) down to the 8-bit range of, say, JPEG, there are now only 256 buckets, and the wrath of math operations start to show as posterized tone gradations.

Yeah, negative numbers, I’m getting to that…

So, black is black, a total absence of light, and we mostly want 0 to represent that. But math doesn’t respect that notion, for some operations the resulting values can be below 0. Really, that’s okay, because letting the result go negative and keeping that value retains information that we might be able to use later to pull detail back into the visible range. If we just clip it, that data is now gone. On the low end, that’s what “crushing” blacks refers to, and on the high end, “clipping”.

Ideally, what you want your software to do is let the data “spill” out of what can be displayed as operations are applied, and save the clipping for the final output to accommodate the particular medium.

Oh, to the goal of the thread, floating point numbers is a way in computers we can use to restore the ability to deal with the fractional part of image data. Funny thing, the predominant convention in representing image data in floating point is to use the range 0.0 - 1.0, where a 16-bit integer value from an image such as 2248 would be about 0.034301758 in the equivalent floating point image. That has to do with maximizing precision, and that discussion hurts the heads of even computer scientists, so I’ll offer nothing about it here. So in computer floating point, the numbers are still digital, but the buckets are arbitrarily and infinitesimally small and for our purposes are about as close to analog as we can get in our processing. Zero is still our ‘goal black’, but 1.0 is the upper limit for viewing instead of 65536 or (horrors) 255. Note that to use floating point even as early as the hardware delivery, there’s still an integer->floating point conversion required, so we’ll never really escape the fundamental tyranny of the sensor ADC.

My thought about “unbounded floating point pipelines” is to facilitate moving unbounded images between softwares, retaining the out-of-display values, both positive and negative, for working them in other programs that may have particular capabilities to recover them to the visible. FWIW…

Wow, a lot of writing on just one cup of coffee. I don’t mean to tutorialize smart folks here, but I’ve only recently learned some of this and I think it helps to pull it together this way. If you got this far, thanks for reading…

Elle · January 19, 2018, 4:15pm

When starting with a raw file, as @LebedevRI already said, subtracting the black level can create negative channel values. Almost always it’s a good idea to clip these values to zero. One exception might be when dealing with very noisy images. Another might be if you have reason to think the automatically calculated black level is wrong. As an aside, and AFAIK, Nikon cameras have a black level of zero because the black level (below which the signal is mostly noise) is subtracted in-camera, before the raw file is saved.

Applying the camera input profile doesn’t produce any negative channel values. It just interprets the channel values, tells the color management software how to get from RGB to XYZ.

Converting from the camera input profile to an RGB working space might produce negative channel values, depending on whether colors in the image fall outside the color gamut of the selected RGB working space. It’s really easy to capture colors when shooting raw, that will exceed the sRGB color gamut. For example any bright yellow flower will suffice.

heckflosse · January 19, 2018, 5:06pm

16,383, same for 65,536 later, which should be 65,535

ggbutcher · January 20, 2018, 6:38pm

Whups. One of the things most pervasive in my struggle to program well is the confusion of value vs range in arrays. I’ve crashed my code many times indexing an array out of bounds, specifically for the 65535-65536 thing among others…

Thanks. Really!

LuisSanz · January 20, 2018, 7:27pm

Non clipped values can lead to unexpected behaviours on some algorithms. Think of a function using a curve fitting optimized for the [0-1] range but whose nature changes in a significant way outside this range. A simple x^2 will have this issue, not to mention the complexity that arises if x^2.2 is applied instead. It’s a common assumption to expect the input of an algorithm to be positive numbers and a single negative input might compromise its stability on many calculations.

One idea can be to implement two functions, one to stretch the data to the [0,1] range and another to stretch it back to the full floating point range, and let the functions in the pipeline call them and read/update the clipping points x_low and x_high when needed.

Elle · January 20, 2018, 8:08pm

Yes, some functions just don’t produce acceptable results when performed on data that’s outside the display range. For example, for raising to a power, out of display range channel values and resulting colors quickly get very squirrelly (how’s that for technical language ):

Just to clarify in case anyone isn’t sure, “display range” isn’t just a function of the Y value (luminance) of a color being between 0.0 and 1.0. It’s also a function of the RGB color space - in any given color space, to be in “display range” all three channel values have to be within the range 0.0 to 1.0. So ProPhotoRGB’s “reddest red” is within display range when editing in the ProPhotoRGB color space, but quite outside display range if this same color of red is converted to unbounded sRGB.

Do you have an example algorithm for which this procedure produces acceptable results? It seems like it ought to work at least sometimes. But it doesn’t work very well for gamma adjustments. Here’s another example where it doesn’t work: Color correction fails in unbounded sRGB. . But this example also involves multiplying to remove a color cast, when the color cast was added in some other color space than the one in which the user is trying to make a color correction. So my second example mixes two very different problems with dealing with out of display range channel values.

plaven · January 20, 2018, 11:45pm

I enjoyed the read. Thanks for making it through on just one coffee!