More robust color picker in Darktable

paolod · December 1, 2020, 3:40am

I have an idea for an improvement to the color pickers in Darktable used by various modules…

Depending on the specific parameter in a specific module that the color is being picked for, either the mean, min, or max of all pixels in the selected area, for each color channel, is used. The min and max are sensitive to outliers - all it takes is one single pixel that’s very different from the others to throw off the picked value.

I keep running into this problem when trying to use the Dmax picker in negadoctor. Either there’s a speck of dust on the negative, or a stray pixel with a negative RGB value that causes it to pick 6.00dB instead of a more reasonable value. I’m not really sure if this problem comes up often with any other modules that pick the min / max value - has anyone else had trouble with this?

A possible solution is to ignore the few pixels with the highest or lowest values and take something like the pixel value at the 0.5th percentile as the min, and the value at the 99.5th percentile as the max.

It would obviously be inefficient to find the exact quantiles (would require sorting all the pixels!!) but an approximation should be good enough. There has been a lot of research into efficient algorithms for this problem of approximating quantiles on large data sets, for example the t-digest. As a quick experiment I took this C implementation and tried calling it from color_picker_helper_4ch() in src/common/color_picker.c, replacing the existing min/max/mean calculation and it seems to work well. Single threaded it’s a little slow, but it parallelizes easily and runs quickly on an entire image on my 6 core machine. Once the t-digest has made a single pass over the entire area its output can be queried to get the value of any arbitrary quantile.

Maybe this could be calculated alongside the existing min/max, and any module that wants to reject outliers could query it instead of accessing picked_color_min or picked_color_max. Or, it could be enabled optionally for all modules - for example shift-drag instead of drag to select the picked area and picked_color_min and picked_color_max get set to the 0.5th and 99.5th percentiles instead of the actual min and max.

I haven’t experimented with this much, I’m not sure if there are specific percentiles like 0.5% and 99.5% that could be hardcoded without being adjustable and work well in most cases.

johnny-bit · December 1, 2020, 10:00am

Interesting… Although that C code would benefit from more parallelization in case of image processing …
I also wonder about float/log histograms presented there. that too could be interesting. I haven’t had the time to dig into it. It seems that you did, so proposing a PR would be good way to start the conversation IMO

paolod · December 1, 2020, 7:26pm

The existing t-digest C implementation can be parallelized by creating a separate t-digest instance for each thread that’s working on a subset of the pixels, and then merging all of them together into one t-digest after all of the threads are done.

But maybe the t-digest is overcomplicating this… if a histogram of the picked area already exists and the bin resolution is fine enough then maybe it’s easier to just use that. Can just sum up all the bins, and then start summing from the left side and right side and when the sum reaches something like 0.5% of the total, that bin is the min or max. That’s exactly how the ‘enhance contrast’ function in ImageJ works (source).

paolod · December 29, 2020, 2:45am

There might be a much easier solution to this than trying to figure out the pixel value at a certain percentile…

The specific problem I’m trying to solve is that there’s usually a few random scattered outlier pixels in the scaled down preview image used by the color picker. It picks one of these as the minimum value and ends up with a value that isn’t actually representative of the darkest tone in the image. So something like the negadoctor Dmax picker or filmic rgb black relative exposure picker gets an implausible value. I think that these outliers are probably demosaic artifacts.

If I set color smoothing in demosaic to five times (which applies a median filter) or put a module that blurs the image in the pipeline before the module with the picker, it behaves a lot more consistently and picks more reasonable min values.

So it might be worthwhile to modify the color picker so that it always picks from a filtered version of the preview image instead of the actual preview image to avoid these outliers. As a quick experiment I tried a 3x3 blur (copied from tone equalizer get_luminance_from_buffer) and a 3x3 median filter and both seem to pick sane looking minimum values. I only tried on a couple images though, would need to test a lot of different images where the color picker fails to be sure it’s really an improvement.

Does this make sense, is it worth pursuing? Is anyone else having trouble with color pickers misbehaving?

pehar · December 29, 2020, 6:49am

Working on thousands of digitized negatives, the Dmax color picker in negadoctor is almost unusable for me. I don’t use it. About 50% of the images result in 6 dB due to outliers and/or negative RGB values. No problems with other modules so far.

johnny-bit · December 29, 2020, 7:50am

Seems like a sane approach imo. Go for it!

paolod · December 30, 2020, 6:24am

Maybe the pixel at 0.5th percentile approach would work better for negadoctor… in addition to single pixel artifacts from demosaicing etc, the Dmax picker in negadoctor also has to deal with big opaque specks of dust on the negative that might be a few pixels across.

Something that rejects single pixel artifacts would still be somewhat of an improvement though, at least it would work on the negatives without any big dust spots.