Histogram Confusion?

ajax · May 31, 2018, 1:49pm

While trying to learn about image editing I’ve spent a fair amount of time working with both GIMP & Rawtherapee (RT). In the process, I found something that I haven’t been able to explain from the documentation I’ve found. It has to do with histograms which seem to me to be a concept/tool that has the same purpose in both GIMP and RT. With that in mind it seems necessary to understand differences observed between them.

I’ve included images below which show side-by-side views of histograms produced by both GIMP and RT for the same image. There are 3 such images. One for each of the 3 color channels (R, G, and B). What looks pretty clear is that the histograms for both the Red and Green channels are quite similar whereas the Blue channel is quite different (at least with respect to amplitude). Insofar as the GIMP Blue channel is nearly imperceptible it is not very useful. Also it turns out that with GIMP when the “Value” (? 3 channels combined) is specified rather than one of the 3 color choices the resulting histogram, which can be seen in this post, looks essentially the same as the one for the Blue channel.

Can someone explain this discrepancy between the histograms in GIMP and RT?

ggbutcher · May 31, 2018, 4:05pm

Without inspecting eithers’ code, it would appear RT is scaling the vertical plot of each channel to that channel’s maximum, and GIMP is scaling it to the max of all three channels or something similar. If this assumption holds, I’d say that GIMP’s presentation is actually more informative, in that the relative magnitudes of each channel can be considered. I’d simplistically surmise the image that these histograms represents just doesn’t have a lot of blue in it.

RT’s histogram does more prominently illustrate that more blue values are being clipped at the upper bound. GIMP also shows it (I think) in that little ‘notch’ in the bottom right corner.

Goes to show, select-able log/linear scaling of the x axis is important, but relative scaling of the y axis also has to be considered in how we construct histograms.

james · May 31, 2018, 5:23pm

You can enable/disable scaling in RawTherapee, using the small triangle icon next to the histogram.

afre · May 31, 2018, 5:29pm

Both RT and GIMP have scaling and representation options. In the Curves window (GIMP), there are four buttons on the top right.

ajax · May 31, 2018, 5:38pm

While this could explain the difference in the histograms it would suggest that the RT method deceiving. If this means GIMPs is more accurate, I’d say it could mean it is better than RT but still not useful.

ajax · May 31, 2018, 5:52pm

Nearly the upper half of the scene is clear blue sky on a bright sunny day. The bright sun does make it whiter (washed out if you will). It is in fact the desire to make that sky a bit bluer without distorting other colors is one of my primary objectives for this picture. Something I’ve read about but must admit to minimal understanding has to do with human perception. Could it be that even though I’m seeing a lot of blue fewer pixels are involved in doing it?

ggbutcher · May 31, 2018, 5:55pm

Statistics guy here, don’t know if I’d call it ‘deceiving’. Maybe, ‘not as useful’…

Need to remember what a histogram is, and that is a set of counts of value occurrences. The accuracy word applies to the relative heights of the bars representing the counts, and for the single-channel histograms above, that relativeness is apparent. Maybe not prominent in some cases.

Image histograms are challenged in a number of ways. The one pertinent to this discussion is the space available for depicting the bars of the largest ‘bucket’; there are just not enough pixels in any display to allow a unit-quantifiable scaling of our large image channel values. So, the bars are scaled, down to little windows at that. But, in all of the histograms presented, the relationships of the individual bars is apparent and mostly useful.

RT’s histogram has another abstraction, that being depiction as a line across the tops of the ‘invisible’ bars. This is not a proper histogram presentation, but it is extremely useful for three-channel histograms; the three lines are more visible than would be the stacked bars.

When I use a histogram visually, I’m mostly concerned with 1) the upper and lower bounds, and 2) the general R-G-B relationship through the range of data. I use the former to consider black-white point setting, and the latter to gauge color adjustment for white balance, etc. I also write algorithms to tease characteristics out of histograms to manipulate the image; in that case I don’t care what they look like…

ggbutcher · May 31, 2018, 5:58pm

Now, I’m curious to see the image behind these histograms. I understand if you can’t post, but inquiring minds…

Edit: Oh, in this case even a JPEG would be informative.

ajax · May 31, 2018, 6:00pm

I have fiddled with those controls but the one you mention seems to have, at best, a microscopic affect. Maybe I should point out that in order to have both GIMP and RT processing the same image it is not a raw file but rather a .tif file created by GIMP. Showing the “chromaticity” (5th button from the top) appears to cause the scaling to change a noticeable amount.

ajax · May 31, 2018, 6:13pm

As someone very much in learn mode it is playing with these buttons that caused me to discover the things that confused me enough to make this post. Switching from linear to logarithmic does cause these histograms to occupy more real estate. I’d say this helps for the Blue channel (i.e., you can at least see something) but the Red and Green channels are flattened in a way that shows much less variation.

afre · May 31, 2018, 6:33pm

There are many ways to represent data. E.g., darktable has a waveform view.

Apple has docs on the common scopes of the video world. Wish RT, dt and PhotoRaw had as many options.

ajax · May 31, 2018, 7:44pm

I converted the 16bit uncompressed tif to a jpg and uploaded to Google Photos. Here is a link to that image.

ggbutcher · May 31, 2018, 8:17pm

4comparison-histogram

Here’s the histogram from rawproc, my hack software. It scales the vertical axis to the max tone, sorta. FWIW…

Edit: It appears most of your blue is at the far right. I scrolled a bit to find the top, never got there. Oh, that’s probably what GIMP is doing to the blue channel, it’s scaling it to the biggest blue bucket, which pushes the rest of the data into the floor.

ajax · May 31, 2018, 8:22pm

Got to thinking a little more about scaling. As ggbutcher I haven’t looked at any code either. I must also admit that I don’t fully comprehend what GIMP is doing that it calls “Logarithmic”. However, in it’s most basic form the histogram is trying to show how many of the total pixels are encoded with each of the possible (?tonal) values. I think that means that the histogram for each channel has the same number of values as does each other channel. Some kind of scaling is needed because the maximum number of pixels found with any particular value can vary quite widely based on the size and resolution of the image and the available space within which the histogram will be presented is limited. It would seem that this presents a somewhat different problem for the kind of presentation that displays several histograms simultaneously than for the case where only one histogram is displayed at a time.

In examples presented herein it is RT which is normally displaying several histograms within a single view. The graphics displayed herein for RT have removed the other channel in each instance in order to accentuate the comparison to GIMP which only shows one histogram per view. Therefore if each of these separate (i.e., GIMP style) histograms are individually scaled it would become possible for most of the values, on the x-axis, to be represented by a small number or even none of the total pixels. However this should require that a large portion of the total pixels have one, or few, of the remaining values. In that, there should be a spike somewhere on the x-axis. This cannot be found in the Blue channel GIMP histogram shown above.

To make things even more confusing the linear GIMP histogram for the combined channels (i.e., “value” specified on GIMP’s color curves) shown below looks pretty much the same as the Blue channel histogram shown above.

GIMPcurves

ajax · May 31, 2018, 8:30pm

That one looks very much like the RT one to me.

I just got done with a much more verbose way of making your point that a spike somewhere would explain how everything else is reduced to very minimal values. Problem is there is no spike on the GIMP blue channel histogram. I also upload another image which shows the combined histogram from GIMP. It is pretty hard to distinguish from the Blue channel histogram.

ggbutcher · May 31, 2018, 8:40pm

I recall when I wrote my histogram code I had to “scooch” the scale to fit in the window, specifically to include the upper and lower clipping.

afre · May 31, 2018, 8:58pm

The bars are there. Just incredibly faint and hard to see. I suggest that you make a request to the GIMP devs to make the histogram more visible.

ajax · May 31, 2018, 9:28pm

Is the meaning of “scooch” something like “squeeze”? Would squeezing the scale mean that the ends points went out of view?

According to both GIMP and RT there is very little clipping in this image. RT puts a value on it which in this case is less than .02%. I’m thinking that clipping would have the affect of allowing pixels that should not be the same ending up in the same position on either the extreme left or right end the histogram. Could also be both ends.

When it comes to useful information knowing about such spikes on either end would seem to me to be much more important than anything you might learn about the sparsely distributed other pixels. That might mean show the spikes no matter what that does to the rest of the histogram.

ggbutcher · May 31, 2018, 9:35pm

It’s a technical term.
What it means for the histogram is that I have to horizontally scale the histogram graphic to the window dimensions minus a pixel or two, I forget now what it is. That way, those vertical ascents at the ends don’t end up in the window borders.

Can’t tell from the JPEG, but you have a lot of blue in the sky that falls in the range 253-255. From the raw, RT is in a better position to evaluate the clipping done for the output, so I’d put stock in that .02%.

This is exactly why I ‘scooched’ my histogram scaling.

ggbutcher · June 1, 2018, 2:51am

Just checked my histogram scaling, and I subtract 6 from the window width and height numbers before I set the scaling. IIRC, it was a ‘try, nope, try, nope, try, closer, try, Yes!’ thing…