Human perception

micha · February 20, 2024, 8:46pm

Hello @kofa,
It is an exciting question how to achieve a higher contrast at a desired point in the image.

How do you determine in the raw development where the contrast should be higher, but at the expense of an even lower contrast in the other areas?

Of course, I am primarily interested in ART because I want to work with it. I see that you can do this with Sigmoid with Skew and a little with White point.

What other options are there for increasing the contrast at a certain point? I think you can do anything with the curves, but it can quickly look artificial if you don’t do it very sensitively.

I’m curious.

bastibe · February 21, 2024, 6:14am

There is a critical misconception in the original post: that we need to encode a log image in order to look natural. This is false.

The most natural encoding is linear. If we record linear light, and display linear light, it would look perfectly natural. Our eyes are doing the log encoding. It is our perception that is linearly higher for a multiplicative stimulus, but the stimulus is still linear. It does not need to be added digitally.

The problem in photography is, we do not have displays or printers capable of displaying a linear scene. A sunset may contain as much as 20 EV of dynamic range. That is a 2^20 range of linear light energy. A staggering contrast. Of which our cameras will record as much as 12 or 14 EV. But our displays will be able to show 8 or 10 EV, and a print 4 to 6 EV.

It’s this reduction in contrast that requires the sigmoid curve. We need to map the tones of the recorded light onto the limited dynamic range of the display or print. A naturally lossy process. But there’s nothing “natural” or “perceptual” about it.

We have learned to interpret a small two-dimensional projection of a scene (what we call an image), and can derive emotional meaning from that. Non-technological societies can not do this. They do not see meaning in a photo beyond a colorful piece of paper. This is learned behavior. And as such, we have learned that white parts in an image represent bright things such as sunlight while black parts represent darkness. And that the tones in between are an artist’s rendition of critical detail. This is what we’re trying to achieve with our tone-mapper. Map the critical parts of the scene to middle tones, while maintaining a semblance of brightness and darkness otherwise.

kofa · February 21, 2024, 11:12am

Sorry, I did not mean to imply that. I agree with what you wrote above.

kofa · February 21, 2024, 11:41am

All pixel-wise operations can be represented as curves: on the x-axis, the input, on the y-axis, the output.

Then there are operations which act on areas. I believe ART and RawTherapee call this ‘contrast by detail levels’, but I don’t use them actively, so I’m not sure.

However, did you not ask for this topic to be opened because you wanted background info? The original one was about ART; maybe that would be a better place to ask ART-related questions.

yasuo · February 21, 2024, 1:34pm

Low contrast

High contrast

If tone curve of the image has steep slope (= allocation input range narrower), the image has high contrast.

So…

High contrast: Much steep = narrow input range allocated wide output range

Low contrast: Less steep = wide input range allocated narrow output range

ggbutcher · February 21, 2024, 2:47pm

Thanks, @yasuo, for putting some curve tutorial into the discourse.

Understanding tone manipulation is about understanding transfer functions, and understanding transfer functions is about understanding curves. A curve is really just a graphic representation of a transfer function; if you take any tone operator, filmic, sigmoid, log, etc, and plotted its output from an input of, say, 0.0 to 1.0, you’d get a curve. Same sort of curve you mess with in the control-point curve tool. Difference between the two is that the named operators have some equation at their root, control point curves have a point list and some spline equation to smooth the plot between the points.

And, as @yasuo points out, the slope of the curve defines the contrast applied (or not) to that segment of the tone distribution. The only thing I’d add to the graph would be the straight line between the bottom-left and top-right, which represents the “identity” of a curve, where each input value = the output, or ‘no-change’. Then, you get a sense of which parts of the curve are increasing the tone value, and which parts are decreasing it, based on whether they are above or below the identity line…

ggbutcher · February 21, 2024, 3:07pm

@bastibe, thanks for articulating what I couldn’t find words for…

What we need to consider in our photographic endeavors is end-to-end how the tone distribution is manipulated to make the final rendition. it’s our rendition media that we’re accommodating, not our vision. The whole gamma thing is really a legacy of the cathode ray tube, whose tone response was decidedly non-linear, and our workflow ever since has been whipsawed to accommodate. LCD screens could provide linear response, but they are usually skewed to CRT gammas to accommodate legacy sRGB encoding.

When I process a raw in rawproc, the first operation is to just ingest the raw data. I can select that first operation for display, and the image looks quite dark. But, not as dark as it really is because the display pipe is pushing that raw data through the display transform, where the color side is meaningless as each pixel only has one value, but the tone side is being applied. And, its effect is visible, and indicative of what the display requires.

I then add operations on-by-one, until the display rendition of the last tool in the chain looks nice. The tone curve, at the end of the chain, presents whatever additional tone manipulation is needed, usually a lifting of the shadows to accommodate my highlight-weighted exposure; the major lifting already done by the display transform. Again, all to accommodate the rendition medium. Stopped using middle-gray exposure some time ago…

So, when I export a JPEG, I transform the export data and embed to a sRGB colorspace and tone. This makes the image presentable on a non-colormanaged setup, and also provides the information for a color-managed setup to do an appropriate transform. A bit of a crapshoot, but provides the best chance for my rendition to look like I think it should to others.

Okay, a bit of a story, but what I want to get across here is that we are working to the vagaries of our rendition media, not necessarily our perception…

micha · February 21, 2024, 6:20pm

Yes, the curves determine where the contrasts are high and where they are low.
Do you know of any other ways in ART to increase the contrast at certain tonal values - always at the cost of other areas, of course.

kofa · February 21, 2024, 6:34pm

That is correct – but isn’t it the log-like behaviour of our perception that lets us ‘get away with’ such extreme compressions?

That last bit is interesting. I have never read about such experiments. Do you have a source?
I have read about for example perspective: people spending their whole lives in dense forests, like jungles, supposedly don’t experience perspective as we do; when taken to the plains, they are amazed that distant animals ‘grow’ when approached. However, I find that suspicious: even in a forest, you have tall trees, and birds that look small when perched up high in the tree and larger when you cook them…

ggbutcher · February 21, 2024, 7:28pm

You’ll need rawproc on a mobile computer to do this…

Say, you’re looking at a sunset. Take a picture of it. Import it to rawproc, assign the camera color space, black-subtract, demosaic, and scale the data to the black-white boundaries. Depending on how you exposed the shot, the sun may be blown, or the shadows may be night-black. So, why doesn’t a linear rendition on a monitor look the same as the actual scene? Mainly, because you’re looking at a device with its own limited tone response and gamut, with a linear image shoehorned onto it. The behavior of your vision is working with the display, not the actual scene. So, I don’t think it’s that our wildly complex vision system lets us “get away” with tone compressions, it’s that we need to do that to accommodate the limited range of the medium.

When you look at a picture, you’re looking at an oh-so-coarse-and-compressed rendition, not the actual scene…

Slow · February 21, 2024, 7:34pm

Wehre can I read about this?

kofa · February 21, 2024, 7:42pm

When we fit the brightness range of the image onto the display by linear scaling, we (generally) darken everything. I think it’s important to keep midtones intact, so they look like the average surroundings of the display, as a reference. But that means we need non-linear compression, usually at both ends of the range.
If the resolution of the display were infinite, despite the limited max. brightness, looking at the display in complete darkness would resemble the original scene, I think (although at extremely low light levels, our vision also changes).

bastibe · February 21, 2024, 8:12pm

I don’t have a source, I’m afraid. I remember reading about it somewhere. The argument went something like this: cave paintings and early art were very abstract. Even early medieval art lacked a concept of perspective, as we understand it today. All of these things had to be learned/discovered at some point. There were experiments with remote tribespeople who didn’t recognize scenes or other people on photographs. They could learn it easily enough, but the jump from a flat image to “seeing” the three-dimensional scene implied therein was not automatic.

I may have gotten parts of that from a (somewhat boring) online course by the MOMA, and a (fantastic) online Stanford course on digital photography by Marc Levoy. Highly recommended, that last one. Both resources should be free.

Good question. It’s sort of my understanding of how psychophysics works for images. I have studied audio psychophysics for several years, it was part of my PhD work. I find it endlessly fascinating how we derive meaning from sparse stimuli. Essentially, we perceive not our sensor responses, but an internal world model that is continuously updated to fit the sensor data. As such, we hear/perceive a clean conversation between two humans, even though there’s traffic noise and city noises all around us. Somehow we can stand in a forest, and perceive the absolute cacophony of birdsong and wind noise as “blissful silence”. I dabbled some in the science of visual perception as well, but don’t know of a good textbook about it.

As I understand it, when we see a picture, we perceive our world model’s reconstruction of what the original scene must have looked like, despite only seeing a flat projection. This can be measured by brain scans. However, obviously, a sunset on a print is nothing like a real sunset. The real thing is all about the bright luminous color, contrasted against the deep dark foreground. But the print really isn’t bright at all, being reflected light from a mere lamp, nor particularly black. There is no luminance whatsoever. If you held up the print during the real sunset, it would be barely legible at all, dim and bland as it is. Yet we recognize it and it evokes a similar emotional response as the real thing.

Some of the Chaos Communication Congress talks by Joscha Bach may be a good starting point for the philosophy behind this, if not the science. Also, they’re highly entertaining. Watch them in chronological order for best impact. There are various texts on constructivism that express a similar sentiment from a philosophical viewpoint. Constructivism seems a great fit for the scientific literature on psychophysics (the science of perception; how sensory stimuli are perceived). There are textbooks on that topic, though none that I know of that make the connection to constructivist philosophy. The closest is perhaps the auditory homunculus, where hearing someone speak activates the same brain regions as if we were speaking ourselves, implying that we not so much understand the sounds of speech, as much as we reconstruct what it would have been like to speak those words, and thereof derive the intended meaning. Speech processing and perception is wonderfully weird like that.

ggbutcher · February 21, 2024, 8:13pm

Indeed. S’why I struggled with dynamic range tests before I understood this. I’d light a target, shoot it, and wonder why I couldn’t deal with the “shadows” like I had to in my outdoor shots. I finally set up a scene where the bright part was lit up on my desk, and the darker part was under the desk…

ggbutcher · February 21, 2024, 8:19pm

Those of you with hearing aids may resonate with this…

I found that there wasn’t a setting that would let me pick out the immediate-area conversation from the wider cacophony of a crowded room. Take them out and I immediately regained that “superpower”, at the expense of the high frequencies that are my bane. There’s something missing from the hearing aid presentation that enables that discrimination…

bastibe · February 21, 2024, 8:29pm

That’s the fun part of audio signal processing. The best algorithms are always in the brain. We can do beam forming and adaptive wiener filters and fancy binaural correlations, but as soon as we touch too much of the phase response, we kill the brain’s auditory object formation, and lose the signal in the noise.

Sometimes you can train your brain to learn to “read”, and interpret the hearing aid’s output. Cochlear implants are given to newborns right after birth, to give them a fighting chance of that. It can happen later in life as well, I’ve seen it happen. But it’s hard work, and doesn’t work for everybody.

Some people therefore swear by simple “open” hearing aids that do nothing more than boost the higher frequencies. While technically nowhere near as impressive as the fancy closed, multi-microphoned gadgets, they retain more of the original sound, which works better for some people.

ggbutcher · February 21, 2024, 8:55pm

I didn’t know such a thing existed.

When I got my first hearing test, the audiologist told me, “you’re missing hearing at frequencies at which women speak.” Really!? Can I get a letter saying that? Laminated?..

I’m definitely in that category. Just got back from the audiologist, to procure some of those little domes that go over the speaker (ripped one cleaning), so for the past few days I’ve been aid-less. Without I can hear just fine, except for those high bits that help discern speaking, and putting the aids back in put that little bit of peak into the signal. So now, you’ve got me wondering if there’s something out there that’d put the peaks in the prose, but also let me discriminate conversations in public space…

Okay, bit off-topic, back to the wonder that is human vision… Okay, had cataract surgery last year, interesting how yellow-ish my vision had become. But now I’m wearing reader bifocals, no correction at distance, 1.5 diopters close in, and they filter UV frequencies to boot. Interesting thing there is that my vision with those glasses is now back to “yellow-ish”, and it’s interesting how now I do (and sometimes don’t) accommodate that in what I see. All that to surmise, you can’t even count on direct viewing of a scene to be “canonical”…

PhotoPhysicsGuy · February 21, 2024, 9:06pm

I would be very very careful with conclusions from this type of “test”. If you know audio psychophysics then you know how hard it is to construct experiments to discern how close (in a chain of abstraction-layer within the brain) a cognate/percept is to sensory stimuli.

Which immediatley brings up many questions as to why a “little bit of learning” to view 2D-representations with vastly different dynamic ranges still “works”, or what this “works” actually means.

Because maybe it doesn’t work in the sense of percept-similarity, but DR might be not important as a memory aid to relive/imagine a moment?

And yet we can see immediate differences in the electronics-store when SDR and HDR content is played back side by side.
Also does immersion profit from percepts that are closer to real dynamic ranges?
As in: is immersion facilitated more easily by higher-DR but unimportant for cognition as cognition works already on low-DR images after a bit of learning?

Does it though? Or is it just helping us to remember the percepts we had during the sunset that the print is not evoking?
Would a better print (or display) with higher DR help somehow to quicker, easier more vividly remember the scene?
If we are updating our world model (or it’s reconstruction) with percepts, then the percept quality (how close picture-percepts are to real world percepts) is somehow involved.

yasuo · February 21, 2024, 10:39pm

Yes.
We have tone curves, tone equalizer, sigmoid in ART to control image contrast. Tone curves is traditional way to adjust contrast. However, it may occur hue and saturation shift simultaneously with tone adjusting. Tone Equalizer is a new way of image contrast adjustment. It avoids hue and saturation shift. Sigmoid may also occur hue and saturation shift in principal, however it is arranged to avoid hue shift in ART in IMHO.
With log tone mapping, we can control allocating data range but we cannot control tone curve .

Slow · February 21, 2024, 11:24pm

Would help this dog photos?