Module position with respect to display vs scene ref'd

Why?

I think this is one of the pitfalls of the “Scene-referred” terminology. People build a notion that the picture is a replication of the scene.
Replicating a scene is and will always be impossible. And it really should not be the goal of photography. And it really isn’t desirable, in the end, according to studies and anecdotes.

There is no “ground truth”. There is no “accurate representation”. There is only a picture, as a standalone, on its own.

A painter, sitting “in the scene”, even the ones that focus on photorealism, will attenuate pure colors the same way they get attenuated in film photography. This is not what we experience in real life, however, that is how we look at pictures and understand what is in them.

There are 0 artworks that are highly-regarded, just because they accurately represented the scene. Even the sculpture of David, a hyper-realistic ‘representation’ of human body isn’t a masterpiece because of that.

If I asked you to present “objectively the best picture in the world”, you would probably fail to do so. And if you could, it sure as heck would not be because it ‘copied the scene most accurately’.

I find it’s helpful to think about a white teapot in a blue-lit, white room. If you take a picture, you have to decide whether the subject of your picture is the blue lighting or that the teapot is white, and balance accordingly. There’s no ground truth, nor accuracy.

Trying to replicate the scene accurately (and forever failing) is a weird little box people put themselves in. It disregards art, intention, meaning, ideas, personal touch and so much more.

Hopefully we can move away from that.

I disagree. Linear processing, at least to me, is just a tool that gives me more flexibility to present a picture that way I intend it to be, with minimum artifacts, with maximum predictability.

I would like to stress once again, that “display-referred” and “scene-referred” are not better or worse than each other. Some things must be done before the ‘tone-mapping’ happens, some things must be done after. Employing both is the only way to have a full grasp on the image making pipeline.

4 Likes

First, why do you ask? There might be a question behind this question. Second, the reason for this is to test whether things have been preserved and to test that the algorithm is not cheating for the lack of a better word. I am not saying that it should be done, but it shows the robustness of the algorithm. If it is hack and saw and based on shaky theory and implementation, then there is no way we can put Humpty Dumpty back together again.


Re: @AD4K

Photography in the sense of imaging is a broad field. It depends on expectations and requirements. If the goal is replication for say for example scientific pursuits, then it is commendable. For artistic reasons, perhaps not so much. However, there is value in understanding scene-referred workflows. Certainly, decisions need to be made within the workflow. There is no one answer, as you have hinted.

The greatest artists out there hone their craft so they know how to find expression. The David example is great because Michelangelo clearly understood human anatomy, the human form and the medium and tools applied to create this work. It is from this foundation that David was crafted. In this sense, scene-referred is an important concept that should be explored.

2 posts were split to a new topic: Question about unbounded processing

I am unsure of where your disagreement lies. I agree with your posts for the most part, though I am trying to address the nuances, though perhaps not as masterfully.

There are elements that are easier to understand and implement in display-referred fashion. Under the hood, however, there is a slow crawl toward linear workflows and ordering of modules because is it not only superior but it is also easier to manage once understood and implemented properly. It is still a work in progress in dt and the existence of modules with linear and display-referred features is evidence of that. Sometimes, the modules themselves are fully in the linear camp, but are presented in human terms so users can understand it easier. That can be a challenge for devs whose brains can be wired a little differently from the average user.

Linearity is not the sole arbiter of order either. There is also a preferred order of operations even among linear steps, though some are not clear and would depend on preference or purpose. But reading the raw buffer and metadata of a raw file must certainly come first!

The thing is that, as @flannelhead noted, there is also the concept of linear encoding and EOTF. There are other guiding principles and systems that certain people and industries follow such as OCIO and ACES, but I digress. This is to ensure that what we see is display/human-friendly and how we work with the image is consistent and computer-friendly. Under the hood, it is still a linear and hopefully standardized workflow. All the science and engineering is there.

Now, masking would be affected by non-linearity because we would be directly interacting with the image in display-referred fashion, to the spatial extent influenced by whatever masking mode we happen to be in and how it is assisting us in an abstracted manner.

Certainly, the order in which the masking occurs is important too. By your note on linear workflow being before display-referred applies. Again, it is not so simple. Briefly, masking depends on where the module is currently in the pipeline. If the masking is not directly entangled with the module’s algorithm but only meant to selectively apply the processing, then it may be better to generate the mask elsewhere and then copying over to apply within the module. For the most part, this is unnecessary because if our colour management and profiling (i.e. people tend to call it calibration here) are done correctly we already see things in a scene-like-display-referred manner. Edit: okay, this section got muddled a bit. It is way past midnight now. I blame exhaustion and tiredness!

Anyway, it is way past midnight, so maybe I am not being clear here either or am talking nonsense. Going back to pipeline order, yes, generally the linear stuff should precede display stuff. Whether all of it absolutely should. Well, it depends; such as what @flannelhead and others are hinting at earlier and explaining in much more detail elsewhere. What I am getting at is that what is considered the display stuff can be further broken down into parts, such as EOTF and look, the editing of an image for aesthetics. And there are elements that do not belong to EOTF or look, which would belong in the realm of image correction, e.g., colour, noise, distortion and bad data correction.

There are so many layers of complexity to image processing, which is why we have this forum to nerd out on! :smile_cat: Thanks for participating!

Developing a ‘look’ often begins very early in the pipeline, before the image ‘formation’ (as flannelhead put it). From the point where you pick on what the neutral gray is. Or the new “primaries” module that allows you to change hue angles and purities of separate primaries. That cannot be done after the image formation; it is a part of image building as pure colors will attenuate to white with intensity. That cannot (or, at least, should not) be brought back after the ‘formation’.

Another example is monochromatic photography - where the picture is essentially B&W, but has a tone, that is not the same as ‘tint’. This “conversion” to monochromatic needs to happen before the point when the picture is built (for attenuation, etc.)

These are some example off the top of my head, I’m sure there are plenty more.

I’m talking in a general sense, this would apply in any software, not just Darktable and its specific modules.

I remain with the notion that a person who has mastered Darktable and their picture building pipeline will most likely both modes of processing on the same photo - employing tools that do processing before and after the picture is built. It’s a tandem that should not be cut in half.

1 Like

Thank you for sharing your thoughts. I think I need to re-think and re-evaluate some of my workflow. So far I tried to have the tone mapper as late as possible in the pipeline to avoid clipping because of bounds, out-of-gammut problems etc. My mantra was kind of Filmic or Sigmoid will finally map everything to the output profile and I don’t have to worry a lot.

I will play around with those things if I have the feeling that I cannot get the result that I have in mind. And in particular when it comes to masking. I like the approach to ask oneself “Which steps could have happened between the scene and the sensor?” To your list, I would add e.g. lens deblur, chromatic aberration and so on.

This discussion, however, illustrates the crux of open and powerful software … you have an almost infinite space of options and on the one hand, you need somehow to limit your degrees of freedom if you want to get things done. On the other hand, I like playing and that’s why I like DT :slight_smile:.

I think this idea was ground into quite a few people by filmic’s Dev, as this was his belief too, “work in the scene ref’d part of the pipe.” You see that reflected in some of filmic’s design choices, like the gamut mapping in v7.

While that kind of rigor or ethos might be nice in some instances, if you’re getting a result from a module you like in a specific place in the pipeline, go for it.

We need to keep in mind that the so-called *-referred is about the state of the image data. When the camera records an image, the data is scene-referred, that is, the individual magnitudes correspond to the light energy at the scene. Using camera data with those magnitudes to make a rendition is not satisfying, and the resulting image is just too dark. So, a tone transform has to take place somewhere in the pipeline to spread the data according to what we need to perceive as well as accommodate the rendition medium.

The whole discussion of *-referred was originated in order of operations, where earlier software would do things to the data after it was slewed to display-referred. It’s been proven that is a sub-optimal approach, so the recent effort has been to move operations to before the display-referred transform, to work on the scene-referred data.

Of note is that the primary tone transform to display-referred happens in the color profile used for export/display. All this filmic/sigmoid/whatever shenanigans is just for looks.

I’ve played a lot with order-of-operations in rawproc, my hack raw processor. I’ve gradually come to the workflow where I don’t do any tone or color transforms until just prior to export, and do all my “editing” on the camera data in it’s scene-referred and spectral glory. Works just fine with well-behaved display and export transforms…

1 Like

I don’t think a regular user should worry much – trust your eyes to judge the end result. If you find it difficult to achieve an end result you want with the defaults / default module order / whatever, then a deviation from the defaults is in order.

It is a bit problematic to generalize like this. As already discussed in this thread, I genuinely believe that certain operations are better done in the pre-formation state (“scene-referred”) while others are better done on the already formed picture (“display-referred”).

One example was the one with color balance rgb above - it relies on the existence of a “white intensity” in the data, but it is not really reliably possible that in the “scene-referred” data. On the other hand, when the picture has been formed, the notion of white readily exists.

Another example would be adjustments like color zones. At least I prefer to do such adjustments on the picture where the input data has already gone through the shaping of luminance and hue to arrive at a reasonable picture. Especially, targeting a particular hue or lightness range would be very difficult if such a module was placed in the scene-referred part of the pipe. But if you operate on the picture, then you can see the reference point of those zone adjustments.

2 Likes

Isn’t “referred” just that - a reference point?

Scene-referred meant you used ‘the Scene’ as a reference
Display-referred meant you used ‘what you see on your display’ as a reference.

You would apply transformations based on what you’re referring to/your reference point.

Your explanation, it seems, leans towards “Scene-referred = good” / “Display referred = bad”, when they both can and should be used in any picture processing pipeline.

Yes, but context matters and in the context of darktable the fulcurm is the display transform.

I disagree and I think you’re over simplifying.

1 Like

Disagreeing with what exactly?

If I’m “oversimplifying”, it’s just because I’m trying my best to keep the conversation grounded and understandable to all.

I agree! I’m not really sure just how many DT users there are out there who like to follow these forum posts on a regular basis, maybe hundreds. I’m one of them. It just seems like what starts out as helpful, soon becomes a difficult read. Often going into an area of discussion that only a few of those hundreds of users would understand. I admit that I am one of them.

My point is that I was hoping for some answers regarding the “Module position” of display vs scene ref’d, that would be easy to understand. I will continue to read in the hopes that my brain finally starts to understand things in a logical way and one which can be applied to my edits within DT.

1 Like

With this:

1 Like

So personally for me nothing beats experimentation… move some modules that you typically use around in the pipeline and see how the results differ if at all… Try some extremes that can reveal the impact sometimes. I think its not so formulaic. Each photo will offer you what it does and then you build the picture you desire. I think that is a good description offered above “build the picture”… Perhaps the steps you like to use to build might define your workflow and it might deviate from another. People might in certain cases be able to offer examples of how a module might behave better or maybe just differently if you were to move it but I don’t think it’s that rigid.

I think the videos by @s7habo are really nice examples… He builds the edit and adds modules and instances as needed… when he get the edit to a certain point he moves the module… often color calibration or color balance further along to build on the state of the image rather than just adding another instance in the default position.

To some extent the answer really lies in the editor knowing what they want to do with the photo and how to apply the tools to do that… I think in the end most of the moving around is going to be to support color grading edits/decisions…

4 Likes

One thing I haven’t seen mentioned is the difference between the order in which you apply/use moduies (“workflow”), and the order in which they are processed (“pipeline”).
Those two are completely unrelated (in theory at least).

But you still see the changes when you add a module (so it’s last in the workflow), wherever that module sits in the pipeline. So there’s no reason to change the processing order for a module, just because its place in your workflow doesn’t correspond with its place in the pixel pipeline.

Workflow (history stack) and processing (pixel pipeline) orders are completely unrelated

There are indeed cases where it’s useful to move modules within the pixel pipeline, but not all that many.

@flannelhead : what exactly do you mean with “pre-formation stage” as opposed to an “already formed picture”, and why should those be linked to “scene-referred” and “display-referred” parts of the pipeline, resp.?
In your example with color balance rgb: I ususally use it after adjusting the image to a good base, so after adjusting filmic (and thus setting my white point). But it is still applied to my image before filmic…
Similar with colour adjustments with color calibration rgb: I use a 2nd copy after setting up white balance and adjusting filmic, so that second copy is (much) higher in the history stack. It’s sill below filmic in the processing stack (aka pixle pipeline).

Same with any localised changes: I apply them after setting up a good base picture, but I don’t care where those operations sit in the pixel pipe.

@AD4K : The difference between scene-referred and display-referred was originally based on how light values are encoded: in scene-referred, the pixel values are proportional to light energy (in the scene), in display-referred they are proportional to the log of the light energy. That log transform makes all the difference.
That “display-referred” is in practice also bound to the range of 0…1 isn’t essential (although it has its consequences…)

It seems that you have assumed that we cannot differentiate between the history stack and the pipeline.

We are explicitly talking about put the pipeline, and the order modules are processed. Let’s not confuse the readers by bringing in the history stack, which is more or less unrelated to the discussion.

Disagree. Examples were given in this very topic.

Agree completely. I just want to sort out the semantics of the whole thing, make sure the fundamental condition is understood before discussing the implications. The terms “scene-referred” and “display-referred” are about the condition of the data and one fundamental operation between the two, the tone curve. Order-of-operations works around that, to the benefit or detriment of said data…

4 Likes

Roughly, pre-formation data ≈ “scene-referred”, formed picture ≈ “display-referred” in the current darktable terminology. sigmoid or filmic rgb are the modules doing picture formation – taking the input data and shaping it so that it can become a picture.

As explained in this thread (at least in these replies by @AD4K), this terminology is an intentional departure from the “scene-referred” / “display-referred” terms. Scene vs display carries the implication that the display-referred state is somehow a simulacrum or a replication of the scene, which is not. A picture exists in its own right, and the terminology around picture formation tries to respect this.

Thinking in terms of the picture helps also to think about the pipeline order. Certain operations act on the data that is used to form the picture, other operations will manipulate the picture that has been already formed. As already mentioned, things like tinting, framing, watermarking etc. belong to the latter category. And my argument was that tools like color balance rgb may give better results when used to act on the picture instead of the data that is used to form the picture. I believe a meaningful distinction between shadows, midtones and highlights (as required by color balance rgb) can’t be reliably made on the data prior to the picture formation.

It is not about encoding but rather shaping the intensities to form a picture. The data after filmic / sigmoid represents the picture and is (roughly) linearly proportional to the light that is emitted by the display.

As an example, blurs can be applied on both states of the data – before the picture formation or after it, but it has a different meaning or purpose then. If you would like to simulate bokeh, that is something that should be applied prior to image formation. As real bokeh also happens physically in the lens, this effect should be calculated on the data that resembles the light caught by the camera sensor. On the other hand, if you would like to add an effect resembling spilled watercolor, that is definitely something that needs to be applied on the picture.

Encoding is another matter. For example, when saving the picture to a jpeg file, this data is encoded with the sRGB inverse EOTF before it is quantized to the target bit depth. Same for displays – they have their own EOTF too. This can be considered just a practicality of saving / transferring the picture with limited bit depth. The code values, encoded by the inverse EOTF, are something that you should not try apply a blur on, for example. I believe this is what is demonstrated in the example images here.

2 Likes