HDR, ACES and the Digital Photographer 2.0

@Carmelo_DrRaw @Elle I read through lots of PDFs today. I didn’t keep track of them but here is one that seems to address much of it: https://www.itu.int/dms_pub/itu-r/opb/rep/R-REP-BT.2390-1-2016-PDF-E.pdf, which I only skimmed just now, sorry. However, the obsessive reading spree was on my small-screened and half-touch-broken phone, while busy doing something else. :joy_cat: I might have gotten things mixed up because of that and the fact that there is so much info out there, but here are some points that might be relevant. Again, I am speaking in non-technical possibly vague terms. My purpose in threads like these is to brainstorm and / or provide sanity to a very complex subject. I will leave the technical efforts and battles to the rest of you. :innocent:

1. There are two standard transfer functions called Perceptual Quantizer (PQ) and Hybrid Log-Gamma (HLG). Each has its strengths and weaknesses. Briefly, PQ is an absolute display-referred signal; HLG is a relative scene-referred signal. The former needs metadata and isn’t backwards compatible with SDR displays; the latter is. Depending on the rest of the specs, esp. for PQ, there is a preferable type of HDR display and surround. A common measure is the amount of nits.

Looking into this would probably answer @Carmelo_DrRaw’s question. Many documents show what happens on various displays and surrounds combinations. Pretty graphs and descriptions. Makes me want to root for one or the other as if it were a competition. :stuck_out_tongue: (I am leaning toward HLG :racing_car: :horse_racing: :soccer:).

2. Next we have @Elle’s link and comments.

Stupid Github, now, won’t let me search from its front page without logging in. MS’s handiwork? :angry:

These commits and their comments show how variable the standards could be. There were PDFs talking about the choices made by various entities, workflows and devices. The discussion varies depending on the perspectives of the document or slideshow publisher but you kind of get a gist of the common themes are among the infographs, tables and figures.

@Elle’s particular linked document HDR.pdf gives examples in the form of waveforms, which is very helpful from our perspective. Photographers tend to use the histogram (JPG); videographers use waveforms (and other scopes) to quickly gauge where the DR, among other things, is. As you look at the images, to me at least, it is easy to understand why “there is no agreed upon diffuse white point level”. It has to do with a lot of things, a few which I will briefly list in the next paragraph.

Just as we need to make decisions when we look at the camera’s histogram (generally generated by the preview JPG, not the raw!), the videographer has to look at the scopes to determine and decide on the DR and the distribution of tones, among other things. Choices need to be made (edit: and we need to consider leaving some data and perceptual headroom too). Hopefully consistent ones per batch or project. These decisions are based on a number of factors including personal experience and tastes; client and product expectations; workflow; and ultimate output and viewing conditions. There is a lot to be said about point #2 but I have to rest after a tough day!

2 Likes

It’s definitively the second link http://www.streamingmedia.com/Downloads/NetflixP32020.pdf

Linear to st2084 10000nits

y=\big({c1 + c2* x^{m1} \over (1 + c3*x^{m1})}\big)^{m2}

Linear to st2084 1000nits

y=\big({c1 + c2* (x/10)^{m1} \over 1 + c3*(x/10)^{m1}}\big)^{m2}

Tested in Vapoursynth https://github.com/vapoursynth/vapoursynth where

c=core.resize.Bicubic(clip=c, format=vs.RGBS, transfer_in_s="linear", transfer_s="st2084", nominal_luminance=1000)

is equivalent to this in polish standard notation

c = core.std.Expr(c, expr=" 0.8359375 x 10 / 0.1593017578125 pow 18.8515625 * + 1 18.6875 x 10 / 0.1593017578125 pow * + / 78.84375 pow ",format=vs.RGBS)

y=\big({0.8359375 + 18.8515625 * (x / 10)^{0.1593017578125} \over 1 + 18.6875 * (x/10)^ {0.1593017578125}})^{78.84375}

OK, that’s an equation that I can put into a spreadsheet - thanks! I’ll try to post the resulting ICC profile later today.

I don’t quite get it… this means that the same y value is obtained for an input value that is 10x bigger at 1000 nits than at 10000 nits. Shouldn’t be the other way around?

EDIT: I have changed the formulas in @afre post to take advantage of the new math formatting, for better readability…

1 Like

See the equation on page 20 here:

https://www.smpte.org/sites/default/files/23-1615-TS7-2-IProc02-Miller.pdf

The pdf is very readable.

I think the equation @age gave perhaps “goes the other way”.

The equation on page 20 is for 10,000 nits, fwiw.

Edit: Ok, here’s the equation on page 20 in a spreadsheet:

pq-luminance-equation.ods (29.2 KB)

@Carmelo_DrRaw or anyone - could you check my equations?

My spreadsheet just replicates the equation on page 20 and spits out the point TRC values for a point TRC with 101 points from 0 to 100, where 0 maps to 0 and 100 maps to 65535. Assuming my equations are correct, more TRC points would be needed for an actual ICC profile, and I’m absolutely sure the current equations in the spreadsheet aren’t right for a 1000-nit monitor, for one thing the nits is wrong.

I have converted this hdr image https://hdrihaven.com/hdri/?h=aerodynamics_workshop to linear rgb with 2020 primaries, it’s scene referred so it goes from 0 to inf. (actually fro 0 to +15.0=1500nits)

Scaled down to 0-1 range

Y=x/15

Converted to st2084 1500nits

y=\big({c1 + c2* (x/6.666666)^{m1} \over 1 + c3*(x/6.66666)^{m1}}\big)^{m2}

10000 nits / 1500 nits=6.66666666

While for st2084 1000 nits we have to scaled down from scene referred in this way (there will be some highlights clipped)

y=x/10

and this fomula

y=\big({c1 + c2* (x/10)^{m1} \over 1 + c3*(x/10)^{m1}}\big)^{m2}

10000 nits / 1000 nits = 10

Very interesting! I think it is worth to include the ST2084 formula in the tone mapping tool of PhotoFlow…

Hmm, given the above exchange I’m sure I don’t know what sort of profile @age actually wanted :slight_smile: and looking at my spreadsheet I already found a problem. But solving that problem won’t address whatever type of profile @age actually wanted - is the goal a “look” icc profile that transforms the colors in the linear scene-referred image?

For that image I’ve imported the .hdr file in darktable and exported as floating point .tif linear with 2020 primaries, and I think that could be considered scene referred.

I’m not sure how to use .icc profile for hdr images.
Adobe has some .icc profile with pq transfer curve
https://forums.adobe.com/thread/2313742

I would share an article about Netflix talking on HDR for still images and .icc profiles.
https://medium.com/netflix-techblog/enhancing-the-netflix-ui-experience-with-hdr-1e7506ad3e8

Edit.
https://www.w3.org/TR/png-hdr-pq/#file
Here one .icc profile for HDR .png files

You’ve @'ed the wrong person but I may have linked to the formulas in one of my posts as well. :slight_smile:

Oh dear— PNG + ICC + HDR: I wonder what the compatibility for that is…


PS This might be helpful. I have been working through it.

1 Like

@afre @Elle @age @dutch_wolf @gwgill
I hope my understanding of the HDR display format and involved transforms is improving…
If I understand correctly, there are at least two major standard for HDR displays, one is PQ and the other HLG. They use two different types of transfer functions, and therefore they are not compatible with each other. On top of that, there is the varying maximum brightness of different display models that should be taken into account.
So, there might be two approaches for preparing an image for display on an HDR device:

  1. convert the scene-linear data to the appropriate format for a given HDR standard. Ideally, one should in this case save at least two images, one for PQ and one for HLG, and rely on the user to pick up the good one depending on its hardware. How the actual display brightness is handled in this case, I still don’t know…
  2. save the image in scene-linear encoding, and let the image viewer do the required transforms for sending the data to screen. The image viewer should in this case detect the type of display and apply the appropriate OETF to encode the values to be sent to screen. While doing this, it could also take into account the actual display brightness, if the information can be retrieved somehow. However, I guess there is no such FLOSS image viewer for the moment…

Am I still on track?

1 Like

I think that is about right but I haven’t looked into the standards myself yet (still working my trough the ingress side of things). Note that you do loose some (creative) freedom when going with option 2 since you loose the ability to set the tonemap operator but have to live with whatever the SW does[1].


[1] Even tough an HDR display has quite a high dynamic range it is still not comparable to what potentially can be stored in scene-linear imagery so some kind of tonemap will still be necessary

1 Like

I have zero experience in this, like many topics I jump into without abandon. Just going by what I have read. :blush:

I would say that it depends on your goals. The DP2.0 would have a different capturing, creating and post-processing workflow; and delivery, encoding and decoding pipeline than a broadcaster and / or producer (e.g., Netflix, Sony) would. The latter group would differ in many ways too, depending on who, what, where, why, how and when. It also isn’t as simple as PQ vs HLG. If you look at my last link, you would see that there is a forward, and backward, conversion between PQ, HLG and linear-scene, in the normalized range of [0,1]. Moreover, it depends on the HDR standard in question; the quirks of the HDR pipeline and display; and the nits involved in the devices and surround; and viewing distance. All this while you have to consider the strengths and drawbacks of each method from start to finish and hope that the final product is worth sharing and consuming. :monkey:

That said, I do think a good starting point would be to do something similar to what we have already been doing on this forum; i.e., the PlayRaws’ Raw File → This is what I did. Do you like it?

Speaking of HDR, I normally witness great afternoon-evening skies walking by this window. I don’t have time to compose, still my shaky hands or avoid the reflection though. (This window has a coating of some sort.) My very LDR phone and basic camera app also white balances the photos. This pair of photos is supposed to have dozens of layers of clouds, each having a different colour palette and being blown to the left at different speeds. The closer clouds are brighter and warmer, providing some amazing contrast with both the foreground and background. But alas, you can experience none of that below. :disappointed:

There is also the ACES RRT, which sort of falls into the same category of standards for viewing HDR images. It’s a viewing transform anyway.

From reading around and following the links @age has posted, I’ve realized I know even less than I thought I knew about this whole HDR display stuff, and I didn’t think I knew very much to begin with :slight_smile: . But hopefully after some careful reading and experimenting with that nice HDR image @age linked to and with some working with spreadsheets for the equations, it will all start to make some sense.

Edit: Just now I was looking through the ArgyllCMS mailing list archives and found this thread about BT.2100 PQ:

Haven’t read through the thread yet, but it looks interesting.

Not new material for me but I like the coherence of the discussion. The last message is particularly fun.

Honestly it’s a gigantic mess, and I personally hope HDR either dies a
horrible death or display hardware improves significantly (no dynamic
contrast / dynamic backlight BS, just straight up static contrast using
e.g. OLEDs or MLEDs). We might just have to accept the fact that we
can’t calibrate HDR displays until that happens.

Not really the same thing since as far as my understanding goes for an ACES workflow it will always go trough the RRT even if the output is an HDR display. The RRT transforms ACES2065-1 to what is called OCES[1], which is supposed to represent a perfect display, from there it is mapped again to a normal screen (be that a sRGB screen or an HDR screen).


[1] No idea what it stands for, since it is officially not a user facing component it isn’t documented in any of the freely available docs it might be in the actual SMPTE standard but that costs at least $120 so I have no idea

OCES stands for “Output Color Encoding Specification”. This is the definition for the internal use in the IIF. No end-user ever handles OCES images directly. OCES is the output image color space targeted for the reference display device which would have a more than 1,000,000 to 1 dynamic range. OCES images can be acquired by rendering RRT (which will be explained later) to ACES images. Therefore, the same ACES images would be converted to the same OCES images every time.

Source: http://www.fujifilm.com/about/research/report/056/pdf/index/ff_rd056_009_en.pdf

1 Like

Saw that but that doesn’t tell me the info I would actually want (Primaries, does it go from 0.0 to 1.0 or some other range, does it have linear gamma, log, something else?) and sadly just a curious glance at the RRT doesn’t really give this info since it includes some film emulation elements (it has a ‘glow module’ and a ‘red modifier’ for example) which is one of the reasons it isn’t suitable for photography work (in my opinion)

Unfortunately, like yourself, I haven’t come across any details yet. :slight_smile: