Black image with Haze Removal + OpenCL

Background:
I recently purchased a new PC. I moved the images + xmp from the old one to the new. I imported the folders and noticed some random images black in the lighttable. I go into the darktable and they remain black. After some investigation, it is only in images I used the Haze Removal module. If I turn it off, the image restores. If I turn off OpenCl and keep Haze on, the image also restore.

The images show fine in the old PC. Windows 10, 3.6, old Nvidia 710 card, intel CPU
New PC is AMD Ryzen 5700G (integrated AMD video card) + Nvidia 3060 497.29 (latest drivers). It is running Win11 (a step back from Win10).

Issue:
Black image when using Haze Removal module + OpenCL

Troubleshooting so far:

  1. started with a new image (no edits) + turn on Haze Removal at default settings and it has the problem. Turn off OpenCL or the Haze Removal and the blank screen goes away.
  2. Installed 3.8 - same issue
  3. Installed the 3.9 (windows insider program) - same issue
    4a) I tried to run the darktable -d opencl and I cant get it to work in Win11.
  4. ran darktable-cltest - It finds 5 devices,

0.054099 [opencl_init] opencl: 1
0.054656 [opencl_init] opencl_scheduling_profile: ‘default’
0.055660 [opencl_init] opencl_library: ‘’
0.056368 [opencl_init] opencl_memory_requirement: 768
0.057422 [opencl_init] opencl_memory_headroom: 400
0.058359 [opencl_init] opencl_device_priority: ‘/!0,///!0,*’
0.059465 [opencl_init] opencl_mandatory_timeout: 200
0.060475 [opencl_init] opencl_size_roundup: 16
0.061297 [opencl_init] opencl_async_pixelpipe: 0
0.062126 [opencl_init] opencl_synch_cache: active module
0.063103 [opencl_init] opencl_number_event_handles: 25
0.064218 [opencl_init] opencl_micro_nap: 1000
0.065062 [opencl_init] opencl_use_pinned_memory: 0
0.065912 [opencl_init] opencl_use_cpu_devices: 0
0.066770 [opencl_init] opencl_avoid_atomics: 0
0.067595 [opencl_init]
0.068682 [opencl_init] found opencl runtime library ‘OpenCL.dll’
0.069776 [opencl_init] opencl library ‘OpenCL.dll’ found on your system and loaded
0.148594 [opencl_init] found 3 platforms
0.149369 [opencl_init] found 5 devices
0.150064 [opencl_init] device 0 NVIDIA GeForce RTX 3060' has sm_20 support. 0.151376 [opencl_init] device 0 NVIDIA GeForce RTX 3060’ supports image sizes of 32768 x 32768
0.153097 [opencl_init] device 0 `NVIDIA GeForce RTX 3060’ allows GPU memory allocations of up to 3071MB
[opencl_init] device 0: NVIDIA GeForce RTX 3060
CANONICAL_NAME: nvidiag
GLOBAL_MEM_SIZE: 12288MB
MAX_WORK_GROUP_SIZE: 1024
MAX_WORK_ITEM_DIMENSIONS: 3
MAX_WORK_ITEM_SIZES: [ 1024 1024 64 ]
DRIVER_VERSION: 497.29
DEVICE_VERSION: OpenCL 3.0 CUDA

5.273702 [opencl_init] OpenCL successfully initialized.
5.274622 [opencl_init] here are the internal numbers and names of OpenCL devices available to darktable:
5.276346 [opencl_init] 0 ‘NVIDIA GeForce RTX 3060’
5.277233 [opencl_init] 1 ‘gfx90c’
5.277810 [opencl_init] 2 ‘NVIDIA GeForce RTX 3060’
5.278719 [opencl_init] 3 ‘Microsoft Basic Render Driver’
5.279797 [opencl_init] 4 ‘AMD Radeon™ Graphics’
5.280669 [opencl_init] FINALLY: opencl is AVAILABLE on this system.
5.281875 [opencl_init] initial status of opencl enabled flag is ON.

I understand the AMD Radeon being there, but not the rest. Is this normal? This is the first time I play around with cltest. In the past I just turn it on and it worked.

I found someone had a similar issue in the past, but it doesnt seem it was fixed. https://github.com/darktable-org/darktable/issues/8000

If there is something I should try first, please let me know. I can start a github issue, but I would like to provide the most info and try to help find the root cause problem first.

You can perhaps block a device from use in order to see if one in particular is the culprit?
See the opencl section of the manual on how to do this.

Perhaps starting darktable with -d opencl would give useful info?

You should spend some time reading and understanding darktable 3.8 user manual - multiple devices
and the do some test, if you can increase performance by manually set preferences, which gpu device should be used for which purpose.

you might get better performance if you just prioritize full pixel pipe to be done on your faster gpu. If you set same priorities for each pixel pipe then it’s first come first serve - so the full pixel pipe might be done on a slower device since the faster is already in user for preview or thumbnail pipe …
The custom prioritization just works if you’re using standard mode

you might do some test to find the best configuration by logging the processing time:

darktable -d perf -d opencl | grep -e’dev_process_’ -e’using device’ -e’

Ive read it and a couple of different threads.

I have one Nvidia card in my PC and one embed AMD video within the CPU. The system is finding 5 devices. (0=Nvidia, 1=gfx90c, 2=Nvidia again, 3= Microsoft, 4= AMD Radeon). I would expect it to only find 2 instead of 5.

Regardless, I managed to find the log (windows hides in a hidden folder). It loads the opencl for the 5 devices without errors. To troubleshoot the problem I forced the priority to only use device 0. The blank screen while using haze removal is still happening.

Further down in the log I found this:
16.415868 [pixelpipe_process] [full] using device 0
16.430242 [opencl_rawprepare] couldn’t enqueue kernel! -1
16.430528 [opencl_pixelpipe] could not run module ‘rawprepare’ on gpu. falling back to cpu path
16.460226 [opencl_white_balance] couldn’t enqueue kernel! -1
16.460526 [opencl_pixelpipe] could not run module ‘temperature’ on gpu. falling back to cpu path
16.493223 [opencl_highlights] couldn’t enqueue kernel! -1
16.493507 [opencl_pixelpipe] could not run module ‘highlights’ on gpu. falling back to cpu path
16.526699 [opencl_demosaic] rcd couldn’t enqueue kernel! -1
16.526940 [opencl_pixelpipe] could not run module ‘demosaic’ on gpu. falling back to cpu path
16.687733 [opencl_demosaic] can not identify resource limits for device 0
16.688085 [opencl_demosaic] can not identify resource limits for device 0
16.688411 [opencl_denoiseprofile] couldn’t enqueue kernel! -1
16.688677 [opencl_pixelpipe] could not run module ‘denoiseprofile’ on gpu. falling back to cpu path
16.904229 [opencl_lens] couldn’t enqueue kernel! -1
16.904499 [opencl_pixelpipe] could not run module ‘lens’ on gpu. falling back to cpu path
16.948236 [hazeremoval, transition_map_cl] unknown error: -1
16.948543 [hazeremoval, box_min_cl] unknown error: -1
16.948767 [guided filter] unknown error: -1
16.948929 [guided filter] fall back to cpu implementation due to insufficient gpu memory
16.987639 [hazeremoval, dehaze_cl] unknown error: -1
16.989778 [opencl_exposure] couldn’t enqueue kernel! -1
16.989975 [opencl_pixelpipe] could not run module ‘exposure’ on gpu. falling back to cpu path
17.008077 [opencl_colorin] couldn’t enqueue kernel! -1
17.008285 [opencl_pixelpipe] could not run module ‘colorin’ on gpu. falling back to cpu path
[dt_ioppr_transform_image_colorspace_cl] error -1 enqueue kernel for color transformation
17.025327 [opencl_pixelpipe] could not run module ‘channelmixerrgb’ on gpu. falling back to cpu path
17.050392 [opencl_colorbalancergb] couldn’t enqueue kernel! -1
17.050692 [opencl_pixelpipe] could not run module ‘colorbalancergb’ on gpu. falling back to cpu path
17.241408 [opencl_filmicrgb] couldn’t enqueue kernel! -1
17.241704 [opencl_pixelpipe] could not run module ‘filmicrgb’ on gpu. falling back to cpu path
[dt_ioppr_transform_image_colorspace_cl] error -1 enqueue kernel for color transformation
17.294280 [opencl_pixelpipe] could not run module ‘bilat’ on gpu. falling back to cpu path
17.383959 [opencl_colorout] couldn’t enqueue kernel! -1
17.384160 [opencl_pixelpipe] could not run module ‘colorout’ on gpu. falling back to cpu path

and then this:
27.204178 [opencl_summary_statistics] device ‘NVIDIA GeForce RTX 3060’ (0): 68 out of 68 events were successful and 0 events lost

I could not run the grep you recommended. It gives me an error.

I’m not sure what else to try.

Are you able to disable the onboard graphics of the the AMD chip…Just to confirm that the two GPU are playing nice?? Also not sure about Win11 haven’t bumped up to it yet but I think there are places where you can allow the software and the os to have impact on the settings or you can set it to use the NVIDIA driver settings…its been a while since I messed with it…but you could have something going on in that arena as well…just some random thoughts… It does look like opencl is not happy with your 3060…seems like it can’t find how much memory is available …may be?? others would know more…maybe one of the opencl settings in your config file is weird or maybe the opencl kernels are corrupt…I am not sure if those are something you can delete and they will get regenerated?? Someone that actually knows about OPENCL will likely be able to surmise something from those log entries…hope you get it sorted

The compiled kernels are in the cache directory, and can be deleted; they are regenerated when you start darktable.
To find out available memory, under Linux one can use the tool nvdia-smi. I don’t know about Windows. However, under Linux at least, I think a failed memory allocation is detected properly, and OpenCL is turned off (and we see fall back to cpu implementation due to insufficient gpu memory above).
opencl_memory_headroom could be increased to avoid allocation errors.
Perhaps the device priorities could also be tweaked to exclude devices.

Repeat this with each device numbers. It seems there are two just driver based devices.
At least devices 2, 1 and 4 should be usable - 2 because 0 wasn’t useable :wink:
But that’s just an try an error approach

Update of actions I’ve done based on yall inputs:

  1. forced priority to use one device at a time, but it still tried to use the others.
  2. changes nvidia drivers from game ready to studio
  3. delete the kernels in the cache to regenerate them
  4. changed the memory headroom to 800 per OpenCL analysis... Darktable... much faster with Opencl disabled...something wrong?? - #22 by kofa
    and other settings from that thread
  5. checked performance with and without OpenCL

In summary, the only way to avoid the black image is by disabling the OpenCL or the Haze Removal. It seems that the Nvidia card is not being used in most modules with the opencl. The system switches to CPU, so there is not a big performance improvement. I sure would like the system to take advantage of the 12gb fast card.

The scheduling priorities are only taken into account if the scheduling profile is set to default (and not with very fast GPU or multiple GPUs (darktable 3.8 user manual - scheduling profile). But OK, I see that is what you have:

0.054656 [opencl_init] opencl_scheduling_profile: ‘default’
0.058359 [opencl_init] opencl_device_priority: ‘*/!0,*/*/*/!0,*’

Do you see the final priorities in the log?
Those should appear after the text [opencl_priorities] these are your device priorities: - see https://github.com/darktable-org/darktable/blob/master/src/common/opencl.c#L1379-L1387.

Were you able to disable the AMD onboard GPU?? What was up with the 2 Nvidia entries . Were you able to figure that out??

You could check this and try to see if you toggle it what happens

And this will likely also look the same in Win11…add DT to the list of apps and specify the GPU…

Just worth trying… https://www.howtogeek.com/351522/how-to-choose-which-gpu-a-game-uses-on-windows-10/

I could not figure out how to disable the AMD GPU. It is really not onboard, but on chip. It is part of the CPU, from what I read online. I did try going to Device Manager and just disabled the device there. I then deleted the kernel cache to force the regeneration. The AMD was not regenerated or found by DT opencl.

I dont get why I have 2 Nvidia entries. It creates two Nvidia kernel cache (3060_100 and 3060_51109)
device 0 (the _100) says supports image sizes of 32768 x 32768
allows GPU memory allocations of up to 3071MB
GLOBAL_MEM_SIZE: 12288MB

the card is 12gb of memory, so why allow up to 3071MB?

the second device (_51109) says: supports image sizes of 16384 x 16384
allows GPU memory allocations of up to 1024MB
GLOBAL_MEM_SIZE: 12142MB

I dont get what is going on with this second device (smaller sizes/memory)

Regardless, with only the Nvidia card, the haze removal = black image. Also it seems to fail to use the GPU.
7.180602 [pixelpipe_process] [full] using device 0
7.189400 [opencl_rawprepare] couldn’t enqueue kernel! -1
7.189596 [opencl_pixelpipe] could not run module ‘rawprepare’ on gpu. falling back to cpu path

That windows 11 setting was on by default. Since DT using opencl, I think the windows settings should not matter.

You must be about ready to hit something…it does seem that your card is not clearly being recognized or reporting the right values. I wonder if anyone else has a 3060 and what version of the driver is maybe working for them?? Would any of the other Opencl settings reduce the amount of memory that could be allocated?? I use occasionally ON1 photoraw and it was suggest for it as it uses OPENCL as well to add it to the graphics settings and specify the high setting so I did the same with DT…I don’t know if it made any difference and I am using Win10 not 11 but I did it anyway figuring it could not hurt and if the OS did ever get involved it was manually set to run using the most GPU it could not letting Windows decide for itself…

I think that’s the amount of memory that can be allocated in one chunk; however, all memory can be allocated (in several chunks). For my 6 GB card:

0.067616 [opencl_init] device 0 `GeForce GTX 1060 6GB' supports image sizes of 16384 x 32768
0.067620 [opencl_init] device 0 `GeForce GTX 1060 6GB' allows GPU memory allocations of up to 1519MB

(From: darktable 3.4/3.5 opencl slow on Windows 10 - #44 by kofa)

Memory info while proceccing an image:

FB Memory Usage
    Total                             : 6077 MiB
    Used                              : 5762 MiB
    Free                              : 315 MiB

(From: Which benchmarks provide an estimate to enable me compare and decide on which GPU to buy, for Image Processing ? - #13 by kofa)

did you already test each gpu device prioritation?
i.e. in your AppData\Local\darktable\darktablerc file using the line
opencl_device_priority=x,*/!x,*/*/*/*
x is the gpu device number 0 to 4

If all fails, then you need support by someone who has experience with win11 specific handling of opencl.

I did. Now that I disabled AMD, I just tested with this: opencl_device_priority=!0,!1,/!0,!1,/!0,!1,/!0,!1,/!0,!1,*

device 2 is the Microsoft Basic Render Driver. The execution fails to CPU and it does render the image, but this is the same as turning OpenCL off.

I tend to agree that there is an issue in the kernel cache process. There should only be one device found and one device kernel generated. The nvidia card / driver uses OpenCL 3.0. I dont know if there is a difference in the language with it.

I will try to mess with Windows settings to try something else.

If you did that tests with each device and all failed with same error message then you might try to get help in a windows11 support group.

1 Like

I am no expert but I would think maybe this is your issue…based on what you described earlier

See page 19 …sounds like what you are experiencing…

I don’t think a fix is provided as these are shown as known issues…

I am not sure in this case if this if for newer drivers, or a windows issue so maybe you could look for windows based solution or actually google a bit and see if older drivers had this issue??

TAG error…

EDIT

Check if you have Windows 11 game mode enabled…not sure what the default is but you might actually want it off?? And set DT as a high priority graphics app…see if either helps??

Hi @g-man,

I see there are multiple devices reporting for OpenCL. I’ve had similar issue in DT 3.8 where even with single GPU (rtx 3080), it was listed twice along with ‘Microsoft Basic Render Driver’.
Running darktable-cltest.exe resulted in multiple reports of opencl_create_kernel failing to create the kernel with -5 error (CL_​OUT_​OF_​RESOURCES).
Attempts at filtering for each device with ‘opencl_device_priority’ or settings that affect memory consumption failed.

What did the trick was removal of the OpenCL™ and OpenGL® Compatibility PackGet OpenCL™ and OpenGL® Compatibility Pack - Microsoft Store
After this, the GPU was listed only once and ‘Microsoft Basic Render Driver’ was no longer present.

I’ve not observed any negative effect of removing that compatibility layer for OpenCL.

FIXED!!!

Thanks @swish . That solved the problem. It seems that the combability stuff is pre-installed in my system (maybe part of windows 11). I went to windows System, Apps, Apps & Features and searched for OpenCL. I then selected it to be uninstalled.

Now DT only finds 2 devices (one Nvidia and one AMD) . DT is using both GPU and CPU to process the modules. I’m also able to use Haze Removal without it turning into a black image.

Just for fun, I deleted the kernel cache to force a new compile. It is all working now.

So, is this something we should include into the DT manual? Eg. For Windows installs, uninstall the OpenCL compatibility from apps if more than one device is found.