You should spend some time reading and understanding darktable 3.8 user manual - multiple devices
and the do some test, if you can increase performance by manually set preferences, which gpu device should be used for which purpose.
you might get better performance if you just prioritize full pixel pipe to be done on your faster gpu. If you set same priorities for each pixel pipe then it’s first come first serve - so the full pixel pipe might be done on a slower device since the faster is already in user for preview or thumbnail pipe …
The custom prioritization just works if you’re using standard mode
you might do some test to find the best configuration by logging the processing time:
I have one Nvidia card in my PC and one embed AMD video within the CPU. The system is finding 5 devices. (0=Nvidia, 1=gfx90c, 2=Nvidia again, 3= Microsoft, 4= AMD Radeon). I would expect it to only find 2 instead of 5.
Regardless, I managed to find the log (windows hides in a hidden folder). It loads the opencl for the 5 devices without errors. To troubleshoot the problem I forced the priority to only use device 0. The blank screen while using haze removal is still happening.
Further down in the log I found this:
16.415868 [pixelpipe_process] [full] using device 0
16.430242 [opencl_rawprepare] couldn’t enqueue kernel! -1
16.430528 [opencl_pixelpipe] could not run module ‘rawprepare’ on gpu. falling back to cpu path
16.460226 [opencl_white_balance] couldn’t enqueue kernel! -1
16.460526 [opencl_pixelpipe] could not run module ‘temperature’ on gpu. falling back to cpu path
16.493223 [opencl_highlights] couldn’t enqueue kernel! -1
16.493507 [opencl_pixelpipe] could not run module ‘highlights’ on gpu. falling back to cpu path
16.526699 [opencl_demosaic] rcd couldn’t enqueue kernel! -1
16.526940 [opencl_pixelpipe] could not run module ‘demosaic’ on gpu. falling back to cpu path
16.687733 [opencl_demosaic] can not identify resource limits for device 0
16.688085 [opencl_demosaic] can not identify resource limits for device 0
16.688411 [opencl_denoiseprofile] couldn’t enqueue kernel! -1
16.688677 [opencl_pixelpipe] could not run module ‘denoiseprofile’ on gpu. falling back to cpu path
16.904229 [opencl_lens] couldn’t enqueue kernel! -1
16.904499 [opencl_pixelpipe] could not run module ‘lens’ on gpu. falling back to cpu path
16.948236 [hazeremoval, transition_map_cl] unknown error: -1
16.948543 [hazeremoval, box_min_cl] unknown error: -1
16.948767 [guided filter] unknown error: -1
16.948929 [guided filter] fall back to cpu implementation due to insufficient gpu memory
16.987639 [hazeremoval, dehaze_cl] unknown error: -1
16.989778 [opencl_exposure] couldn’t enqueue kernel! -1
16.989975 [opencl_pixelpipe] could not run module ‘exposure’ on gpu. falling back to cpu path
17.008077 [opencl_colorin] couldn’t enqueue kernel! -1
17.008285 [opencl_pixelpipe] could not run module ‘colorin’ on gpu. falling back to cpu path
[dt_ioppr_transform_image_colorspace_cl] error -1 enqueue kernel for color transformation
17.025327 [opencl_pixelpipe] could not run module ‘channelmixerrgb’ on gpu. falling back to cpu path
17.050392 [opencl_colorbalancergb] couldn’t enqueue kernel! -1
17.050692 [opencl_pixelpipe] could not run module ‘colorbalancergb’ on gpu. falling back to cpu path
17.241408 [opencl_filmicrgb] couldn’t enqueue kernel! -1
17.241704 [opencl_pixelpipe] could not run module ‘filmicrgb’ on gpu. falling back to cpu path
[dt_ioppr_transform_image_colorspace_cl] error -1 enqueue kernel for color transformation
17.294280 [opencl_pixelpipe] could not run module ‘bilat’ on gpu. falling back to cpu path
17.383959 [opencl_colorout] couldn’t enqueue kernel! -1
17.384160 [opencl_pixelpipe] could not run module ‘colorout’ on gpu. falling back to cpu path
and then this:
27.204178 [opencl_summary_statistics] device ‘NVIDIA GeForce RTX 3060’ (0): 68 out of 68 events were successful and 0 events lost
I could not run the grep you recommended. It gives me an error.
Are you able to disable the onboard graphics of the the AMD chip…Just to confirm that the two GPU are playing nice?? Also not sure about Win11 haven’t bumped up to it yet but I think there are places where you can allow the software and the os to have impact on the settings or you can set it to use the NVIDIA driver settings…its been a while since I messed with it…but you could have something going on in that arena as well…just some random thoughts… It does look like opencl is not happy with your 3060…seems like it can’t find how much memory is available …may be?? others would know more…maybe one of the opencl settings in your config file is weird or maybe the opencl kernels are corrupt…I am not sure if those are something you can delete and they will get regenerated?? Someone that actually knows about OPENCL will likely be able to surmise something from those log entries…hope you get it sorted
The compiled kernels are in the cache directory, and can be deleted; they are regenerated when you start darktable.
To find out available memory, under Linux one can use the tool nvdia-smi. I don’t know about Windows. However, under Linux at least, I think a failed memory allocation is detected properly, and OpenCL is turned off (and we see fall back to cpu implementation due to insufficient gpu memory above). opencl_memory_headroom could be increased to avoid allocation errors.
Perhaps the device priorities could also be tweaked to exclude devices.
Repeat this with each device numbers. It seems there are two just driver based devices.
At least devices 2, 1 and 4 should be usable - 2 because 0 wasn’t useable
But that’s just an try an error approach
In summary, the only way to avoid the black image is by disabling the OpenCL or the Haze Removal. It seems that the Nvidia card is not being used in most modules with the opencl. The system switches to CPU, so there is not a big performance improvement. I sure would like the system to take advantage of the 12gb fast card.
The scheduling priorities are only taken into account if the scheduling profile is set to default (and not with very fast GPU or multiple GPUs (darktable 3.8 user manual - scheduling profile). But OK, I see that is what you have:
I could not figure out how to disable the AMD GPU. It is really not onboard, but on chip. It is part of the CPU, from what I read online. I did try going to Device Manager and just disabled the device there. I then deleted the kernel cache to force the regeneration. The AMD was not regenerated or found by DT opencl.
I dont get why I have 2 Nvidia entries. It creates two Nvidia kernel cache (3060_100 and 3060_51109)
device 0 (the _100) says supports image sizes of 32768 x 32768
allows GPU memory allocations of up to 3071MB
GLOBAL_MEM_SIZE: 12288MB
the card is 12gb of memory, so why allow up to 3071MB?
the second device (_51109) says: supports image sizes of 16384 x 16384
allows GPU memory allocations of up to 1024MB
GLOBAL_MEM_SIZE: 12142MB
I dont get what is going on with this second device (smaller sizes/memory)
Regardless, with only the Nvidia card, the haze removal = black image. Also it seems to fail to use the GPU.
7.180602 [pixelpipe_process] [full] using device 0
7.189400 [opencl_rawprepare] couldn’t enqueue kernel! -1
7.189596 [opencl_pixelpipe] could not run module ‘rawprepare’ on gpu. falling back to cpu path
That windows 11 setting was on by default. Since DT using opencl, I think the windows settings should not matter.
You must be about ready to hit something…it does seem that your card is not clearly being recognized or reporting the right values. I wonder if anyone else has a 3060 and what version of the driver is maybe working for them?? Would any of the other Opencl settings reduce the amount of memory that could be allocated?? I use occasionally ON1 photoraw and it was suggest for it as it uses OPENCL as well to add it to the graphics settings and specify the high setting so I did the same with DT…I don’t know if it made any difference and I am using Win10 not 11 but I did it anyway figuring it could not hurt and if the OS did ever get involved it was manually set to run using the most GPU it could not letting Windows decide for itself…
did you already test each gpu device prioritation?
i.e. in your AppData\Local\darktable\darktablerc file using the line opencl_device_priority=x,*/!x,*/*/*/*
x is the gpu device number 0 to 4
If all fails, then you need support by someone who has experience with win11 specific handling of opencl.
I tend to agree that there is an issue in the kernel cache process. There should only be one device found and one device kernel generated. The nvidia card / driver uses OpenCL 3.0. I dont know if there is a difference in the language with it.
I will try to mess with Windows settings to try something else.
I am no expert but I would think maybe this is your issue…based on what you described earlier
See page 19 …sounds like what you are experiencing…
I don’t think a fix is provided as these are shown as known issues…
I am not sure in this case if this if for newer drivers, or a windows issue so maybe you could look for windows based solution or actually google a bit and see if older drivers had this issue??
TAG error…
EDIT
Check if you have Windows 11 game mode enabled…not sure what the default is but you might actually want it off?? And set DT as a high priority graphics app…see if either helps??
I see there are multiple devices reporting for OpenCL. I’ve had similar issue in DT 3.8 where even with single GPU (rtx 3080), it was listed twice along with ‘Microsoft Basic Render Driver’.
Running darktable-cltest.exe resulted in multiple reports of opencl_create_kernel failing to create the kernel with -5 error (CL_OUT_OF_RESOURCES).
Attempts at filtering for each device with ‘opencl_device_priority’ or settings that affect memory consumption failed.
Thanks @swish . That solved the problem. It seems that the combability stuff is pre-installed in my system (maybe part of windows 11). I went to windows System, Apps, Apps & Features and searched for OpenCL. I then selected it to be uninstalled.
Now DT only finds 2 devices (one Nvidia and one AMD) . DT is using both GPU and CPU to process the modules. I’m also able to use Haze Removal without it turning into a black image.
Just for fun, I deleted the kernel cache to force a new compile. It is all working now.
So, is this something we should include into the DT manual? Eg. For Windows installs, uninstall the OpenCL compatibility from apps if more than one device is found.
Not something for the user manual (which doesn’t cover this sort of thing - OS-specific issues/features) but perhaps might be useful on the github README file or even the Windows build instructions.