OpenCL problem DT 4.6 Windows

This has been a persistent problem on my computer and updating the Nvidia Drivers this week and installing DT4.6 has not resolved my problem.

I recently bought a Canon R7 and the CR3 files from that cause DT to crash when I move the details threshold slider in the Diffuse or Sharpen module. I previously raised the issue in this thread Open CL question darktable V4.5.0+816. My work around has been to allocate all device memory to darktable. This allowed me to keep using DT but is not a true fix. Today I opened a 52.3MB nef file from play raw My eyes! Too much bokeh!
and this file caused a crash even with all device memory allocated to darktable when I tried using the details threshold slider. I have attached a crash log from today with a CR3 file when I set default values for open CL and then tried using the details threshold slider. I hope this can be informative.

darktable-log.txt (458.7 KB)

  1. delete your current logfile so we create a new one from now on
  2. disable the amd GPU via the processing preference menú option
  3. try again with the same xmp

Can you share the xmp?

Also:

  1. Can you post how much available memory windows reports for your system (with your normal web browser open)? A print screen will be good.
  2. use -d opencl -d memory instead of -d common

disabling amd did not help. Here is the new crash file and the xmp file. I am not sure how to get the system memory report that you request. My computer sadly is limited to 16GB RAM

darktable-log.txt (490.3 KB)
2R0A0494.CR3.xmp (9.7 KB)

First, Delete the logfile so we don’t get all of the old stuff.

It looks like AMD is still on. I think need to restart dt after that change.

Just use -d opencl

Thanks for your patience. I thought a new text file was written every time and for some reason my computer takes nearly 5 minutes to find that file by searching the C drive. The INETCache folder seems hidden from me despite asking windows to show hidden folders. I have now deleted the crash file, set DT up without AMD and opened it via command line. Here is the crash report. I hope it can be helpful.
darktable-log.txt (8.7 KB)

1 Like

You need to disable ‘hide protected operating system folders’.

image

But you don’t have to search: you can type. Maybe even bookmark it, once you have it open.

I was sure the iGPU from AMD was causing problems with using system memory. I was wrong.

  1. Keep the AMD off.
  2. Turn off the Nvidia (does it fault with just CPU path?)
  3. Go back to -d common

Can you share the associated CR3 to the xmp you posted above?

I will try on my windows build later today/tomorrow.

I just loaded the xmp on to the original CR3 and put it on the NEF mentioned above. THere are two instances of D&S… One has a mask so I assumed this was the one initiating the problem… I could see no issue. I also bumped iterations all the way to 128 the max slider value and went repeatedly back and forth from +100 to - 100 on the details and I have no issues… My box is a 12 gen intel, running windows 11 with a 3060TI card and 32Gb system ram. So maybe this is an amd or resources issue??

EDIT:

These are my settings. I did not that long ago move it to use all device memory to see if there was any issue. I have not had any crashes so I have left it there… I did not look to see if I had any benefit from doing so… otherwise… everything disabled but Nvidia which is my GPU…

image

I managed to reproduced it. It on the opencl path with limited memory resources. I used resources with: resource_small=128 4 64 200 . Using 300 and 400 works ok. Yes I forgot about using mini. I will switch to Fedora to see if I could reproduce there and start a GitHub Issue.

     8.6980 blend with form CL         [full]           diffuse                (   0/   0) 1560x1041 scale=0.2234 --> (   0/   0) 1560x1041 scale=0.2234 IOP_CS_RGB, BLEND_CS_RGB_SCENE, no form
     8.6980 refine_detail_mask on GPU  [full]           diffuse                (   0/   0) 1560x1041 scale=0.2234 --> (   0/   0) 1560x1041 scale=0.2234 
     8.6980 [opencl memory] device 0: 125108204 bytes (119.3 MB) in use
     8.6980 [opencl memory] device 0: 190180444 bytes (181.4 MB) in use
     8.7085 [opencl memory] device 0: 190180496 bytes (181.4 MB) in use
     8.7112 [opencl memory] device 0: 190180444 bytes (181.4 MB) in use
     8.7247 [opencl memory] device 0: 125108204 bytes (119.3 MB) in use
     8.7249 [opencl memory] device 0: 60035964 bytes (57.3 MB) in use
     8.7252 distort detail mask        [full]           diffuse                (3490/   0) 3491x4660 scale=1.0000
     8.7252 distort detail mask        [full]           demosaic               (   0/   0) 6983x4660 scale=1.0000 --> (   0/   0) 1560x1041 scale=0.2234 
     8.7252 resample_1c_plain                                                  (   0/   0) 6983x4660 scale=1.0000 --> (   0/   0) 1560x1041 scale=0.2234 bicubic
Magick: caught exception 0xC0000005 "Access violation"...

Reproduced on current master (4.7.0~git97.fcbe7dd5) Fedora 39 KDE 12gb Nvidia GPU
resource_small=128 4 64 200
Changing from lanczos3 to bicubic still crashes

     6.3712 blend with form CL         [full]           diffuse                (   0/   0) 1657x1106 scale=0.2373 --> (   0/   0) 1657x1106 scale=0.2373 IOP_CS_RGB, BLEND_CS_RGB_SCENE, no form
     6.3713 refine_detail_mask on GPU  [full]           diffuse                (   0/   0) 1657x1106 scale=0.2373 --> (   0/   0) 1657x1106 scale=0.2373 
     6.3713 [opencl memory] device 0: 110662836 bytes (105.5 MB) in use
     6.3713 [opencl memory] device 0: 153777156 bytes (146.7 MB) in use
     6.3760 [opencl memory] device 0: 153777208 bytes (146.7 MB) in use
     6.3764 [opencl memory] device 0: 153777156 bytes (146.7 MB) in use
     6.3965 [opencl memory] device 0: 110662836 bytes (105.5 MB) in use
     6.3967 [opencl memory] device 0: 67548516 bytes (64.4 MB) in use
     6.3977 distort detail mask        [full]           diffuse                (4668/   0) 2313x4660 scale=1.0000
     6.3977 distort detail mask        [full]           demosaic               (   0/   0) 6981x4660 scale=1.0000 --> (   0/   0) 1657x1106 scale=0.2373 
     6.3977 resample_1c_plain                                                  (   0/   0) 6981x4660 scale=1.0000 --> (   0/   0) 1657x1106 scale=0.2373 lanczos3
Segmentation fault (core dumped)
1 Like

Please continue the conversation here:

1 Like

Thanks for your help @g-man . Do you need anything else from me at this stage? Thanks @kofa I have setup folder view as you described. BTW, I replied here because I didn’t want to add noise to the bug report on Github.

@priort do you want to try my image by unchecking use all device memory. Use all device memory prevents crashes on my system.

Changed to

image

DT

image

128 iterations on the module and moving the details slider freely from -100 to + 100 with no crash…

Changing the darktable CPU/memory resources to large has also prevented the crashing on my computer.

EDIT: The nef file still crashes even with CPU/memory set to large or unrestricted and OpenCL set to use all device memory. The crash occurs using the details threshold slider. If I disable OpenCL I get no crashing with this image.
image

I can use small as well with no issue…

@priort it is probably unique to my computer.

Do you need the intel support on …not sure if that messes with anything but you dont’ have any intel GPU do you??

The intel seems to make no difference. Probably no reason to have it on. I am unsure why DT selected it by default.