OpenCL not available after 'suspend/resume' cycle. Is there an easy work around?

Running dt 4.6.1 on Mint 21.3/Ubuntu 22.04 base. GPU is nvidia GTX 3060

Foe the past few months I have experienced multiple (i.e 2 or 3 per day) instances where OpenCL become unavailable in dt. Although irritating, especially as I haven’t been making conscious changes to either dt or the display drivers, I could usually get OpenCL support activated again by restarting dt. But in the last week or so I notice that, more often than not, I have to restart Mint to get OpenCL support. I’ve now spent time trying to correlate the loss of OpenCl with my own use of Mint. My so far front-runner opinion is that the loss of functionality correlates with a suspend/resume cycle on that PC while dt is still open - something which I do multiple times per day.

If this opinion is valid, and there is a correlation, I am inclined to accept that the problem is in the operating system, rather than in dt. In which case, is there some command that I can invoke after a ‘resume’ that will allow dt to still regard the retirements for running OpenCl as having been met?

What does dmesg say when ooenCL dies?

I’m going to give a slightly off-target answer to this: I ran ‘darktable-cltest’ before I saw your reply; the result is in the screen shot. The lack of file ‘libOpenCl.so’ bothered me greatly 2 days ago when I was reading the manual. I couldn’t understand how dt was able to activate OpenCl ever in the absence of this file - but it does and the file is definitely not there: I asked Thunar to search for the file across the whole of my PC.

Screen shot:

Now I’ll run dmesg:

Huge amount of output to terminal: where should I be looking ?

Look for something related to your graphics driver.

OK; in the interim I rebooted Mint and re-ran darktable-cltest without having started darktable; the result says that OpenCL is available, just not activated yet. This seems to me to confirm my opinion that there is a correlation with a ‘suspend/resume’ cycle.

Nothing obvious to me - and that is probably as to be expected: 99.9% of the output from dmesg is outside my understanding.

First, you mention 4.6.1, but the screenshot say 4.6.0

Nvidia on used to have issues with power management from suspend/hibernate. I havent seen this problem in Fedora for a while. I dont use Ubuntu, so I cant help. Do a search for fixes.

Suspend cycle problems have a long history of issues, check your distribution…

Yes, I notice that too but don’t have an explanation. Here is what darktable thinks it’s running:
2024-03-24_17-54

Further, I am confused about the resources OpenCl needs: the manual says:

“The principle OpenCL function flow is like this:
darktable > libOpenCL.so > libnvidia-opencl.so.1 > kernel driver module(s) > GPU
• darktable dynamically loads libOpenCL.so – a system library that must be accessible to the system’s dynamic loader (ld.so).
• libOpenCL.so reads the vendor-specific information file (/etc/OpenCL/vendors/nvidia.icd) to find the library that contains the vendor-specific OpenCL implementation.”

But my system does not have the file libOpenCL.so and neither do the other two Linux PCs that I have (not running Mint). All three PCs run darktable and OpenCl (most of the time). So, is the manual wrong ?

You might have multiple dt installed in your system in different folders.

The manual is correct. Libopencl is installed in your system. See 0.0299 seconds in your post.

By the way, you also have opencl turned off by your choice in settings.

You are right - thanks for this observation. Further mystified! Now corrected.

The reference at 0.0299 seconds is to libOpenCl.so.1, not to libOpenCL.so - and the manual clearly distinguishes between these. I do not have file libOpenCL.so and I have never had a problem finding file libOpenCL.so.1, So I still don’t understand how OpenCL can be activated.

It’s working correctly when it doesn’t have a suspend/resume issues.

This issue is getting beyond being irritating. I normally run Mint for weeks at a time without restarting it (usually I restart only because there has been a kernel update). But, now, unless I restart my PC every time before using darktable, and multiple times while processing a batch of images, I will never actually be able to use the expensive GPU I have fitted; More, I experience delays in processing and exporting which add up to more time than it takes to restart the PC. This is not a sensible workflow.

Now I am finding that I can start processing an image in the darkroom with OpenCL enabled but before I have finished a simple exposure adjustment I find that OpenCl is not available. I ran darktable-cltest immediately after the latest experience; the terminal out put is as follows:

Again the issue indicated here is the inability of darktable to find libOpenCL and libOpenCl.so. I have looked for both of these in Synaptic. They are not listed. perhaps somebody who understands why could explain to me where I must look to find these important objects and, possibly explain to me why I didn’t have a problem with their absence in darktable 4.6.0 or 4.4 or 42. and so on.

I often, but not always, get the same problem on suspend/resume. The easiest solution is to close DT before suspend, but whenever I forget to this, I do the following:

After suspend, DT is open but not using opencl.

close DT

sudo rmmod nvidia_uvm

sudo insmod nvidia_uvm

start DT

I think this solution came from this forum a couple of years ago

3 Likes

Sound advice, thank you. Sadly I normally remember this advice immediately after clicking the ‘suspend’ button… But now you have given me the work-around I was hoping for. Slightly annoyed with myself for failing to find that procedure when I searched through this forum before submitting my original post. More than ‘annoyed’; actually embarrassed since I started one of those discussions back in 2021, in which this solution was proposed, for exactly the same problem…

That’s what you get when old age visits you with that brain disease, whose name I forget, which cause people to forget the names of serious brain diseases, an effect of which makes you to forget what you just typed…

I would suggest to use modprobe rather than insmod as it does dependency checks:

sudo rmmod nvidia_uvm
sudo modprobe nvidia_uvm

How about multiple versions of the nvidia drivers installed, or maybe both nvidia and nouveau? They could get confused about which one is the boss :slight_smile:

I have a system with nvidia too, and I stopped suspending completely because for me, it sometimes just won’t wake up. A long sleep, you might call it.

The big picture problem is the truly insane complexity that comes with adding a separate mostly proprietary component like a GPU. My distro is derived from ubuntu. Anyone else running a combo like this, I recommend looking up the source for the ubuntu-drivers-common package and especially the file share/hybrid/gpu-manager.c in the source tree.

That can never work.


Ian

Hi,

that actually also for be (Ubuntu) has been or is a current issue. After suspend with darktable open, opencl is gone.
But also here the already mentioned

sudo rmmod nvidia_uvm
sudo modprobe nvidia_uvm

is doing the trick to reenable. I just have a brief batch with this to start if the issue occurs for me …

Disclaimer: I have not tested it, not using Linux mint. I’m guessing it uses systemd-suspend.service

What you can do is execute that commands after leaving the suspend/hibernate state.

Based on systemd-suspend.service man when it enters on suspending status it will execute all the executables found in /usr/lib/systemd/system-sleep/ with two arguments but we are interested in the first one. It will be “pre” and “post”, which means suspending or leaving suspension.

Then you can write an script like this one from here to run your commands:

#!/bin/sh

PATH=/sbin:/usr/sbin:/bin:/usr/bin

case "$1" in
    pre)
            #code execution BEFORE sleeping/hibernating/suspending
    ;;
    post)
            #code execution AFTER resuming
    ;;
esac

exit 0

Do not forget to set execution permission.

Mine does (Ubuntu 23.10) – but it’s set up to compile darktable from source.
Here is how you can find stuff from the terminal:

kofa@eagle:~$ ls -l `locate libOpenCL.so`
lrwxrwxrwx 1 root root    18 Jun 14  2023 /usr/lib/i386-linux-gnu/libOpenCL.so.1 -> libOpenCL.so.1.0.0
-rw-r--r-- 1 root root 84220 Jun 14  2023 /usr/lib/i386-linux-gnu/libOpenCL.so.1.0.0
lrwxrwxrwx 1 root root    18 Jun 14  2023 /usr/lib/x86_64-linux-gnu/libOpenCL.so -> libOpenCL.so.1.0.0
lrwxrwxrwx 1 root root    18 Jun 14  2023 /usr/lib/x86_64-linux-gnu/libOpenCL.so.1 -> libOpenCL.so.1.0.0
-rw-r--r-- 1 root root 73384 Jun 14  2023 /usr/lib/x86_64-linux-gnu/libOpenCL.so.1.0.0
lrwxrwxrwx 1 root root    14 Jun 14  2023 /usr/share/man/man7/libOpenCL.so.7.gz -> libOpenCL.7.gz

This shows that libOpenCL.so is a symbolic link pointing at libOpenCL.so.1.0.0, and so is libOpenCL.so.1. There are two versions, one for 32 and 64 bits. libOpenCL.so only exists for 64 bits (there is /usr/lib/x86_64-linux-gnu/libOpenCL.so, but no /usr/lib/i386-linux-gnu/libOpenCL.so).

And

kofa@eagle:~$ cat /etc/OpenCL/vendors/nvidia.icd 
libnvidia-opencl.so.1

shows that the ICD file from Nvidia specifies the library libnvidia-opencl.so.1. And that library is (in 32 and 64 bit versions):

kofa@eagle:~$ ls -l `locate libnvidia-opencl.so.1`
lrwxrwxrwx 1 root root 30 Oct 30 12:17 /usr/lib/i386-linux-gnu/libnvidia-opencl.so.1 -> libnvidia-opencl.so.525.147.05
lrwxrwxrwx 1 root root 30 Oct 30 12:17 /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.1 -> libnvidia-opencl.so.525.147.05

To find the packages providing them:

kofa@eagle:~$ dpkg -S /usr/lib/x86_64-linux-gnu/libOpenCL.so /usr/lib/x86_64-linux-gnu/libOpenCL.so.1 /usr/lib/x86_64-linux-gnu/libOpenCL.so.1.0.0
ocl-icd-opencl-dev:amd64: /usr/lib/x86_64-linux-gnu/libOpenCL.so
ocl-icd-libopencl1:amd64: /usr/lib/x86_64-linux-gnu/libOpenCL.so.1
ocl-icd-libopencl1:amd64: /usr/lib/x86_64-linux-gnu/libOpenCL.so.1.0.0

kofa@eagle:~$ dpkg -S /etc/OpenCL/vendors/nvidia.icd /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.1 /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.525.147.05
libnvidia-compute-525:amd64: /etc/OpenCL/vendors/nvidia.icd
libnvidia-compute-525:amd64: /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.1
libnvidia-compute-525:amd64: /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.525.147.05

So:

  • libOpenCL.so is from a package, ocl-icd-opencl-dev, required only for development (incl. compiling darktable from source); that is probably why you don’t have it;

  • libOpenCL.so.1 and libOpenCL.so.1.0.0 come from the vendor-independent package ocl-icd-libopencl1;

  • while the Nvidia vendor package libnvidia-compute-525 provides libnvidia-opencl.so.1 and libnvidia-opencl.so.525.147.05. The latter version number corresponds to the version of my other NVidia packages (the 32-bit packages were filtered out below):

    kofa@eagle:~$ dpkg -l|grep nvidia|grep 147 | grep -v i386
    ii  libnvidia-cfg1-525:amd64                                 525.147.05-0ubuntu0.23.10.1                          amd64        NVIDIA binary OpenGL/GLX configuration library
    ii  libnvidia-common-525                                     525.147.05-0ubuntu0.23.10.1                          all          Shared files used by the NVIDIA libraries
    ii  libnvidia-compute-525:amd64                              525.147.05-0ubuntu0.23.10.1                          amd64        NVIDIA libcompute package
    ii  libnvidia-decode-525:amd64                               525.147.05-0ubuntu0.23.10.1                          amd64        NVIDIA Video Decoding runtime libraries
    ii  libnvidia-encode-525:amd64                               525.147.05-0ubuntu0.23.10.1                          amd64        NVENC Video Encoding runtime library
    ii  libnvidia-extra-525:amd64                                525.147.05-0ubuntu0.23.10.1                          amd64        Extra libraries for the NVIDIA driver
    ii  libnvidia-fbc1-525:amd64                                 525.147.05-0ubuntu0.23.10.1                          amd64        NVIDIA OpenGL-based Framebuffer Capture runtime library
    ii  libnvidia-gl-525:amd64                                   525.147.05-0ubuntu0.23.10.1                          amd64        NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
    ii  nvidia-compute-utils-525                                 525.147.05-0ubuntu0.23.10.1                          amd64        NVIDIA compute utilities
    ii  nvidia-dkms-525                                          525.147.05-0ubuntu0.23.10.1                          amd64        NVIDIA DKMS package
    ii  nvidia-driver-525                                        525.147.05-0ubuntu0.23.10.1                          amd64        NVIDIA driver metapackage
    ii  nvidia-kernel-common-525                                 525.147.05-0ubuntu0.23.10.1                          amd64        Shared files used with the kernel module
    ii  nvidia-kernel-source-525                                 525.147.05-0ubuntu0.23.10.1                          amd64        NVIDIA kernel source package
    ii  nvidia-utils-525                                         525.147.05-0ubuntu0.23.10.1                          amd64        NVIDIA driver support binaries
    ii  xserver-xorg-video-nvidia-525                            525.147.05-0ubuntu0.23.10.1                          amd64        NVIDIA binary Xorg driver