opencl, oh opencl - Where art thou opencl?

Had nvidia opencl working out of the box…

A week ago - darktable was finding and using nvidia opencl.
I’ve just realized that as of today - it doesn’t find nvidia opencl and therefore, defaults to CPU.

I’m not aware of any major changes on my system.
I’ve done the typical google’ing and followed the various “reinstall nvidia drivers” steps found.

The strange part is that clinfo - shows nvidia opencl active and working - but darktable cannot see nvidia at all?

I have darktable debug logs from a week ago and today…
I could put on a paste server or here? Where is best?

Linux Neon (Ubuntu clone)
Darktable 4.4.2 flatpak. No change.
NVIDIA-SMI 535.113.01 Driver Version: 535.113.01 CUDA Version: 12.2 No change.

Oh great opencl sages - any ideas?

Operating system?

What does

darktable-cltest

report?

Darktable just doesn’t find the nvidia…

$ clinfo -l
Platform #0: NVIDIA CUDA
 `-- Device #0: NVIDIA GeForce RTX 3060

From logs from the 15th Sept.

     0.0619 [dt_get_sysresource_level] switched to 2 as `large'
     0.0619   total mem:       63648MB
     0.0619   mipmap cache:    7956MB
     0.0619   available mem:   43509MB
     0.0619   singlebuff:      994MB
     0.0619   OpenCL tune mem: OFF
     0.0619   OpenCL pinned:   OFF
[opencl_init] opencl related configuration options:
[opencl_init] opencl: ON
[opencl_init] opencl_scheduling_profile: 'very fast GPU'
[opencl_init] opencl_library: 'default path'
[opencl_init] opencl_device_priority: '*/!0,*/*/*/!0,*'
[opencl_init] opencl_mandatory_timeout: 400
     0.0648 [dt_dlopencl_init] could not find default opencl runtime library 'libOpenCL'
     0.0648 [dt_dlopencl_init] could not find default opencl runtime library 'libOpenCL.so'
[opencl_init] opencl library 'libOpenCL.so.1' found on your system and loaded
[opencl_init] found 2 platforms
[opencl_init] found 2 devices

[dt_opencl_device_init]
   DEVICE:                   0: 'NVIDIA GeForce RTX 3060'
   PLATFORM NAME & VENDOR:   NVIDIA CUDA, NVIDIA Corporation
   CANONICAL NAME:           nvidiacudanvidiageforcertx3060
   DRIVER VERSION:           535.86.05
   DEVICE VERSION:           OpenCL 3.0 CUDA, SM_20 SUPPORT
   DEVICE_TYPE:              GPU
   GLOBAL MEM SIZE:          12044 MB
   MAX MEM ALLOC:            3011 MB
   MAX IMAGE SIZE:           32768 x 32768
   MAX WORK GROUP SIZE:      1024
   MAX WORK ITEM DIMENSIONS: 3
   MAX WORK ITEM SIZES:      [ 1024 1024 64 ]
   ASYNC PIXELPIPE:          NO
   PINNED MEMORY TRANSFER:   NO
   MEMORY TUNING:            NO
   FORCED HEADROOM:          400
   AVOID ATOMICS:            NO
   MICRO NAP:                250
   ROUNDUP WIDTH:            16
   ROUNDUP HEIGHT:           16
   CHECK EVENT HANDLES:      128
   PERFORMANCE:              3.367
   TILING ADVANTAGE:         0.000
   DEFAULT DEVICE:           NO
   KERNEL BUILD DIRECTORY:   /app/share/darktable/kernels
   KERNEL DIRECTORY:         /home/akh/.var/app/org.darktable.Darktable/cache/darktable/cached_v1_kernels_for_NVIDIACUDANVIDIAGeForceRTX3060_5358605
   CL COMPILER OPTION:       -cl-fast-relaxed-math
   KERNEL LOADING TIME:       0.0505 sec

[dt_opencl_device_init]
   DEVICE:                   1: 'AMD Radeon Graphics (renoir, LLVM 15.0.7, DRM 3.49, 6.2.0-32-generic)'
   PLATFORM NAME & VENDOR:   Clover, Mesa
   CANONICAL NAME:           cloveramdradeongraphics
   DRIVER VERSION:           23.1.6
   DEVICE VERSION:           OpenCL 1.1 Mesa 23.1.6 (git-0697ac0d75)
   DEVICE_TYPE:              GPU
   *** insufficient device version ***
[opencl_init] OpenCL successfully initialized. Internal numbers and names of available devices:
[opencl_init]		0	'NVIDIA CUDA NVIDIA GeForce RTX 3060'
[opencl_init] FINALLY: opencl is AVAILABLE and ENABLED.
[dt_opencl_update_priorities] these are your device priorities:
[dt_opencl_update_priorities] 		image	preview	export	thumbs	preview2
[dt_opencl_update_priorities]		0	0	0	0	0
[dt_opencl_update_priorities] show if opencl use is mandatory for a given pixelpipe:
[dt_opencl_update_priorities] 		image	preview	export	thumbs	preview2
[dt_opencl_update_priorities]		1	1	1	1	1
[opencl_synchronization_timeout] synchronization timeout set to 0
[dt_opencl_update_priorities] these are your device priorities:
[dt_opencl_update_priorities] 		image	preview	export	thumbs	preview2
[dt_opencl_update_priorities]		0	0	0	0	0
[dt_opencl_update_priorities] show if opencl use is mandatory for a given pixelpipe:
[dt_opencl_update_priorities] 		image	preview	export	thumbs	preview2
[dt_opencl_update_priorities]		1	1	1	1	1
[opencl_synchronization_timeout] synchronization timeout set to 0

From logs from now 29th Sept.

     0.0741 [dt_get_sysresource_level] switched to 2 as `large'
     0.0741   total mem:       63648MB
     0.0741   mipmap cache:    7956MB
     0.0741   available mem:   43509MB
     0.0741   singlebuff:      994MB
     0.0741   OpenCL tune mem: OFF
     0.0741   OpenCL pinned:   OFF
[opencl_init] opencl related configuration options:
[opencl_init] opencl: ON
[opencl_init] opencl_scheduling_profile: 'very fast GPU'
[opencl_init] opencl_library: 'default path'
[opencl_init] opencl_device_priority: '*/!0,*/*/*/!0,*'
[opencl_init] opencl_mandatory_timeout: 400
     0.0787 [dt_dlopencl_init] could not find default opencl runtime library 'libOpenCL'
     0.0788 [dt_dlopencl_init] could not find default opencl runtime library 'libOpenCL.so'
[opencl_init] opencl library 'libOpenCL.so.1' found on your system and loaded
[opencl_init] found 1 platform
[opencl_init] found 1 device

[dt_opencl_device_init]
   DEVICE:                   0: 'AMD Radeon Graphics (renoir, LLVM 15.0.7, DRM 3.49, 6.2.0-33-generic)'
   PLATFORM NAME & VENDOR:   Clover, Mesa
   CANONICAL NAME:           cloveramdradeongraphics
   DRIVER VERSION:           23.1.6
   DEVICE VERSION:           OpenCL 1.1 Mesa 23.1.6 (git-0697ac0d75)
   DEVICE_TYPE:              GPU
   *** insufficient device version ***
[opencl_init] no suitable devices found.
[opencl_init] FINALLY: opencl is NOT AVAILABLE and NOT ENABLED.

Try to see if the chain described here is broken:
https://darktable-org.github.io/dtdocs/en/special-topics/opencl/setting-up/

1 Like

Check that the flatpak has the proper drivers and access to them. Start with a flatpak update -y

5 Likes

Yes - I (mostly) followed that page in my initial nvidia driver reinstall attempt.

# ls -l /dev/nvidia*
crw-rw-rw- 1 root root 195,   0 Sep 29 14:01 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Sep 29 14:01 /dev/nvidiactl
crw-rw-rw- 1 root root 195, 254 Sep 29 14:01 /dev/nvidia-modeset
crw-rw-rw- 1 root root 507,   0 Sep 29 14:01 /dev/nvidia-uvm
crw-rw-rw- 1 root root 507,   1 Sep 29 14:01 /dev/nvidia-uvm-tools
# lsmod |grep nvidia
nvidia_uvm           1761280  0
nvidia_drm             90112  2
nvidia_modeset       1314816  3 nvidia_drm
nvidia              56721408  99 nvidia_uvm,nvidia_modeset
drm_kms_helper        249856  4 drm_display_helper,amdgpu,nvidia_drm
drm                   696320  29 gpu_sched,drm_kms_helper,drm_display_helper,nvidia,drm_buddy,amdgpu,drm_ttm_helper,nvidia_drm,ttm
video                  69632  2 amdgpu,nvidia_modeset

That was the 10sec fix… after the 5hrs of googling and various attempts.

Sir - you deserve a drink of your choice!

So the after-action report debrief: how often should this be done? As it seems it doesn’t get picked-up in the day-to-day update cycle?

5 Likes

I run fedora 38 and the Discover (the application program) sends notifications when there is a system package updated (dnf) or a flatpak one. Check how Linux Neon manages this.

The flatpak approach is good and bad. The nvidia drivers have to be available inside the sandbox environment.

2 Likes

Neon uses the same… which I understand relies on appstream et al. however, in this case there were no notifications for requested updates nor was I aware of anything being updated automagically.

Thanks for the pointer. Much appreciated.

Maybe Neon has something similar

1 Like

Exactly the same… (well not pointing to fedoraproject - but you get the idea).

I am trying to use mesa rusticl drivers on my ubuntu system but as soon as I enable the rusticl drivers darktable crashes … :frowning:

Without enabling mesa rusticl, the nvidia drivers are found and can be used though. Would be great to see that both GPUs are beeing used.

RUSTICL_ENABLE=radeonsi /opt/darktable/bin/darktable-cltest 
this is darktable 4.5.0+777~g83758c7d9c
copyright (c) 2009-2023 johannes hanika
https://github.com/darktable-org/darktable/issues/new/choose

compile options:
  bit depth is 64 bit
  normal build
  SSE2 optimizations enabled
  OpenMP support enabled
  OpenCL support enabled
  Lua support enabled, API version 9.2.0
  Colord support enabled
  gPhoto2 support disabled
  G'MIC support enabled (compressed LUTs will be supported)
  GraphicsMagick support enabled
  ImageMagick support disabled
  libavif support enabled
  libheif support disabled
  libjxl support disabled
  OpenJPEG support enabled
  OpenEXR support enabled
  WebP support enabled

     0.0605 [dt_get_sysresource_level] switched to 2 as `large'
     0.0605   total mem:       31443MB
     0.0605   mipmap cache:    3930MB
     0.0605   available mem:   21494MB
     0.0605   singlebuff:      1965MB
     0.0610 [opencl_init] opencl library 'libOpenCL' found on your system and loaded, preference 'default path'
     0.1169 [opencl_init] found 2 platforms
[opencl_init] found 2 devices

[dt_opencl_device_init]
   DEVICE:                   0: 'RENOIR (renoir, LLVM 15.0.7, DRM 3.49, 6.2.0-33-generic)'
   PLATFORM NAME & VENDOR:   rusticl, Mesa/X.org
   CANONICAL NAME:           rusticlrenoir
   DRIVER VERSION:           23.2.1 - kisak-mesa PPA
   DEVICE VERSION:           OpenCL 3.0 
   DEVICE_TYPE:              GPU, dedicated mem
   GLOBAL MEM SIZE:          15722 MB
   MAX MEM ALLOC:            2048 MB
   MAX IMAGE SIZE:           16384 x 16384
   MAX WORK GROUP SIZE:      1024
   MAX WORK ITEM DIMENSIONS: 3
   MAX WORK ITEM SIZES:      [ 1024 1024 1024 ]
   ASYNC PIXELPIPE:          NO
   PINNED MEMORY TRANSFER:   NO
   AVOID ATOMICS:            NO
   MICRO NAP:                250
   ROUNDUP WIDTH & HEIGHT    16x16
   CHECK EVENT HANDLES:      128
   TILING ADVANTAGE:         0.000
   DEFAULT DEVICE:           NO
   KERNEL BUILD DIRECTORY:   /opt/darktable/share/darktable/kernels
   KERNEL DIRECTORY:         /home/ds/.cache/darktable/cached_v2_kernels_for_rusticlRENOIR_2321kisakmesaPPA
   CL COMPILER OPTION:       -cl-fast-relaxed-math
thread '<unnamed>' panicked at 'called `Option::unwrap()` on a `None` value', ../src/gallium/frontends/rusticl/core/program.rs:228:13
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
fatal runtime error: failed to initiate panic, error 5

The rusticl driver is not concidered to be stable yet.

Yet it would be good if dt wouldn’t crash. Could you post what you get by using the “-d opencl” option please?

I am getting the same output as I already posted before …

without setting the environment RUSTICL_ENABLE, I am getting

/opt/darktable/bin/darktable -d opencl
this is darktable 4.5.0+777~g83758c7d9c
copyright (c) 2009-2023 johannes hanika
https://github.com/darktable-org/darktable/issues/new/choose

compile options:
  bit depth is 64 bit
  normal build
  SSE2 optimizations enabled
  OpenMP support enabled
  OpenCL support enabled
  Lua support enabled, API version 9.2.0
  Colord support enabled
  gPhoto2 support disabled
  G'MIC support enabled (compressed LUTs will be supported)
  GraphicsMagick support enabled
  ImageMagick support disabled
  libavif support enabled
  libheif support disabled
  libjxl support disabled
  OpenJPEG support enabled
  OpenEXR support enabled
  WebP support enabled

     0.6060 [dt_get_sysresource_level] switched to 2 as `large'
     0.6061   total mem:       31443MB
     0.6061   mipmap cache:    3930MB
     0.6061   available mem:   21494MB
     0.6061   singlebuff:      1965MB
     0.6096 [opencl_init] opencl library 'libOpenCL' found on your system and loaded, preference 'default path'
     0.6506 [opencl_init] found 2 platforms
     0.6506 [opencl_init] no devices found for Mesa/X.org (vendor) - rusticl (name)
[opencl_init] found 1 device

[dt_opencl_device_init]
   DEVICE:                   0: 'NVIDIA GeForce RTX 3060 Laptop GPU'
   PLATFORM NAME & VENDOR:   NVIDIA CUDA, NVIDIA Corporation
   CANONICAL NAME:           nvidiacudanvidiageforcertx3060laptopgpu
   DRIVER VERSION:           525.125.06
   DEVICE VERSION:           OpenCL 3.0 CUDA, SM_20 SUPPORT
   DEVICE_TYPE:              GPU, dedicated mem
   GLOBAL MEM SIZE:          5938 MB
   MAX MEM ALLOC:            1484 MB
   MAX IMAGE SIZE:           32768 x 32768
   MAX WORK GROUP SIZE:      1024
   MAX WORK ITEM DIMENSIONS: 3
   MAX WORK ITEM SIZES:      [ 1024 1024 64 ]
   ASYNC PIXELPIPE:          NO
   PINNED MEMORY TRANSFER:   NO
   AVOID ATOMICS:            NO
   MICRO NAP:                250
   ROUNDUP WIDTH & HEIGHT    16x16
   CHECK EVENT HANDLES:      128
   TILING ADVANTAGE:         0.000
   DEFAULT DEVICE:           NO
   KERNEL BUILD DIRECTORY:   /opt/darktable/share/darktable/kernels
   KERNEL DIRECTORY:         /home/ds/.cache/darktable/cached_v2_kernels_for_NVIDIACUDANVIDIAGeForceRTX3060LaptopGPU_52512506
   CL COMPILER OPTION:       -cl-fast-relaxed-math
   KERNEL LOADING TIME:       0.0902 sec
[opencl_init] OpenCL successfully initialized. internal numbers and names of available devices:
[opencl_init]		0	'NVIDIA CUDA NVIDIA GeForce RTX 3060 Laptop GPU'
     0.8193 [opencl_init] FINALLY: opencl is AVAILABLE and ENABLED.
[opencl_init] opencl_scheduling_profile: 'multiple GPUs'
[opencl_init] opencl_device_priority: '!0,*/!1,*/1,*/1,*'
[opencl_init] opencl_mandatory_timeout: 2000
[dt_opencl_update_priorities] these are your device priorities:
[dt_opencl_update_priorities] 		image	preview	export	thumbs	preview2
[dt_opencl_update_priorities]		0	0	0	0	0
[dt_opencl_update_priorities] show if opencl use is mandatory for a given pixelpipe:
[dt_opencl_update_priorities] 		image	preview	export	thumbs	preview2
[dt_opencl_update_priorities]		0	0	0	0	0
[opencl_synchronization_timeout] synchronization timeout set to 20
[dt_opencl_update_priorities] these are your device priorities:
[dt_opencl_update_priorities] 		image	preview	export	thumbs	preview2
[dt_opencl_update_priorities]		0	0	0	0	0
[dt_opencl_update_priorities] show if opencl use is mandatory for a given pixelpipe:
[dt_opencl_update_priorities] 		image	preview	export	thumbs	preview2
[dt_opencl_update_priorities]		0	0	0	0	0
[opencl_synchronization_timeout] synchronization timeout set to 20

Generally they are more or less in sync with the distro ones, at least in my experience. As long as the user updates both at the same time, there should be no troubles. I guess the problem is with automatic updates as we see here…

Maybe, I could suggest a short paragraph for the user manual .

Something like "if your using a Flatpak version, the graphics drivers must also be available and up-to-date within the Flatpak sandbox environment. If you have checked the OpenCL function flow, loaded drivers, and device nodes.

Before going any further try a:
flatpak update -y"

1 Like

I’m not sure it would be merged. The user manual should be around how to use the software. Keeping your OS up to date should be out of scope. We do have a FAQ with a flatpak section, maybe something there if this is an issue others experience (frequently…).

Indeed flatpak is out of scope for the user manual… But how is the flatpak nvidia driver not a part of the automatic update?

It is, but if and only if you actually run the update via the terminal command or Discover.

I don’t understand the automatic updates comments… What am I missing?

1 Like