DT 5.4.1. Upgraded a couple of days ago to the nvidia 590 driver. Now, it takes forever (27sec) to export a 1000px images, it used to be less than 2sec.
It passes the darktable-cltest but I get the
I have 'NVIDIA CUDA NVIDIA GeForce RTX 4060 Laptop GPU' id=0: NOT utilized
$ darktable-cltest
darktable 5.4.1
Copyright (C) 2012-2026 Johannes Hanika and other contributors.
Compile options:
Bit depth -> 64 bit
Exiv2 -> 0.28.7
Lensfun -> 0.3.4
Debug -> DISABLED
SSE2 optimizations -> ENABLED
OpenMP -> ENABLED
OpenCL -> ENABLED
Lua -> ENABLED - API version 9.6.0
Colord -> ENABLED
gPhoto2 -> ENABLED
OSMGpsMap -> ENABLED - map view is available
GMIC -> ENABLED - Compressed LUTs are supported
GraphicsMagick -> ENABLED
ImageMagick -> DISABLED
libavif -> ENABLED
libheif -> ENABLED
libjxl -> ENABLED
LibRaw -> ENABLED - Version 0.22.0-Release
OpenJPEG -> ENABLED
OpenEXR -> ENABLED
WebP -> ENABLED
See https://www.darktable.org/resources/ for detailed documentation.
See https://github.com/darktable-org/darktable/issues/new/choose to report bugs.
0.3641 [opencl_init] opencl disabled via darktable preferences
0.3642 [opencl_init] opencl library 'libOpenCL' found on your system and loaded, preference 'default path'
0.4982 [opencl_init] found 1 platform
[opencl_init] found 1 device
[dt_opencl_device_init]
DEVICE: 0: 'NVIDIA GeForce RTX 4060 Laptop GPU'
CONF KEY: cldevice_v5_nvidiacudanvidiageforcertx4060laptopgpu
PLATFORM, VENDOR & ID: NVIDIA CUDA, NVIDIA Corporation, ID=4318
CANONICAL NAME: nvidiacudanvidiageforcertx4060laptopgpu
DRIVER VERSION: 590.48.01
DEVICE VERSION: OpenCL 3.0 CUDA, SM_20 SUPPORT
DEVICE_TYPE: GPU, dedicated mem
GLOBAL MEM SIZE: 7808 MB
MAX MEM ALLOC: 1952 MB
MAX IMAGE SIZE: 32768 x 32768
MAX CONSTANT BUFFER: 64 KB
ADDRESS ALIGN: 512
COMPUTE UNITS: 24
MAX WORK GROUP SIZE: 1024
MAX WORK ITEM DIMENSIONS: 3
MAX WORK ITEM SIZES: [ 1024 1024 64 ]
ASYNC PIXELPIPE: NO
PINNED MEMORY TRANSFER: NO
AVOID ATOMICS: NO
MICRO NAP: 250
ROUNDUP WIDTH & HEIGHT 16x16
CHECK EVENT HANDLES: 128
TILING ADVANTAGE: 0.000
DEFAULT DEVICE: NO
KERNEL BUILD DIRECTORY: /usr/share/darktable/kernels
KERNEL DIRECTORY: /home/froggy/.cache/darktable/cached_v5_kernels_for_NVIDIACUDANVIDIAGeForceRTX4060LaptopGPU_5904801
CL COMPILER OPTION: -cl-fast-relaxed-math
CL COMPILER COMMAND: -w -cl-fast-relaxed-math -DNVIDIA_SM_20=1 -DNVIDIA=1 -I"/usr/share/darktable/kernels"
CL EXCEPTION: DT_OPENCL_ONLY_CUDA
KERNEL LOADING TIME: 0.0129 sec
[opencl_init] OpenCL successfully initialized. internal numbers and names of available devices:
[opencl_init] 0 'NVIDIA CUDA NVIDIA GeForce RTX 4060 Laptop GPU'
0.5832 [opencl_init] FINALLY: opencl PREFERENCE=OFF is AVAILABLE and NOT ENABLED.
[opencl_update_priorities] these are your device priorities:
[opencl_update_priorities] image preview export thumbs preview2
[dt_opencl_update_priorities] 0 -1 0 0 -1
[opencl_update_priorities] show if opencl use is mandatory for a given pixelpipe:
[opencl_update_priorities] image preview export thumbs preview2
[opencl_update_priorities] 0 0 0 0 0
[opencl_synchronization_timeout] synchronization timeout set to 200
[opencl_update_priorities] these are your device priorities:
[opencl_update_priorities] image preview export thumbs preview2
[dt_opencl_update_priorities] 0 -1 0 0 -1
[opencl_update_priorities] show if opencl use is mandatory for a given pixelpipe:
[opencl_update_priorities] image preview export thumbs preview2
[opencl_update_priorities] 0 0 0 0 0
[opencl_synchronization_timeout] synchronization timeout set to 200
What am I missing or is the problem with the 590 drivers?
Here’s the debug:
>$ darktable -d opencl -d perf
darktable 5.4.1
Copyright (C) 2012-2026 Johannes Hanika and other contributors.
Compile options:
Bit depth -> 64 bit
Exiv2 -> 0.28.7
Lensfun -> 0.3.4
Debug -> DISABLED
SSE2 optimizations -> ENABLED
OpenMP -> ENABLED
OpenCL -> ENABLED
Lua -> ENABLED - API version 9.6.0
Colord -> ENABLED
gPhoto2 -> ENABLED
OSMGpsMap -> ENABLED - map view is available
GMIC -> ENABLED - Compressed LUTs are supported
GraphicsMagick -> ENABLED
ImageMagick -> DISABLED
libavif -> ENABLED
libheif -> ENABLED
libjxl -> ENABLED
LibRaw -> ENABLED - Version 0.22.0-Release
OpenJPEG -> ENABLED
OpenEXR -> ENABLED
WebP -> ENABLED
See https://www.darktable.org/resources/ for detailed documentation.
See https://github.com/darktable-org/darktable/issues/new/choose to report bugs.
0.0000 [dt starting]
darktable -d opencl -d perf
0.2970 [opencl_init] opencl disabled via darktable preferences
0.2974 [opencl_init] opencl library 'libOpenCL' found on your system and loaded, preference 'default path'
0.4294 [opencl_init] found 1 platform
[opencl_init] found 1 device
[dt_opencl_device_init]
DEVICE: 0: 'NVIDIA GeForce RTX 4060 Laptop GPU'
CONF KEY: cldevice_v5_nvidiacudanvidiageforcertx4060laptopgpu
PLATFORM, VENDOR & ID: NVIDIA CUDA, NVIDIA Corporation, ID=4318
CANONICAL NAME: nvidiacudanvidiageforcertx4060laptopgpu
DRIVER VERSION: 590.48.01
DEVICE VERSION: OpenCL 3.0 CUDA, SM_20 SUPPORT
DEVICE_TYPE: GPU, dedicated mem
GLOBAL MEM SIZE: 7808 MB
MAX MEM ALLOC: 1952 MB
MAX IMAGE SIZE: 32768 x 32768
MAX CONSTANT BUFFER: 64 KB
ADDRESS ALIGN: 512
COMPUTE UNITS: 24
MAX WORK GROUP SIZE: 1024
MAX WORK ITEM DIMENSIONS: 3
MAX WORK ITEM SIZES: [ 1024 1024 64 ]
ASYNC PIXELPIPE: NO
PINNED MEMORY TRANSFER: NO
AVOID ATOMICS: NO
MICRO NAP: 250
ROUNDUP WIDTH & HEIGHT 16x16
CHECK EVENT HANDLES: 128
TILING ADVANTAGE: 0.000
DEFAULT DEVICE: NO
KERNEL BUILD DIRECTORY: /usr/share/darktable/kernels
KERNEL DIRECTORY: /home/froggy/.cache/darktable/cached_v5_kernels_for_NVIDIACUDANVIDIAGeForceRTX4060LaptopGPU_5904801
CL COMPILER OPTION: -cl-fast-relaxed-math
CL COMPILER COMMAND: -w -cl-fast-relaxed-math -DNVIDIA_SM_20=1 -DNVIDIA=1 -I"/usr/share/darktable/kernels"
CL EXCEPTION: DT_OPENCL_ONLY_CUDA
KERNEL LOADING TIME: 0.0132 sec
[opencl_init] OpenCL successfully initialized. internal numbers and names of available devices:
[opencl_init] 0 'NVIDIA CUDA NVIDIA GeForce RTX 4060 Laptop GPU'
0.5155 [opencl_init] FINALLY: opencl PREFERENCE=OFF is AVAILABLE and NOT ENABLED.
[opencl_update_priorities] these are your device priorities:
[opencl_update_priorities] image preview export thumbs preview2
[dt_opencl_update_priorities] 0 -1 0 0 -1
[opencl_update_priorities] show if opencl use is mandatory for a given pixelpipe:
[opencl_update_priorities] image preview export thumbs preview2
[opencl_update_priorities] 0 0 0 0 0
[opencl_synchronization_timeout] synchronization timeout set to 200
[opencl_update_priorities] these are your device priorities:
[opencl_update_priorities] image preview export thumbs preview2
[dt_opencl_update_priorities] 0 -1 0 0 -1
[opencl_update_priorities] show if opencl use is mandatory for a given pixelpipe:
[opencl_update_priorities] image preview export thumbs preview2
[opencl_update_priorities] 0 0 0 0 0
[opencl_synchronization_timeout] synchronization timeout set to 200
1.2963 [lib_load_module] failed to open `midi': libportmidi.so.2: cannot open shared object file: No such file or directory
4.9977 [dt_dev_load_raw] loading the image. took 0.080 secs (0.084 CPU)
5.0390 [export] creating pixelpipe took 0.038 secs (0.066 CPU)
5.0395 [dev_pixelpipe] took 0.000 secs (0.000 CPU) initing base buffer [export]
5.0436 [dev_pixelpipe] took 0.004 secs (0.046 CPU) [export] processed `rawprepare' on CPU, blended on CPU
5.0494 [dev_pixelpipe] took 0.006 secs (0.042 CPU) [export] processed `temperature' on CPU, blended on CPU
5.0606 [dev_pixelpipe] took 0.011 secs (0.115 CPU) [export] processed `highlights' on CPU, blended on CPU
5.1729 [dev_pixelpipe] took 0.112 secs (1.300 CPU) [export] processed `cacorrect' on CPU, blended on CPU
5.2506 [dev_pixelpipe] took 0.078 secs (0.864 CPU) [export] processed `demosaic' on CPU, blended on CPU
5.4662 [dev_pixelpipe] took 0.216 secs (3.051 CPU) [export] processed `lens' on CPU, blended on CPU
5.5424 [dev_pixelpipe] took 0.076 secs (1.061 CPU) [export] processed `ashift' on CPU, blended on CPU
5.5602 [dev_pixelpipe] took 0.018 secs (0.262 CPU) [export] processed `flip' on CPU, blended on CPU
5.5730 [dev_pixelpipe] took 0.013 secs (0.151 CPU) [export] processed `exposure' on CPU, blended on CPU
5.6344 [dev_pixelpipe] took 0.061 secs (0.713 CPU) [export] processed `toneequal' on CPU, blended on CPU
5.6435 [dev_pixelpipe] took 0.009 secs (0.038 CPU) [export] processed `crop' on CPU, blended on CPU
5.6672 [dev_pixelpipe] took 0.024 secs (0.170 CPU) [export] processed `colorin' on CPU, blended on CPU
5.6798 [dt_ioppr_transform_image_colorspace] IOP_CS_LAB-->IOP_CS_RGB took 0.013 secs (0.114 CPU) [channelmixerrgb]
5.7723 [dev_pixelpipe] took 0.105 secs (1.324 CPU) [export] processed `channelmixerrgb' on CPU, blended on CPU
18.0094 [dev_pixelpipe] took 12.237 secs (192.682 CPU) [export] processed `diffuse' on CPU, blended on CPU
18.7467 [dev_pixelpipe] took 0.737 secs (10.794 CPU) [export] processed `diffuse.1' on CPU, blended on CPU
31.3278 [dev_pixelpipe] took 12.581 secs (203.196 CPU) [export] processed `diffuse.2' on CPU, blended on CPU
31.8447 [dev_pixelpipe] took 0.517 secs (8.526 CPU) [export] processed `colorbalancergb' on CPU, blended on CPU
32.1881 [dev_pixelpipe] took 0.343 secs (4.465 CPU) [export] processed `agx' on CPU, blended on CPU
32.2281 [resample_plain] plan 0.001 secs (0.000 CPU) resample 0.039 secs (0.637 CPU)
32.2281 [dev_pixelpipe] took 0.040 secs (0.637 CPU) [export] processed `finalscale' on CPU, blended on CPU
32.2289 [dt_ioppr_transform_image_colorspace] IOP_CS_RGB-->IOP_CS_LAB took 0.001 secs (0.005 CPU) [colorout]
32.2307 [dev_pixelpipe] took 0.003 secs (0.019 CPU) [export] processed `colorout' on CPU, blended on CPU
32.2307 [dev_process_export] pixel pipeline processing took 27.192 secs (429.462 CPU)
32.2928 [export_job] exported to `/evo/temp/darktable/darktable_exported/dogs-vancouver-20140101-0337_02.jpg'
[opencl_summary_statistics] device 'NVIDIA CUDA NVIDIA GeForce RTX 4060 Laptop GPU' id=0: NOT utilized