Help with lag on high end GPU

I’m running the latest version of Darktable on Windows with an RTX 3080ti (OpenCL enabled) and every single slider change and subsequent edit takes at minimum 1-2 seconds to register/process.

The weird part is that my GPU/VRAM usage remains very low when using it but when I run darktable-cltest it detects my GPU and says OpenCL is available and enabled.

The full OpenCL test results are as follows:

Compile options:
Bit depth → 64 bit
Debug → DISABLED
SSE2 optimizations → ENABLED
OpenMP → ENABLED
OpenCL → ENABLED
Lua → ENABLED - API version 9.2.0
Colord → DISABLED
gPhoto2 → ENABLED
GMIC → ENABLED - Compressed LUTs are supported
GraphicsMagick → ENABLED
ImageMagick → DISABLED
libavif → ENABLED
libheif → ENABLED
libjxl → ENABLED
OpenJPEG → ENABLED
OpenEXR → ENABLED
WebP → ENABLED

See resources | darktable for detailed documentation.
See Sign in to GitHub · GitHub to report bugs.

 0.0715 [dt_get_sysresource_level] switched to 3 as `unrestricted'
 0.0737   total mem:       16339MB
 0.0748   mipmap cache:    2042MB
 0.0756   available mem:   261425MB
 0.0767   singlebuff:      16339MB
 0.0797 [opencl_init] opencl library 'OpenCL.dll' found on your system and loaded, preference 'default path'
 0.1098 [opencl_init] found 1 platform

[opencl_init] found 1 device

[dt_opencl_device_init]
DEVICE: 0: ‘NVIDIA GeForce RTX 3080 Ti’
PLATFORM, VENDOR & ID: NVIDIA CUDA, NVIDIA Corporation, ID=4318
CANONICAL NAME: nvidiacudanvidiageforcertx3080ti
DRIVER VERSION: 555.85
DEVICE VERSION: OpenCL 3.0 CUDA, SM_20 SUPPORT
DEVICE_TYPE: GPU, dedicated mem
GLOBAL MEM SIZE: 12287 MB
MAX MEM ALLOC: 3072 MB
MAX IMAGE SIZE: 32768 x 32768
MAX WORK GROUP SIZE: 1024
MAX WORK ITEM DIMENSIONS: 3
MAX WORK ITEM SIZES: [ 1024 1024 64 ]
ASYNC PIXELPIPE: NO
PINNED MEMORY TRANSFER: NO
USE HEADROOM: 600Mb
AVOID ATOMICS: NO
MICRO NAP: 250
ROUNDUP WIDTH & HEIGHT 16x16
CHECK EVENT HANDLES: 128
TILING ADVANTAGE: 0.000
DEFAULT DEVICE: NO
KERNEL BUILD DIRECTORY: C:\Program Files\darktable\share\darktable\kernels
KERNEL DIRECTORY: C:\Users\mkw27\AppData\Local\Microsoft\Windows\INetCache\darktable\cached_v3_kernels_for_NVIDIACUDANVIDIAGeForceRTX3080Ti_55585
CL COMPILER OPTION: -cl-fast-relaxed-math
CL COMPILER COMMAND: -w -cl-fast-relaxed-math -DNVIDIA_SM_20=1 -DNVIDIA=1 -I"C:\Program Files\darktable\share\darktable\kernels"
KERNEL LOADING TIME: 0.0329 sec
[opencl_init] OpenCL successfully initialized. internal numbers and names of available devices:
[opencl_init] 0 ‘NVIDIA CUDA NVIDIA GeForce RTX 3080 Ti’
0.2632 [opencl_init] FINALLY: opencl is AVAILABLE and ENABLED.
[opencl_init] opencl_scheduling_profile: ‘very fast GPU’
[opencl_init] opencl_device_priority: ‘/!0,///!0,*’
[opencl_init] opencl_mandatory_timeout: 400
[dt_opencl_update_priorities] these are your device priorities:
[dt_opencl_update_priorities] image preview export thumbs preview2
[dt_opencl_update_priorities] 0 0 0 0 0
[dt_opencl_update_priorities] show if opencl use is mandatory for a given pixelpipe:
[dt_opencl_update_priorities] image preview export thumbs preview2
[dt_opencl_update_priorities] 1 1 1 1 1
[opencl_synchronization_timeout] synchronization timeout set to 0
[dt_opencl_update_priorities] these are your device priorities:
[dt_opencl_update_priorities] image preview export thumbs preview2
[dt_opencl_update_priorities] 0 0 0 0 0
[dt_opencl_update_priorities] show if opencl use is mandatory for a given pixelpipe:
[dt_opencl_update_priorities] image preview export thumbs preview2
[dt_opencl_update_priorities] 1 1 1 1 1
[opencl_synchronization_timeout] synchronization timeout set to 0

Any help would be greatly appreciated.

Set your resource level to default.

Just tried, no change in performance whatsoever.

What version of dt?

Share the output from darktable -d perf after moving a slide.

Version 4.6.1

When I run it through the console and move the sliders there’s no output at all

Delete your kernels and see … it might refresh things… What OS are you running… in WIndows they are in a somewhat obscure subfolder of the user folder…

image

Delete the kernels folders and DT will recreate them…

image

That did the trick, thank you for the help!

1 Like