I’m running Ubuntu 24.04.1 LTS on a Framework Laptop 16 (AMD Ryzen™ 7040 Series with dGPU (AMD Radeon™ RX 7700S)).
I’m completely new to OpenCL and currently $ clinfo --list gives me an empty output. I’d like to get OpenCL running to use it with darktable (deb installation). I understood, that there are (at least) three ways to use OpenCL:
If I understand the manual correct, the darktablerc-line to disable the iGPU and use the dGPU for everything should read: opencl_device_priority=0/0/0/0/0,0
Correct?
I’m wondering if the AMD Ryzen™ 9 7940HS is really so slow, that waiting for/only using the dGPU is faster instead than using CPU + dGPU in parallel.
Dt process most modules in sequence since the output of one feeds the next module, so there is really not much parallel within a single processing pipe. Telling it to use the dGPU normally yields faster processing because there are more processing units in a GPU.
The area that can help parallel is the preview vs the full pipe. One can execute in parallel. But with a iGPU, it is using the same memory and processing as the CPU. Something has to wait if you funnel too many things thru the CPU.
You can use -d perf to evaluate how your system behaves when it exports or when you do a change in the darkroom (ideally at the start of the pipe).
I might be mistaken, but to my understanding with above configuration (very fast GPU & forbidding iGPU) the dGPU is the only device left over for darktable to use. So I was wondering if a configuration similar to opencl_device_priority=0/*/0/*/*,* (to allow offloading to the iGPU for everything except center image and export pipeline) and/or not using very fast GPU config (to allow offloading to the CPU) wouldn’t make more sense.
Memory shouldn’t be an issue here, having 96 GB of it and if I recall correctly 8 GB assigned to the iGPU in BIOS. iGPU and CPU using the same processing sounds counter-intutive to me (what would the iGPU consist of then?), but I’m no expert on that field.
So I tried 10 different combinations of opencl_device_priority and opencl_scheduling_profile (default, very fast GPU) and the results were very close to each other, differences normally <0.01 secs on process_image and ~0.1 secs on process_export.
Only two exceptions were when I forced to use the iGPU (priority: 1/1/1/1/1) & profile default (export time more than tripled, process image time more than doubled) and when I used */!0,*/*/*/!0,* with profile default (image process time more than doubled).
Note: Probably the numbers are not trustworthy anyway as I did not