Multiple GPU, OpenCL Priority setting NOT working as desired

I have 2 GPUs 0=Intel HD 630 and 1=AMD RX-570 I want that DT should use always use device 1 and when not available use device 0 then CPU

So I have this is my OpenCL settings

opencl=TRUE
opencl_async_pixelpipe=false
opencl_avoid_atomics=false
opencl_checksum=766871645
opencl_device_priority=/!1,/1,/1,/1,*
opencl_library=
opencl_mandatory_timeout=200
opencl_memory_headroom=400
opencl_memory_requirement=768
opencl_micro_nap=1000
opencl_number_event_handles=25
opencl_scheduling_profile=multiple GPUs
opencl_size_roundup=16
opencl_synch_cache=active module
opencl_use_cpu_devices=false
opencl_use_pinned_memory=false

But when I run the DT Benchmark
darktable-cli arecibo.orf arecibo.orf.xmp test.jpg --core -d perf -d opencl

it uses only the Intel 630.

So what should be my opencl_device_priority if I want to use AMD RX-570 mainly?
In the guide it says that there are 4 setting but then what is the purpose of the first “*”?

DT recognizes both the GPUs but Uses Intel HD 630 only. Following is the relevant output

0.073026 [opencl_init] opencl: 1
0.073034 [opencl_init] opencl_scheduling_profile: ‘multiple GPUs’
0.073043 [opencl_init] opencl_library: ‘’
0.073053 [opencl_init] opencl_memory_requirement: 768
0.073062 [opencl_init] opencl_memory_headroom: 400
0.073069 [opencl_init] opencl_device_priority: ‘/!1,/1,/1,/1,*’
0.073087 [opencl_init] opencl_mandatory_timeout: 200
0.073103 [opencl_init] opencl_size_roundup: 16
0.073117 [opencl_init] opencl_async_pixelpipe: 0
0.073131 [opencl_init] opencl_synch_cache: active module
0.073148 [opencl_init] opencl_number_event_handles: 25
0.073166 [opencl_init] opencl_micro_nap: 1000
0.073181 [opencl_init] opencl_use_pinned_memory: 0
0.073195 [opencl_init] opencl_use_cpu_devices: 0
0.073212 [opencl_init] opencl_avoid_atomics: 0

[opencl_init] device 0: Intel(R) UHD Graphics 630 [0x3e92]
CANONICAL_NAME: intelru
GLOBAL_MEM_SIZE: 12710MB
MAX_WORK_GROUP_SIZE: 256
MAX_WORK_ITEM_DIMENSIONS: 3
MAX_WORK_ITEM_SIZES: [ 256 256 256 ]
DRIVER_VERSION: 21.40.21182
DEVICE_VERSION: OpenCL 3.0 NEO

1.185912 [opencl_init] device 1 Ellesmere' supports image sizes of 16384 x 16384 1.185914 [opencl_init] device 1 Ellesmere’ allows GPU memory allocations of up to 3481MB
[opencl_init] device 1: Ellesmere
CANONICAL_NAME: ellesme
GLOBAL_MEM_SIZE: 4096MB
MAX_WORK_GROUP_SIZE: 256
MAX_WORK_ITEM_DIMENSIONS: 3
MAX_WORK_ITEM_SIZES: [ 1024 1024 1024 ]
DRIVER_VERSION: 3302.5 (PAL,HSAIL)
DEVICE_VERSION: OpenCL 2.0 AMD-APP (3302.5)

1.647490 [opencl_init] kernel loading time: 0.0616
1.647495 [opencl_init] OpenCL successfully initialized.
1.647496 [opencl_init] here are the internal numbers and names of OpenCL devices available to darktable:
1.647498 [opencl_init] 0 ‘Intel(R) UHD Graphics 630 [0x3e92]’
1.647499 [opencl_init] 1 ‘Ellesmere’
1.647501 [opencl_init] FINALLY: opencl is AVAILABLE on this system.
1.647502 [opencl_init] initial status of opencl enabled flag is ON.

1.648834 [opencl_priorities] these are your device priorities:
1.648836 [opencl_priorities] image preview export thumbs preview2
1.648841 [opencl_priorities] 0 0 0 0 0
1.648845 [opencl_priorities] 1 1 1 1 1
1.648849 [opencl_priorities] show if opencl use is mandatory for a given pixelpipe:
1.648851 [opencl_priorities] image preview export thumbs preview2
1.648855 [opencl_priorities] 0 0 0 0 0
1.648858 [opencl_synchronization_timeout] synchronization timeout set to 20

10.430849 [dev_process_export] pixel pipeline processing took 7.354 secs (8.258 CPU)
[export_job] exported to `test.jpg’
10.984775 [opencl_summary_statistics] device ‘Intel(R) UHD Graphics 630 [0x3e92]’ (0): 221 out of 221 events were successful and 0 events lost
10.986769 [opencl_summary_statistics] device ‘Ellesmere’ (1): NOT utilized

I’m using the default setting */!0,*/*/*, but only one GPU available on my system…
Did you have a look at darktable 3.7 (development) user manual - multiple devices ? It also contains an example.

Yes, have read it but there is something amiss in the guide, it talks about 4 settings.
center image
preview pixelpipe
the export pixelpipes
the thumbnail pixelpipes
But if you run DT with -d opencl from terminal it shows

1.648834 [opencl_priorities] these are your device priorities:
1.648836 [opencl_priorities] image preview export thumbs preview2
1.648841 [opencl_priorities] 0 0 0 0 0
1.648845 [opencl_priorities] 1 1 1 1 1
1.648849 [opencl_priorities] show if opencl use is mandatory for a given pixelpipe:
1.648851 [opencl_priorities] image preview export thumbs preview2
1.648855 [opencl_priorities] 0 0 0 0 0

So now there are 5 setting with addition of preview2
Is the Guide not updated?

My DT version is
darktable 3.7.0~git1263.b9383373cc-1
Ubuntu 20.04.3

opencl_device_priority=*/!1,*/1,*/1,*/1,*

Better visible like this (the forum thought your * characters were meant to indicate italics).

You may want to give +1 a try:

You can enforce GPU processing by prefixing the list of allowed GPUs with a plus sign +. In this case darktable will not use the CPU but rather suspend processing until the next permitted OpenCL device is available.

Also, this looks weird to me:

1.648834 [opencl_priorities] these are your device priorities:
1.648836 [opencl_priorities] image preview export thumbs preview2
1.648841 [opencl_priorities] 0 0 0 0 0
1.648845 [opencl_priorities] 1 1 1 1 1

I think this is one line per device (so, first device #0, then #1). Why is device #1 listed for preview, when that is configured as !1,*, meaning anything but #1?

you are correct when I pasted from darktablerc it was jumbled, your setting is what I have.

but that is what I have

Er…

To me this looks as if darktable ignored the config string, if the profile is set to multiple GPUs or
very fast GPU: dt_conf_get_string_const("opencl_device_priority") is only used if some other value (e.g. default) is used for opencl_scheduling_profile.

So is it a bug? Should it be rectified?

No, not a bug in the code (it automatically sets some reasonable default values if those profiles are used – that is their purpose). In fact, the documentation says it, see https://darktable-org.github.io/dtdocs/special-topics/opencl/multiple-devices/:

[…] you need to select the “default” scheduling profile and change the settings in the “opencl_device_priority” configuration parameter.

you might get better performance if you just prioritize full pixel pipe to be done on your faster gpu. If you set same priorities for each pixel pipe then it’s first come first serve - so the full pixel pipe might be done on a slower device since the faster is already in user for preview or thumbnail pipe …
The custom prioritization just works if you’re using standard mode

i’ll do that but my setting is right?

so what should be my setting according to you?

Also I noticed this behavior when doing the benchmark so there was no thumbnails or other UI

you might do some test to find the best configuration by logging the processing time:

darktable -d perf -d opencl | grep -e’dev_process_’ -e’using device’ -e’

<your fastest device>,* first section
maybe exclude your faster device in preview pipe !<your fastest device>, *

Given the manual’s example:

As the GTS450 is slower than the HD7950, an optimized “opencl_device_priority” could look like: !GeForce GTS450,*/!Tahiti,*/Tahiti,*/Tahiti,*.

And ‘Intel(R) UHD Graphics 630 [0x3e92]’ being slower than ‘Ellesmere’, I’d set

opencl_device_priority=default
opencl_device_priority=!Intel(R) UHD Graphics 630 [0x3e92],*/!Ellesmere,*/Ellesmere,*/Ellesmere,*/*

Note that the manual recommends the use of names over numerical IDs. Also, cards have ‘canonical’ names, maybe those would be a better choice:

opencl_device_priority=!intelru,*/!ellesme,*/ellesme,*/ellesme,*/*

0.021208 [opencl_init] opencl_scheduling_profile: ‘default’
0.021210 [opencl_init] opencl_library: ‘’
0.021212 [opencl_init] opencl_memory_requirement: 768
0.021213 [opencl_init] opencl_memory_headroom: 400
0.021215 [opencl_init] opencl_device_priority: ‘!Intel(R) UHD Graphics 630 [0x3e92],/!Ellesmere,/Ellesmere,/Ellesmere,/*’

0.317430 [opencl_priorities] these are your device priorities:
0.317432 [opencl_priorities] image preview export thumbs preview2
0.317437 [opencl_priorities] 0 0 0 0 0
0.317442 [opencl_priorities] 1 1 1 1 1
0.317446 [opencl_priorities] show if opencl use is mandatory for a given pixelpipe:
0.317448 [opencl_priorities] image preview export thumbs preview2
0.317451 [opencl_priorities] 0 0 0 0 0

8.912390 [dev_process_export] pixel pipeline processing took 7.339 secs (8.153 CPU)
[export_job] exported to test.jpg' 9.465691 [opencl_summary_statistics] device 'Intel(R) UHD Graphics 630 [0x3e92]' (0): 221 out of 221 events were successful and 0 events lost 9.467400 [opencl_summary_statistics] device 'Ellesmere' (1): NOT utilized 8.912390 [dev_process_export] pixel pipeline processing took 7.339 secs (8.153 CPU) [export_job] exported to test.jpg’
9.465691 [opencl_summary_statistics] device ‘Intel(R) UHD Graphics 630 [0x3e92]’ (0): 221 out of 221 events were successful and 0 events lost
9.467400 [opencl_summary_statistics] device ‘Ellesmere’ (1): NOT utilized

I don’t know. Maybe you should report this on github.

@kofa I am sorry but I don’t know how to report it on github.

But you may want to connect to the channel where developers hang out, and ask them if this is indeed a problem or if we don’t understand the documentation:
https://webchat.oftc.net/?channels=%23darktable

Simply ask them to check out the issue here (copy the link of your reply where you have already updated the config but still get the ‘not utilized’ message).

Raised the issue at GitHub
Thanks