CPU device 2 discarded

When opening darktable 2.6.2 in debug mode in order to study the internal numbering of GPUs I noticed that CPU device 2 is being discarded:


Is this ok?
If yes, does it mean that darktable is single threaded? If no, how can I avoid the CPU from being discarded?

That just means that OpenCL won’t use your CPU’s built in GPU. The CPU will still be used. Most desktop Intel processors have a built in GPU.

Not actually what it is saying. It’s saying that it won’t use OpenCL to target execution on the CPU itself.

It appears that he has one of the “OpenCL on CPU” implementations (Such as Intel’s CPU Runtime - see the SECOND half of https://software.intel.com/en-us/articles/opencl-drivers , not the first half) installed, and this is detected as device 2. Darktable blacklists any OCL implementation that falls into this category, preferring its own internal CPU-targeted code instead.

Device 0 is NVIDIA discrete graphics, Device 1 is Intel integrated graphics (GPU on the CPU die) (first half of link above), and DT does not appear to be discarding this (if it’s Beignet it should be blacklisted, if it’s NEO then DT will allow that. As to whether it’s faster on this platform, that’s unknown. DT has a basic benchmarking suite to try and evaluate this and will disable OCL by default if it fails, but the user can turn it back on in the GUI. I’ve done that on my machine because while the “test” gives 0.45 seconds for CPU and 0.65 for GPU, my actual workflows were more like 3 seconds GPU, 12 GPU for an i5-7200U

People confuse these two potential OCL execution paths time and time again…

Hi’ @Entropy512

I have studied the darktable manual and other texts in order to comprehend the meaning of your post. I think I understand the following:

In order to make use of the GPUs (NVIDIA and Intel) on my laptop darktable loads OpenCL-code into the GPUs, this code is compiled at runtime. The loading and compiling was successful on both GPUs. If loading of the OpenCl-code fails darktable can’t make use of the GPU(s) and will do l the graphics processing on the CPU.

Darktable will also try to load OpenCL-code into the CPU, but this was not successful on my CPU (“discarding CPU device 2”) and darktable will use its own code to do graphics processing on the CPU. The “device 2” message has nothing to do with the number of cores on the CPU.

Is the above correct?

Do you mean whether OpenCL is faster than some native code in NVIDIA or Intel GPUs? Or is it native code on the CPU?

How do I know if OCL is disabled by darktable and how do I turn it on, it’s not the “Activate OpenCL support option” is it?

There must be a typing error. The last “GPU” should be a “CPU”?

How do you monitor the performance of your workflow? Is it by running darktable with the “perf” option?
Is it possible easily to determine if Beignet or NEO is installed on my pc?
Where can I read more about all this?

Sorry for all the questions, but the topic is really interersting………

Some basics can be found here:

1 Like

Pretty close, other than that Darktable won’t even try to load OCL code into the CPU. The OCL drivers on the system list a set of given possible targets - in your case, it looked like NVIDIA GPU, Intel GPU, and Intel CPU. Darktable didn’t even consider the third option here because, in general, native code almost always works much better in that use case.

Although do note that since you have an NVidia GPU, Darktable is likely going to prefer that and you won’t see it using your CPU much because the GTX 850M is almost surely going to be faster.

OCL on GPU is almost always faster with a discrete GPU for almost all workloads. OCL on GPU with an integrated GPU (such as Intel integrated graphics) is not quite so clear-cut. It depends on the exact CPU/GPU model (Intel has significant variation in their graphics capability tiers), and also depends on the workload. One disadvantage of OpenCL is that if you want to run something on the GPU, you must send the data to the GPU, and for some workloads, that winds up making the total workflow slower.

There’s significant debugging info if you use darktable -d perf -d opencl - but in general if the checkbox is greyed out, dt found no suitable candidate OCL devices that were not blacklisted. If it’s present but unchecked, either you unchecked it, or dt found that the GPU was slower for its benchmark - but for some workloads you MAY benefit from overriding it.

You can get more info as to what OpenCL capabilities are present on a Linux system with the clinfo command. Not sure about Windows. The fact that a CPU target for OCL is being seen makes me think you might be on Windows? That honestly also means you likely don’t want to try using the Intel GPU target either - Beignet is blacklisted across the board for all OSes, and NEO is still blacklisted on Windows because it seems to be highly immature on Windows. (A big red flag should be the fact that Intel only has Linux releases with claims to pass Khronos certification tests…)

In your case you have a GTX 850M so most likely, you want to be using that anyway. It’s virtually guaranteed to significantly outperform any Intel integrated GPU solution.

Yes, that last GPU should have been CPU. Further complicating things, if DT does NOT use OpenCL, it uses OpenMP to try and make full use of all of your cores - OpenMP - Wikipedia

Nearly all modules have OpenMP support, not all (but most) have OpenCL support. If a module doesn’t have OCL capability, darktable will use OpenMP to fill up your CPU cores.

Hi’ @Entropy512
Thank you for your response. I really appreciate your efforts in explaining all this to me and I hope other users also benefit.

When studying the topic it strikes me how impressive it is that developers put such a huge effort into developing and maintaining complicated free software like darktable.
It’s awesome, thank you guys out there!

Yes I’m on windows.

Now, I think I almost got it right, see below……

…and darktable displayed the message “discarding CPU device 2……” to tell me this. The CPU is device 2 because darktable has three possible targets and the internal numbering starts with 0. Correct?

The Intel GPU is accepted as device 1 and using the standard setup, opencl_device_priority = star/!0,star/star/star (PIXLS.US doesn’t get my Danish keyboard “*” right) the Intel GPU is used the most. NVIDIA is used very little. That is the reason why I got into this issue following this thread:

But everything is apparently working fine in darktable in the standard setup using the Intel GPU. How would I experience a malfunction?

You suggest that I exclude the Intel GPU altogether maybe by setting opencl_device_priority = !1,star/!1,star/!1,star/!1,star? This works also very well and now the NVIDIA GPU is almost used exclusively. Should I prefer this?

Hi’ @pk5dark
Thanks. I have read the darktable user manual. A fine description.

But It is more a question of understanding Intel CPU versus Intel GPU versus NVIDIA GPU and OpenCL + other software. This is not covered in the manual…