OpenCl multiple GPUs, memory and tuning

While having all the parameters that you can tweak is great, I think it is not something that we should mess with. The settings selected are a balance of being safe and optimal for most systems. The mandatory timeout is the only one you should increase if you use a lot of iterations. When we only had the GL HLR, it was beneficial to let it go more than 2sec (mainly during export).

1 Like

I guess that we all want the most performance as possible when using darktable. The subject is complex and there seems to be differences in opinion as to the optimal settings. I think that one should know exactly what they are doing when they attempt to tune their graphics card for darktable. I for one do not.

I upgraded my graphics card from the GTX 1050 2MB to the RTX 2060 6MB when I noticed tiling and slow performance in general. The change was huge and I now find working in darktable to be an enjoyable experience.

I noted the darktablerc entries above in yesterdays post. If there is a specific setting that should be changed then I would do so. But, only upon the recommendation from someone in the know like @kofa , @priort or @hannoschwalm. Otherwise, I think it’s best that I leave the settings as they are rather than mess things up.

Oh, I’m not “in the know” at all. It was only a few weeks ago that I crashed darktable exactly as Hanno described above: I let it use all GPU memory. It worked like a charm (see Everyone is happy and thinks “whow, cool, got it” above), and then came the moment that he predicted: Bang - darktable allocates graphics memory it won’t get and the code won’t work.

The good thing is, one can restore the working config (if one realises (or is told, as it happened in my case) that it’s a config problem originating between the seat and the keyboard), learn a lesson – and find another creative way to break darktable the week after. :stuck_out_tongue:

3 Likes

You are really good in getting darktable to it’s limit or one could also say, detecting a bug :grin: how often did you insist: hanno that’s a problem?

Adding to that there are also OS and video driver settings that might dictate DT performance even if it is optimized. So to fully optimize things you really have to start with the Nvidia setting for example and then work through the OS settings and make sure nothing is holding back performance. I don’t think it’s always strait forward if you have one set in a certain way and the other set to say request performance does for example the os take priority over the driver settings … I think you can work through this to be sure . The example I noted previously with ON1 photo was an example of this. The software had a setting to use the GPU but unless the OS in this case Windows was set for it to be used by selecting ON1 and setting high priority as the option for it then performance was reduced… I never tested this with DT I just set it that way just in case. I felt well it can’t hurt.

The best performance I have with my system’s NVIDIA graphic card (GeForce 940M, 2GB), is with the following changed settings:

In darktable preferences
OpenCL scheduling profile - default
tune OpenCL performance – nothing

in “darktablerc”
opencl_device_priority=0,+/0,+/0,+/0,+/0,+
opencl_mandatory_timeout=20000

Note: where is the character “+” should be used the character “*”

Thank you for a good explanation.

I have read the documentation several times and you are right, it is misleading in some respects.

There is a big difference between giving darktable unrestricted access to system resources or to GPU resources.

As understand your post: Being safe in respect to GPU resources is a matter of probability and the worst that can happen is that darktable will fall back to do processing on the CPU. Is this correct?

I thought that my NVIDIA card was never used by other processes than darktable. Watching the performance tab of device manager I have noticed that the NVIDIA GPU is used only in special situations. Therefore, if I don’t do other work while exporting I could let darktable use all NVIDIA memory?

As it turns out it’s not a big problem. Letting darktable use all GPU memory is only a noteworthy benefit in special situations like running diffuse with the dehaze preset (10 iterations).

Yes, that’s what I did. If I’m correct, when using the fast GPU setting the device_priority settings are no longer being used? If so, is it okay to leave those settings in the darktablerc file?

I would have to go back a read the manual but as for the scheduling I think when you use default the those are the settings used… Setting to fast GPU is like manually changing all the pipelines to use or prefer GPU . I think… I could be wrong too

‘Very fast GPU’ sets ‘mandatory’ GPU execution (subject to the opencl_mandatory_timeout); ‘multiple GPUs’ simply sets all pipelines to use whatever GPU is available, falling back to the CPU immediately if there are none. That setting does not know about GPU speed differences, so if you have a ‘real’ GPU and a much slower one integrated into the APU, you’re still probably better off with ‘default’, and manually specifying the ID of the fast card.