Advice on GPU "strategy" (ROCm)

Hello,

Are there currently any GPU cards that provide a significant speedup to, say, Darktable, and fulfill the following requirements?

  • Use only free (open source) drivers.
  • All necessary software is already present in a mainstream distribution (say Debian “stable” with some backports or bits from “testing”).

Here is the background:

For the last 9 years or so a docked thinkpad x220 has been my main work machine. Only recently modern CPUs have reached the point where I consider an upgrade worthwhile (~3 times faster single thread performance). I will keep my trusty thinkpad for now, but for stationary usage I’m thinking of replacing it with a new desktop machine.

I am rather clueless as far GPUs go. For now, my main application for a GPU would be graphics software (mostly darktable) and perhaps some programming of my own.

From what I have seen (searching this forum and elsewhere), ROCm seems to be the catchword. But as far as I can see at this point in time ROCm is not at all at a stage where things work out of the box. Special drivers need to be installed that are supported only for specific distributions. The choice of GPU cards also seems rather limited.

If these observations are indeed correct, my tendency would be to build myself a box with good CPU performance (I would probably choose a Ryzen 5000 series CPU), and only put a cheap and silent graphics card into it. (For multi-threaded workloads this will already provide an awesome speedup over my current setup.) Then, in 2 or 3 years, when the ROCm situation has stabilized, I could upgrade the machine with a decent GPU card.

Does this sound like a reasonable plan?

Advice is very welcome.

1 Like

I had less than succesfull results using an AMD RX580 GPU.

While it’s good that it uses free, open source drivers, I’m actually finding my Nvidia GTX 1660, using the proprietary drivers a lot more stable.

There is an interesting discussion here: New GPU for Darktable/ video encoding - #29 by Brian_Innes

Thanks a lot for pointing me to this thread. I did not find it by myself when I researched the topic on my own.

Oh, I did not imagine that support for AMD GPUs on Linux is that catastrophic today.

I feel too old for all that tinkering (doing too much tinkering already in too many domains). I will get a GPU card when Debian/Ubuntu supports it out of the box with free drivers.

Given that AMD is seeing increased use in supercomputers (see LUMI for example), I hope that this will translate to more reliable GPGPU support on Linux sooner than later. But I might be too optimistic given that mainstream GPGPU has been around for more than 10 years already.

Until this time has come I think I’ll stick with CPUs. I tried darktable today on my workstation at work (Some oldish double Xeon with 2*8 cores), and it runs well enough even without GPU. It should be even better on a recent Ryzen 5000 series CPU.

However, for such a CPU without integrated graphics, I do need some graphics card. I just spent some time searching for a possibility (passively cooled, free drivers, display port output, cheap), but there does not seem to be anything!

Choosing a slower/more expensive Intel CPU only because of their integrated graphics seems bizarre.

Prior to upgrading my Gpu last year to the AMD RX580, I used a passively cooled Nvidia 2Gb GT1030 GPU.

No difficulties installing the Nvidia drivers using the driver manager in Mint or Ubuntu. I haven’t used Debian, but there should be guides available for installing the Nvidia drivers.

I guess the decision is this. Use an AMD card, which has issues with Rocm / OpenCL but has opensource drivers, or use Nvidia, which requires proprietary drivers (which some distros actually include and install them for you when you install the system), but is generally more stable?

One thing to be mentioned is there are opensource drivers included in the Linux kernel for Nvidia GPUs, (the nouveau drivers) that works for general use, but do not include openCL.

Overall it’s not that difficult to install Nvidia drivers on most distros nowadays.

Certainly after my experience with AMD, I’m sticking with NVIDIA.

1 Like

I saw that many people recommend Nvidia GT710 cards.

They are passively cooled, and apparently the last model from Nvidia with truly free drivers (i.e. without binary blobs, which has advantages beyond purism, see the phoronix article above).

Dispalyport does not seem to be available, but since the monitor situation on Linux (HDR, HIDPI) is not at all clear at the moment, I will stick with my old 16:10 24" IPS display anyway.

Seems strange to pay 60 EUR for such an old card, but that seems to be the situation. Apparently non-gamer non-laptop non-server machines without integrated graphics are truly a niche market!

AMD is good until you need AMDGPU-PRO, ROCm, or OpenCL. Both AMD and ROCm seem to work at the moment but only for k5.4.0-66 which is creating a lot of problems for a lot of users. Even then, apps like DaVinci Resolve show the viewers only with AMDGPU-PRO, but Fairlight timelines are just black. With ROCm Fairlight timelines are visible but it doesn’t render the video viewers. I guess that’s a problem with OpenGL.

I have an AMD Radeon RX Vega 64 Liquid and OpenCL performance on Linux seems to be worse than an M1 Mac Mini.

Darktable works completely fine. I don’t know if the performance is as good as with a comparable Nvidia gpu but I’ve never had an issue with Darktable not working or crashing with OpenCL issues.

Natron doesn’t launch, Blender has issues, OBS snap doesn’t launch. Tests done under X11. However Natron and OBS snap will launch under Wayland which brings me to the worst part.

The worst part is that you can’t use Wayland with AMDGPU-PRO or ROCm. I get such a terrible screen tearing when I install one of those drivers. So I’m forced to stay with X11 forever. Even if I’d like to use Wayland for the apps that won’t launch under X, I can’t. Besides the screen tearing, Wayland session will crash very often. Sometimes you’ll get 15min of terrible screen teary session, sometimes you’ll get just 20 seconds of session time before the crash. I had an experience where I didn’t realize I was under Wayland and it crashed on me cca 15 times in 10-15 min. However if you use the drivers from the kernel then all of those issues are gone but so is OpenCL.

Now some good news seems to be that AMD is hiring new kernel developers at the moment and while we all might thing that everything will be better now. I really think they might not even employ anyone and this is just a marketing strategy for their Linux customers because AMD Linux user base has been trough a lot with them in the past 2-3 years and things are starting to boil over very rapidly now. People are fed up, they’ve spent a great deal of money just to deal with AMD’s incompetence and issues. It’s just a matter of time until press picks this up so I think this might be their attempt to calm some folks (like myself) down.

What they don’t realize is that it ain’t gonna work if the bug reports and issues still get ignored. They are just buying a month or two. I seriously hope they actually employ the people that send their CVs and that they get to testing and fixing quickly.

Don’t buy AMD gpu now. AMD sounds like a great buy for us FOSS folk on paper but the real world experience is something entirely different. And this is the worst moment to buy a GPU as the prices are very high. I think you should consider delaying your purchase at least until the summer of after it. By then we should be seeing if there is any worth investing in AMD or just go with Nvidia and have a nice life and the prices should be dropping somewhat.

Something that tells me that AMD might be doing damage control with this hiring news is that I saw AMD staff on various online forums and message boards engaging with people and trying to do some discussion management. And why I appreciate them explaining some things. I don’t think anybody cares about that. You can’t explain away 3 years of neglect and constant issues. Only testing and bug fixes can do that. I hope I’m wrong tho. I hope they just want to engage with the community and that they intent on fixing this mess. I’d be the first one to buy 5 more AMD gpus if that happens.

2 Likes

I’m sorry to report that my worst fears have come true. AMD just confirmed all my fears and basically f***ed their every Linux user and supporter.

"
Hi All,

As per the latest information and clarity provided in our Documentation that ROCm does not support GUI applications officially.

Docs also updated accordingly @ [https://github.com/RadeonOpenCompute/ROCm#hardware-and-software-support](https://github.com/RadeonOpenCompute/ROCm/issues/url)

<em>Hardware and Software Support
ROCm is focused on using AMD GPUs to accelerate computational tasks such as machine learning, engineering workloads, and scientific computing. In order to focus our development efforts on these domains of interest, ROCm supports a targeted set of hardware configurations which are detailed further in this section.</em>
**Note: The AMD ROCm™ open software platform is a compute stack for headless system deployments. GUI-based software applications are currently not supported.**
"

https://github.com/RadeonOpenCompute/ROCm/issues/1345

1 Like

They think they are so clever, they forgot that the problem is rendering which can be done in headless mode. I’ll write them 50 bug reports for the same stuff just in headless mode til tomorrow…

1 Like

Instead of attack the cause of the problem, they apply the saying: dead dogs don’t bite

1 Like

I’m certainly glad I went back to Nvidia! (GTX 1660)

Lol, I tipped off Phoronix and this is what he wrote:
https://www.phoronix.com/scan.php?page=news_item&px=Radeon-ROCm-Non-GUI

I hope AMD paid him well for that article.

He didn’t even mentioned that they don’t support gui apps with ROCm anymore but that they at the same time also removed pal OpenCL from AMDGPU-PRO and put in ROCr, so I guess no driver now supports GUI apps.

As for the Clover, it doesn’t even have image support and won’t have any time soon even tho it seems to be “in progress”. Scamed by the company, that’s what we are.

I hope they are paying him well…

Finally, at least it’s getting some attention on Reddit. It only took 3 YEARS! :rage: :rage:
If AMD doesn’t make some drastic changes in response to this I’m seriously done with them. FOREVER! :rage:
https://www.reddit.com/r/linux/comments/lvbc5v/rocm_opencl_library_for_amd_gpus_now_no_longer/

1 Like

What is actually the difference between general purpose GPU computation when performed by GUI applications and by headless applications?

I’ve never used OpenCL myself, but I imagine that the kind of computations that are done by Darktable or other graphics software is not different in principle from what is done in headless scientific applications. Why does it matter whether there’s a GUI or not?

Because of OpenGL but it was afaik bundled in ROCm at one point. Other than that, nothing is different afaik.

But that’s not the point, all of AMD recent consumer gpus are marketed as “work by day, game by night” style of gpu. So why do I need 3 different stacks to achieve that?!

I don’t game but if I did, I’d need AMDGPU for gaming, AMDGPU-PRO for “GUI” OpenCL accelerated apps and ROCm for ML and dev stuff. And what about HIP?

It’s a large mess.

Another way to run darktable with OpenCL accelaration seems to be the recent “Rusticl” addition to Mesa: Mesa's "Rusticl" Implementation Now Manages To Handle Darktable OpenCL - Phoronix

I’m considering getting a machine with a Ryzen 5700G APU. Unless I am mistaken, this hardware should (eventually) allow to run OpenCL-accelerated darktable using Mesa/Rusticl. Or am I overseeing any problems? Would someone who has a better overview of free graphics drivers be so kind to confirm?

I still haven’t bought the new machine that was the motivation for starting this thread, but now I can no longer wait. I would be tempted to get a Ryzen 7000 series processor with its superior single thread performance, but I fear that all that brand-new hardware and software (especially on the Linux side) will not be mature and stable.

Another reason to avoid Ryzen 7000/AM5 at the moment seems that while I’d like to build a quiet machine that has low idle power consumption, the new AM5 stuff (boards and CPUs) is geared towards highest performance with very high TDPs.

So, all in all, I’m tempted to get a Ryzen 5700G. Does this sound like a reasonable choice in the fall of 2022?

I’d advise you to buy what works now, not what might work in the future.

Thanks! I see your point, but that would mean getting a nvidia graphics card and running it with proprietary drivers. I’d like to avoid this because I prefer free drivers, and also because I would like to avoid a separate GPU card if I can.

As far as I can tell, with bleeding-edge software it is already possible to run darktable with OpenCL using mesa/rusticl on a AMD Vega GPU (and Ryzen 5700G seems to be AMD Vega).

Finally, I did some tests running darktable CPU-only on a similarly powerful machine, and I was satisfied with the result. So my question is more about the probability to get an extra boost in the future.

If you’re talking about the tweet from a week or so ago, then yes. But it isn’t stable or merged into mesa at all yet. So if you want to bet on that then go for it. If you want useable openCL now, then nvidia is the only option. AMD cant make up their corporate mind, and Intel doesn’t have anything powerful enough (though hopefully that changes soon).

I’m not sure you will get an extra boost with a CPU GPU vs a CPU. If you want boost, I think you should look at dedicate GPU card.

I posted yesterday comparing fedora vs win11 (with and without GPU). With this conversation I remember that I have a 5700G. I normally have that GPU disabled to only use the NVidia. I just did a test of the disabling the Nvidia to test the GPU from the 5700G CPU on windows 11. This is the same image as yesterday with the same edits.

4.1.0+491~ge5537f318 (almost the same as yesterday with the PR)
Win11 5700G CPU/GPU - 14.076 sec - 75.845051 [export] creating pixelpipe took 0.154 secs (0.156 CPU)75.850773 [de - Pastebin.com

Current master + PR12605 darktable 4.1.0+483~gee07a9f21
Ryzen 5700G 16Gb Nvidia 3060 12Gb

Export of Canon 60D image with my default processing in all 3 tests
Fedora 36 KDE X11 - 2.064 sec - 61.208044 [export] creating pixelpipe took 0.064 secs (0.082 CPU)61.213031 [de - Pastebin.com
Windows 11 GPU - 2.692 sec 33.145194 [export] creating pixelpipe took 0.150 secs (0.141 CPU)33.150333 [de - Pastebin.com
Windows 11 CPU only - 20.23secs 17.754234 [export] creating pixelpipe took 0.144 secs (0.156 CPU)17.759793 [de - Pastebin.com

The GPU is a 10x factor vs CPU with the modules I use. The CPU/GPU did has a slight performance gain vs CPU of 6 sec less.

Your results may vary depending on raw file size and modules you use.