Advice on GPU "strategy" (ROCm)

Hello,

Are there currently any GPU cards that provide a significant speedup to, say, Darktable, and fulfill the following requirements?

  • Use only free (open source) drivers.
  • All necessary software is already present in a mainstream distribution (say Debian “stable” with some backports or bits from “testing”).

Here is the background:

For the last 9 years or so a docked thinkpad x220 has been my main work machine. Only recently modern CPUs have reached the point where I consider an upgrade worthwhile (~3 times faster single thread performance). I will keep my trusty thinkpad for now, but for stationary usage I’m thinking of replacing it with a new desktop machine.

I am rather clueless as far GPUs go. For now, my main application for a GPU would be graphics software (mostly darktable) and perhaps some programming of my own.

From what I have seen (searching this forum and elsewhere), ROCm seems to be the catchword. But as far as I can see at this point in time ROCm is not at all at a stage where things work out of the box. Special drivers need to be installed that are supported only for specific distributions. The choice of GPU cards also seems rather limited.

If these observations are indeed correct, my tendency would be to build myself a box with good CPU performance (I would probably choose a Ryzen 5000 series CPU), and only put a cheap and silent graphics card into it. (For multi-threaded workloads this will already provide an awesome speedup over my current setup.) Then, in 2 or 3 years, when the ROCm situation has stabilized, I could upgrade the machine with a decent GPU card.

Does this sound like a reasonable plan?

Advice is very welcome.

1 Like

I had less than succesfull results using an AMD RX580 GPU.

While it’s good that it uses free, open source drivers, I’m actually finding my Nvidia GTX 1660, using the proprietary drivers a lot more stable.

There is an interesting discussion here: New GPU for Darktable/ video encoding

Thanks a lot for pointing me to this thread. I did not find it by myself when I researched the topic on my own.

Oh, I did not imagine that support for AMD GPUs on Linux is that catastrophic today.

I feel too old for all that tinkering (doing too much tinkering already in too many domains). I will get a GPU card when Debian/Ubuntu supports it out of the box with free drivers.

Given that AMD is seeing increased use in supercomputers (see LUMI for example), I hope that this will translate to more reliable GPGPU support on Linux sooner than later. But I might be too optimistic given that mainstream GPGPU has been around for more than 10 years already.

Until this time has come I think I’ll stick with CPUs. I tried darktable today on my workstation at work (Some oldish double Xeon with 2*8 cores), and it runs well enough even without GPU. It should be even better on a recent Ryzen 5000 series CPU.

However, for such a CPU without integrated graphics, I do need some graphics card. I just spent some time searching for a possibility (passively cooled, free drivers, display port output, cheap), but there does not seem to be anything!

Choosing a slower/more expensive Intel CPU only because of their integrated graphics seems bizarre.

Prior to upgrading my Gpu last year to the AMD RX580, I used a passively cooled Nvidia 2Gb GT1030 GPU.

No difficulties installing the Nvidia drivers using the driver manager in Mint or Ubuntu. I haven’t used Debian, but there should be guides available for installing the Nvidia drivers.

I guess the decision is this. Use an AMD card, which has issues with Rocm / OpenCL but has opensource drivers, or use Nvidia, which requires proprietary drivers (which some distros actually include and install them for you when you install the system), but is generally more stable?

One thing to be mentioned is there are opensource drivers included in the Linux kernel for Nvidia GPUs, (the nouveau drivers) that works for general use, but do not include openCL.

Overall it’s not that difficult to install Nvidia drivers on most distros nowadays.

Certainly after my experience with AMD, I’m sticking with NVIDIA.

1 Like

I saw that many people recommend Nvidia GT710 cards.

They are passively cooled, and apparently the last model from Nvidia with truly free drivers (i.e. without binary blobs, which has advantages beyond purism, see the phoronix article above).

Dispalyport does not seem to be available, but since the monitor situation on Linux (HDR, HIDPI) is not at all clear at the moment, I will stick with my old 16:10 24" IPS display anyway.

Seems strange to pay 60 EUR for such an old card, but that seems to be the situation. Apparently non-gamer non-laptop non-server machines without integrated graphics are truly a niche market!

AMD is good until you need AMDGPU-PRO, ROCm, or OpenCL. Both AMD and ROCm seem to work at the moment but only for k5.4.0-66 which is creating a lot of problems for a lot of users. Even then, apps like DaVinci Resolve show the viewers only with AMDGPU-PRO, but Fairlight timelines are just black. With ROCm Fairlight timelines are visible but it doesn’t render the video viewers. I guess that’s a problem with OpenGL.

I have an AMD Radeon RX Vega 64 Liquid and OpenCL performance on Linux seems to be worse than an M1 Mac Mini.

Darktable works completely fine. I don’t know if the performance is as good as with a comparable Nvidia gpu but I’ve never had an issue with Darktable not working or crashing with OpenCL issues.

Natron doesn’t launch, Blender has issues, OBS snap doesn’t launch. Tests done under X11. However Natron and OBS snap will launch under Wayland which brings me to the worst part.

The worst part is that you can’t use Wayland with AMDGPU-PRO or ROCm. I get such a terrible screen tearing when I install one of those drivers. So I’m forced to stay with X11 forever. Even if I’d like to use Wayland for the apps that won’t launch under X, I can’t. Besides the screen tearing, Wayland session will crash very often. Sometimes you’ll get 15min of terrible screen teary session, sometimes you’ll get just 20 seconds of session time before the crash. I had an experience where I didn’t realize I was under Wayland and it crashed on me cca 15 times in 10-15 min. However if you use the drivers from the kernel then all of those issues are gone but so is OpenCL.

Now some good news seems to be that AMD is hiring new kernel developers at the moment and while we all might thing that everything will be better now. I really think they might not even employ anyone and this is just a marketing strategy for their Linux customers because AMD Linux user base has been trough a lot with them in the past 2-3 years and things are starting to boil over very rapidly now. People are fed up, they’ve spent a great deal of money just to deal with AMD’s incompetence and issues. It’s just a matter of time until press picks this up so I think this might be their attempt to calm some folks (like myself) down.

What they don’t realize is that it ain’t gonna work if the bug reports and issues still get ignored. They are just buying a month or two. I seriously hope they actually employ the people that send their CVs and that they get to testing and fixing quickly.

Don’t buy AMD gpu now. AMD sounds like a great buy for us FOSS folk on paper but the real world experience is something entirely different. And this is the worst moment to buy a GPU as the prices are very high. I think you should consider delaying your purchase at least until the summer of after it. By then we should be seeing if there is any worth investing in AMD or just go with Nvidia and have a nice life and the prices should be dropping somewhat.

Something that tells me that AMD might be doing damage control with this hiring news is that I saw AMD staff on various online forums and message boards engaging with people and trying to do some discussion management. And why I appreciate them explaining some things. I don’t think anybody cares about that. You can’t explain away 3 years of neglect and constant issues. Only testing and bug fixes can do that. I hope I’m wrong tho. I hope they just want to engage with the community and that they intent on fixing this mess. I’d be the first one to buy 5 more AMD gpus if that happens.

2 Likes

I’m sorry to report that my worst fears have come true. AMD just confirmed all my fears and basically f***ed their every Linux user and supporter.

"
Hi All,

As per the latest information and clarity provided in our Documentation that ROCm does not support GUI applications officially.

Docs also updated accordingly @ [https://github.com/RadeonOpenCompute/ROCm#hardware-and-software-support](https://github.com/RadeonOpenCompute/ROCm/issues/url)

<em>Hardware and Software Support
ROCm is focused on using AMD GPUs to accelerate computational tasks such as machine learning, engineering workloads, and scientific computing. In order to focus our development efforts on these domains of interest, ROCm supports a targeted set of hardware configurations which are detailed further in this section.</em>
**Note: The AMD ROCm™ open software platform is a compute stack for headless system deployments. GUI-based software applications are currently not supported.**
"
1 Like

They think they are so clever, they forgot that the problem is rendering which can be done in headless mode. I’ll write them 50 bug reports for the same stuff just in headless mode til tomorrow…

1 Like

Instead of attack the cause of the problem, they apply the saying: dead dogs don’t bite

1 Like

I’m certainly glad I went back to Nvidia! (GTX 1660)

Lol, I tipped off Phoronix and this is what he wrote:
https://www.phoronix.com/scan.php?page=news_item&px=Radeon-ROCm-Non-GUI

I hope AMD paid him well for that article.

He didn’t even mentioned that they don’t support gui apps with ROCm anymore but that they at the same time also removed pal OpenCL from AMDGPU-PRO and put in ROCr, so I guess no driver now supports GUI apps.

As for the Clover, it doesn’t even have image support and won’t have any time soon even tho it seems to be “in progress”. Scamed by the company, that’s what we are.

I hope they are paying him well…

Finally, at least it’s getting some attention on Reddit. It only took 3 YEARS! :rage: :rage:
If AMD doesn’t make some drastic changes in response to this I’m seriously done with them. FOREVER! :rage:

1 Like

What is actually the difference between general purpose GPU computation when performed by GUI applications and by headless applications?

I’ve never used OpenCL myself, but I imagine that the kind of computations that are done by Darktable or other graphics software is not different in principle from what is done in headless scientific applications. Why does it matter whether there’s a GUI or not?

Because of OpenGL but it was afaik bundled in ROCm at one point. Other than that, nothing is different afaik.

But that’s not the point, all of AMD recent consumer gpus are marketed as “work by day, game by night” style of gpu. So why do I need 3 different stacks to achieve that?!

I don’t game but if I did, I’d need AMDGPU for gaming, AMDGPU-PRO for “GUI” OpenCL accelerated apps and ROCm for ML and dev stuff. And what about HIP?

It’s a large mess.