GPGPU - ArchWiki section 1.1.1 AMD/ATI states which module is intended for which range of GPUs. I can’t remember where I saw the list of all of them.
It seems my problems are not all resolved. I can only import one folder or images then the import function hangs, cannot open any more folders.
Also, I have only recently begun to open DT from the CL. In the console I’m now seeing a list of errors. But no idea for how long that has been happening. So I’ll start a new thread in case these are not entirely OpenCL related.
I did a couple of image processes using each of “scenic rgb” and “color balance rgb” without noticing anything untoward. Not sure if the console output reflects this activity. I suspect not.
Oh yeah, that page that you sent earlier. I tried each of the options that seemed relevant, with ‘yay -R -s’ to remove dependencies between them.
rocm-opencl-runtime:
Crashes on darktable startup:
PHI node has multiple entries for the same basic block with different incoming values!
%967 = phi float [ %largephi.extractslice0, %sw.default ], [ %largephi.extractslice055, %sw.bb667 ], [
...
label %if.end
%largephi.extractslice0187 = extractelement <4 x float> %div, i64 0
%largephi.extractslice0191 = extractelement <4 x float> %div, i64 0
in function blendop_Lab
LLVM ERROR: Broken function found, compilation aborted!
opencl-clover-mesa, opencl-legacy-amdgpu-pro:
No device found (my device is too new for these)
opencl-amd:
Darktable opens with opencl and sets profile to “Very fast gpu”. I set it back to Default to test.
The video card crashes when I open an image in darkroom. Although, occasionally I can open one, close it, then it crashes on the next one. There are no error messages with ‘darktable -d opencl’, but dmesg has those errors above. I tried reseating the card, but that didn’t help.
opencl-rusticl-mesa:
Darktable opens with opencl enabled (using ‘RUSTICL_ENABLE=radeonsi darktable -d perf -d opencl’).
However, ‘filmic rgb’ and/or ‘color balance rgb’ modules result in a black image. When I scroll in and out on the image, it appears briefly, then goes back to black. This isn’t just a display issue either, as the jpegs are black when I export them.
I checked out and built master, but I still have the issue with filmic rgb.
I’m fairly experienced with software debugging (it’s kind of my day job), but my C++ skills are at least 10 years in the back of my memory. Almost everything I do is Java, and mostly single-threaded.
I was able to get darktable hooked up to a debugger using vscode yesterday, but I haven’t had a lot of time to play with it. I tried it with rusticl since at least it doesn’t crash my video card. In my very preliminary testing, it looks like the filimic rgb module finishes successfully.
It seems to me to be a synchronization issue, as I was sometimes able to not reproduce it when I threw a breakpoint in “develop.c” where it raises the signal that the UI_PIPE_FINISHED and let it sit there for a few seconds before continuing.
I’ll try to keep testing, and if I find something useful I’ll raise it on the darktable git page. I’m pretty much out of my element with multithreading/synchronization issues though.
There don’t seem to be any logs about it. If there’s a way to enable more logging, please let me know.
Andrew, did you try using the AMD proprietary drivers?
In the filmic module, does it happen with filmic HR on or off?
I haven’t tried the proprietary drivers yet.
The settings inside filmic rgb don’t make any difference, only whether it is enabled or not.
I’m confident you wont get black images with proprietary drivers.
If you want to test in a better way
- set cl preference to “very fast gpu” in case you have just one gpu card. This forces all pipelines to use the card and you will sooner hit your issue
- use
-d pipe -d opencldebugging options. That will pinpoint you to modules that don’t work correctly
The idea about a race condition. I have never observed such an issue and i wouldn’t even know how that coud happen. What you observed is very likely a fallback to cpu code - see the above mentioned options.
There is one more idea:
Could you edit your darktablerc file manually? There will be an option like
cldevice_v5_xyz_building=-cl-fast-relaxed-math with xyz as your device. could you make that an empty string cldevice_v5_xyz_building= and check again? You have to remove all OpenCl kernels to test that …
I haven’t actually been able to reproduce my race condition, so I’m going to chalk that one up to my non-systematic testing at that point.
I captured the logs running with the rusticl implementation (-d perf -d opencl -d pipe) both before and after clearing the building= config and deleting the cached kernel. I don’t see any errors, but the output image is all black. When I remove filmic, the image appears normally.
I’ll upload the logs, but I realize that rusticl is not really supported.
I’ll try testing the opencl-amd implementation with those debug logs enabled, but that’s the one where I keep losing the display, and that gets old very fast.
Thanks for all your help so far. It could be that my hardware combination just isn’t supported.
basicConfig.txt (22.6 KB)
modifiedConfig.txt (22.7 KB)
Have you tried to disable the -cl-fast-relaxed-math before building the kernels?
Yes. “modifiedConfig.txt” is the log where I disabled that (“CL COMPILER OPTION” is blank). I removed the cached kernels between each attempt.