darktable 3.6.1 crashing when OpenCL activated - Manjaro [solved]

Looking for some help with an odd new issue that started happening after the latest Manjaro update (installed from the official repositories (community)). I’ve been using darktable for a while with OpenCL enabled and working fine but started experiencing crashes after an update. Darktable would start just fine and load the first few thumbnails but would crash as I started to scroll down the lighttable. I tried changing the settings on quality and format of thumbnails but nothing worked until I turned off OpenCL.
Now, darktable starts up normally and remains stable until I reactivate OpenCL and start scrolling through thumbnails in the lighttable or start exporting photos.

So far, I’ve only seen this problem when OpenCL is on. It crashes and exits with this error message when I run it from the terminal:

/usr/include/c++/11.1.0/bits/stl_vector.h:1045: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = hsa_signal_s; _Alloc = std::allocator<hsa_signal_s>; std::vector<_Tp, _Alloc>::reference = hsa_signal_s&; std::vector<_Tp, _Alloc>::size_type = long unsigned int]: Assertion ‘__n < this->size()’ failed.
zsh: abort (core dumped) darktable -d cache

darktable --version output:

this is darktable 3.6.1
copyright (c) 2009-2021 johannes hanika
darktable-dev@lists.darktable.org

compile options:
bit depth is 64 bit
normal build
SSE2 optimized codepath enabled
OpenMP support enabled
OpenCL support enabled
Lua support enabled, API version 7.0.0
Colord support enabled
gPhoto2 support enabled
GraphicsMagick support enabled
ImageMagick support disabled
OpenEXR support enabled

neofetch output for system information:
OS: Manjaro Linux x86_64
Kernel: 5.15.7-1-MANJARO
Shell: bash 5.1.12
Resolution: 3840x2160
DE: Plasma 5.23.4
WM: KWin
Theme: Breath2 2021 Dark [Plasma], Breeze [GTK2/3]
Icons: breeze [Plasma], breeze [GTK2/3]
CPU: AMD Ryzen 7 3700X (16) @ 3.600GHz
GPU: AMD ATI Radeon RX 5500/5500M / Pro 5500M
Memory: 3982MiB / 32070MiB

clinfo output attached
clinfo.txt (15.9 KB)

Thank you to everyone involved with developing this amazing piece of software! Any help would be appreciated.

You can start darktable from the terminal with darktable -d opencl and paste the results after of crashes.

Here’s the output from darktable -d opencl
opencl.txt (46.8 KB)

The last line before it crashed was
[pixelpipe_process] [thumbnail] using device 0

And I still get this weird error message in the terminal:
/usr/include/c++/11.1.0/bits/stl_vector.h:1045: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = hsa_signal_s; _Alloc = std::allocator<hsa_signal_s>; std::vector<_Tp, _Alloc>::reference = hsa_signal_s&; std::vector<_Tp, _Alloc>::size_type = long unsigned int]: Assertion ‘__n < this->size()’ failed.

Morning, @botheringbirds, and welcome!

What graphic-card drivers do you use?
I have an Nvidia – so my settings naturally
will differ a bit from yours, but still: my
old settings can be found here:

Have fun!
Claes in Lund, Sweden

@Claes thank you for the suggestion! I’m using an AMD Radeon RX 5500 XT.
I actually saw your post before making my own and made sure that I had the matching programs for AMD.

Hmmmm… Have you seen this?

@Claes I have not. That thread looks to be for the newer 6900 XT and more focused on mining but a driver issue could make sense. I looked around for something to help diagnose and came across clpeak which crashes with the same kind of error.

   ~  clpeak
Platform: Clover
Device: AMD Radeon RX 5500 XT (NAVI14, DRM 3.42.0, 5.15.7-1-MANJARO, LLVM 13.0.0)
Driver version : 21.2.5 (Linux x64)
Compute units : 22
Clock frequency : 1895 MHz
Build Log: fatal error: cannot open file ‘/usr/share/clc/gfx1012-amdgcn-mesa-mesa3d.bc’: No such file or directory

Platform: AMD Accelerated Parallel Processing
Device: gfx1012:xnack-
Driver version : 3361.0 (HSA1.1,LC) (Linux x64)
Compute units : 11
Clock frequency : 1900 MHz

Global memory bandwidth (GBPS)
/usr/include/c++/11.1.0/bits/stl_vector.h:1045: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator(std::vector<_Tp, _Alloc>::size_type) [with _Tp = hsa_signal_s; _Alloc = std::allocator<hsa_signal_s>; std::vector<_Tp, _Alloc>::reference = hsa_signal_s&; std::vector<_Tp, _Alloc>::size_type = long unsigned int]: Assertion ‘__n < this->size()’ failed.
zsh: abort (core dumped) clpeak

Aha! That was an important clue.
See suggestion at the end of this page:
https://bugs.archlinux.org/task/70740

/careful: there might be one space too many in his link/

1 Like

Fixed! That was an adventure. I had ROCm and the opencl-mesa package installed and needed the opencl-amd one instead. I uninstalled opencl-mesa and the ROCm related packages and ran “pamac install opencl-amd”. There was some issue with the ncurses5 dependency but importing the PGP key as advised here: AUR (en) - ncurses5-compat-libs did the trick. Thanks, @Claes!

1 Like

Super!
Now you can go on bothering birds again.

Have fun!
Claes in Lund, Sweden