ART 1.18.1 segfaulting on launch

I am running ART on Slackware Linux. I’m actually the community maintainer of the ART package for SlackBuilds.org. Last night I upgraded from 1.17.2 to 1.18.1. The build went fine, but it segfaults immediately when I try to run it. No information is provided other than “Segmentation fault.” Any other reports like this recommendations for how to debug this? I guess the first thing to do is do a debug build and see if it will give me a traceback. I did see a thread that someone is having a possibly similar issue on Windows 10.

1 Like

Hi,
Yes, a backtrace from a debug build is your best bet

Same issue with the debug build. I have a hunch this is related to mimalloc, which was a new requirement for 1.18.1. Maybe I can do a debug build of that.

Great, so now you can produce a gdb backtrace and we can see what is going on :slight_smile:
BTW, does the “official” linux64 binary crash too?

Looks like my hunch was correct. Here is the gdb output:

Starting program: /usr/bin/ART 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x7ffff320d640 (LWP 14219)]
[New Thread 0x7ffff2a0c640 (LWP 14220)]
[New Thread 0x7ffff220b640 (LWP 14221)]
[New Thread 0x7ffff1a0a640 (LWP 14222)]
[New Thread 0x7ffff1209640 (LWP 14223)]
[New Thread 0x7ffff0a08640 (LWP 14224)]
[New Thread 0x7ffff0207640 (LWP 14225)]
[New Thread 0x7fffefa06640 (LWP 14226)]
[New Thread 0x7fffef205640 (LWP 14227)]
[New Thread 0x7fffeea04640 (LWP 14228)]
[New Thread 0x7fffee203640 (LWP 14229)]
[New Thread 0x7fffeda02640 (LWP 14230)]
[New Thread 0x7fffed201640 (LWP 14231)]
[New Thread 0x7fffeca00640 (LWP 14232)]

Thread 15 "ART" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffeca00640 (LWP 14232)]
0x00007ffff52e89dd in mi_malloc () from /usr/lib64/libmimalloc.so.2

The mimalloc version is 2.0.7. Any idea if that is supported?

The linux64 binary doesn’t work, because it’s linked to libsystemd.so.0, and Slackware doesn’t have that.

Thanks for the info. I’m not sure I’ve tried 2.0.7 but I will upgrade locally and see if there’s anything strange going on there.

@montagdude 2.0.9 works fine here. Did you build mimalloc yourself? Did you try one of the env vars to see if any useful message is displayed? (mi-malloc: Environment Options)

mimalloc is another from the community SlackBuilds repo on Slackware, so it was built from source but presumably has had at least one person check that it works. I don’t have time for further debugging right now, but I will definitely dig into it and update this thread with my findings. Thanks for your help.

Partly related, when I was experimenting compiling ART on MacOS Ventura the other day, sometimes I had crashes and the MacOS bug report seemed to indicated something related to mimalloc as well. But without GDB on this OS, I couldn’t go any further hunting the bug.
I also have mimalloc 2.0.9 I think.

If you want to try without mimalloc, this patch should be enough:

diff --git a/CMakeLists.txt b/CMakeLists.txt
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -607,7 +607,7 @@
         set(ART_MIMALLOC_VERSION_INFO "V${mimalloc_VERSION}")
     endif()
 else()
-    message(FATAL_ERROR "ART requires the mimalloc library. Please install it (see https://microsoft.github.io/mimalloc/)")
+    message(WARNING "ART requires the mimalloc library. Please install it (see https://microsoft.github.io/mimalloc/)")
 endif()
 
 if(ENABLE_LIBRAW)

FYI, ART 1.18.1 works fine on openSUSE Tumbleweed (mimalloc 2.0.9)

I was away for the last few days. It just so happens that mimalloc was updated to 2.0.9 on my OS the other day, but unfortunately that doesn’t solve the problem. I tried MIMALLOC_VERBOSE=1, but it didn’t give any information that seems particularly useful. Here is the output:

dello@Latitude:~$ ART
mimalloc: process init: 0x7faddc958dc0
mimalloc: secure level: 0
mimalloc: mem tracking: none
mimalloc: using 1 numa regions
mimalloc: option 'show_errors': 0
mimalloc: option 'show_stats': 0
mimalloc: option 'verbose': 1
mimalloc: option 'eager_commit': 1
mimalloc: option 'deprecated_eager_region_commit': 0
mimalloc: option 'deprecated_reset_decommits': 0
mimalloc: option 'large_os_pages': 0
mimalloc: option 'reserve_huge_os_pages': 0
mimalloc: option 'reserve_huge_os_pages_at': -1
mimalloc: option 'reserve_os_memory': 0
mimalloc: option 'deprecated_segment_cache': 0
mimalloc: option 'page_reset': 0
mimalloc: option 'abandoned_page_decommit': 0
mimalloc: option 'deprecated_segment_reset': 0
mimalloc: option 'eager_commit_delay': 1
mimalloc: option 'decommit_delay': 25
mimalloc: option 'use_numa_nodes': 0
mimalloc: option 'limit_os_alloc': 0
mimalloc: option 'os_tag': 100
mimalloc: option 'max_errors': 16
mimalloc: option 'max_warnings': 16
mimalloc: option 'max_segment_reclaim': 8
mimalloc: option 'allow_decommit': 1
mimalloc: option 'segment_decommit_delay': 500
mimalloc: option 'decommit_extend_delay': 1
mimalloc: option 'destroy_on_exit': 0
Segmentation fault

I would like to get it working with mimalloc rather than disabling it. I’ll do a debug build of mimalloc next.

The plot thickens…

When I do a debug build of mimalloc, ART runs, but it gives an error when I close the window. Here is the mimalloc verbose output and gdb backtrace. Problem is, now I don’t know if this is the same as the error that causes the segfault with the regular build. I guess there’s a good chance it is the same, and it’s just manifesting differently in the debug build.

mimalloc: error: buffer overflow in heap block 0x200020c0390 of size 7: write after 7 bytes

Thread 1 "ART" received signal SIGABRT, Aborted.
0x00007ffff47ab868 in raise () from /lib64/libc.so.6
(gdb) bt
#0  0x00007ffff47ab868 in raise () at /lib64/libc.so.6
#1  0x00007ffff4792546 in abort () at /lib64/libc.so.6
#2  0x00007ffff52e2748 in  () at /usr/lib64/libmimalloc-debug.so.2
#3  0x00007ffff52f4887 in  () at /usr/lib64/libmimalloc-debug.so.2
#4  0x00007ffff617fd64 in lfMount::~lfMount() () at /usr/lib64/liblensfun.so.2
#5  0x00007ffff6164250 in lfDatabase::~lfDatabase() ()
    at /usr/lib64/liblensfun.so.2
#6  0x00007ffff6164349 in lfDatabase::Destroy() ()
    at /usr/lib64/liblensfun.so.2
#7  0x0000000000bee383 in rtengine::LFDatabase::~LFDatabase() ()
#8  0x00007ffff47ae4c7 in __run_exit_handlers () at /lib64/libc.so.6
#9  0x00007ffff47ae66a in  () at /lib64/libc.so.6
#10 0x00007ffff4794044 in __libc_start_main () at /lib64/libc.so.6
#11 0x000000000071f1aa in _start ()

I figured out the problem. I saw lensfun pretty high in the traceback, so I decided to look into it. It turns out Slackware 15 has a non-standard version of lensfun, 0.3.95, which I couldn’t even find on the project’s website or github. Switching to 0.3.3, the actual latest version, solved the segfault. However, this isn’t a real solution for me at the moment, since things in the SlackBuilds community repo are required to be tested against a stock Slackware installation. I’ll just use the patch @agriggio provided for now to remove mimalloc as a requirement.

The good news is that lensfun-0.3.3 is in the Slackware development tree, so it will be in the next release. I’ll just have to wait until then to re-enable mimalloc in my build.

1 Like

If it helps, I am ok with making mimalloc optional again, but enabled by default and with a clear warning that disabling it is not recommended

Either way is fine. Just as long as it doesn’t become actually required by the code (other than in CMakeLists.txt), it’s not a problem to keep patching it this way.

We had a similar issue a while back when Arch updated to a dev-version of lensfun, see for reference: In Arch Linux I noticed a problem with lensfun after upgrade - #6 by gaaned92

I assume your best bet would be to reach out to the package maintainer in Slack and make them move back to the official version.

That’s not really an option, since lensfun is part of the main Slackware distribution, and at least in the stable releases things only get updated for security fixes. It is an option in the development branch, but, as I said, lensfun has already been updated there. I may still upgrade it on my own system, though.