Segfault with tcmalloc

I have a strange problem since a while now: when I try to compile ART from git source on my computer (Linux Manjaro) against tcmalloc, I get a segmentation fault on start. But when the program is compiled for the AUR package (which basically does the same thing, and also builds against tcmalloc) it doesn’t segfault.

Here’s the the error message from gdb running a debug build:

(gdb) run
Starting program: /home/sguyader/Photo-apps/art-master-debug/ART 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
[New Thread 0x7ffff2460700 (LWP 138683)]
[New Thread 0x7ffff1c5f700 (LWP 138684)]
[New Thread 0x7ffff145e700 (LWP 138685)]

Thread 1 "ART" received signal SIGSEGV, Segmentation fault.
0x00007ffff58a876a in tc_newarray () from /usr/lib/libtcmalloc.so.4
(gdb) bt full
#0  0x00007ffff589870b in tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned int, int) ()
    at /usr/lib/libtcmalloc.so.4
#1  0x00007ffff5898a10 in tcmalloc::ThreadCache::Scavenge() () at /usr/lib/libtcmalloc.so.4
#2  0x00005555560a1950 in cJSON_Delete (item=0x5555572ccac0) at /home/sguyader/Sources/art/rtengine/cJSON.c:233
        next = 0x5555572ccb00
#3  0x00005555560a18de in cJSON_Delete (item=0x5555572cc8c0) at /home/sguyader/Sources/art/rtengine/cJSON.c:223
        next = 0x0
#4  0x00005555560a18de in cJSON_Delete (item=0x5555572cc840) at /home/sguyader/Sources/art/rtengine/cJSON.c:223
        next = 0x5555572ccb40
#5  0x00005555560a18de in cJSON_Delete (item=0x5555571c8940) at /home/sguyader/Sources/art/rtengine/cJSON.c:223
        next = 0x0
#6  0x00005555560a18de in cJSON_Delete (item=0x5555571c8880) at /home/sguyader/Sources/art/rtengine/cJSON.c:223
        next = 0x0
#7  0x00005555563df10e in rtengine::CameraConstantsStore::parse_camera_constants_file(Glib::ustring)
    (this=0x55555685aac0 <rtengine::CameraConstantsStore::getInstance()::instance_>, filename_=...)
    at /home/sguyader/Sources/art/rtengine/camconst.cc:789
        filename = 0x555557083840 "/home/sguyader/Photo-apps/art-master-debug/dcraw.json"
        stream = 0x5555570b6000
        bufsize = 114688
        increment = 131072
        datasize = 67520
        ret = 0
        buf = 0x555557262000 ""
        jsroot = 0x5555571c8880
        js = 0x0
#8  0x00005555563df377 in rtengine::CameraConstantsStore::init(Glib::ustring, Glib::ustring)
    (this=0x55555685aac0 <rtengine::CameraConstantsStore::getInstance()::instance_>, baseDir=..., userSettingsDir=...)
    at /home/sguyader/Sources/art/rtengine/camconst.cc:833
        f = {static npos = 18446744073709551615, string_ = "/home/sguyader/Photo-apps/art-master-debug/dcraw.json"}
        i = 0
        builtin_files = 
          {0x55555657846c "dcraw.json", 0x555556578477 "rt.json", 0x55555657847f "camconst.json", 0x55555657848d "cammatrices.json"}
        userFile = {static npos = 18446744073709551615, string_ = "/home/sguyader/Photo-apps/art-master-debug/dcraw.json"}
#9  0x0000555556154d7e in rtengine::_ZN8rtengine4initEPKNS_8SettingsEN4Glib7ustringES4_b._omp_fn.0(void) ()
    at /home/sguyader/Sources/art/rtengine/init.cc:87
        s = 0x555556852fd0 <options+1264>
        baseDir = @0x7fffffffd880: {static npos = 18446744073709551615, string_ = "/home/sguyader/Photo-apps/art-master-debug"}
        userSettingsDir = @0x7fffffffd8a0: {static npos = 18446744073709551615, string_ = "/home/sguyader/.config/ART"}
        loadAll = true
#10 0x00007ffff5207df3 in GOMP_parallel_sections
    (fn=0x555556154c39 <rtengine::_ZN8rtengine4initEPKNS_8SettingsEN4Glib7ustringES4_b._omp_fn.0(void)>, data=0x7fffffffd740, num_threads=4, count=7, flags=0) at /build/gcc/src/gcc/libgomp/sections.c:235
        team = <optimized out>
#11 0x00005555561549f1 in rtengine::init(rtengine::Settings const*, Glib::ustring, Glib::ustring, bool)
    (s=0x555556852fd0 <options+1264>, baseDir=..., userSettingsDir=..., loadAll=true) at /home/sguyader/Sources/art/rtengine/init.cc:52
#12 0x0000555555ed2f90 in Options::load(bool, int) (lightweight=false, verbose=-1) at /home/sguyader/Sources/art/rtgui/options.cc:2467
        path = 0x0
        dPath = {static npos = 18446744073709551615, string_ = ""}
        defaultTranslation = {static npos = 18446744073709551615, string_ = "/home/sguyader/Photo-apps/art-master-debug/languages/default"}
        languageTranslation = {static npos = 18446744073709551615, string_ = "/home/sguyader/Photo-apps/art-master-debug/languages/English"}
        localeTranslation = 
          {static npos = 18446744073709551615, string_ = "/home/sguyader/Photo-apps/art-master-debug/languages/English (US)"}
#13 0x0000555555e85b93 in main(int, char**) (argc=1, argv=0x7fffffffdda8) at /home/sguyader/Sources/art/rtgui/main.cc:520
        exname = "/home/sguyader/Photo-apps/art-master-debug/ART", '\000' <repeats 465 times>
        exePath = {static npos = 18446744073709551615, string_ = "/home/sguyader/Photo-apps/art-master-debug"}
        fatalError = {static npos = 18446744073709551615, string_ = ""}
        ret = 32767

Any idea what’s going on?

No idea, sorry. Did you try building tcmalloc yourself and see if that helps? Also, what happens with a different allocator?

I haven’t tried compiling tcmalloc myself, but what puzzles me is that the AUR package uses the same tcmalloc library but doesn’t crash.
Disabling tcmalloc (using the generic malloc I suppose) doesn’t crash.
I’ll try compiling tcmalloc anyways and report back here.

Using a freshly compiled tcmalloc doesn’t change the problem.
I also tried with mimalloc, it also crashes.

Thanks, I’ll take a closer look when I can. If you have time/interest in helping further, you can try using git bisect to nail down the first commit when this occurs, and also compile with address sanitizer and see if it reports something (that’s what I’m going to do basically…)

Ok Alberto I’ll try that.

Before using git bisect, I wanted to check an old commit (ae4b70e) that I think used to work, but now it also crashes. Here’s the address sanitizer output:

[sguyader@sg-lenovo build]$ ./debug/ART      
=================================================================
==188559==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60200007a278 at pc 0x7f874f24d1ca bp 0x7f8746318080 sp 0x7f8746317828
WRITE of size 9 at 0x60200007a278 thread T1
    #0 0x7f874f24d1c9 in __interceptor_memcpy /build/gcc/src/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:806
    #1 0x7f874d493074 in lfLens::AddMount(char const*) (/usr/lib/liblensfun.so.2+0x20074)
    #2 0x7f874d48a0c2  (/usr/lib/liblensfun.so.2+0x170c2)
    #3 0x7f874e260369 in g_markup_parse_context_parse (/usr/lib/libglib-2.0.so.0+0x55369)
    #4 0x7f874d48c102 in lfDatabase::Load(char const*, char const*, unsigned long) (/usr/lib/liblensfun.so.2+0x19102)
    #5 0x7f874d48ddb5 in lfDatabase::Load(char const*) (/usr/lib/liblensfun.so.2+0x1adb5)
    #6 0x7f874d48ddef in lfDatabase::Load(char const*) (/usr/lib/liblensfun.so.2+0x1adef)
    #7 0x7f874d48dbb1 in lfDatabase::Load() (/usr/lib/liblensfun.so.2+0x1abb1)
    #8 0x5591f7869782 in rtengine::LFDatabase::init(Glib::ustring const&) /home/sguyader/Sources/art/rtengine/rtlensfun.cc:334
    #9 0x5591f752479a in rtengine::init(rtengine::Settings const*, Glib::ustring, Glib::ustring, bool) [clone ._omp_fn.0] /home/sguyader/Sources/art/rtengine/init.cc:61
    #10 0x7f874c4a03ed in gomp_thread_start /build/gcc/src/gcc/libgomp/team.c:123
    #11 0x7f874c451421 in start_thread (/usr/lib/libpthread.so.0+0x9421)
    #12 0x7f874c380bf2 in __GI___clone (/usr/lib/libc.so.6+0xffbf2)

0x60200007a278 is located 0 bytes to the right of 8-byte region [0x60200007a270,0x60200007a278)
allocated by thread T1 here:
    #0 0x7f874f2c5459 in __interceptor_malloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:145
    #1 0x7f874d49305f in lfLens::AddMount(char const*) (/usr/lib/liblensfun.so.2+0x2005f)

Thread T1 created by T0 here:
    #0 0x7f874f26b1c7 in __interceptor_pthread_create /build/gcc/src/gcc/libsanitizer/asan/asan_interceptors.cpp:214
    #1 0x7f874c4a0a0b in gomp_team_start /build/gcc/src/gcc/libgomp/team.c:839
    #2 0x7f874c498ded in GOMP_parallel_sections /build/gcc/src/gcc/libgomp/sections.c:234

SUMMARY: AddressSanitizer: heap-buffer-overflow /build/gcc/src/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:806 in __interceptor_memcpy
Shadow bytes around the buggy address:
  0x0c04800073f0: fa fa fd fa fa fa fd fd fa fa fd fd fa fa fd fa
  0x0c0480007400: fa fa fd fa fa fa fd fa fa fa 04 fa fa fa 04 fa
  0x0c0480007410: fa fa 06 fa fa fa 06 fa fa fa 06 fa fa fa 06 fa
  0x0c0480007420: fa fa fd fa fa fa 06 fa fa fa fd fa fa fa fd fa
  0x0c0480007430: fa fa fd fa fa fa fd fa fa fa 00 fa fa fa 06 fa
=>0x0c0480007440: fa fa fd fa fa fa 02 fa fa fa 07 fa fa fa 00[fa]
  0x0c0480007450: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0480007460: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0480007470: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0480007480: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0480007490: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==188559==ABORTING

Hi,
Did you upgrade lensfun recently? Or are you using a different version than the aur build?

I saw these messages regarding lensfun. I have only 1 lensfun installed, so the AUR build should use the same as far as I know.
Anyways I just installed lensfun-git from AUR… and now it seems to work (I only tried the older commit from May, I’ll try the latest now).

Edit: the version of the lensfun binary installed on Manjaro was 0.3.95-2, and building from the AUR provides the same 0.3.95 version…

just a wild guess: there might have been some link-time incompatibility that caused lensfun to use the “wrong” malloc/free. If you mix different allocators, then bad things happen… but as I said this is just a wild guess

I still don’t understand why compiling from the AUR worked, using the same lensfun binary…
Anyways, now my problem is solved. Thanks for the help.

That’s a funny thing…i disabled tcmalloc for the AUR package some time ago as i experienced segfaults as well. I tried again after a few days and the segfaults where gone…i didn’t check any further. Till now the segfaults didn’t come back…