ART crashes on perspective correction

Hi, I have been using RawTherapee for several years and recently came across ART. I compile it myself on Xubuntu from git. I have noticed it will crash with regularity on Nikon NEF files when I apply perspective correction. I was wondering if this forum is the appropriate place to provide the details such as the gdb backtrace and a sample NEF file or is there another preferred way to do that? Thanks.

hi, thanks for reporting. bitbucket is the preferred place, but here is also fine. can you share a raw please? thanks!

I created myself a login on bitbucket but it doesn’t allow me to open an issue there. So I will here :slight_smile:

Xubuntu 18.04, fully patched. AMD Ryzen 2700X. 32 GB RAM.

Its pretty easy to replicate the crashing. Happens with most any of my Nikon D600 NEF files as soon as perspective correction is enabled. Note that if there is an existing .arp it seems to not crash and correction works. So with the attached DSC_9344.NEF just delete any existing DSC_9344.NEF.arp, start ART, open DSC_9344.NEF and enable perspective correction and it should crash immediately. Here is AboutThisBuild.txt:

Version: 1.1-14-g753b6b3b9
Branch: master
Commit: 753b6b3b9
Commit date: 2020-02-19
Compiler: cc 7.4.0
Processor: x86_64
System: Linux
Bit depth: 64 bits
Gtkmm: V3.22.2
Lensfun: V0.3.2.0
Exiv2: V0.25
Build type: debug
Build flags: -std=c++11 -march=native -Werror=unused-label -Wall -Wuninitialized -Wno-deprecated-declarations -Wno-unused-result -fopenmp -Werror=unknown-pragmas -g -ftree-vectorize
Link flags: -march=native
OpenMP support: ON
MMAP support: ON
Build OS: Linux
Build date: 2020-02-20T01:54:19Z

This is 1.1-14 but its been crashing on 1.0* also.

Uploading: DSC_9344.NEF…

Hummm, does the upload work? Let me try again…
log.txt (90.6 KB)
DSC_9344.NEF (29.6 MB)

thanks, I’ll take a look

@jtompkins66 I can’t reproduce the crash here, but the backtrace seems to indicate the problem seems related to parallelization. I’m not really familiar with openmp, so I’ll have to guess a little bit… (maybe @heckflosse can help?). You can try the following two things:

  1. try applying the following patch:
diff --git a/rtengine/iplab2rgb.cc b/rtengine/iplab2rgb.cc
--- a/rtengine/iplab2rgb.cc
+++ b/rtengine/iplab2rgb.cc
@@ -90,7 +90,7 @@
 
         // cmsDoTransform is relatively expensive
 #ifdef _OPENMP
-        #pragma omp parallel firstprivate(img, data, W, H)
+#       pragma omp parallel //firstprivate(img, data, W, H)
 #endif
         {
             AlignedBuffer<float> pBuf(3 * W);
  1. if that doesn’t work, try rebuilding without openmp and see if that helps – at least that will give us some clue.

Thanks!

Forgive my slow response, I am in the United States and so 6 or 7 hours behind you.

I first disabled OPENMP in the cmake by setting “-DOPTION_OMP=OFF”, it runs a lot slower but still crashes as soon as perspective correction is enabled. Then next while still disabling OPENMP I made the source code change you suggested, it still crashes. Also on both attempts I set the compile to generic rather than native code.

Here is the AboutThisBuild.txt:

Version: 1.1-21-g3ff525dec
Branch: master
Commit: 3ff525dec
Commit date: 2020-02-21
Compiler: cc 7.4.0
Processor: generic x86
System: Linux
Bit depth: 64 bits
Gtkmm: V3.22.2
Lensfun: V0.3.2.0
Exiv2: V0.25
Build type: debug
Build flags: -std=c++11 -mtune=generic -Werror=unused-label -Wall -Wuninitialized -Wno-deprecated-declarations -Wno-unused-result -g -ftree-vectorize
Link flags: -mtune=generic
OpenMP support: OFF
MMAP support: ON
Build OS: Linux
Build date: 2020-02-21T23:01:00Z

The attached log1.txt is the first crash, OPENMP off but no source code change, then log2.txt adds the source code change.

log1.txt (27.6 KB) log2.txt (27.0 KB)

I would not pretend to know a lot about C++ or the details of this software, but I have programmed in plain C. According to gdb in copyAndClamp H is 1340 and W is 2009, these are the loop limits. When it crashes i, the H loop counter, is 1220. If I go up a level to the caller, lab2monitorRgb I see that the dst parameter to copyAndClamp is image->data. If I examine the contents of image (the destination) I see that its width is 2009 (same as W) and its height is 1220 (smaller than H). That height, 1220, is smaller than the H value in copyAndClamp (1340) and that is also the value of i in the loop when it crashes (i.e. when its starts accessing invalid memory addresses). Perhaps the loop limits are wrong for the height loop in copyAndClamp?

can you also send me the arp sidecar file? thanks!

Thanks for the new logs, they are much clearer now without the openmp clutter. Thanks also for your analysis, you are definitely correct, there was a buffer size mismatch problem. I think this patch should fix it, can you please test and confirm?

diff --git a/rtengine/iplab2rgb.cc b/rtengine/iplab2rgb.cc
--- a/rtengine/iplab2rgb.cc
+++ b/rtengine/iplab2rgb.cc
@@ -81,6 +81,7 @@
 void ImProcFunctions::lab2monitorRgb(Imagefloat *img, Image8* image)
 {
     img->setMode(Imagefloat::Mode::LAB, multiThread);
+    image->allocate(img->getWidth(), img->getHeight());
     
     if (monitorTransform) {

I applied your patch and it works much better! Congratulations! I tried perspective correction on a number of files and it is much better. However I have found one file it crashes on. Let me investigate that one more and report back. Still, this is a great improvement. Thanks!

This new type crash is a lot more difficult to reproduce. But so far I can reproduce it eventually. I don’t know the exact details but so far I have been able to trigger it 5 or 6 times. Basically I have to open roughly 10 files applying perspective correction to each. Eventually one of them will crash. So far its always crashed on an image in portrait mode. It always crashes like this:

#0 0x00007ffff230b7e0 in __memcpy_ssse3 () at …/sysdeps/x86_64/multiarch/memcpy-ssse3.S:127
#1 0x0000555556013689 in rtengine::Crop::update(int) (this=0x5555844dd000, todo=4095) at /usr/local/arc/art/dev/rtengine/dcrop.cc:374

It crashes in the ssse3 version of memcpy called in dcrop.cc.

Attached is the gdb log.txt. I did not include any NEF files because it take about 10 or so and the one it crashes on changes, although again so far its been on a file in portrait mode. For the same reason I did not include any .arp files. See thread 1782.

log.txt (208.4 KB)

can you try this patch?

diff --git a/rtengine/dcrop.cc b/rtengine/dcrop.cc
--- a/rtengine/dcrop.cc
+++ b/rtengine/dcrop.cc
@@ -369,9 +369,12 @@
         Image8* final = new Image8(finalW, finalH);
         Image8* finaltrue = new Image8(finalW, finalH);
 
+        const int cW = cropImg->getWidth();
+        const int ctW = cropImgtrue->getWidth();
+
         for (int i = 0; i < finalH; i++) {
-            memcpy(final->data + 3 * i * finalW, cropImg->data + 3 * (i + upperBorder)*cropw + 3 * leftBorder, 3 * finalW);
-            memcpy(finaltrue->data + 3 * i * finalW, cropImgtrue->data + 3 * (i + upperBorder)*cropw + 3 * leftBorder, 3 * finalW);
+            memcpy(final->data + 3 * i * finalW, cropImg->data + 3 * (i + upperBorder)*cW + 3 * leftBorder, 3 * finalW);
+            memcpy(finaltrue->data + 3 * i * finalW, cropImgtrue->data + 3 * (i + upperBorder)*ctW + 3 * leftBorder, 3 * finalW);
         }
 
         cropImageListener->setDetailedCrop(final, finaltrue, params.icm, params.crop, rqcropx, rqcropy, rqcropw, rqcroph, skip);

I tried that, sadly it did not help. I have been testing more, it definitely seems related to images in portrait mode where they are cropped. I am able to trigger the crash with only two such images, that is it crashes on the second reliably. The crop happens to be in landscape mode even though they are portrait images. In gdb when it crashes I did an “up” to get into the Crop::update from the memcpy. From there this is what I have concerning local variables:

(gdb) up
#1 0x00005555560136da in rtengine::Crop::update (this=0x55555c8f26a0, todo=4095) at /usr/local/arc/art/dev/rtengine/dcrop.cc:383
383 memcpy(finaltrue->data + 3 * i * finalW, cropImgtrue->data + 3 * (i + upperBorder)*ctW + 3 * leftBorder, 3 * finalW);
(gdb) p finalW
$1 = 1340
(gdb) p finalH
$2 = 1977
(gdb) p i
$3 = 1692
(gdb) p upperBorder
$4 = 32
(gdb) p leftBorder
$5 = 0
(gdb) p cropw
$6 = 1340
(gdb) p cW
$7 = 1340
(gdb) p ctW
$8 = 1340

Note that the cW and ctW (which you added) are the same as the cropw used previously.

In gdb I also examined finaltrue and its width is 1340 with a height of 1977. But, examining cropImgtrue its width is 1340 while its height is only 1220, that is less then the height of finalH which is the loop limit. The loop counter “i” at the crash is 1692 which is higher than 1220 the height of cropImgtrue which is the source data so its accessing memory outside the source.

thanks for the detailed analysis – this is certainly helpful. I’ll dig into this more later

hi again, can you try this (on top of the current master)? Thanks!

diff --git a/rtengine/dcrop.cc b/rtengine/dcrop.cc
--- a/rtengine/dcrop.cc
+++ b/rtengine/dcrop.cc
@@ -352,7 +352,7 @@
         // Computing the internal image for analysis, i.e. conversion from lab->Output profile (rtSettings.HistogramWorking disabled) or lab->WCS (rtSettings.HistogramWorking enabled)
 
         // internal image in output color space for analysis
-        Image8 *cropImgtrue = parent->ipf.lab2rgb(bufs_[2], 0, 0, cropw, croph, params.icm);
+        Image8 *cropImgtrue = parent->ipf.lab2rgb(bufs_[2], 0, 0, cropImg->getWidth(), cropImg->getHeight(), params.icm);
 
         int finalW = rqcropw;
 
@@ -370,11 +370,10 @@
         Image8* finaltrue = new Image8(finalW, finalH);
 
         const int cW = cropImg->getWidth();
-        const int ctW = cropImgtrue->getWidth();
 
         for (int i = 0; i < finalH; i++) {
             memcpy(final->data + 3 * i * finalW, cropImg->data + 3 * (i + upperBorder)*cW + 3 * leftBorder, 3 * finalW);
-            memcpy(finaltrue->data + 3 * i * finalW, cropImgtrue->data + 3 * (i + upperBorder)*ctW + 3 * leftBorder, 3 * finalW);
+            memcpy(finaltrue->data + 3 * i * finalW, cropImgtrue->data + 3 * (i + upperBorder)*cW + 3 * leftBorder, 3 * finalW);
         }
 
         cropImageListener->setDetailedCrop(final, finaltrue, params.icm, params.crop, rqcropx, rqcropy, rqcropw, rqcroph, skip);

Version: 1.1-25-g0bfedcadd
Applied the above patch to the above newest version. It works! I am unable to get it to crash, congratulations! Now get some sleep, you must be up late. :slight_smile:

1 Like