Help for testing latest PhotoFlow version (particularly on Windows)

Carmelo_DrRaw · January 26, 2017, 12:33pm

Ok, there are clearly some bad coordinates… do you still have the full log? I would need to see the messages before the “pont @” lines. Which, by the way, seem to be too many, and repeating…

heckflosse · January 26, 2017, 12:42pm

Here’s a full log
log.7z (65.2 KB)

Carmelo_DrRaw · January 26, 2017, 12:57pm

Is this the log that corresponds to the crash? I’m asking because I cannot find the lines like

point @ x0=71170300  y0=71170313

in your full log…

Thanks!

heckflosse · January 26, 2017, 1:02pm

No, I had make a new one. But it corresponds to a crash

Carmelo_DrRaw · January 26, 2017, 3:43pm

I start to suspect some problem related to standard C++ containers (std::list and/or std::vector). All crashes seem to happen when accessing container elements… I need to check if I am not doing something wrong there.

Silvio_Grosso · January 26, 2017, 6:54pm

Hello,

I have downloaded today’s version : Unzipped on Windows 7 (64 bit).
As usual it is extremely easy to crash PhotoFlow with the clone stamp tool : tried on several images (jpeg - Nef from Nikon D700)

I have created the usual backtrace through gdb [1] and with the messages printed on prompt console [2]

I have further investigated the crash which easily occurs while opening different big D810 nef images (around 40-50 Mb each). Photoflow alwasy crashes when you open a second different Nef image.
Tried on 3 different computers and different Windows versions (7 - 8.1 - 10).
Here you can take a look at the messages got toninght from the prompt [3].
As soon as I have tried to open a new different D810 Nef image PhotoFlow has crashed suddenly (the first Nef image always worked great instead).
For this crash I have tried to get a full backtrace too but it was never “produced”.
Here you can take a look a the gbd prompt console:

THANKS a lot indeed !

[1] https://dl.dropboxusercontent.com/u/3095134/PC_TRUCCHI/BUG_REPORTS/2016-01-26-logs/Backtrace_stamp_clone_tool_crash.txt
[2] https://dl.dropboxusercontent.com/u/3095134/PC_TRUCCHI/BUG_REPORTS/2016-01-26-logs/Prompt_messaga_crash_clone_tool.txt
[3] https://dl.dropboxusercontent.com/u/3095134/PC_TRUCCHI/BUG_REPORTS/2016-01-26-logs/prompt_message_nef_d810_crash.txt

floessie · January 26, 2017, 7:53pm

Hm, you are working on refs to refs to refs (std::list< std::pair<int, int> >& ← PF::Stroke<PF::Stamp>& ← PF::StrokesGroup& ← …) which is okay and efficient, but are you sure that no other thread is modifying the base instance? I must admit, I only had a quick look at the link to GitHub you provided…

Carmelo_DrRaw · January 26, 2017, 8:02pm

@floessie I just came more or less to the same conclusion… I am messing up with threads, and I need to protect the modifications to the container objects with mutexes.

Hopefully it is as simple as that

Thanks for looking!

heckflosse · January 26, 2017, 8:22pm

That’s the first step to get completely confused. Maybe a different threading would be more clear…

Carmelo_DrRaw · January 26, 2017, 8:31pm

Well, in fact it is a bit more complicated… all tools in PhF process the image data chunk-wise in several parallel threads, and each thread has it’s own private copy of the parameters, therefore there is no need for mutexes. Also, tool parameters are usually atomic (integer or float values, or fixed-size arrays at worse), with two notable exceptions: the draw tool and the clone/stamp tool, which require complex sort-of-lists-of-vectors-of-objects-with-lists-of-pairs-… in other words, quite complex structures with dynamic sizes.

So those two tools need some “special” treatment.

By the way, a different threading is out of discussion for the moment. The parallel processing is handled very well by the underlying VIPS library, and I have no good reason to change this part

floessie · January 27, 2017, 8:15am

Maybe, instead of serializing the whole processing on a mutex for these two tools, it could be beneficial to work on copies of those sort-of-lists-of-vectors-of-objects-with-lists-of-pairs-… and merge them under a mutex later, if that’s applicable(?).

Carmelo_DrRaw · January 27, 2017, 8:27am

Maybe we are thinking about the same solution… my idea is to make a local copy of the sort-of-lists-of-vectors-of-objects-with-lists-of-pairs-… structure at the beginning of the processing of each chunk, and only protect this copying phase with a mutex, so that the mutex can be released as soon as possible.

assaft · January 27, 2017, 11:13am

I’m trying to understand the arch - are you saying that the image is divided into chunks (E.g. for simplicity, 4 quarters) and then a thread is created per chunk to process each chunk’s data throughout the sequence of layers, and at the end the chunks are merged?

This sounds like a good threading model to me. Obviously, for layers that require a global view on the image (e.g. the stamp-clone), the chunks need to be merged before processing that layer by a single thread, and then there will be a split to chunks with multiple threads carrying on the processing as before until the end.

Usually it is recommend to set the number of processing threads based on the number of cores in the machine. For example, if it has a dual cpu, dual core, hardware, then 4 processing threads and division for 4 chunks.

About STL, the docs of SGI STL say that simultaneous read accesses to shared containers are safe but it could be that the Windows implementation doesn’t follow suit. For example, if some structures use smart pointers without protecting the update of the reference counter. So if you have 2 threads reading from the same structure at the same time, the you’ll have a racing condition on the reference counter. If the memory footprint of these structures isn’t big then duplicating them is indeed a simple solution. Alternatively, you can look at [STLPort] (STLport: SGI STL Overview) which is thread-safe and provides an identical implementation for both Unix-based and Window versions so maybe less chance of platform variability.

Carmelo_DrRaw · January 27, 2017, 2:20pm

Yes, this is basically the philosophy which is followed.

No, this is exactly the kind of thing which is avoided in the code. PhF never stores the whole image into memory, only the pieces that are being processed. For the clone/stamp tool, each output chunk is analysed in terms of which strokes do actually modify it, and for each of those strokes the corresponding input areas are computed and properly copied to the output region.

As such, PhF is, at least in principle, completely scalable in terms of the size of the input data being processed. One can process a 10000x10000 pixels image consuming almost as much memory as what is needed for a 1000x1000 pixels image…

This is exactly what the underlying VIPS library does.

I think my code is doing much worse: simultaneous read/write accesses to shared containers, which I’m pretty sure should be avoided without exceptions… I’ll circulate a new thread-safe version in a short while, hopefully this issue will be finally fixed!

Carmelo_DrRaw · January 27, 2017, 10:46pm

@assaft @Silvio_Grosso @heckflosse I have a new Windows version for testing, with mutex-protected structures in the clone/stamp tool: Filebin | abke2gg0y19tsbxw

Could you check if this solves the crashes, whenever you have the opportunity to do so?

A BIG THANKS for the help!!!

Silvio_Grosso · January 28, 2017, 7:32am

Hello Carmelo_DrRaW,

Just downloaded the new version: 20170127

Now PhotoFlow is much more stable with the Stamp Clone tool !

Unfortunately, it still crashes but much more later than before. As a consequence, the messages by the prompt produce a huge log text file this time before the crash : 30 Mb [1]

Another problem is stritcly related to my personal hardware setup:
Windows 7 - 64 bit
CPU : Intel I7
RAM : 8 gb
GPU : Nvidia graphic card
In short, with my Raw Nef images (Nikon D700 - 8-9 Mb each) there is always a big lag between the click to apply the stamp clone and the final result on the image. But I suppose I should buy a much powerful computer to avoid this lag with big Raw Nef images

EDIT:
Tested on Windows 10 - 64 bit as well (to confirm the problem on a different computer)
CPU : Intel I7
RAM: 8 gb
GPU: Nvidia graphic card

On Windows 10 today’s version is indeed much more stable !
It still crashes though but only if you really continue with the stamp clone tool to do so…
I don’t link the log txt that I got with message prompt because it is huge (210 Mb; with 7000000 rows…)

BTW , both on Windows 7 - Windows 10 I have noticed that sometimes you lose some clicks when you clone the image. In short, the stamp clone action is not applied on the image.
It occurs on random but it is bothering because you are forced to repeat once again your action. At first, I have supposed it was a lag due to my big Nef images (8 - 9 Mb) but now I have realized that you really lose them and you have to repeat them.

Thanks a lot for your work.
Aside from this particular crash PhotoFlow is indeed amazing !

[1] https://dl.dropboxusercontent.com/u/3095134/PC_TRUCCHI/BUG_REPORTS/2016_01_28_logs/log_stamp.txt

Carmelo_DrRaw · January 28, 2017, 11:38am

@Silvio_Grosso I have prepared a new version that is again optimized and without all the verbose output from the clone/stamp tool: Filebin | fs3gyu6ql2ixdz2l

If it works, I will add it to the continuous github release.

Concerning the lag at the beginning of the strokes, I need to investigate it and see if it can be avoided.

Silvio_Grosso · January 28, 2017, 12:10pm

Hello,

Just tested today’s build on:
Windows 7 - 64 bit
CPU : Intel I7
RAM: 8 gb
GPU: Nvidia graphic card
The Raw Nef D700 image tested is around 9 Mb big

The crash as usual occurs only if you really want to unleash it because now PhotoFlow is much more robust at this regard

Here is the txt log [1] and the video [2] I have recorded with my steps (they are completely on random on the image just to force PhotoFlow to crash…)

I suppose the lag to apply the clone action to the image is due to my hardware which is not simply powerful enough. With smaller jpeg images I would opt for Gimp…

[1] https://dl.dropboxusercontent.com/u/3095134/PC_TRUCCHI/BUG_REPORTS/2016_01_28_logs/crash_clone_stamp_windows_7.txt
[2] https://dl.dropboxusercontent.com/u/3095134/PC_TRUCCHI/BUG_REPORTS/2016_01_28_logs/PHOTOFLOW_20170128_STAMP_CLONE_CRASH.avi

Silvio_Grosso · January 28, 2017, 6:01pm

Hello,

Another worse problem related to PhotoFlow (today’s version 20170128):
It is always unable to restore any image which has previously crashed due to the clone stamp tool.
In essence, none of my images are able to being restored with the actions made with the clone tool before it crashed PhotoFlow.

To reproduce this (Windows 7 - 64 bit):

open a Raw Nef image (D700) and work with the clone stamp tool until PhotoFlow crashes;
Restart PhotoFlow and open the same image;
Click Ok to restore the same image;
Do something with other different tools (e.g. > basic editing): sooner or later PhotoFlow always crashes.

To avoid this crash and finally save your image you must delete all changes done with the clone stamp tool (by simpling unchecking the stamp tool option from the layer stack).

Here is the log text file [1] and the ancillary video with all my steps [2].

[1] https://dl.dropboxusercontent.com/u/3095134/PC_TRUCCHI/BUG_REPORTS/2016_01_28_logs/crash_recovery_failing_log.txt
[2] https://dl.dropboxusercontent.com/u/3095134/PC_TRUCCHI/BUG_REPORTS/2016_01_28_logs/PHOTOFLOW_20170128_CLONE_STAMP_RESTORING_FAILING.avi

assaft · January 28, 2017, 11:58pm

OK, and how does the caching work? do you cache the output of each chunk of each layer?

Regarding the number of threads being used - last time I tried to run with gdb there were still hundreds of messages about thread creation and termination.

About the latest Windows build, I’ll try to find time to check it soon.