Starting with G’MIC’s default settings for Details/Dcp dehaze I just changed Output to New layer(s) and Gamma to 35.00 and clicked OK. The result was quite interesting – but my computer needed 51 seconds to fulfill my command.
Is my computer really that slow or do you believe that I have missed something in the setup in The Gimp or in G’MIC?
On my Computer:
Windows 7 - 64 bit
CPU: I7 2630Qm
GPU: Nvidia 540M
RAM: 8 gb
With Gamma as 35.00 it takes 45 seconds on my image,
Judging by the task manager panel (taskmgr) it looks like this specific filter is multithreaded in that it uses all my cores (as regards the CPU i7) while running.
As far as I am aware, G’MIC filters are unable to use the GPU (to perform faster).
Thank you, @Jacal
Hm… I wonder if we dare to ask @David_Tschumperle about his execution time?
Presumably, he is running something super, high-fidelity de-luxe…?
As @paperdigits I also use an AMD FX-8350 8 cores @ 4 GHz. I don’t know anything about the implementation of Dcp dehaze, but I optimized the RT retinex stuff some time ago. 39 seconds for an 8 MP file sounds like a basic implementation without optimizations. RT retinex is at about 2 to 4 seconds (depending on settings) for a 36 MP file on the same cpu. Though I still don’t know whether I compare apples to pears here…
knowing that even small optimizations sum up if you can make a lot of them
but the main thing is, if I’m stuck somewhere in optimization process, I go to bed and the next morning I know how to solve it or I know that I can’t optimize it any further (maybe that’s a kind of wizardry)
Edit: To help understanding the process of optimization I wrote some lines here in gamcurve_opt.txt. It doesn’t cover all the points I mentioned above (only 1., 2., 4. and 6.)
I’ll be able to say how my computer performs tomorrow, at the lab
Also some reminders :
G’MIC filters are developed using the G’MIC script language, which is thus not compiled but interpreted. This is of course a good reason why G’MIC execution time is almost always slower if you compare it with an equivalent algorithm that has been compiled in C/C++ for instance. On the other hand, that’s also why the 450+ filters in libgmic takes less than 6Mb, and why we have such an amount of filters (they are usually easier/shorter to develop).
The dcp dehaze filter in G’MIC is not the same as the Retinex filter. I can’t say precisely what algorithm is used behind (as I said earlier, I didn’t develop this particular filter), but its algorithmic complexity is maybe of higher order than Retinex , and comparing the two algorithms is probably unfair too. For what I know, Retinex seems to be a quite basic algorithm in terms of complexity.
The time consuming part most likely stems from the loop containing “-median 3”. Whether the transmission map can be obtained without it is perhaps the problem…
If I read the G’MIC/CImg code correctly, the -median 3 applies a 3x3 median (median of 9 values) to the image.
The parallelization is at channel level, which means it will not use more then c cores, where c is the number of channels in the image.
I don’t know how the channel data is arranged in CImg class, but in case it is arranged like e.g. RGBRGBRGB… the parallelization will mostly become ineffective because of cache conflicts when writing to res. In worst case it can happen that the parallelized version takes a lot more time than a single threaded version because of these cache conflicts.
But even when using only one core the cpu would read and write c * amount of memory than necessary.
If the channel data is arranged as different planes 1. is still valid.
In RT we reviewed our median code some time ago and the fastest we could get for median of 9 values was this. Using this we could reduce the time to median9 a float 36 MP file with 3 separate channels to about 210 ms (70 ms per channel) measured on the above mentioned AMD FX-8350 8 cores @ 4 GHz
Right. Jérome could use then command '-apply_parallel_overlap` that could split the median filter into N image blocs (where N is equal or close to the number of threads). Anyway, I’ve tested it quickly on a decent image (res 3000x2135) on my 4-cores machine, and it appears it becomes a bit smaller for the 3x3 median. This could be interesting probably for even larger images, but probably without a big difference :
Without the spatial splitting of the image:
$ gmic image.jpg -tic -median 3 -toc -q
[gmic]-0./ Start G'MIC interpreter.
[gmic]-0./ Input file 'image.jpg' at position 0 (1 image 3000x2135x1x3).
[gmic]-1./ Initialize timer.
[gmic]-1./ Apply median filter of size 3, on image [0].
[gmic]-1./toc/ Set status to '0.332'.
[gmic]-1./ Elapsed time: 0.332 s.
[gmic]-1./ Quit G'MIC interpreter.
With the spatial splitting (here, my machine has 4 cores):
$ gmic image.jpg -tic -apply_parallel_overlap \"-median 3\" -toc -q
[gmic]-0./ Start G'MIC interpreter.
[gmic]-0./ Input file 'image.jpg' at position 0 (1 image 3000x2135x1x3).
[gmic]-1./ Initialize timer.
[gmic]-1./ Apply parallelized command '-median 3' on image [0], with overlap 0 and 4 threads.
[gmic]-1./toc/ Set status to '0.449'.
[gmic]-1./ Elapsed time: 0.449 s.
[gmic]-1./ Quit G'MIC interpreter.
So, yes with more cores you should get something better. Anyway, even 332 ms to apply the median filter on such an image doesn’t seem to be excessive. I’ve tested with a 36MP image, and I get a computation time of 723ms on my 4 cores machine. Doesn’t sound so bad.
In CImg and G’MIC, image data are arranged by channel planes : RRRRRRRRRR…GGGGGGGGGG…BBBBBBB, so no interlacing of the channel data.
CImg also uses a special case for 3x3 median filtering. At this point, considering the computation times I get for the median filter (less than 1s for an image with a decent resolution), I don’t think the median filtering is the bottleneck of the dehaze algorithm.
@heckflosse, @David_Tschumperle
Thanks for the interesting insights. What I meant was the entire loop may have been where most time was spent, I didn’t mean to imply -median was at fault specifically. Perhaps this assumption is also wrong, it needs proper testing. I hope I haven’t wasted too much of your time!