Feature request: Image enhancement via Deep Learning

Bilbo · July 24, 2017, 10:50pm

By 2020, computers will be generating the code for themselves.

bazza · July 25, 2017, 1:48am

I think that could train to a network to adjust parameter of gmic and attain some results wished.

bazza · July 27, 2017, 1:05am

To my would like me things as well as learn palettes or use an “intelligent criterion”

Ted · July 28, 2017, 7:15pm

I am presently attempting to install “Interactive Deep Colorization” which uses AI to colorize black and white images. An acquaintance of mine is using it with good results to colorize a large number of historical photographs. If I can get it working, might try this too.

Having said this, unless the installation process can be greatly simplified, this would never work as a G’MIC plugin. It also requires a very good computer.

alcinos · July 31, 2017, 7:00am

Hi there,
I have a bit of expertise in deep learning and the frameworks, so I can share it here. I confess that I’m not very familiar with G’mic, but I am a developer of Kdenlive, and such features are also of interest to us (though if they don’t make it to G’mic we can’t consider other ways to bring them in)

So basically, as mentioned there is usually a training part before being able to use a network (though not always necessary, for example there are very impressive results of artistic style transfer that don’t need it). What you have to realize here is that this training is done once and for all. You only need to compute the parameters of the network and share them with your users and they can use it for their task. Note also that these parameters can be computed in any framework and used in any other, so no need to implement training in G’mic.

To use the model, you indeed need to be able to make a forward pass in the network, that is compute the output. This is an order of magnitude easier than training a net, so if the net is not too complicated, it can be manageable to do it with whatever you already have implemented in terms of matrix multiplication.
Otherwise, reinventing the wheel is probably a bad idea. The deep learning frameworks do often use a python interface, but they always have a backend in C++ that can be imported and tailored to fit your needs. There are also libs especially made for fast inference with tight computational resources (mobile phones), like Caffe2.
Using GPUs can help a great deal on that kind of operations. May I ask if you already use them in your pipeline to speed up computation?

David_Tschumperle · July 31, 2017, 9:10am

Indeed, I’m aware of these libs. But as there is already all the needed stuffs in G’MIC to compute the forward pass of a CNN, I’m definitely not keen on having yet another dependency to a (relatively large) deep learning library for G’MIC, if I can avoid it. In my opinion, the only gain of having such a dependency would be to allow a GPU-based forward evaluation, which is cool but maybe not strictly necessarily. And as you said, we probably don’t need to do the training in G’MIC either, so we are only interested in a very small portion of the DL library features.
From what I read, evaluating a CNN forward pass can be done in a few milliseconds with GPUs, so it looks like using multi-core CPUs to do the same should be done in a reasonable amount of time.

I think I have only to figure out how to reproduce the forward evaluation step implemented in these libs. That’s a matter of knowing how the neuron coefficients are stored and the order of the operations, but I think this should be possible.

No, I don’t use it yet. I actually do believe multi-core CPUs (>256) will be more useful in the future for image processing operations, but that is my personal opinion.

Jijil · April 19, 2018, 12:42pm

Why not seeing things at a larger scale? We are many users, we have many computers.

So, concerning the slow aspect on learning, an idea: why not developping a specific tool scanning a local image base (on which you can even exclude some if you wished) and send results through internet. And only that. No complete images or parts of them, only statistics. Anonymized. Even if it is slow, a massive number of users could definitely improve it. But for my part it MUST be free (GPL or similar) software, for privacy and security reasons mainly.

But there is a need for 1-this software (@David_Tschumperle: faucon, yaka…) 2-a server or P2P architecture (which could help about anonymization too…)
We may also choose which CPU/GPU ratio we allow, and may process data for others (roughly like a BOINC client).

In clear: we’re a community, we support a system (share computing power, submit data from our own images,…) and as long as we do that, we benefit from it.
Even if it doesn’t need to run forever for this learning task, the architecture may run other calculations on our images…
The important aspect IMHO is to keep privacy & security. Forever.

bazza · April 19, 2018, 5:31pm

++P2P GPU

I think that can simulate a neural network with “inpaint [multi-scale]” using one mask and filling up the mask with textures.

Similar neural-doodle but without network neural.

bazza · April 23, 2018, 3:36am

6acb02668d4bf67468d3d79ab4ce797310f6b243_1_690x431

Generate using Resynthetize texture FFT and mask area color

Morgan_Hardwood · April 23, 2018, 5:38am

@bazza could you explain a bit more about what’s going on in the 4-tiles picture?

bazza · April 23, 2018, 1:39pm

It is the example of a neural network GitHub - alexjc/neural-doodle: Turn your two-bit doodles into fine artworks with deep neural networks, generate seamless textures from photos, transfer style from one image to another, perform example-based upscaling, but wait... there's more! (An implementation of Semantic Style Transfer.)

I think that it could do something like this resynthetize textures and replacing mask them with these textures and for greater realism joining with blend.

Morgan_Hardwood · April 23, 2018, 1:51pm

But you wrote this above the 4-tile image:

So that 4-tile image was not a demonstration of using “Inpaint [Multi-scale]”?

bazza · April 23, 2018, 2:03pm

Obviously it does not go to be a network neural, but the result can be much better that this example:

This this generated with “Resynthetize texture FFT” no modify code of inpaint, but east uses Resynthetize and blend. I only am treating to explain an idea.

bazza · April 25, 2018, 5:36am

cdf87c1e364eafa1bfbacacc9079fe01d7477549_line
Poster Edges

cdf87c1e364eafa1bfbacacc9079fe01d7477549
inpaint [multi-scale]

paperdigits · April 25, 2018, 5:39am

Can you share the gmic command to do this?

bazza · April 25, 2018, 5:45am

I go to do a video.
I did it with several steps

bazza · April 25, 2018, 6:05am

afre · April 25, 2018, 6:23am

@bazza Could you slow it down a bit?

bazza · April 25, 2018, 6:30am

bazza · May 4, 2018, 5:22am

http://dcgi.felk.cvut.cz/home/sykorad/pbf.html
https://youtu.be/X02lCaNZFls