Feature request: Image enhancement via Deep Learning

Some words about the possibility of having neural network-based methods in G’MIC.
I’ve started studying these kind of methods and from what I’ve read so far, what I can say is:

  • There is everything in G’MIC to create artificial neurons networks, including convolutional layers, pooling, and so on…
  • Methods relying on neural networks have two main aspects : 1. learning, and 2. evaluating.
    Concerning the evaluation aspect: I’m still not sure how fast G’MIC can perform to evaluate a feature using a neural network, particularly if the network is deep. Basically, the evaluation consists in a lot of image convolutions and matrix operations (mostly multiplications). These two are implemented in G’MIC, and are even parallelized, so it may happen that neural network evaluation could be fast enough in G’MIC, when evaluated on a machine with several cores.
    Concerning the learning phase: it is definitely slow. People writing scientific papers about NN tell it requires sometimes several weeks of training, with GPU-based convolutions and matrix multiplications. So, even when GPUs are used, it is slow as hell. I don’t expect then to have fast learning methods in G’MIC. No way.

So, only if the neural network evaluation phase can be fast enough in G’MIC (and I still cannot tell because G’MIC does not rely on GPUs for this kind of tasks), then maybe I’ll be able to implement some of the interesting image processing methods using NN. At this stage, I can only hope that this is possible.

In any case, this will require a lot of work and testing, so I would say you shouldn’t expect to see such things coming in G’MIC at least before 2018. All the code for those NN-based algorithms proposed on github are relying on external machine learning libraries (often used in Python), which are definitely not easily integrated in G’MIC. This means that probably the best way to go is to recode those machine learning abilities directly as G’MIC code. This seems to be possible, what I don’t know is if that will be fast enough.

Anyway, that is something I’d like to explore in the next year. But that is not as easy as ‘take a code from github and integrate it as a G’MIC command’.

6 Likes

Thanks for taking the time for your feedback. The future at least looks exciting for G’MIC. I’ll be sure to stick around for that :slight_smile:

2018? That’s gonna take awhile, but I will stick around. 2020 looks like the year of every noticed open source graphic programs.

By 2020, computers will be generating the code for themselves.

I think that could train to a network to adjust parameter of gmic and attain some results wished.

To my would like me things as well as learn palettes or use an “intelligent criterion”

I am presently attempting to install “Interactive Deep Colorization” which uses AI to colorize black and white images. An acquaintance of mine is using it with good results to colorize a large number of historical photographs. If I can get it working, might try this too.

Having said this, unless the installation process can be greatly simplified, this would never work as a G’MIC plugin. It also requires a very good computer.

2 Likes

Hi there,
I have a bit of expertise in deep learning and the frameworks, so I can share it here. I confess that I’m not very familiar with G’mic, but I am a developer of Kdenlive, and such features are also of interest to us (though if they don’t make it to G’mic we can’t consider other ways to bring them in)

So basically, as mentioned there is usually a training part before being able to use a network (though not always necessary, for example there are very impressive results of artistic style transfer that don’t need it). What you have to realize here is that this training is done once and for all. You only need to compute the parameters of the network and share them with your users and they can use it for their task. Note also that these parameters can be computed in any framework and used in any other, so no need to implement training in G’mic.

To use the model, you indeed need to be able to make a forward pass in the network, that is compute the output. This is an order of magnitude easier than training a net, so if the net is not too complicated, it can be manageable to do it with whatever you already have implemented in terms of matrix multiplication.
Otherwise, reinventing the wheel is probably a bad idea. The deep learning frameworks do often use a python interface, but they always have a backend in C++ that can be imported and tailored to fit your needs. There are also libs especially made for fast inference with tight computational resources (mobile phones), like Caffe2.
Using GPUs can help a great deal on that kind of operations. May I ask if you already use them in your pipeline to speed up computation?

Indeed, I’m aware of these libs. But as there is already all the needed stuffs in G’MIC to compute the forward pass of a CNN, I’m definitely not keen on having yet another dependency to a (relatively large) deep learning library for G’MIC, if I can avoid it. In my opinion, the only gain of having such a dependency would be to allow a GPU-based forward evaluation, which is cool but maybe not strictly necessarily. And as you said, we probably don’t need to do the training in G’MIC either, so we are only interested in a very small portion of the DL library features.
From what I read, evaluating a CNN forward pass can be done in a few milliseconds with GPUs, so it looks like using multi-core CPUs to do the same should be done in a reasonable amount of time.

I think I have only to figure out how to reproduce the forward evaluation step implemented in these libs. That’s a matter of knowing how the neuron coefficients are stored and the order of the operations, but I think this should be possible.

No, I don’t use it yet. I actually do believe multi-core CPUs (>256) will be more useful in the future for image processing operations, but that is my personal opinion.

1 Like

Why not seeing things at a larger scale? We are many users, we have many computers.

So, concerning the slow aspect on learning, an idea: why not developping a specific tool scanning a local image base (on which you can even exclude some if you wished) and send results through internet. And only that. No complete images or parts of them, only statistics. Anonymized. Even if it is slow, a massive number of users could definitely improve it. But for my part it MUST be free (GPL or similar) software, for privacy and security reasons mainly.

But there is a need for 1-this software (@David_Tschumperle: faucon, yaka…) 2-a server or P2P architecture (which could help about anonymization too…)
We may also choose which CPU/GPU ratio we allow, and may process data for others (roughly like a BOINC client).

In clear: we’re a community, we support a system (share computing power, submit data from our own images,…) and as long as we do that, we benefit from it.
Even if it doesn’t need to run forever for this learning task, the architecture may run other calculations on our images…
The important aspect IMHO is to keep privacy & security. Forever.

++P2P GPU :smiley:


I think that can simulate a neural network with “inpaint [multi-scale]” using one mask and filling up the mask with textures.

Similar neural-doodle but without network neural.

1 Like

6acb02668d4bf67468d3d79ab4ce797310f6b243_1_690x431

Generate using Resynthetize texture FFT and mask area color

@bazza could you explain a bit more about what’s going on in the 4-tiles picture?

It is the example of a neural network GitHub - alexjc/neural-doodle: Turn your two-bit doodles into fine artworks with deep neural networks, generate seamless textures from photos, transfer style from one image to another, perform example-based upscaling, but wait... there's more! (An implementation of Semantic Style Transfer.)

I think that it could do something like this resynthetize textures and replacing mask them with these textures and for greater realism joining with blend.

But you wrote this above the 4-tile image:

So that 4-tile image was not a demonstration of using “Inpaint [Multi-scale]”?

Obviously it does not go to be a network neural, but the result can be much better that this example:

This this generated with “Resynthetize texture FFT” no modify code of inpaint, but east uses Resynthetize and blend. I only am treating to explain an idea.

cdf87c1e364eafa1bfbacacc9079fe01d7477549_line
Poster Edges

cdf87c1e364eafa1bfbacacc9079fe01d7477549
inpaint [multi-scale]

Can you share the gmic command to do this?

I go to do a video.
I did it with several steps