By 2020, computers will be generating the code for themselves.
I think that could train to a network to adjust parameter of gmic and attain some results wished.
I am presently attempting to install “Interactive Deep Colorization” which uses AI to colorize black and white images. An acquaintance of mine is using it with good results to colorize a large number of historical photographs. If I can get it working, might try this too.
Having said this, unless the installation process can be greatly simplified, this would never work as a G’MIC plugin. It also requires a very good computer.
I have a bit of expertise in deep learning and the frameworks, so I can share it here. I confess that I’m not very familiar with G’mic, but I am a developer of Kdenlive, and such features are also of interest to us (though if they don’t make it to G’mic we can’t consider other ways to bring them in)
So basically, as mentioned there is usually a training part before being able to use a network (though not always necessary, for example there are very impressive results of artistic style transfer that don’t need it). What you have to realize here is that this training is done once and for all. You only need to compute the parameters of the network and share them with your users and they can use it for their task. Note also that these parameters can be computed in any framework and used in any other, so no need to implement training in G’mic.
To use the model, you indeed need to be able to make a forward pass in the network, that is compute the output. This is an order of magnitude easier than training a net, so if the net is not too complicated, it can be manageable to do it with whatever you already have implemented in terms of matrix multiplication.
Otherwise, reinventing the wheel is probably a bad idea. The deep learning frameworks do often use a python interface, but they always have a backend in C++ that can be imported and tailored to fit your needs. There are also libs especially made for fast inference with tight computational resources (mobile phones), like Caffe2.
Using GPUs can help a great deal on that kind of operations. May I ask if you already use them in your pipeline to speed up computation?
Indeed, I’m aware of these libs. But as there is already all the needed stuffs in G’MIC to compute the forward pass of a CNN, I’m definitely not keen on having yet another dependency to a (relatively large) deep learning library for G’MIC, if I can avoid it. In my opinion, the only gain of having such a dependency would be to allow a GPU-based forward evaluation, which is cool but maybe not strictly necessarily. And as you said, we probably don’t need to do the training in G’MIC either, so we are only interested in a very small portion of the DL library features.
From what I read, evaluating a CNN forward pass can be done in a few milliseconds with GPUs, so it looks like using multi-core CPUs to do the same should be done in a reasonable amount of time.
I think I have only to figure out how to reproduce the forward evaluation step implemented in these libs. That’s a matter of knowing how the neuron coefficients are stored and the order of the operations, but I think this should be possible.
No, I don’t use it yet. I actually do believe multi-core CPUs (>256) will be more useful in the future for image processing operations, but that is my personal opinion.
Why not seeing things at a larger scale? We are many users, we have many computers.
So, concerning the slow aspect on learning, an idea: why not developping a specific tool scanning a local image base (on which you can even exclude some if you wished) and send results through internet. And only that. No complete images or parts of them, only statistics. Anonymized. Even if it is slow, a massive number of users could definitely improve it. But for my part it MUST be free (GPL or similar) software, for privacy and security reasons mainly.
But there is a need for 1-this software (@David_Tschumperle: faucon, yaka…) 2-a server or P2P architecture (which could help about anonymization too…)
We may also choose which CPU/GPU ratio we allow, and may process data for others (roughly like a BOINC client).
In clear: we’re a community, we support a system (share computing power, submit data from our own images,…) and as long as we do that, we benefit from it.
Even if it doesn’t need to run forever for this learning task, the architecture may run other calculations on our images…
The important aspect IMHO is to keep privacy & security. Forever.
I think that can simulate a neural network with “inpaint [multi-scale]” using one mask and filling up the mask with textures.
Similar neural-doodle but without network neural.
Generate using Resynthetize texture FFT and mask area color
@bazza could you explain a bit more about what’s going on in the 4-tiles picture?
It is the example of a neural network https://github.com/alexjc/neural-doodle
I think that it could do something like this resynthetize textures and replacing mask them with these textures and for greater realism joining with blend.
But you wrote this above the 4-tile image:
So that 4-tile image was not a demonstration of using “Inpaint [Multi-scale]”?
Obviously it does not go to be a network neural, but the result can be much better that this example:
This this generated with “Resynthetize texture FFT” no modify code of inpaint, but east uses Resynthetize and blend. I only am treating to explain an idea.
Can you share the gmic command to do this?
I go to do a video.
I did it with several steps