For some time now, I have been writing a machine learning library for G’MIC.
The idea is not necessarily to compete with the existing reference libraries, which are well established (e.g. Tensorflow, Caffee,…), but to recode everything from scratch. There are several reasons for this, but the main ones are :
- I want to understand how it all works, down to the decimal point. Because I’m also a researcher in the field of image processing, and I just want to know.
- I want to be able to write this ML library in the G’MIC language, to take advantage of the flexibility it gives to write new image processing filters easily.
- I want to avoid having yet another dependency of G’MIC on an external library that weighs tons of megabytes.
For now, all the library functions I wrote fit in 651 lines of G’MIC scripting.
With that amount of lines, of course things are still a bit limited, and what I can do basically is:
- Create a neural network by chaining different computation modules, which can be so far chosen between : input (2D images or vectors), convolutional layer (with stride and dilation), fully-connected layer, non-linearity, 2d pooling, batch-normalization, cloning, addition, MSE loss.
- Optimize the network weights, by gradient backward-propagation, using batch of inputs/outputs.
- Save/load the current state of a neural network.
- And of course, evaluate the network for new data once the training is done.
That’s it for now (and I’d say it’s not too bad for 651 lines ).
Of course, a lot of work has still to be done :
- Improve the network optimizer (I’ve read about the well-known Adam optimizer). The fact is that simple gradient descent just do not work in practice, for a lot of reasons, so a lot of hacks are necessary to achieve a correct learning.
- Add more interesting modules to the network: this already turned out to be easier than expected, as the library architecture has been designed for that.
- Optimize the learning procedure to take advantage of the maximum number of cores. It’s already multi-threaded, but there are probably still improvements to get.
- Experiment with a lot of different network architectures, in order to validate the library capabilities.
Just to say it once again : I’d like to keep G’MIC as light as possible, so I don’t plan to experiment with monster GANs that creates photorealistic images from nothing (such as the ThisPersonDoesNotExist webpage), … but who knows ?
I’ve already done a few experiments with simple neural networks, and I find the latest one interesting enough for me to present it to you:
I’ve recently tried to implement a ResNet, in order to learn an image denoising task. For that, I selected a few images, and teach the network to denoise random patches taken from these images, that are randomly corrupted by noise. After a night(!) of learning, I’ve been able to test the network on new noisy images, and it appears to me the result is not that bad, considering the network is quite shallow (5 layers, 33Kb of weights, compressed).
Here are a few results:
This is of course a preliminary result. I’m currently training a largest network, to see if it performs better than this one. What I like already anyway is that the process is quite subtle and often doesn’t make the image look too flat, as it’s often the case with regular denoiser.
For sure, there are things to improve : how the added synthetic noise is modeled, what could be the best network architecture, etc. But I think i’m on the right way.
More generally, if this ML library starts to work as expected, I expect some funny and interesting stuffs coming into G’MIC in the future.
I don’t think there are so many people who try to program their own machine learning library, and this is a challenge I gave myself for G’MIC. I don’t know where it will lead yet, but I really like the idea of being able to easily apply AI-based image processing filters directly from the command line or from a plug-in for GIMP and Krita!
I’ll post some news in this thread when I have new interesting things to show.