Machine Learning Library in G'MIC

Hello everyone,

For some time now, I have been writing a machine learning library for G’MIC.

The idea is not necessarily to compete with the existing reference libraries, which are well established (e.g. Tensorflow, Caffee,…), but to recode everything from scratch. There are several reasons for this, but the main ones are :

  • I want to understand how it all works, down to the decimal point. Because I’m also a researcher in the field of image processing, and I just want to know.
  • I want to be able to write this ML library in the G’MIC language, to take advantage of the flexibility it gives to write new image processing filters easily.
  • I want to avoid having yet another dependency of G’MIC on an external library that weighs tons of megabytes.

For now, all the library functions I wrote fit in 651 lines of G’MIC scripting.
With that amount of lines, of course things are still a bit limited, and what I can do basically is:

  • Create a neural network by chaining different computation modules, which can be so far chosen between : input (2D images or vectors), convolutional layer (with stride and dilation), fully-connected layer, non-linearity, 2d pooling, batch-normalization, cloning, addition, MSE loss.
  • Optimize the network weights, by gradient backward-propagation, using batch of inputs/outputs.
  • Save/load the current state of a neural network.
  • And of course, evaluate the network for new data once the training is done.

That’s it for now (and I’d say it’s not too bad for 651 lines :slight_smile: ).

Of course, a lot of work has still to be done :

  • Improve the network optimizer (I’ve read about the well-known Adam optimizer). The fact is that simple gradient descent just do not work in practice, for a lot of reasons, so a lot of hacks are necessary to achieve a correct learning.
  • Add more interesting modules to the network: this already turned out to be easier than expected, as the library architecture has been designed for that.
  • Optimize the learning procedure to take advantage of the maximum number of cores. It’s already multi-threaded, but there are probably still improvements to get.
  • Experiment with a lot of different network architectures, in order to validate the library capabilities.

Just to say it once again : I’d like to keep G’MIC as light as possible, so I don’t plan to experiment with monster GANs that creates photorealistic images from nothing (such as the ThisPersonDoesNotExist webpage), … but who knows ? :slight_smile:


I’ve already done a few experiments with simple neural networks, and I find the latest one interesting enough for me to present it to you:
I’ve recently tried to implement a ResNet, in order to learn an image denoising task. For that, I selected a few images, and teach the network to denoise random patches taken from these images, that are randomly corrupted by noise. After a night(!) of learning, I’ve been able to test the network on new noisy images, and it appears to me the result is not that bad, considering the network is quite shallow (5 layers, 33Kb of weights, compressed).

Here are a few results:




This is of course a preliminary result. I’m currently training a largest network, to see if it performs better than this one. What I like already anyway is that the process is quite subtle and often doesn’t make the image look too flat, as it’s often the case with regular denoiser.

For sure, there are things to improve : how the added synthetic noise is modeled, what could be the best network architecture, etc. But I think i’m on the right way.


More generally, if this ML library starts to work as expected, I expect some funny and interesting stuffs coming into G’MIC in the future.
I don’t think there are so many people who try to program their own machine learning library, and this is a challenge I gave myself for G’MIC. I don’t know where it will lead yet, but I really like the idea of being able to easily apply AI-based image processing filters directly from the command line or from a plug-in for GIMP and Krita!

I’ll post some news in this thread when I have new interesting things to show.
Stay tuned! :radio:

28 Likes

MSE = ‘Mean Square Error’, I trust.
Might be that less-than-monsterous Generative Adversarial Networks could operate within the inpainting realm. Tyrannical Despots always need to properly manage the visual records of the Glorious Revolution, particularly when certain followers — exhibiting imperfect loyalties — need to be elided from history, their ghosts in Glorious Revolutionary videos and pictures replaced with anodyne backgrounds, brick walls, shrubs, flowering and otherwise, cement walls, with or without machine gun bullet pock marks, providing grist for the training mill.

Note that my tongue is only halfway in my cheek.

1 Like

Incredible. So here it starts, the future :wink:. And yes, denoising looks much more natural than what I am able to achieve with traditional denoising.

Too funny that you added text to make sure everybody recognizes which the noisy image is :grin:.

Good stuff, that looks very interesting.

In the second example, orange seems to have changed to magenta. What has happened?

Sweet. I think most of my commands / filters would benefit from ML. Optimization of any kind is definitely welcome.

1 Like

Probably because in this first test, I’ve used a very mimited set of color noisy samples for the learning, meaning the network does color shift to what he has seen most during training.
A slightly largest network is currently being trained (with more noisy samples), hopefully it will improve the result.

Today news:

I’ve spent some time yesterday and today to fix bugs in the library, add new network layers, and test more complex networks.

New network layers have been implemented:

  • maxpool2d : max pooling layer.
  • append : append channels of two input images
  • split : split channels of an input into two images.
  • upsample2d : nearest-neighbor or linear x2 upsampling.

I’ve also experimented with U-Nets and VGG-like networks, without much success right now for denoising, but at least it seems to work as expected (i.e. I get denoising results also with complex networks but with worst quality than using simpler networks).

The loss optimizer has been also slightly improved.

The Neural Network Library in G’MIC now approaches 900 lines of code, which is still quite light I think. I believe this will be really a useful addition to the G’MIC framework.

3 Likes

Is network layer a new form of image in context of gmic coding?

No, it’s really something internal to the neural network library, although adding a new layer may insert a new image into the image list (basically if this layer has parameters to be learnt).

Nice to see so much progress. Do you have more time in your schedule now? :stuck_out_tongue:

There are things I want to try, but as with everything G’MIC, I would like working examples to know what is possible. The ideas are there yet I have only tapped into a few percent of them because I am still not as proficient in the language and math as I would like to be.

You can always ask on the exercise thread though I can only help if there is some clarification and sample on what it is that you want. I wish I could figure out how to solve the complex Popcorn Fractal which is one of my dream filters.

News (2021/07/13):

  • I’ve extended the library architecture to allow the use of different network optimizers.
  • I’ve implemented the Adam optimizer, and honestly, this makes a huge difference in the convergence rate. The loss reaches (minimal) values that I was not reaching with previous (simpler) optimizers.
  • I’ve also implemented a variant of Adam, named Adamax, which doesn’t seem to perform particularly better, at least with the few experiments I’ve tried.
  • I’ve implemented the U-Net architecture, and currently trying to use it for image denoising. So far, not much success, compared to my first trial, but the network is clearly larger (approx. 2M parameters), so it may have to do with the fact I haven’t let it trained enough.
  • I’m experimenting also with a new loss (gradient-weighted MSE) that seems to perform better for image denoising.
  • Plus, the additional code cleaning.

The networks I’m trying are bigger and bigger, and this has a clear impact on the computation time. I’m interested by any help to optimize the CImg convolution/correlation operators (GPU experts, you are welcome).
At this point, this is the main bottleneck I have. Apart from that, everything seems to work fine.

4 Likes

I misread Adam as ADMM, which is also a thing. :slight_smile:

Some (good) news:

I had some time last week and this week-end to make progress on the G’MIC Machine-Learning library, and I’m very happy to announce that I’ve been finally able to set up a first filter that uses ML for the G’MIC-Qt plug-in!

This new filter is simply named Repair / Denoise. It uses a convolutional neural network to denoise images. This filter can be found in the latest developement version of the G’MIC-Qt plug-in (version 3.0.0_pre, posted yesterday at : Index of /files/prerelease). It looks like this at the moment:

It’s a quite CPU-demanding filter, so do not use it if you don’t have at least 4 cores :wink: And even with that, it will take a lot of processing time if your image resolution is large.

There is also an associated command denoise_cnn that can be used from the command line as well (e.g. to batch-process several noisy images):

$ gmic sp colorful,256 noise 15 cut 0,255 +denoise_cnn 0

This example will render this couple of before/after images:

The convolutional neural network used in this filter comes in two flavors:

  • One for processing “soft” noise, trained with images where synthetic gaussian noise has been added in the RGB channels independently (so this is mainly a colored noise).
  • One for processing more “heavy” noise, trained with images where synthetic noise has been added, but this time, in the HSV channels independently.

In both cases, thousands of natural images have been used for the training (I’m actually using the Lorem Picsum webservice to build a training set). The synthetic noise added for the training has random amplitude, so that the network is able to adapt itself to different levels of noise.

The network training has been achieved only by using the functionalities of the integrated G’MIC ML Library, which was quite a challenge!
The neural network is basically a simple ResNet, with 11 convolutional layers (3x3conv, with width varying from 64 to 8). Each of the two flavors of this network has been trained during at least 8 hours.

As these neural network are quite shallow, they have less than 100k learned parameters, which means they don’t require a lot of storage (both networks are stored, compressed, in a 720K file).
The network files are then downloaded directly from the G’MIC server when the command denoise_cnn is used for the first time.

The inference of the network is done “patch by patch” (with patch size 64x64), so image patches can be processed in parallel.

Well that’s it! I’m very happy because all this is the result of hundreds of hours thinking about the design of the ML library structures, learning how neural network training works, implementing the whole stuff from scratch, and finally testing and debugging for hours… But finally, with a result!

A lot of things remain to be done, but for me, this is a first milestone for having ML-based image processing algorithms in G’MIC.

Stay tuned :+1:

18 Likes

This seems like a very good time to keep a close eye on g’mic developments! Maybe some parts will be useable in filters without advanced knowledge eventually? Even without using the ML directly, I noticed there are a lot of new “support” commands and math functions which can be useful too.

3 Likes

Interesting development! I would only recommend to remove the cached file in case of any errors or allowing a forced loading via internet.

That’s great David, thank you!

1 Like

It runs surprisingly fast even on my old laptop - although I haven’t tested with really large images - under 30s most of the time. Perhaps somebody will find the time to do comparisons vs other algorithms, but at least with artificial noise the output is clean (only some faint blotches visible). Seems like a great start :slight_smile:

2 Likes

This is great work that I highly appreciate. Keep on!

I’m deeply impressed on how such a quality tool/framework is freely (as in freedom) available for all of us. Thank you David!

You, your colleagues and your research institute rocks!

P.D: A very interesting filter would be one doing “automatic” background removal or some kind of “shape mask”, maybe providing some guidance with three colors/masks/markers: background, foreground and ambiguity zones (where NN will have it’s primary role).

1 Like