I’m thinking of training a small convolutional neural network for debayering and integrating it into a filter that can be used within the G’MIC-Qt plug-in.
In the past, I have already trained CNNs for denoising (“Repair/Denoise [CNN]” filter) and x2 upscaling (“Upscale [CNN2x]” filter). I am by no means a specialist in debayering, but I feel that with a little relevant information from specialists, it should be possible (and even useful).
However, I would like to make sure that it is worthwhile before I start. If so, could someone knowledgeable give me some advice on best practices for debayering and the problems encountered with more traditional debayering methods (e.g. not using neural networks)?
Training a CNN for debayering, independent of G’MIC, is a useful preparation for the future sensors. 2x2 RGGB Bayer layout is simple and does not really require a CNN to convert to co-sited RGB, but it is not the wrong tool either. But future phone sensors will not have this simple layout. This layout, already in announced phone-class 200MP sensors from Sony, definitely requires AI, as the physical chroma resolution is much lower than the raw pixel density, yet reconstruction of the full color at the full resolution is expected:
R R R R G G G G
R R R R G G G G
R R R R G G G G
R R R R G G G G
G G G G B B B B
G G G G B B B B
G G G G B B B B
G G G G B B B B
And the lessons learned with a simple 2x2 RGGB Bayer layout will likely transfer.
if anything i’d say i struggle with spectral bias/the softness you also observed in your mlp image approximation experiments. it’s a game of training data, probably (who knows).
another problem is moire for sub-nyquist detail, especially without optical low-pass filters. here softness is a feature i suppose:
Yes, Artificial Neural Networks offer far superior visual quality when demosaicing.
It doesn’t necessarily need to be a CNN, but basically all SOTA models provide great results if trained properly.
In my experience, the results produced by decently large CNNs with sufficiently large data are pretty good. Recently for an internship I have trained a 25 and a 128 GFLOPs CNN on a couple passes of synthetically Bayered ImageNet. I’ve attached results for the 19th image of the Kodak dataset for the 25 GFLOP CNN.
I have played around with debayering in the past. The main issue, I think, is determining the correct orientation of edges/details with ambiguous data. Good results can be obtained by linear interpolation of the green channel in the correct direction (horizontal/vertical) at each missing pixel. You don’t get as much detail as using information from the other channels, but you get 80% of the way there. If the green channel is correct, the red and blue channels can be reconstructed from the difference from the green channel with great results. If the green channel is wrong, then reconstructing the red and blue channels will give strong false colours and moiré patterns.