Looking for a PhD jury member - thesis on image denoising and compression w/AI

Hello all,

Would anyone here who has a PhD degree be interested in being part of the jury for my PhD defense (most likely in January)? My thesis is about image denoising and compression using neural networks.

This is pretty unconventional but my main motivation has been to make really effective and widely applicable image denoising using neural networks for my use-case which is image development using exclusively open-source software (and ideally I would like my research to be integrated into such software), so I think I am likely to find the best target audience here.

Below is the abstract:

This thesis addresses two fundamental challenges in image processing—image compression and image denoising—using deep learning techniques to improve both visual quality and computational efficiency.

The first challenge is image compression, which aims to minimize the storage and transmission cost of images while preserving visual quality. Recent methods based on convolutional autoencoders have shown superior results, but their reliance on complex entropy models for predicting the probability distribution of each feature leads to higher computational costs. This work proposes a simplified compression scheme that uses a single convolutional autoencoder with a set of multiple learned prior distributions stored in static tables of cumulative distribution functions, as an alternative to computationally expensive single-feature parametric priors. During inference, these static priors allow the entropy coder to efficiently compress spatial features across all channels. The proposed method achieves comparable rate-distortion performance to other state-of-the-art models, while significantly reducing entropy coding and decoding complexity.

The second challenge, image denoising, involves the removal of unwanted noise from images captured under suboptimal conditions. Noise not only degrades image quality but also impairs the performance of both standard and learned compression algorithms, as it is inherently non-compressible. This thesis proposes a unified model that performs joint denoising and compression. By training the model on noisy-clean image pairs across a wide range of noise levels, it learns to denoise images effectively as part of the compression process, all while maintaining the computational cost of compression alone. This joint approach improves rate-distortion performance compared to compressing noisy images or using separate denoising and compression models. Additionally, the model is capable of producing decompressed images with visual quality superior to that of the noisy uncompressed inputs.

The final part of this thesis focuses on raw input images, specifically Bayer pattern images produced by most digital cameras. It is demonstrated that processing raw or minimally processed images (e.g., debayered and converted to the standard linear Rec. 2020 color profile) offers substantial gains in both compression efficiency and denoising quality compared to working with fully processed sRGB images. Treating Bayer images as 4-channel inputs reduces computational complexity by a factor of four, while also improving compression performance at lower bitrates. Moreover, denoising raw or linear RGB images early in the processing pipeline enables greater generalization. Unlike models trained on processed sRGB images, which perform well only on data processed by specific camera image signal processors (ISPs) or software pipelines, early-stage denoising of raw or linear RGB data ensures compatibility across diverse imaging systems, development software, and processing styles.

To facilitate further research and development, a novel dataset of raw clean-noisy image pairs is introduced, with each scene containing at least one ground-truth clean image paired with multiple noisy versions captured at varying ISO levels and exposure times. This dataset supports academic research and the development of denoising models integrated into the pixel pipeline of image processing software, providing a valuable resource for advancing denoising and compression techniques in real-world applications.

List of publications:

[0] B. Brummer and C. De Vleeschouwer, Natural image noise dataset, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2019 (as part of my master thesis work)

[1] B. Brummer and C. De Vleeschouwer, Adapting jpeg xs gains and priorities to tasks and contents, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2020, pp. 629–633.

[2] B. Brummer and C. De Vleeschouwer, End-to-end optimized image compression with competition of prior distributions, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2021, pp. 1890–1894.

[3] B. Brummer and C. De Vleeschouwer, On the importance of denoising when learning to compress images, in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), January 2023, pp. 2440–2448.

[4] B. Brummer and C. De Vleeschouwer, Joint learned compression and denoising of raw images, ongoing work

7 Likes

Interesting thesis.

A little confused about something in the fourth paragraph of your post, regarding using raw images. In the second sentence, you refer to the input as “debayered”, then in the next sentence you refer to “4-channel inputs”. Bear-of-little-brain here understands the result of debayered to be 3 channel RGB, what am I missing?

2 Likes

Thank you :slight_smile:

processing {raw} or {minimally processed images (e.g., debayered and converted to the standard linear Rec. 2020 color profile)}

are two different input types (and two different models).
We can use raw (Bayer) input images and treat them as 4-channels, this reduces complexity by a factor of 4 (we let the model work with lower HxW dimensions and perform debayering/upscaling at the end), or we can minimally process input images s.t. they are debayered first and use a standard RGB color profile (more computational complexity but potentially better generalization wrt different sensors and their unique color profiles or even X-Trans sensors).

1 Like

Thanks, I feel much better now… :crazy_face:

I don’t think I have the math chops to jury this one.

You are welcome, glad it helped :slight_smile:
I don’t think it’s too math heavy (at least that’s not my strong suit), the worst paper would be “End-to-end optimized image compression with competition of prior distributions” where I had to learn a few things about the cumulative distribution function, but I completely understand and I appreciate your interest :slight_smile:

@hanatos is probably competent but probably also very busy

great to hear you’re wrapping up! i’m hopelessly incompetent when it comes to neural networks, so i won’t be able to review your work… but i’d be interested in delivering some of your results as open source. i got as far as implementing an integrated GPU/tensor core convolutional neural network that can jointly demosaic and denoise a 24MP image in about 80ms on an RTX 3080.

unfortunately so far it’s at 10/80, like 10x more expensive than the conventional code and 80% of the quality… so whenever you are done working on the thesis and defense i’ll be glad to get some of your input.

2 Likes

Excellent, I look forward to working with you on this! :slight_smile:

1 Like

I would also be interested in reading the thesis when it’s done, so I hope you would consider posting it here. In the meantime, best of luck!

1 Like

Will do :slight_smile: Thank you

2 Likes

Fascinating work, and kudos for dedicating it to FOSS applications!

I have some relevant work experience with neural networks and image processing, but my academic work was in signal processing, so I don’t think I qualify.

2 Likes

nice. this is how far i got, if you want to have a peak already: vkdt/src/pipe/modules/jddcnn at master · hanatos/vkdt · GitHub . i think it’s mostly the training with my synthetic noise that doesn’t give me great results (see ipython notebook).

2 Likes

Thanks! I definitely can’t play with it at the moment but I look forward to training models with your architecture and comparing/optimizing denoising/runtime performance.
This is the architecture I have been using for both linear RGB and Bayer denoising, it’s a modified U-Net with a final PixelShuffle upsampling. I haven’t done any runtime performance testing but it’s probably nowhere near 80 ms / 24 MP.
Synthetic noise typically yields very poor results, it should work much better with my dataset :slight_smile:

I think I have found an external jury member (my previous employer).

4 Likes

awesome, thanks for the pointer. this looks very similar. fewer layers, more channels. i’ll hope for your training data set to make the difference :slight_smile:

1 Like