Image Processing – Recommended Reading

elstoc · December 23, 2019, 8:06pm

I’ve been a user of darktable for a few years now and I try to contribute where I can (raising enhancement suggestions, helping people use the software) but, watching @anon41087856’s darktable 3.0 video series (https://www.youtube.com/playlist?list=PL4EYo8VotTsiZLr3BqGeBRj-qYGO63bIv) has whetted my appetite to learn the tool in more depth.

I’ve read the manual a couple of times but a lot of the module parameters really need a bit of underlying knowledge to use properly and I really want to learn more about what they do and, more importantly, how they work. What I really need is more information about the theory behind it all.

Are there any books, lectures, youtube videos etc. that anyone can recommend, more oriented towards post-processing theory (i.e. without being geared towards a specific application)?

So far I’ve found the following…

Digital Photography Lectures by Marc Levoy (CS 178 - Digital Photography)
Exposing Digital Photography Lectures by Dan Armendariz (http://digitalphotography.exposed)
Science for the Curious Photographer by Charles S. Johnson Jr. (https://www.amazon.co.uk/Science-Curious-Photographer-Introduction-Photography/dp/0415793262/)

Can you suggest anything else

chris · December 23, 2019, 8:29pm

As I do not think there is a comprehensive and easy solution, this would be my approach:

I suggest to be part of the dev communication channels: Bug tracker/pull request tracker on github, mailing list, irc, … There’s a lot to learn and devs are all kind enough to answer questions of they come up on one of these channels.
Taking one or the other signal processing lecture may help. First understanding 1d such as audio processing may decrease the steepness of the learning curve. First deterministic such as digital filters, Fourier transform, …, second stochastic signal processing, and maybe then the step to 2d.
Understanding the whole system from a physical perspective may help as well, maybe basic communications lectures that teach communication channels/wave propagation, amplifiers, …, and/or optical communications and/or electronics?

This would be one approach. For sure there are other approaches that dive more or less deep into these topics.

hanatos · December 23, 2019, 8:50pm

maybe also worth a look:

Color Imaging: Fundamentals and Applications
by Erik Reinhard, Erum Arif Khan, Ahmet Oguz Akyuz, Garrett Johnson
CRC Press, 2008

ggbutcher · December 23, 2019, 9:57pm

Oh, deja vu…

I was where you are about 4 years ago; long story, won’t bore you with, but I ended up writing my own processing software and learning at least the rudiments of all this along the way. While doing that, I wrote my own image processing library, and in that I kept a set of, for lack of a better term, “monologues” about the fundamentals of things like resizing, tone curving, and such. They’re comments interspersed through the code, and also included are references to more scholarly work that either or both instructed me or provided actual code for the relevant operation. Here are some links to peruse, no particular order except their occurrence in gimage.cpp:

Convolutions and Sharpening: rawproc/gimage.cpp at master · butcherg/rawproc · GitHub
Convolution Kernels and Blurring: rawproc/gimage.cpp at master · butcherg/rawproc · GitHub
(This business of using matrices to muck with images is intriguing…)

Image Rotation: rawproc/gimage.cpp at master · butcherg/rawproc · GitHub
(You will be mortified at the nature of this transform… )

Curves: rawproc/gimage.cpp at master · butcherg/rawproc · GitHub
(Oriented to the interactive tool, but curves are really the embodiment of a “transfer function” which is what mos image processing boils down to…)

Subtraction and Addition: rawproc/gimage.cpp at master · butcherg/rawproc · GitHub

Tone Mapping: rawproc/gimage.cpp at master · butcherg/rawproc · GitHub
(Kind of a “zoo” of algorithms, I have recently focused on the filmic one…)

White Balance: rawproc/gimage.cpp at master · butcherg/rawproc · GitHub

Demosaic: rawproc/gimage.cpp at master · butcherg/rawproc · GitHub
(Studying the half algorithm is most useful to grokking the concept)

Chromatic Saturation: rawproc/gimage.cpp at master · butcherg/rawproc · GitHub
(Probably of no use except to illustrate a rather arcane way to do color saturation)

Exposure Compensation: rawproc/gimage.cpp at master · butcherg/rawproc · GitHub

GrayScaling: rawproc/gimage.cpp at master · butcherg/rawproc · GitHub

Denoising:
NLMeans: rawproc/gimage.cpp at master · butcherg/rawproc · GitHub
dcraw Wavelet: rawproc/gimage.cpp at master · butcherg/rawproc · GitHub
(You’ll probably only get a headache reading this code, but the few words up front might shape a perspective…)

Resizing: rawproc/gimage.cpp at master · butcherg/rawproc · GitHub
(This one introduces interpolation, which is used in a bunch of other places where some aspect of an image have to be transformed (not just its size…)

Definitely, For What It’s Worth…

elstoc · December 23, 2019, 10:06pm

Wow. Thanks @ggbutcher. This will keep me busy for a while!

snibgo · December 23, 2019, 10:51pm

For a book, I suggest “Digital Image Processing”, Gonzalez and Woods.

pphoto · December 23, 2019, 11:00pm

Found this website with many good articles about image processing in digital cameras.
Just click through the posts.

elstoc · December 23, 2019, 11:09pm

Thanks. Just started watching a Digital Image Processing course (https://www.youtube.com/playlist?list=PLuh62Q4Sv7BUf60vkjePfcOQc8sHxmnDX) which has this as its recommended textbook. Though based on what @chris suggested I maybe should have started with the Digital Signal Processing lectures (https://www.youtube.com/playlist?list=PLuh62Q4Sv7BUSzx5Jr8Wrxxn-U10qG1et).

afre · December 23, 2019, 11:39pm

The key is to read widely and participate as much. I have been here maybe 3 years? Started out knowing almost nothing. But I have gotten the image processing bug, and it isn’t going away!

Jossie · December 24, 2019, 8:31am

To learn theory and basic operations I can highly recommend the book by Burger & Burge (2016) “Digital image processing”.

Hermann-Josef

elstoc · December 26, 2019, 3:32pm

Thanks all for your suggestions. I got part through the Digital Image Processing course above and, a couple of lectures in, realised there was a series of prerequisites. I’m therefore now working through a a set of 3 lecture series from Rich Radke, in the following order:

Engineering Probability (https://www.youtube.com/playlist?list=PLuh62Q4Sv7BU1dN2G6ncyiMbML7OXh_Jx)
Digital Signal Processing (https://www.youtube.com/playlist?list=PLuh62Q4Sv7BUSzx5Jr8Wrxxn-U10qG1et)
Digital Image Processing (https://www.youtube.com/playlist?list=PLuh62Q4Sv7BUf60vkjePfcOQc8sHxmnDX)

That’ll keep me occupied for a bit!

anon41087856 · December 26, 2019, 4:30pm

Just a few general remarks about ressources, from someone who reads image processing academic papers every week…

Image processing is an area that interests several professions: mathematicians, electrical engineers, computer scientists, colour scientists (psychophysicists ?), and artists. Most of image processing research is aimed at medical imagery (MRI, X-rays), telescopes and microscopes imagery (biology and astronomy), satellite imagery (GPS/cartography), or smartphone-oriented and fully automated photography (auto colour, auto exposure, auto tonemapping, optimised contrast, etc). Only a small subset is aimed at actual artistic photography.

As such, many “groundbreaking” algorithms (denoising, deblurring, etc.) are aimed at monochrome images, and perform quite bad in RGB (creating chromatic aberrations), because they don’t care about the consistency between channels. The are fine, however, for X-rays or radio-telescopes.

Also, mathematicians and electrical engineers tend to be clueless about psychophysics. Colour scientists tend to be sloppy on maths. Computer scientists tend to lack some physics and treat images as random numbers with no relationship with reality. And all of them usually lack chemistry background and treat photography as if it was born with computers, disregarding 160 years of film legacy (except the guys at Kodak and Fuji).

As a consequence, you see people treating RGB data as “colour” or slopily apply colour models without asserting their conditions of validity (the most “funny” being the guys testing the robustness of their demosaicing algos in YUV on white-balanced corrected sRGB files artificially mosaiced from film scans, because you need to go through XYZ first to go to YUV later, but XYZ needs full RGB vectors, meaning it needs the already-demosaiced picture in input - so you demosaic in a colour space that is valid only for already-demoisaiced data).

Coming from mechanical engineering, where we can actually kill people if we turn the corners round on the theory and calculations, all that sloppiness and lack of due-process in image processing is just killing me.

I really wish research teams were more multi-disciplinary and embedded real-life photographers and painters, because the results that pass for “the best” in experimental results are often surprising (PSNR and RMSE don’t make an image guys). It seems everyone is solving local problems at which they look from their narrow specialty, and nobody seems to care about photography as an ordered pipeline.

Back to the topic, using academic ressources, you need to check what the specialty of the guy is and what kind of image processing he does (for cameras, microscopes, telescopes, X-rays, etc. ?). Because there are a few topics that are 100% signal processing (noise & blur), but most of them interleave physics, psychology and ergonomics in non-trivial ways and end-up with some practically unusable Matlab code solving an ill-posed problem.

elstoc · December 26, 2019, 4:47pm

Thanks @anon41087856. Your comments kind of illustrate my problem though. I don’t know how to assess resources because I don’t have enough information to do so. You suggest “checking the speciality of the lecturer” but this is more likely going to lead to me dismissing potentially useful resources than finding better ones. There’s a danger I spend days searching only to dismiss everything I find. It’ll be easy for the steep learning curve to just put me off.

Hence my request for suggestions. Can you recommend some resources that do fit what I need, while also giving me a grounding in the basics that will help me understand what I’m reading.

Right now I’m starting from a degree in Physics (20 years ago) plus some lesser skills in coding so I’m probably pretty much starting from scratch.

Denis_Thibeault · December 26, 2019, 5:01pm

Following this post with strong interest as I would also like to understand what I am doing and why. Thank you.

afre · December 26, 2019, 6:02pm

@elstoc I would say play to your strengths. Start there and branch out.

Each discipline has its blindsides. The interdisciplinary approach was big when I attended school and I enrolled in one of the most cross-disciplinary program that I could find, and I still attended classes that weren’t a part of my curricula to broaden it out even more.

Of course, there are opinions and assertions that are plain wrong but more often than not it is just another perspective of a very complex subject. A good analogy that I think people here would get is colour constancy. Just as colour looks different depending on the lighting situation, the problems of image processing look different depending on your background.

The reason that “most” papers suck is that people just want to get their Masters degree done. Yes, it is the fault of the supervisors and journals for encouraging and publishing misinformation. But my point is that many people approach their subject as a means to an end, which doesn’t help the discipline or contribute to our collective knowledge base coming from research.

ggbutcher · December 26, 2019, 6:42pm

The scholarly work can be a bit tedious to plow through. I’ve come at all this a bit sideways; disclosure: while I have three university degrees, latest is a doctorate in computer science, I have only had four math courses in the whole thing. Long story, sums up to the ‘there are more ways than one to skin a cat’ adage… So, anything that smacks of signal processing is literally Greek to me; as a computer programmer I essentially look for the summations and think “for-loop”…

There are some fundamental mechanics to understand that help. Numeric representation of image data is foundational, and can trip one up if not considered in the overall mechanism. A lot of the papers I’ve read devolve to the 0-255 8-bit range between black and white for their examples, but that really doesn’t help to understand the dynamics of 11-, 12-, and 14-bit resolutions of current sensors delivered in 16-bit containers and often converted to a floating point representation for the precision. G’MIC’s floating point representation 0.0-255.0 had me tearing my hair out a few years ago, as I’d been condition to the 0.0 - 1.0 convention used by a lot of other software.

There’s nothing like studying implementing code. Nothing that can help better. And, nothing that can vex you more. Most of what I’ve come to understand about raw processing came from studying David Coffin’s dcraw.c, but his C code is bewilderingly “Elegant”. @Elle Stone has a really great web page on it, dcraw annotated & outlined, the entire source annotated from the perspective of a non-programmer. Hat’s off to here from bear-of-little-brain here… Another reference that ties into dcraw.c, from the perspective of the Canon CR2 format: Inside the Canon RAW format version 2, understanding .CR2 file format and files produced by Canon EOS Digital Camera

Hope all this is helping…

afre · December 26, 2019, 7:06pm

Yep, code is the practical implementation of what is written. I believe that @Elle isn’t a programmer by training so her code is very hack-y but as such she knows how to communicate well to the rest of us what is going on. Keep in mind her articles are out-of-date but we can still glean a lot from her.

guillermoluijk.com is awesome too. He inspired lots of people and code.

anon41087856 · December 26, 2019, 9:09pm

What is it that you want to understand ? A sensible way to push pixels for photographers ? The theoretical background of algos used in darktable ? All of it ?

I learned everything the hard way… My original trade is solving heat-transfer PDE by finite elements. So, using my maths and some epistemology background, I binge-watched Google Scholar and tried to reproduce the results (lot of papers come with Matlab code) until things started to make sense. After a while, just looking at the equations (good maths are elegant) and the results (halos mean bad algorithms in bad colour spaces), with years of practical photo editing experience, you can recognize good ideas from geeky BS.

Then, all you have is your judgement. Whenever people squeeze in perceptual frameworks to solve 100% signal processing issues (interpolation, denoising, reconstruction), you know it stinks. There is a lot of BS in academia these days, people pushed to publish whatever, low-rank universities going to image processing because it doesn’t cost much (you don’t need to build a super hadron collider to do research in image processing, that’s for sure), so at the end your judgment is key anyway. It really falls-back to straight epistemology.

elstoc · December 26, 2019, 9:34pm

This is the key question isn’t it? I’m not sure I have a good answer, but I can give you a brief flow of consciousness to try and give you an idea… My initial motivation is to understand complex darktable modules on a deeper level than ‘push the sliders until the image looks good’. So that when, for example, I look at the new (darktable 3, non-local-means-) version of profiled denoise, I can understand what’s happening when I move each of the 7 sliders and to choose which one to adjust when and why (the auto mode seems a bit like cheating to me!). The manual is a good starting point but, being written by people who know the subject inside-out, can often assume knowledge that the average photographer doesn’t have, and at some point I usually start to get a bit lost and revert back to ‘play with the sliders and see what happens’. Transferring the information in the manual to a reasonable workflow is sometimes hard to do. Some of the theoretical background would probably help me to do this in a less ‘random’ way and allow to adapt my workflow based on the image and my intent.

Ultimately I would love to contribute back to the project (to coding/testing or to the user manual) but to have a workflow that I’m fully in control of (because I understand it) is my initial goal.

However, I’m basically interested in everything, so I’m happy to get pulled down some academic rabbit-holes on the way.

paperdigits · December 26, 2019, 9:40pm

Here is the chance for you to help solve what you’ve railed against for some time, photographers who don’t know the difference between the tip of their lenses and the holes in their ass