This AI Learned To See In The Dark | Two Minute Papers #253

Wowee…

The paper “Learning to See in the Dark” and its source code is available here:
http://web.engr.illinois.edu/~cchen156/SID.html

I wonder if something like this could find it’s way into RawTherapee and darktable?


ps. first time posting here so hopefully this is the correct section. :confused:
4 Likes

Remind that all these deep learning techniques use a huge amount of memory.
The paper here tells us they use a U-Net architecture for generating the result, and the page https://spark-in.me/post/unet-adventures-part-one-getting-acquainted-with-unet tells a bit more about the size needed for the storage of such an architecture :

Relatively high GPU memory footprint for larger images:

  • 640x959 image => you can fit 4-8 images in one batch with 6GB GPU;
  • 640x959 image => you can fit 8-16 images in one batch with 12GB GPU;
  • 1280*1918 => you can fit 1-2 images in one batch with 12GB GPU;

12 GB to process a 1280x1918 image…

So with my 24mp Nikon, I need only like 128gb GPU RAM? :stuck_out_tongue_winking_eye:

Probably more, as the number of coefficient defining a net is usually increasing more than linearly regarding the input resolution of the image… Of course, the image can be split in different parts, processed, then merged back.
I just wanted to make sure people realize that deepnet are indeed nice, but always require an insane amount of memory to run. That’s why we do often see these algorithms implemented only as external web services rather than running on the client’s computer.

1 Like

Or the GPU memory could be handled as (tile-)cache. The point is fair enough though. Future-tech!

Utilizing Optane for swap in such situations could also be interesting…

I have a bad memory and I can see in the dark. :wink:

I skimmed the paper. It is still unclear to me what the FCN does. I know it is trained by sets of input and reference images, but how do we get from that to a processing pipeline? I guess it is something that a 10-page summary won’t discuss.

Far as I can tell the idea is to train per specific camera model so it learns to recognize the noise patterns specific to camera sensor and derive information from that. But then I’m not a programmer, a glorified script-kiddie at best.

Could also check out the github repo, it’s MIT licensed. It does have tensorflow as requirement though.

@afre

1 Like

I pictured the same thing but not the underpants part. :underage: I didn’t get the reference because I haven’t watched South Park. Ref: Gnomes (South Park) - Wikipedia.

It would be cool if we could link to an online supercomputer to do the heavy work. G’MIC already can communicate online to retrieve information; maybe one day, if David can get a hold of a supercomputer, he can use it to do the calculations and then pass on the result to G’MIC. Who knows. :slight_smile:

@lylejk Unsure how super this computer has to be in order to take requests by say 5000 people. :wink:

Yeah; just trying to get your result from the Deep Dream project can take way too long a time. lolol

:slight_smile:

Are you sure you’re not mixing up the training phase with the network that you run afterwards? Because the latter does not need that much memory IIRC.

The large part of the memory needed is used to store the coefficients of the network, which are necessary both for training and evaluation. I don’t see why the training would take a lot more memory than the evaluation.

Well, usually when training a network you have to process a lot more data. Afterwards, you can prune the network and all you have left is said network.