Samsung Nona-Cell CFAs

Jonas_Wagner · February 15, 2020, 5:21pm

I’ve been looking into the state of cameras in mobile phones again and the sensor of the upcoming Samsung S20 Ultra looks fairly interesting. It’s size is approaching my RX100, the smallest sensor camera I’ve actually enjoyed using (and processing).

The Samsung sensors seems to have an interesting color filter arrangement:

So if I read around the marketing fluff correctly, it’s a standard bayer arrangement but with each pixel subdivided into 9 pixels (sharing the same color filter).

I can see how such an arrangement could be a nice compromise. It should make it relatively easy to get decent color information in low light conditions and while being able to get decent spatial resolution using a super resolution approach like in the google pixel phones.

They then mention a process called remosaicing where they seem to turn the fully demosaiced image back into a mosaiced image at full resolution but I can’t come up with any reason why anyone would want that.

Does anyone know a bit more about the tradeoffs of this color filter arrangement than the marketing fluff linked above? Do we have support/algorithms for dealing with these raws in any open source package?

afre · February 15, 2020, 5:51pm

Without doing any research into this, I am guessing that it is straightforward to mosaic. Doing so gives the user more resolution to do whatever he or she sees fit in advanced or conventional ways.

Why nona instead of tetra is easy to explain as well. The smallest radius for a normal filtering kernel is r=1, which gives you 3x3. I find a 3x3 is faster and more natural to use in my own G’MIC processing than an even number one or larger odd one.

hanatos · February 15, 2020, 6:12pm

very interesting topic, i was wondering about this too.

apparently what they call “tetracell” here has been standard for a while now. to tell the truth i don’t really see how this is a great idea. i think the 2x2 (or 3x3) pixel of same colour can be switched to different ISO, so you’ll effectively get like one HDR sample for a single colour out of the whole block. of course you can then go ahead and demosaic as usual, at lower resolution (which means in this case you get an honest resolution that is 6x6 less than the actual pixels, or 3x3 less of what we currently do with bayer).

the superresolution algorithm works by attaching anisotropic gaussians at every sample location, regardless where it comes from (remapped second exposure, bayer, xtrans, whatever). i have some code that does something similar, and indeed it works both with bayer and xtrans, and is surprisingly simple and has okay quality. could probably be extended to work with this layout too very easily. to get the blue information over the image plane without gaps you’d probably want gaussians of at least 6x6 radius here, right?

i see how this may help dynamic range, not so sure about resolution. probably easier to build a CFA with one big absorption filter over 3x3 pixels instead of one with a finer mosaic (say xtrans style), potentially less issues with interference/cross talk.

ggbutcher · February 15, 2020, 6:20pm

I just looked at the Samsung page, and it appears they’re saying the sensels are “re-binned” to accommodate different lighting. Implies they’re not using a color die-based CFA on top of the sensel array, as a sensel in different binnings could be R, G, or B…

hanatos · February 15, 2020, 6:37pm

where did you find that exactly? sure they don’t mean different sensitivity by that? i.e. there is still the CFA/absorption filter for colour, but then you can get different full well capacities for each of the individual pixels under it.

ggbutcher · February 15, 2020, 6:56pm

This diagram indicates that the different binnings are used with different light situations…

No words that describe that specifically, just positing…

afre · February 15, 2020, 7:20pm

My interpretation is that they are using the nonacell config and mosaicing the 3x3 for good conditions. In any case, I think it would be easy to find out if we did a patent search. Edit Took a quick look just now. I could find a paper behind a paywall (can’t access) that talks about the older 2x2 sensors: https://doi.org/10.2352/J.ImagingSci.Technol.2019.63.6.060410. Using 3x3 instead of 2x2 might overcome some of the obstacles discussed.

Abstract

Latest trend in image sensor technology allowing submicron pixel size for high-end mobile devices comes at very high image resolutions and with irregularly sampled Quad Bayer color filter array (CFA). Sustaining image quality becomes a challenge for the image signal processor (ISP), namely for demosaicing. Inspired by the success of deep learning approach to standard Bayer demosaicing, we aim to investigate how artifacts-prone Quad Bayer array can benefit from it. We found that deeper networks are capable to improve image quality and reduce artifacts; however, deeper networks can be hardly deployed on mobile devices given very high image resolutions: 24MP, 36MP, 48MP. In this article, we propose an efficient end-to-end solution to bridge this gap—a duplex pyramid network (DPN). Deep hierarchical structure, residual learning, and linear feature map depth growth allow very large receptive field, yielding better details restoration and artifacts reduction, while staying computationally efficient. Experiments show that the proposed network outperforms state of the art for standard and Quad Bayer demosaicing. For the challenging Quad Bayer CFA, the proposed method reduces visual artifacts better than state-of-the-art deep networks including artifacts existing in conventional commercial solutions. While superior in image quality, it is 2‐25 times faster than state-of-the-art deep neural networks and therefore feasible for deployment on mobile devices, paving the way for a new era of on-device deep ISPs.

Finally reading the copy, it is unclear what they are doing to exposure times and ISO exactly. They are likely being deliberately vague about it.

For example, the HM1’s Smart-ISO technology produces vivid and vibrant images by intelligently selecting the optimal ISO. High ISOs are used in darker settings while low ISOs are better for brighter environments to control light saturation.

In challenging mixed-light environments for photo-taking, the HM1’s real-time HDR technology optimizes exposures, producing more natural looking videos and still photographs. By assigning the most appropriate exposure lengths to each pixel, the HM1 is able to capture scenes in multiple exposures simultaneously

Jonas_Wagner · February 16, 2020, 1:33am

This is really interesting me. But I don’t yet understands why the gaussians for each sample would need to be anisotropic. At least in my limited understanding the samples are drawn from microlenses which more or less look like something between hemispheres and squares which should be isotropic. If you have some pointers for me on where to get more information I’d be very interested.

I saw that too but my suspicion is that that’s just a ‘marketing’ image. I’d be surprised if they could actually reconfigure the CFA.

hanatos · February 16, 2020, 1:34pm

i thought you were referring to this paper with the superresolution above:

they use anisotropic gaussian kernels (see fig. 2 for overview, fig. 6 for the pretty artifacts it produces, sec. 5.1.2 how to compute covariance… etc, it’s pretty clear i think).

for the reconstruction you want anisotropic because your CFA is incomplete. while incoming lighting would potentially result in something close to an isotropic blotch on a pixel after passing through a CFA lenslet, you don’t know whether the gaps you have in the blue channel actually just blur whatever you’ve seen at your sample points or whether there is an edge going through. so you fit gaussians to structure in the image and then use this for the interpolation.

kmilos · February 16, 2020, 9:26pm

The FWC of the individual pixel cannot be changed, but when you group (‘bin’) them (like 2x2 or 3x3 together), you can get increased FWC of the larger ‘super-pixel’ while still having the penalty of the single pixel read noise, if the binning is performed in the analog (charge) domain:

https://harvestimaging.com/blog/?p=1560

So this would indeed make a nice, always on hand pocket camera for capturing decent 12Mpix raws

Jonas_Wagner · February 20, 2020, 2:08am

Interesting, didn’t consider binning the pixels in hardware but it makes sense that it would lower read noise.

After thinking this through a bit more I wonder how they low pass the image before sampling it with binning. 12mpix without any optical low pass filter sounds like a lot of aliasing. Maybe they ‘hack it’ using the optical stabilization system to vibrate the sensor a bit.

Anyways I guess we’ll have to wait and see how it actually works and performs.

hanatos · February 20, 2020, 8:06am

i really find the linked paper above insightful… sounds like these quad bayer sensors cause more trouble than they do good for the software trying to make sense of it… i mean: In this section, we show that Quad Bayer CFA has more aliasing than standard Bayer CFA.

also i doubt binning is better than just using one big pixel. one pixel has got to catch more photons, also it has 1x readout noise with presumably a simpler circuit attached to it.

so unless someone convinces me otherwise i’ll store these sensors in the “marketing bs” drawer of my brain… i mean if we’re doing esoteric things i think x-trans is very good. and the cheaper manufacturing argument sounds more like mobile phones than progress to me.

HIRAM · February 20, 2020, 8:51am

So does this mean it is generating so-called Super-pixels out of the regular quad Bayer?

kmilos · February 21, 2020, 9:04am

@Jonas_Wagner
You don’t lower the read noise per se, you increase SNR by adding the FWCs.

@hanatos
Can you please share a link where it is claimed binning is better than a single big pixel?! I think the whole point here is that with this quad and nona binning of 0.8um you can get a better dynamic range than the now mature 12Mpix of single 1.25/1.4um, not that it would be better than the single equivalent large 1.6um or 2.4um pixel. But I agree the jury is still out on the supposed benefits of higher resolution mode and HDR mode, though algorithms are getting better by the day…