Hi,
just wanted to report on this issue.
I’ve been shooting mostly with the ES lately even if I knew that noise would be a bit higher than with MS. I didn’t know that the difference is due to the fact that the bit depth changes from 14 to 12 to speed up pixel reading in ES.
Here is a simple test shot of the same scene with the same parameters except for the MS to ES switch. On the left MS 14 bit.
EDIT:
I always forget to tell about my camera: a Fuji X-T30