Ingo, the xy arrangement is as you describe but there is no depth difference .. the ab pair is under a single microlens and the result is each of the a/b sees only a part of the apperture (pupil). So for the OOF areas each one sees (and records) a different picture.
Try to cover the backside half of your lens, then the other half without changing anything else ..
Bobn2 had pictured this at Dpreview forums but I cannot find the posts
The result is shown in the gif I linked as displacement
If the pixels were normal (just not square, like your old Nikon ) we could take double horizontal resolution with 5D4 ..