New Canon .CR3 File Specification

Folks:

My friends at darktable have requested that Exiv2 support Canon’s new CR3 file which will appear with the EOS M50 Mirrorless Camera: This is the Canon EOS M50, specifications and image leaked (4K, new raw file format, viewfinder)

I have carried out a preliminary investigation, located sample images and made positive progress. The file is written as ISO BMFF which is well documented and for which I have found code to search/dump the structure of the file. Canon CR3 support. · Issue #236 · Exiv2/exiv2 · GitHub

The file contains a couple of uuid boxes which I have not yet identified. There’s no question that the Exif metadata is embedded in the first of these boxes. I’ve been unable to find a specification for this data and wonder if anyone knows where to obtain the specification.

I have a friend who spend 20 years at Canon and recently retired. However before I drag my friend into this investigation, perhaps a member of the community knows where to find the specification of these Canon specific uuid boxes.

4 Likes

Cool! Thanks for investigating. I noticed this development as well. Based on your first impressions, what are the major differences between CR2 and CR3 formats? One or two sentences would do :slight_smile:.

  1. It’s not TIFF, but ISO Base Media Format. That is actually not bad, because of TIFF’s absolute offset nature.
  2. Not sure yet about the actual raw encoding.

The reason I am curious is that from what I have read from news and rumor releases, I cannot decide whether this is a full raw or a compromise of some sort to output smaller file sizes.

I didn’t implement CR2 in Exiv2 (src/cr2image.cpp), so I don’t know the difference.

CR3 is ISO BMFF format which is very similar to JPEG2000. However, you should think of ISO BMFF as a “container” specification which Canon will use to their own purposes. Roman (my darktable buddy) is surprised to discover there seems to be an embedded JPEG in there.

More mysteries ahead. Thank You for your encouragement.

Well, the samples we have right now (please contribute to RPU!!!) are all 30…40MBytes, so they should be actual raws. At least i don’t see the point of replacing actual lossless raw compression algorithm with some other lossy compression algorithm, while keeping the same filesize…

Not surprized, it makes sense. But as i’m being corrected, i was looking at the wrong “sub-frame”.
So still not sure what is the actual compression algo. (i’m wishing for JPEG w/arithmetic coding, not some yet another sub-par NIH-fueled homemade algo)

Definitely, esp. when they are naming their new file format CR3!

There’s something that does NOT smell good here. The JPEG at the end is 98.7% of the data and is NOT raw (although it could be lossless compressed). I’ve identified the mdat with isobmffdump (available on GitHub)

749 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ isobmffdump  ~/Downloads/canon_eos_m50_01.cr3 
@0         | ftyp [24]
@24        | moov [28440]
@32        |   uuid [26216]
@26248     |   mvhd [108]
@26356     |   trak [484]
...
@28464     | uuid [65560]
@94024     | uuid [416007]
@510031    | mdat [38025680]  <------ the is the BIG JPEG
@38535711  | end
750 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $

I’ve extracted the mdat/jpeg from the image (bytes 510,031+16 for 38025680-16) bytes (16 = length of UUID itself).

513 rmills@rmillsmbp:~/gnu/github $ dd bs=1 skip=$((510031+16)) count=$((38025680-16)) if=~/Downloads/canon_eos_m50_01.cr3 > foo.jpg
38025664+0 records in
38025664+0 records out
38025664 bytes (38 MB) copied, 418.962 s, 90.8 kB/s
514 rmills@rmillsmbp:~/gnu/github $ exiv2 -pR foo.jpg 
STRUCTURE OF JPEG FILE: foo.jpg
 address | marker       |  length | data
       0 | 0xffd8 SOI  
       2 | 0xffdb DQT   |     132 
     136 | 0xffc0 SOF0  |      17 
     155 | 0xffc4 DHT   |     418 
     575 | 0xffda SOS  
515 rmills@rmillsmbp:~/gnu/github $ ls -alt ~/Downloads/canon_eos_m50_01.cr3 foo.jpg 
-rw-r--r--@ 1 rmills  staff  38025664  4 Mar 18:45 foo.jpg
-rw-r--r--@ 1 rmills  staff  38535711  3 Mar 12:17 /Users/rmills/Downloads/canon_eos_m50_01.cr3
516 rmills@rmillsmbp:~/gnu/github $ calc "38025664/38535711"
0.986764302856641
517 rmills@rmillsmbp:~/gnu/github $

It’s a plain JPEG of 4000x6000 pixels:

522 rmills@rmillsmbp:~/gnu/github $ calc "4000*6000*3/(1024*1024)"
68.66455078125
523 rmills@rmillsmbp:~/gnu/github $

It must be compressed. It isn’t raw! However the compression could be lossless and therefore a raw equivalent.

I believe I read that the CR3 image is 14 bits/channel. There isn’t enough data for 5 bits/channel.

@clanmills do note Canon CR3 support. · Issue #236 · Exiv2/exiv2 · GitHub / Canon CR3 support. · Issue #121 · darktable-org/rawspeed · GitHub

Would Ghidra be of any use? For example to open up DPP or Adobe Raw Converter and check how Adobe and Canon handle CR3?

R6 raw samples at Dpreview now Canon EOS R6 sample gallery: Digital Photography Review