How to analyze a corrupt raw file?

Not my raw file, so we can talk about the methods but not reuploading pictures from the file

Edit: reported that the user had malware before Cannot Open CR2 Raw Photo: Beginners Questions Forum: Digital Photography Review

I loaded the file in darktable with darktable -d perf and saw that Rawspeed rejected it.
Also Exiv2 and Exiftool reported unknown file type. I couldn’t even extract the embedded JPEG with dcraw. So what is the best method to analyse what is wrong?

I notice that that DPR forum user has had a virus problem in the past… maybe should be careful how one handles that file? :woozy_face:

2 Likes
2 Likes

Hope it was a Windows one…

1 Like

Well, he was using windows… I didn’t read the whole thread. It seemed a bit vague actually. Is there any/many Linux or Mac viruses these days? I guess I’m being vague now :slightly_smiling_face:

Perhaps encrypted raw file due to malware.

Malware encryption programs tend to make their presence known, to extort ransom…
Then again, that should be very quickly to spot with a binary editor, esp. when you have a valid CR2 file to compare with.

But from the information available, we can only guess about the cause(s) of those files being unreadable…

Oh, and there’s no reason to believe Linux or MacOS are more resistant to malware attacks than windows, as the weakest part in the defenses tends to be the wetware… But as they have a much smaller market share, they are perhaps a less attractive target.

At least Virustotal didn’t react on the file.

Depends on where the corruption came from. Was it corrupted by the camera? or after the file was stored?
The camera corruption could occur two ways, hardware issue that damaged the file, of damage that happens while storing it on the flash memory. On the computer side the hard drive could be failing, and the file was corrupted there. Is there more info about where the file corruption happened?

When I read it from that user it seems to be the hard drive.

Then you really need to use a disk repair utility. It may be able to recover the data for you. It might be the case that the data in that particular file, at the worst has a unrecoverable block or more. Is it spinning rust? or an ssd? how big is the drive? and how much time to devote to recovery if this is your only system?

With a hex browser:

CR2 is actually an extension of TIFF, so a CR2 file starts with a TIFF header. This means that the first 4 bytes are either 49 49 2A 00 or 4D 4D 00 2A. If you browse the file with a hex editor there are some recognizable things in the first bytes of the file:

If you see completely random junk instead or swathes of zeroes the file is probably FUBAR.

With tools you probably have at home:

  • The file should of course have a reasonable size for the camera (around 25-30 MB for my 20Mpx camera).
  • A good CR2 file doesn’t compress well. ZIPping a CR2 file yields a ZIP of about the same size (98-99%). If the ZIP file is significantly smaller, then the data in the file isn’t as random as it should be (probably long sequences of zeroes) and likely invalid and not worth recovering.

Didn’t compress well.

And the size is about what you would expect for a RAW file from that camera?

The raw file is downloadable in the link. 23MB. I don’t know the camera model.

Weird… the CR2 header(*) can be found deep down in the file (after about 20M of data):

Of course one can copy the tail of the file (from the TIFF II* marker to the end), but this makes a RAW files of 2.5MB, a bit small for the output of a 700D.

This looks like a filesystem corruption. If the disk hasn’t bee tinkered with, the rest of the data could possibly be in the sectors following those of the end of the file, but in the general case they could be anywhere, in one or more pieces, with some pieces already overwritten by later disk activity.

(*) In truth, not the file header but a file header. Maybe the data the owner is looking for is what is before that header, and is missing its header, which could be somewhere else on the disk, and perhaps have been included at the tail of another file. But looking for this would require a dump of the whole media.

1 Like

I could read EF-S 24 mm and firmware version. Couldn’t find any embedded JPEG.

Maybe look harder:

image

The FFxx are JPEG format markers. And the FFD8 (“Start Of Image”) right after where something else terminates can’t be a coincidence.

2 Likes

Right after what looks like an XMP packet

But given that the JPEG is offset WAY deep into the file vs where it should be, the JPEG may be truncated

0x1632930 - 0x13CB6D0 = 0x267260 = 2519648 bytes (so, 2.4) before the end. The embedded thumbnails in the CR2 of my 70D (slightly larger sensor size) go from a little under 2M to 4M, depending on picture, so tight, but not hopeless.

There are actually two images: a 160x120 (the one that starts at 0x13CB6D0) and a 5184x3456 one (which is coherent with the sensor size) at 0x13CFA70:

image

with still significant space for a thumbnail. And since there is a end of image marker well after (0x1600720) (2.1MB further) that coincides with a change in the type of content (so it’s likely not a random pair of bytes):

image

… there is hope that the preview image is still complete.

But of course, we are talking about recovering the thumbnail now, not the raw data, and the author could have shot RAW+JPEG and still have a full size JPEG.

1 Like