Why bother with these two steps when you can read into a numpy array directly from a raw file using rawpy?
This works as expected for my Canon CR2 files, but for some reason, libraw can’t open my Sony ARW and Panasonic RW2 files.
I’m using the latest version as of this writing (0.20), and my cameras have been supported for a while now. So, I’ll have to investigate.
btw, fopen should be using argv[2] in your code snippet.
I did see that but most of the parameters I saw in the API were calls to perform functions that modified the data. It did appear like something you could do with rawpy. I’m not a programmer so it was not immediately apparent to me. I was going from a link I found on google and this reference…https://rcsumner.net/raw_guide/RAWguide.pdf Clearly I think you have more experience and can comment and direct a lot better than me…
Here’s one for you…I might get time to go through it…looks like it could be interesting…
Hmmm, I just opened a RW2 (Panasonic DC-LX100M2) and a ARW (Sony ILCA-77M2) with rawproc, which uses the git master branch of libraw, pulled about two weeks ago. Libraw has decent error reporting, just need to hook into it…
Thanks; wrote that before I actually looked at the code… 
I haven’t actually sat down and learned python, yet. I can copy/paste others’ recipes with aplomb, however… 
Upon examination of the output from your rawdata program, it appears as if the original raw image data has been converted to PGM format.
The file contents are essentially identical to the output of libraw’s bin/unprocessed_raw, but without the PGM header and with one additional trailing character.
Since the raw data has been modified, this isn’t suitable for what I’m trying to do since the output is dependent on how libraw generates this data.
I found this comment and code snippet on libraw’s website which uses libraw’s internal data unpacker, and I think this is closer to what I’m trying to do:
If you’re concerned about compression being modified, I get it.
libraw_internal_data is a protected member of the Libraw class; you’ll need to get Libraw source, move it to public: compile LIbraw, and link to it instead of the libraw version installed by the OS.
Ugh, that sounds kludgy, but it may be the only way to do what I want unless you can think of a better way.
I wonder if the libraw maintainers would consider adding a public analog for libraw_internal_data.
Did a little more digging, and there’s a way to get a pointer to the data structure with this function:
libraw_internal_data_t *get_internal_data_pointer()
Line 303 in libraw/libraw.h. So:
libraw_internal_data_t * libraw_internal = RawProcessor.get_internal_data_pointer();
Rob Summner’s guide is indeed a fantastic tutorial on how to access raw pixel values and understand /toy with basic processing steps for those interested, and still applicable to Matlab/Octave AFAIK.
Btw, as dcraw seems to no longer be maintained, people might have better luck w/ dcraw_emu or unprocessed_raw tools from the LibRaw package.
import rawpy
with rawpy.imread('image.nef') as raw:
raw_image = raw.raw_image.copy()
This however, doesn’t help the the OP as data is after unpacking/decompression, and I don’t think rawpy exposes that internal data structure as discussed…
So, I re-organized the rawdata repo to include what we’ve discussed here. I renamed rawdata.cpp to raw2tiff.cpp, and made two new programs, raw2dat.cpp and reallyraw2dat.cpp.
I also replaced the Makefile with a CMakeLists.txt. Funny, making the program changes took about 15 minutes. Writing that danged CMakeLists.txt took the better part of a day…
Good stuff. On that linked page:
reallyraw2dat: Uses an undocumented access method to the Libraw
internals to retrieve the unmodified (uncompressed, etc...) image data
from the camera raw file and write it to a .dat file.
I think you mean that your program doesn’t modify the data, and doesn’t de-compress it. The word “uncompressed” suggests that your program uncompreses it.
Fixed: ‘uncompressed’ to ‘compressed’. I think that conveys it properly; let me know if it reads another form of ‘funny’… 
Also, I need to disclaimer reallyraw2dat.cpp: I have no real idea what is retrieved from that libraw location, except what’s asserted in the libraw thread post. YMMV…
Although if someone is being that strict - why not just hash the entire file?
When I run your reallyraw2dat program, I’m getting an empty output file.
I was having similar issues when I was trying to figure this out earlier. Even if I debug by using sufficiently large numbers for size and count for fwrite, the output file is still always empty.
fwrite (rawimage, 1000, 1000, f);
Also, libraw_internal_data->unpacker_data.data_size is giving inconsistent sizes for nearly identical Canon CR2 raw files (though I’m not sure if there’s anything wrong with that), identical sizes for Sony ARW raw files, and sometimes zero for others.
Hmmm… I ran it against one of my NEFs and got a decent-sized file, couldn’t find a way to verify the size with either exiftool or exiv2.
If the inconsistent sizes are close, within ~50 pixels or so, that could be the difference between the uncropped raw image and the image cropped of the sensor’s masked borders. I was contemplating outputting to stdout various essential libraw-supplied metadata like the width, height, rawwidth, rawheight, but became subsumed with figuring out CMake… 
I won’t be able to do anything else on this for about a week, but I’ll be thinking about it…
I’m not sure how to verify that the size (or raw data) is correct, either.
Maybe test by modifying the EXIF data for some raw files and make sure the reallyraw2dat output is still the same. At the very least, we would know that it isn’t necessarily doing the wrong thing. 
I figured out what’s wrong: the data_offset is supposed to be for the input file, but your program is using it as a data pointer in memory.
btw, I asked the libraw maintainers about the apparent inconsistencies in data_size, and they said that is to be expected because many raw formats use compression.
Furthermore, they also noted that this low-level method won’t work for raw formats that use more complex data structures involving chunked data like for tiling or striping, and recommended using libraw’s unpack functionality which can already do this.
I don’t have a good understanding of the various data structures, so I’ll need to dig into this further.