Smithsonian Open Access: great collection, not so great quality?

Today I read that the Smithsonian recently put a huge archive of CC0 images of a variety of artwork and objects online: https://www.si.edu/openaccess

I believe this is a truly wonderful collection, in terms of visibility and accessibility of a vast collection of things, and a enormous preservational effort. From a photo enthusiast perspective it’s also fun: for a lot of images they let you download a high resolution TIFF file. The downside is that the image quality is, oddly enough, well… pretty shitty, somehow…

Take for example this image of a quilt: https://www.si.edu/object/acm_1995.5009.0003
And this image of a painting: https://www.si.edu/object/fsg_F1906.269

The TIFF files you can download contain the original metadata, which is interesting in itself. The quilt was shot with a Nikon D800, the painting with a PhaseOne IQ180 (a better choice for archival purposed imo). Also, the quilt was shot at f/14.0 which is a questionably aperture for the lens used.

But the image quality… well. Just take a look at this 200% crop of the center of the quilt.


And remember, this is from a TIFF file. There’s not supposed to be any JPEG artifacts (other than the ones introduced while pasting the snippet here). But where is the texture in the fabric?! It almost looks as if they exported the JPEG first and then converted that to a TIFF.

And even when viewed at 100% I think it is quite clear a lot of detail is lost.
image

How about the medium format image? Well, there’s plenty of pixels (5272x9582) that’s for sure. But a 100% crop of the painting also shows a rather poor quality imo.


Compare this to the available JPEG image on the website, where I added a little sharpening for comparison’s sake:

It really seems to me that the TIFF is not a properly processed raw file. The patterns/artifacts that are present in both images as simply too similar. Could it be they simply extracted an embedded image and exported that to RAW? Surely they must have been smarter than that? … The Exif information even tells me the images were processed in Photoshop, which has had RAW processing capabilities for a long time.

Any thoughts from the community?

You’re doing better than me. I clicked on a random item:

https://www.si.edu/object/nmnhpaleobiology_3122097

then clicked on what I’m guessing is a download icon although there’s nothing to say it actually IS a download icon (if only they’d used text instead of icons I’d have known what they were for) but I just get a random white splodge next to the icon. I suppose if I were bothered I could try a different browser/pc etc but I’m just not that bored. But certainly, clicking on the plus icon zooms in and it’s not an especially clear photo.

For the quilt photographed with the D800, I can’t see any JPEG artifacts. The biggest problem is that the cloth texture is often about the same size as the pixels, causing interference or cancellation – gray smears.

The main light is from the right and slightly below. An unusual choice.

Depends on who is on the job I suppose. Probably some poor unpaid intern. Maybe us nerds should write them our demands for quality. :nerd_face::roll_eyes:

The Smithsonian is under a lot of budget pressure and relies heavily on volunteers, who may or may not know their way around a camera. It would be great to have premium downloads of their extensive collections, but I think it’s admirable that they been able to make their holdings available to the public in the first place.

Indeed!

How would you all go about shooting accurate photos of the blanket?

Well, on the weblink, it says “Welcome to Smithsonian Open Access, where you can download, share, and reuse…right now, without asking…What will you create?” It seems like the intent IS for people to reuse for artistic purposes.

I’d shoot raw, and allow download of a thumbnail, web res and full res medium quality jpegs, and the raw file, and not bother with a useless tiff file. First of all, the raw file will actually be smaller than even an eight bit tiff, because the process of demosaicing from a single channel color array effectively doubles the data, and second, camera manufacturers are usually pretty good at coming up with good lossless raw compression algorithms that are much smarter than the oldschool tiff compression algorithms. Second, tiff files have lots baked in, as the original poster has already said.

In terms of lighting, I think I’d go out of my way to use low powered daylight balanced tungsten, so that the smooth spectral response of the lightsource minimizes color errors.

Camera wise I think their PhaseOne IQ would suffice. I’d shoot at the critical sharpness aperture of a bellows mounted longer focal length large format lens to ensure that the optics are not pushed to the limits in terms of corner performance. If the DOF wasn’t flat enough to get the whole artifact in focus, for say, a 3d sculpture, I’d use Helicon Focus, a focus stacking software with a raw output capability explicated here. Whilst yes it would be still demosaiced, that is kind of the best that would be practical, instead of the server expense of downloading a stack of raw frames.

1 Like

The problem with raw is that maybe 1% of the public would know what to do with it. However, also from their FAQ site:

What if I want an image size or format that is not on the website?

For image sizes or formats that are not available online, please contact the specific museum or program associated with the asset listed on the Rights Contacts page.

So you may be able to get a raw file on a case by case basis.

1 Like

The images might be good enough for AI adversarial training, machine-learning use.

Well, they are already offering full res lossless eight bit tiff files, and those are already overkill for the Joe Schmoe who would not know what a raw is. My point being, is that raw files are more flexible, high quality AND smaller file size than what they are already offering on their website. The 99% of people who do not know how to use raw files would be just as suited by a full res jpg, and if they download the tiff, then they are wasting the Smithsonian’s web server resources.
The 1% wanting the raw would put less pressure on the server by downloading a 40 megabyte compressed raw, vs a 100+ megabyte tiff, and get more out of it.