Slow writing of tiff

Hello,

I work with 3D images, typically tiff files with many directories/slices. When saving these using gmic it takes a long while, minutes, to save the file in disk (400+ slices). If using compression then much longer, understandable. tiffcp does a very quick job writing and saving the file from slices, each on a file, with or without compression so I assume we don’t have a hardware/disk issue. Given that gmic uses libtiff, I assume, I would expect similar writing speeds, but this is not the case.

Is there any way to significantly speed up the tiff writing? Would I need to install gmic from source to achieve a speed up?

gmic: GREYC’s Magic for Image Computing: Command-line interface
Version 3.5.3
(https://gmic.eu)

Thanks for your help.
Alex

I’ve two question :

  • Have you compared the size of the files generated by gmic and tiffcp ? G’MIC tends to output float-valued tiff files as it works with float-valued images internally, and as soon as you apply a filter to your image, the chance you get a float-valued output file is high (and those are usually larger).
  • What are you doing with G’MIC before saving the tiff file ? Do you apply filters ?

Thanks.

Thanks David for the reply and questions.

I usually just use the quantization to adorn the tiff output, as in foo.tif,uint8 or foo.tif,uint16. In the past I also added compression, e.g. foo.tif,uint8,lzw but that was just taking too long to save the file. tiffcp generates smaller files but I usually use aggressive compression, e.g. tiffcp -c zip:p9, which I was not able to achieve with gmic and have not actually done any meaningful comparison.

There are multiple types of operations including filtering, e.g. normalize_local. I don’t recall the slow writing being specific to the types of operations executed. I will be happy to do tests you recommend to track that behavior or any other that can be useful.

Below is a timing comparison between tiffcp and gmic, no filtering or other operation in gmic before saving the desired range of slices from a large tiff stack.

tiffcp operates on single slices saved by gmic so there should be an extra time added to tiffcp to account for I/O from gmic to save the desired stack slices, which are ‘4.742u 1.091s 0:06.33 92.1% 0+0k 0+590400io 0pf+0w’. Removing the ‘uint8’ adornation below (input is already 8-bit quantized) does not help, gmic timing is then 157.425u 2.196s 2:40.56. For this test gmic is almost 38x slower than tiffcp.

time tiffcp -a -c zip:p9 crop_e3D_000*.tif crop_e3D.tif
4.190u 0.196s 0:04.40 99.5% 0+0k 96+188792io 1pf+0w

time gmic stack-1-crop-xzyc_e3D_xy.tif -keep’[0-599]’ a z -o crop_gmic_e3D.tif,uint8
[gmic]./ Start G’MIC interpreter (v.3.5.3).
[gmic]./ Input all frames of TIFF file ‘stack-1-crop-xzyc_e3D_xy.tif’ at position 0 (1268 images [0] = 820x612x1x1, (…),[1267] = 820x612x1x1).
[gmic]./ Keep images [0,1,2,(…),597,598,599] (600 images left).
[gmic]./ Append images [0,1,2,(…),597,598,599] along the ‘z’-axis, with alignment 0.
[gmic]./ Output image [0] as tif file ‘crop_gmic_e3D.tif’, with pixel type ‘uint8’, no compression and bigtiff support (1 image 820x612x600x1).
[gmic]./ End G’MIC interpreter.
158.194u 2.094s 2:40.98 99.5% 0+0k 0+588848io 0pf+0w

That’s interesting, thank you.
The main piece of the tiff saving function used by G’MIC is the one defined at l. 62823 of file CImg.h.

    // [internal] Save a plane into a tiff file.
    template<typename t>
    const CImg<T>& _save_tiff(TIFF *tif, const unsigned int directory, const unsigned int z, const t& pixel_t,
                              const unsigned int compression_type, const float *const voxel_size,
                              const char *const description) const {
(...)

It has been written by Jérome Boulanger (so, not me :slight_smile: ).

I don’t see anything suspect in there right now, and I’m not really aware on how libtiff works, so I can’t really tell what is going on here, and why it is so slow.

A first test I did with G’MIC shows that it’s indeed quite slow!

$ gmic 300,300,300,3 noise 10,2 n 0,255 round tic o large.tiff,uint8 toc
[gmic]./ Start G'MIC interpreter (v.3.5.6).
[gmic]./ Input black image at position 0 (1 image 300x300x300x3).
[gmic]./ Add salt&pepper noise to image [0], with standard deviation 10.
[gmic]./ Normalize image [0] in range [0,255], with constant-case ratio 0.
[gmic]./ Round values of image [0] by 1 and nearest rounding.
[gmic]./ Initialize timer.
[gmic]./ Output image [0] as tiff file 'large.tiff', with pixel type 'uint8', no compression and bigtiff support (1 image 300x300x300x3).
[gmic]./ Elapsed time: 18.112 s.
[gmic]./ End G'MIC interpreter.

18.112s for saving a 80Mb file on a SSD seems indeed a bit long.

I quickly wrote a small C++ code that does the same as my gmic example above, but here, without using G’MIC at all. So, I can confirm the slowdown is related to the save_tiff() function of the CImg Library. I’ll investigate further.

Too many strips/tiles maybe?

Thanks so much for taking the time to check on this and confirming the slow writing.

Is it possible that the writing of each image plane requires trashing the cpu cache memory? If the image data memory is not continuous there should be some overhead loading the cache for each plane. With few planes this is not noticeable but for a large quantity writing may be really slow.

The number of slices seems to be a problem for gmic but it doesn’t seem to be necessarily the problem by itself. If so, I would expect tiffcp, which uses exactly the same libtiff as gmic, see below, to also take a long time to write out the file, which is not the case.

> ldd /bin/gmic | grep libtiff
 libtiff.so.6 => /lib/x86_64-linux-gnu/libtiff.so.6 (0x00007b0a6ee2f000)`
> ldd /bin/tiffcp | grep libtiff
libtiff.so.6 => /lib/x86_64-linux-gnu/libtiff.so.6 (0x000078facf07f000)

Pertinent text below, from https://cimg.eu/CImg_reference.pdf. Maybe cimg_use_tiff was not defined to build the gmic_3.5.3_ubuntu24-04_noble_amd64.deb package and save_other(...) is not that efficient? But given that gmic links to libtiff this might not be relevant…

libtiff support is enabled by defining the precompilation directive cimg_use_tiff.
• When libtiff is enabled, 2D and 3D (multipage) several channel per pixel are supported for char,uchar,short,ushort,float and double pixel types.
• If cimg_use_tiff is not defined at compile time the function uses CImg&save_other(const char∗)

I can ensure it is defined indeed.
Playing a bit with function save_tiff() from the CImg Library, it seems that it takes a bit less than 60ms per slice, to save a volumetric image.
This quickly becomes a problem for a large number of slices.

EDIT: And after further investigation, it’s not even the writing of the data to the file that takes time, ~but all the calls to function TiffSetField().~
→ What takes time is the finding of the min/max values for each image.
I’ll see how to fix that.

OK, so, to sum up.
The function CImg::_save_tiff() saves one plane (image) in a multi-image TIFF file.
For each image, it sets all a lot of metadata (TIFF fields) related to the image content.
It does not take much time for most of them, but two, which are :

  • TIFFTAG_SMAXSAMPLEVALUE
  • TIFFTAG_SMINSAMPLEVALUE

Those stores the min and max of each image value, which means for every plane saved, the whole image is scanned in order to find its min/max values, before setting the corresponding tiff fields.

If I just remove this step for these two fields, the file saving goes from 18s to 110ms.

Now the question is : are these fields important or not ?
If not, I can just ignore them when saving the tiff file.

Here is the (trivial) patch to CImg.h :

I’m currently building new binary packages for 3.5.6_pre with this patch included. I’ll let you know when it’s ready.

It’s ready for testing!

Index of /files/prerelease

couldn’t you just calculate min/max while converting the data for writing out. instead of having to rescan the whole image at the end?

also could it be that it ran the scan over the whole image data for each pixel? because 18s to 100ms looks suspiciously like it ran that scan more than once.

Yes, actually I’ve just seen that I was recalculating the min/max for the whole image when saving each plane… I’m currently fixing this.

I was looking at the git history to find the commits i would need to backport … but yeah not even gonna try that with a git history like that. I will just wait for a new release.

I understand.
But for the way I develop code, writing commit messages is generally useless. I consider a git repo just as a giant USB stick containing the latest sources of my code. I don’t use any of git’s other features, nor do I care to. I’m developing the core of G’MIC on my own for the moment, and I’m fine with that.

It’s actually wiser to use the .tar.gz archives that are regularly updated on the G’MIC website (e.g. the pre-release versions).

If github disappeared tomorrow, I wouldn’t mind at all.

You’d still need a “plan B” to host the community files though?

I have the domain https://gmic.eu with a file hosting service.
As long as I can put the sources on the G’MIC webpages, that’s fine.