Looking for a possible optimization of disk usage during stacking beyond compression.

I stack lots of fairly short exposure lights and appreciate SIRILs speed. I almost exclusively use the OSC Scrips although I’ve created a couple dozen specific to my needs.

One thing SIRIL does that I try to avoid is use huge quantities of disk space for intermediate file storage. To minimize that I usually use set16bits for everything up to stacking and set32bits for the stacking, saving 1/2 the disk space, or I use RICE compression. I have a big fast SSD, so I can deal with it, but I’m just wondering if I can improve further.

Sometimes I stack “mini mosaics” together that are highly overlapping as one job with “-framing=max” and this causes a big data explosion, so it got me thinking:

When I think about how I use SIRIL’s two pass registration so I can use “-framing=max” feature, is there a possibility of it doing a pass reading through the lights data to calculate everything it needs to register and stack and then goes back and does the registration, stacking in one swoop as a memory resident pipeline?

Is there a way that this partially happens already if configured correctly so it doesn’t have to spit out registered lights and just read them back into to stack them?

I know, I’m probably talking big changes and I’m the only person asking, but I just thought I would toss this out in case I’ve missed some way to already get some of what I’m asking.

Hi Steven,

It is indeed a big change and it would require significant rework in the way stacking is done… coming with a serious (negative) impact on speed.
I’ll explain a little bit more. At stacking step, after computing the normalisation factors for each image (if input norm is enabled), siril computes the available memory available to spread X “bands” of N images, N being the number of images to stack and X the number of threads. And then determines the size (=the number of rows) of each band so as to use the available memory. Why bands, i.e. a fixed number of rows? Because it is possible to read these bands easily (= know their adress) from each input file handle and know where they will end up in the final stack. For each pixel in the final image, it can then compute the distribution of the N input (renormed) pixels, reject the outliers and compute its final value.
If the input pixels were originating from unregistered images, we could not use this strategy because for each input image, the pixels ending up in a given output band would not originate from an input band at the same address (it would depend on each image linear transformation matrix). Instead we would need to compute a larger input band, apply interpolation before feeding this to the stack. The memory used for these copies would impair how much memory is usable to stack and that would end up being slow.

Mind you, there is one exception to what I’m telling here. If you have images which are just shifted by a simple translation and your sampling is small enough to live with pixelwise shift, you can directly feed this sequence to the stack (the one that has the registration info written by a two-pass registration step). The reason why this work is that we can still track the input band position wrt. the output band position using the dX/dY shifts. But that’s the only case.

Other strategies to decrease storage space:

  • delete the calibrated images once you have exported the registered images
  • use filtering during the “apply existing registration data” step to export only images that will make it the final stack.

Hope this helps,

Cheers,

Cecile

2 Likes

Thanks you for such a thorough answer and I think I understand now how you’ve chosen to optimize it. Very clever.

And also thanks for the tips.

Oh, just a curious question: does dealing with compressed data formats like RICE screw this up given files may not be regular when comoressed, or are they still indexed in a way that allows your optimized technique to work?

Keep up the good work!

Steven

All the credit for this beautiful strategy goes to @vinvin , I’m just telling you what’s in the code :slight_smile:
I’m afraid I don’t know enough about how cfitsio handles indexing compressed arrays to answer your question…

1 Like

I’m not sure it’s only my fault :slight_smile:

the compression is done row by row, so it should not impact performance much

1 Like