I agree. It sounds silly not to use the memory if it’s available. If you have few cores, and darktable switches to tiling, those few cores will have to work even harder. (Threads may not be the same as cores, because of hyperthreading, but I have not looked into how the number of threads is determined.)
Update
I think they do it like this (checking both installed memory and threads) because they don’t just tweak memory usage parameters, but also decide on settings that affect the choice of algorithms to use: demosaicer for zoomed-out mode in the darkroom, and ui/performance (‘prefer performance over quality’), which seems to affect:
I was going to raise an issue/pull request to modify the code, but I’m in no rush. I started to read the actual code in master to see if there are more changes since that last pull request. I think I noticed some other changes that use the >= 2, so it needs more investigation. I’m currently busy with work.
diffuse or sharpen uses a wavelets decomposition with maximum 10 scales and needs to store the high frequency buffers for each scale plus the residual, because the diffusion process works coarse to fine.
Each scale has a blur size equal to 2^scale, so for the last scale, the radius is 1024 px.
For the tiled variant, a tile overlapping equal to the blur radius is necessary for numerical consistency with the direct approach. Meaning a padding of 1024 px is necessary on each side for the largest size. So the tile size is defined by a mandatory padding first, then the center region is filled as much as possible until RAM is saturated. Problem is the padding region gets computed 2 to 4 times instead of once.
When the image is downscaled, for example in the preview, the image highest frequencies are removed so the wavelets decomposition discards the n first scales and processes only the last. Also the blur radii are scaled by the zoom factor, so the coarsest scale radius is 1024 px * zoom, and so is the padding, which explains the performance boost.
Unfortunately, at export time, if you export at 1:1, no speed-up boost for you.
Diffusion is an iterative process and there is no other way. Actually, the wavelets scheme is already a speed-up, because it will get you in ~32 iterations similar results to what is achieved in the litterature with 100 to 150 iterations.
Also, diffusion is kind of a convolutional neural network and borderline AI. People have been asking for AI shit for years in dt, that’s the runtimes you get wich such methods.
…and magic indeed it is, and high praise for the magician! The sharpen demosaicing preset, for one, is fantastic, best results I have ever seen (LR is not even close). I am not complaining at all — except with myself a bit at having bought a macbook, it seems darktable runs much better on linux. I am fine with the current exporting times of a minute or so per image on my machine: it was hours that was bugging me