As far as I know, the driver cannot report free memory, only total memory on the card (we don’t know how much the operating system uses). The rest is just conjecture:
- darktable asks for the total memory and reads the headroom value; assumes that the difference is available
- then, when it actually tries to use/allocate the memory, the attempt fails
- darktable falls back to the CPU path
By setting a higher headroom, darktable tries to allocate less memory, which succeeds. You may want to experiment, maybe with a value like 600 you’d be able to avoid tiling, and speed up contrast EQ (‘atrous’) even more (but don’t expect a jump as much as from going from CPU to GPU).
Performance tuning tips: https://elstoc.github.io/dtdocs/special-topics/opencl/performance/