I tried opencl_memory_headroom=600 and 500.
600 worked ok but with tiling. 500 caused the contrast equalizer to be processed on the CPU. So tiling can’t be avoided but never mind, the process took only 4.4 sec on the GPU. There is not much room for further tuning.
Thank you for all your help and clarification 
….