One direction linear blur?

Speaking of which…
I’ve noticed that the convolve_fft command did not produce the same result as the convolve command with the same input image and same kernel.
This is now fixed.

Also, I’ve added an additional argument boundary_conditions to command convolve_fft, to allow choosing between different boundary conditions, not only the periodic one.
Should be good now (after a $ gmic update).

Another thing: I’ve tested the convolution of a 1024x1024 image with square kernels of increasing sizes, with convolve and convolve_fft and measured the timings.
On my 24-cores machine, it appears that convolve_fft becomes faster for different sizes, depending on the boundary conditions chosen.
Here is what I get here:

Input image is 1024x1024.

Convolution 3x3:
  > 'convolve' (dirichlet): 0.016 s.
  > 'convolve' (neumann): 0.002 s.
  > 'convolve' (periodic): 0.054 s.
  > 'convolve_fft' (dirichlet): 0.403 s.
  > 'convolve_fft' (neumann): 0.431 s.
  > 'convolve_fft' (periodic): 0.393 s.

Convolution 4x4:
  > 'convolve' (dirichlet): 0.05 s.
  > 'convolve' (neumann): 0.044 s.
  > 'convolve' (periodic): 0.085 s.
  > 'convolve_fft' (dirichlet): 0.403 s.
  > 'convolve_fft' (neumann): 0.413 s.
  > 'convolve_fft' (periodic): 0.359 s.

Convolution 5x5:
  > 'convolve' (dirichlet): 0.022 s.
  > 'convolve' (neumann): 0.02 s.
  > 'convolve' (periodic): 0.117 s.
  > 'convolve_fft' (dirichlet): 0.718 s.
  > 'convolve_fft' (neumann): 0.707 s.
  > 'convolve_fft' (periodic): 0.384 s.

Convolution 6x6:
  > 'convolve' (dirichlet): 0.085 s.
  > 'convolve' (neumann): 0.082 s.
  > 'convolve' (periodic): 0.13 s.
  > 'convolve_fft' (dirichlet): 0.735 s.
  > 'convolve_fft' (neumann): 0.73 s.
  > 'convolve_fft' (periodic): 0.415 s.

Convolution 7x7:
  > 'convolve' (dirichlet): 0.124 s.
  > 'convolve' (neumann): 0.094 s.
  > 'convolve' (periodic): 0.181 s.
  > 'convolve_fft' (dirichlet): 0.419 s.
  > 'convolve_fft' (neumann): 0.408 s.
  > 'convolve_fft' (periodic): 0.385 s.

Convolution 8x8:
  > 'convolve' (dirichlet): 0.119 s.
  > 'convolve' (neumann): 0.104 s.
  > 'convolve' (periodic): 0.202 s.
  > 'convolve_fft' (dirichlet): 0.444 s.
  > 'convolve_fft' (neumann): 0.433 s.
  > 'convolve_fft' (periodic): 0.377 s.

Convolution 9x9:
  > 'convolve' (dirichlet): 0.155 s.
  > 'convolve' (neumann): 0.119 s.
  > 'convolve' (periodic): 0.254 s.
  > 'convolve_fft' (dirichlet): 2.529 s.
  > 'convolve_fft' (neumann): 2.751 s.
  > 'convolve_fft' (periodic): 0.433 s.

Convolution 10x10:
  > 'convolve' (dirichlet): 0.175 s.
  > 'convolve' (neumann): 0.135 s.
  > 'convolve' (periodic): 0.284 s.
  > 'convolve_fft' (dirichlet): 2.483 s.
  > 'convolve_fft' (neumann): 2.592 s.
  > 'convolve_fft' (periodic): 0.433 s.

Convolution 11x11:
  > 'convolve' (dirichlet): 0.196 s.
  > 'convolve' (neumann): 0.148 s.
  > 'convolve' (periodic): 0.365 s.
  > 'convolve_fft' (dirichlet): 0.62 s.
  > 'convolve_fft' (neumann): 0.588 s.
  > 'convolve_fft' (periodic): 0.415 s.

Convolution 12x12:
  > 'convolve' (dirichlet): 0.225 s.
  > 'convolve' (neumann): 0.177 s.
  > 'convolve' (periodic): 0.425 s.
  > 'convolve_fft' (dirichlet): 0.591 s.
  > 'convolve_fft' (neumann): 0.601 s.
  > 'convolve_fft' (periodic): 0.451 s.

Convolution 13x13:
  > 'convolve' (dirichlet): 0.281 s.
  > 'convolve' (neumann): 0.228 s.
  > 'convolve' (periodic): 0.532 s.
  > 'convolve_fft' (dirichlet): 0.431 s.
  > 'convolve_fft' (neumann): 0.751 s.
  > 'convolve_fft' (periodic): 0.436 s.

Convolution 14x14:
  > 'convolve' (dirichlet): 0.299 s.
  > 'convolve' (neumann): 0.276 s.
  > 'convolve' (periodic): 0.627 s.
  > 'convolve_fft' (dirichlet): 0.432 s.
  > 'convolve_fft' (neumann): 0.405 s.
  > 'convolve_fft' (periodic): 0.426 s.

Convolution 15x15:
  > 'convolve' (dirichlet): 0.359 s.
  > 'convolve' (neumann): 0.302 s.
  > 'convolve' (periodic): 0.685 s.
  > 'convolve_fft' (dirichlet): 0.343 s.
  > 'convolve_fft' (neumann): 0.33 s.
  > 'convolve_fft' (periodic): 0.435 s.

Convolution 16x16:
  > 'convolve' (dirichlet): 0.409 s.
  > 'convolve' (neumann): 0.332 s.
  > 'convolve' (periodic): 0.767 s.
  > 'convolve_fft' (dirichlet): 0.326 s.
  > 'convolve_fft' (neumann): 0.29 s.
  > 'convolve_fft' (periodic): 0.407 s.

Convolution 17x17:
  > 'convolve' (dirichlet): 0.399 s.
  > 'convolve' (neumann): 0.331 s.
  > 'convolve' (periodic): 0.779 s.
  > 'convolve_fft' (dirichlet): 0.432 s.
  > 'convolve_fft' (neumann): 0.396 s.
  > 'convolve_fft' (periodic): 0.411 s.

Convolution 18x18:
  > 'convolve' (dirichlet): 0.457 s.
  > 'convolve' (neumann): 0.37 s.
  > 'convolve' (periodic): 0.934 s.
  > 'convolve_fft' (dirichlet): 0.491 s.
  > 'convolve_fft' (neumann): 0.461 s.
  > 'convolve_fft' (periodic): 0.455 s.

Convolution 19x19:
  > 'convolve' (dirichlet): 0.519 s.
  > 'convolve' (neumann): 0.396 s.
  > 'convolve' (periodic): 0.905 s.
  > 'convolve_fft' (dirichlet): 0.948 s.
  > 'convolve_fft' (neumann): 0.952 s.
  > 'convolve_fft' (periodic): 0.402 s.

Convolution 20x20:
  > 'convolve' (dirichlet): 0.583 s.
  > 'convolve' (neumann): 0.429 s.
  > 'convolve' (periodic): 1.017 s.
  > 'convolve_fft' (dirichlet): 0.929 s.
  > 'convolve_fft' (neumann): 0.939 s.
  > 'convolve_fft' (periodic): 0.431 s.

@David_Tschumperle Thanks for adding boundary argument. As for speed, it seems that one would benefit from a dynamic switch to convolve. I haven’t done a rigorous test like that, but I did note at on 10x10 and higher, fft becomes faster than regular convolve in my 6-cores machine.

For the earlier question, I now realized that I can code in dynamic size depending on angle. This will reduce computation time, and in theory should be much faster than pdn version.

EDIT: Mistaken threads for core.

2 cores. :roll_eyes:

Actually, I mistaken threads for core. I have 6 cores here. 3.6 Ghz 6 Cores.