Export times on iMac and Mac mini with diffuse&sharpen

Well, the Mac Mini is less capable when it comes to OpenCL, at least as far as memory is concerned:

   DEVICE_TYPE:              GPU, unified mem
   GLOBAL MEM SIZE:          5461 MB

vs

   DEVICE_TYPE:              CPU, unified mem
   GLOBAL MEM SIZE:          40960 MB
   
   DEVICE_TYPE:              GPU, dedicated mem
   GLOBAL MEM SIZE:          8192 MB

If you also include -d tiling, you may get more details. For example, on my system (using an Nvidia 1060 with 6 GB):

    56.7530 process tiled             CL0 [export]         diffuse                (   0/   0) 7728x5152 scale=1.0000 --> (   0/   0) 7728x5152 scale=1.0000  34 IOP_CS_RGB
    56.7530 [default_process_tiling_cl_ptp] [export] **** tiling module 'diffuse' for image with size 7728x5152 --> 7728x5152
    56.7530 [default_process_tiling_cl_ptp] [export] buffer exceeds singlebuffer, corrected to 3764x5152
    56.7530 [default_process_tiling_cl_ptp] [export] (5x1) tiles with max dimensions 3764x5152, pinned=OFF, good 1716x3104 and overlap 1024
    56.7530 [default_process_tiling_cl_ptp] [export] tile (0,0) size 3764x5152 at origin [0,0]
...
    64.4636 [default_process_tiling_cl_ptp] [export] tile (3,0) size 2580x5152 at origin [5148,0]
    66.2384 process tiled             CL0 [export]         diffuse.1              (   0/   0) 7728x5152 scale=1.0000 --> (   0/   0) 7728x5152 scale=1.0000  35 IOP_CS_RGB
    66.2384 [default_process_tiling_cl_ptp] [export] **** tiling module 'diffuse.1' for image with size 7728x5152 --> 7728x5152
    66.2384 [default_process_tiling_cl_ptp] [export] buffer exceeds singlebuffer, corrected to 4993x5152
    66.2384 [default_process_tiling_cl_ptp] [export] (2x1) tiles with max dimensions 4993x5152, pinned=OFF, good 4865x5024 and overlap 64
    66.2384 [default_process_tiling_cl_ptp] [export] tile (0,0) size 4993x5152 at origin [0,0]
    77.9512 [default_process_tiling_cl_ptp] [export] tile (1,0) size 2863x5152 at origin [4865,0]

That was for the X100VI image, and export time was ~ 30 s.

    90.9632 process tiled             CL0 [export]         diffuse                (   0/   0) 11662x8744 scale=1.0000 --> (   0/   0) 11662x8744 scale=1.0000  34 IOP_CS_RGB
    90.9632 [default_process_tiling_cl_ptp] [export] **** tiling module 'diffuse' for image with size 11662x8744 --> 11662x8744
    90.9632 [default_process_tiling_cl_ptp] [export] buffer exceeds singlebuffer, corrected to 5085x3813
    90.9632 [default_process_tiling_cl_ptp] [export] (4x5) tiles with max dimensions 5084x3813, pinned=OFF, good 3036x1765 and overlap 1024
    90.9632 [default_process_tiling_cl_ptp] [export] tile (0,0) size 5084x3813 at origin [0,0]
...
   128.1086 [default_process_tiling_cl_ptp] [export] tile (3,3) size 2554x3449 at origin [9108,5295]
   129.1109 pipe cache get                [export]         diffuse.1              IOP_CS_RGB line  1( 2) at 0x75a931a4c040. hash=af0f78c8d1063851
   129.1112 process tiled             CL0 [export]         diffuse.1              (   0/   0) 11662x8744 scale=1.0000 --> (   0/   0) 11662x8744 scale=1.0000  35 IOP_CS_RGB
   129.1112 [default_process_tiling_cl_ptp] [export] **** tiling module 'diffuse.1' for image with size 11662x8744 --> 11662x8744
   129.1112 [default_process_tiling_cl_ptp] [export] buffer exceeds singlebuffer, corrected to 5857x4392
   129.1112 [default_process_tiling_cl_ptp] [export] (3x3) tiles with max dimensions 5856x4392, pinned=OFF, good 5728x4264 and overlap 64
   129.1112 [default_process_tiling_cl_ptp] [export] tile (0,0) size 5856x4392 at origin [0,0]
...
   174.1183 [default_process_tiling_cl_ptp] [export] tile (2,2) size 206x216 at origin [11456,8528]

Export time for the GFX100S image was ~85 s.

Notice the messages with 5084x3813, pinned=OFF, good 3036x1765 and overlap 1024. Of the ~ 19 MPx in the tile, only ~5.3 MPx were useful, the rest had to be recomputed over and over. More GPU memory would have meant much faster processing.