Shall I open a new thread for this?
I can confirm that exporting is extremely slow when using this preset (applied to Warm colorful day in fall), due to OpenCL being disabled. I have 6 GB of RAM on the NVidia 1060 card, and 64 GB RAM in the machine; the CPU is an AMD Ryzen 5 5600X. Note the [opencl_diffuse] couldn't enqueue kernel! -4
messages below.
176.436056 [export] creating pixelpipe took 0.053 secs (0.159 CPU)
176.436073 [pixelpipe_process] [export] using device 0
176.436095 [dev_pixelpipe] took 0.000 secs (0.000 CPU) initing base buffer [export]
176.446887 [dev_pixelpipe] took 0.011 secs (0.007 CPU) processed `raw black/white point' on GPU, blended on GPU [export]
176.451013 [dev_pixelpipe] took 0.004 secs (0.004 CPU) processed `white balance' on GPU, blended on GPU [export]
176.458158 [dev_pixelpipe] took 0.007 secs (0.003 CPU) processed `highlight reconstruction' on GPU, blended on GPU [export]
176.546673 [dev_pixelpipe] took 0.089 secs (0.048 CPU) processed `demosaic' on GPU, blended on GPU [export]
176.560071 [dev_pixelpipe] took 0.013 secs (0.009 CPU) processed `lens correction' on GPU, blended on GPU [export]
176.576557 [dev_pixelpipe] took 0.016 secs (0.013 CPU) processed `exposure' on GPU, blended on GPU [export]
176.862751 [dev_pixelpipe] took 0.286 secs (1.330 CPU) processed `tone equalizer' on CPU, blended on CPU [export]
176.938311 [dev_pixelpipe] took 0.076 secs (0.764 CPU) processed `tone equalizer 1' on CPU, blended on CPU [export]
176.995765 [dev_pixelpipe] took 0.057 secs (0.057 CPU) processed `input color profile' on GPU, blended on GPU [export]
image colorspace transform Lab-->RGB took 0.015 secs (0.008 GPU) [channelmixerrgb ]
177.041929 [dev_pixelpipe] took 0.046 secs (0.027 CPU) processed `color calibration' on GPU, blended on GPU [export]
177.087537 [default_process_tiling_cl_ptp] use tiling on module 'diffuse' for image with full size 7374 x 4924
177.087542 [default_process_tiling_cl_ptp] (3 x 1) tiles with max dimensions 4648 x 4924 and overlap 1024
177.087544 [default_process_tiling_cl_ptp] tile (0, 0) with 4648 x 4924 at origin [0, 0]
187.978417 [opencl_diffuse] couldn't enqueue kernel! -4
187.982514 [default_process_tiling_opencl_ptp] couldn't run process_cl() for module 'diffuse' in tiling mode: 0
187.982519 [opencl_pixelpipe] could not run module 'diffuse' on gpu. falling back to cpu path
283.749282 [dev_pixelpipe] took 106.707 secs (1127.128 CPU) processed `diffuse or sharpen' on CPU, blended on CPU [export]
283.749305 [default_process_tiling_cl_ptp] use tiling on module 'diffuse' for image with full size 7374 x 4924
283.749308 [default_process_tiling_cl_ptp] (2 x 1) tiles with max dimensions 7368 x 4924 and overlap 16
283.749309 [default_process_tiling_cl_ptp] tile (0, 0) with 7368 x 4924 at origin [0, 0]
291.479139 [opencl_diffuse] couldn't enqueue kernel! -4
291.483153 [default_process_tiling_opencl_ptp] couldn't run process_cl() for module 'diffuse' in tiling mode: 0
291.483158 [opencl_pixelpipe] could not run module 'diffuse' on gpu. falling back to cpu path
330.331615 [dev_pixelpipe] took 46.582 secs (446.771 CPU) processed `diffuse or sharpen 1' on CPU, blended on CPU [export]
330.331641 [default_process_tiling_cl_ptp] use tiling on module 'diffuse' for image with full size 7374 x 4924
330.331644 [default_process_tiling_cl_ptp] (2 x 1) tiles with max dimensions 6168 x 4924 and overlap 64
330.331646 [default_process_tiling_cl_ptp] tile (0, 0) with 6168 x 4924 at origin [0, 0]
350.124052 [opencl_diffuse] couldn't enqueue kernel! -4
350.128109 [default_process_tiling_opencl_ptp] couldn't run process_cl() for module 'diffuse' in tiling mode: 0
350.128115 [opencl_pixelpipe] could not run module 'diffuse' on gpu. falling back to cpu path
467.724100 [dev_pixelpipe] took 137.392 secs (1337.772 CPU) processed `diffuse or sharpen 2' on CPU, blended on CPU [export]
467.802900 [dev_pixelpipe] took 0.079 secs (0.065 CPU) processed `color balance rgb' on GPU, blended on GPU [export]
467.833439 [dev_pixelpipe] took 0.031 secs (0.014 CPU) processed `filmic rgb' on GPU, blended on GPU [export]
image colorspace transform RGB-->Lab took 0.016 secs (0.008 GPU) [colorout ]
467.892337 [dev_pixelpipe] took 0.059 secs (0.035 CPU) processed `output color profile' on GPU, blended on GPU [export]
468.245263 [dev_pixelpipe] took 0.353 secs (0.353 CPU) processed `dithering' on CPU, blended on CPU [export]
468.342838 [dev_pixelpipe] took 0.098 secs (1.093 CPU) processed `display encoding' on CPU, blended on CPU [export]
468.342945 [opencl_profiling] profiling device 0 ('NVIDIA GeForce GTX 1060 6GB'):
468.342948 [opencl_profiling] spent 0.2184 seconds in [Write Image (from host to device)]
468.342950 [opencl_profiling] spent 0.0019 seconds in rawprepare_1f
468.342953 [opencl_profiling] spent 0.0021 seconds in whitebalance_1f
468.342955 [opencl_profiling] spent 0.0036 seconds in highlights_1f_lch_bayer
468.342957 [opencl_profiling] spent 0.0009 seconds in border_interpolate
468.342958 [opencl_profiling] spent 0.0038 seconds in rcd_border_green
468.342960 [opencl_profiling] spent 0.0053 seconds in rcd_border_redblue
468.342961 [opencl_profiling] spent 0.0049 seconds in rcd_populate
468.342963 [opencl_profiling] spent 0.0038 seconds in rcd_step_1_1
468.342965 [opencl_profiling] spent 0.0029 seconds in rcd_step_1_2
468.342966 [opencl_profiling] spent 0.0018 seconds in rcd_step_2_1
468.342968 [opencl_profiling] spent 0.0050 seconds in rcd_step_3_1
468.342969 [opencl_profiling] spent 0.0027 seconds in rcd_step_4_1
468.342970 [opencl_profiling] spent 0.0015 seconds in rcd_step_4_2
468.342972 [opencl_profiling] spent 0.0043 seconds in rcd_step_5_1
468.342973 [opencl_profiling] spent 0.0070 seconds in rcd_step_5_2
468.342974 [opencl_profiling] spent 0.0070 seconds in rcd_write_output
468.342976 [opencl_profiling] spent 0.0309 seconds in [Copy Image (on device)]
468.342978 [opencl_profiling] spent 0.0117 seconds in exposure
468.342979 [opencl_profiling] spent 0.2296 seconds in [Read Image (from device to host)]
468.342981 [opencl_profiling] spent 0.0094 seconds in colorin_unbound
468.342982 [opencl_profiling] spent 0.0085 seconds in colorspaces_transform_lab_to_rgb_matrix
468.342983 [opencl_profiling] spent 0.0108 seconds in channelmixerrgb_CAT16
468.342985 [opencl_profiling] spent 18.1803 seconds in diffuse_blur_bspline
468.342986 [opencl_profiling] spent 19.9586 seconds in diffuse_pde
468.342988 [opencl_profiling] spent 0.0177 seconds in colorbalancergb
468.342990 [opencl_profiling] spent 0.0064 seconds in filmic_mask_clipped_pixels
468.342991 [opencl_profiling] spent 0.0088 seconds in filmicrgb_chroma
468.342993 [opencl_profiling] spent 0.0087 seconds in colorspaces_transform_rgb_matrix_to_lab
468.342995 [opencl_profiling] spent 0.0166 seconds in colorout
468.342997 [opencl_profiling] spent 38.7750 seconds totally in command queue (with 3 events missing)
468.343011 [dev_process_export] pixel pipeline processing took 291.907 secs (2915.494 CPU)
Card info:
0.051513 [opencl_init] device 0 `NVIDIA GeForce GTX 1060 6GB' has sm_20 support.
0.051716 [opencl_init] device 0 `NVIDIA GeForce GTX 1060 6GB' supports image sizes of 16384 x 32768
0.051720 [opencl_init] device 0 `NVIDIA GeForce GTX 1060 6GB' allows GPU memory allocations of up to 1519MB
[opencl_init] device 0: NVIDIA GeForce GTX 1060 6GB
CANONICAL_NAME: nvidiag
GLOBAL_MEM_SIZE: 6077MB
MAX_WORK_GROUP_SIZE: 1024
MAX_WORK_ITEM_DIMENSIONS: 3
MAX_WORK_ITEM_SIZES: [ 1024 1024 64 ]
DRIVER_VERSION: 470.82.00
DEVICE_VERSION: OpenCL 3.0 CUDA
According to nvida-smi, with the browser open and darktable idling in the background, I have 465MiB in use of 6077MiB.
Is there some setting I could change to allow the operation to run on the GPU? In the manual, the only setting I found mentioning OpenCL and tiling is only concerned with the speed of memory copies: opencl_use_pinned_memory
, but nothing about OpenCL and tiling memory sizes. Starting with an empty darktable config did not help.