Testing with phoronix vs. darktable-cli

In another thread it has been said, that phornix is problematic with dt 5.

Can you explain please why or point me to a discussion?

openbenchmarking.org / result / 2505127-NE-BEELINKDA70

The link says 6.012 for the boat–image (note opencli cannot be used with this pc)

$ darktable-cli bench.SRW bench.SRW.xmp bench.jpg --core --disable-opencl -d perf

darktable 5.0.1
Copyright (C) 2012-2025 Johannes Hanika and other contributors.

Compile options:
  Bit depth              -> 64 bit
  Debug                  -> DISABLED
  SSE2 optimizations     -> ENABLED
  OpenMP                 -> ENABLED
  OpenCL                 -> ENABLED
  Lua                    -> ENABLED  - API version 9.4.0
  Colord                 -> ENABLED
  gPhoto2                -> ENABLED
  GMIC                   -> ENABLED  - Compressed LUTs are supported
  GraphicsMagick         -> ENABLED
  ImageMagick            -> DISABLED
  libavif                -> DISABLED
  libheif                -> ENABLED
  libjxl                 -> ENABLED
  LibRaw                 -> ENABLED  - Version 0.22.0-Devel202403
  OpenJPEG               -> ENABLED
  OpenEXR                -> ENABLED
  WebP                   -> ENABLED

See https://www.darktable.org/resources/ for detailed documentation.
See https://github.com/darktable-org/darktable/issues/new/choose to report bugs.

     0.8851 [dt_dev_load_raw] loading the image. took 0.104 secs (0.190 CPU)
     0.9509 [export] creating pixelpipe took 0.062 secs (0.664 CPU)
     0.9550 [dev_pixelpipe] took 0.003 secs (0.005 CPU) initing base buffer [export]
     0.9614 [dev_pixelpipe] took 0.006 secs (0.042 CPU) [export] processed `rawprepare' on CPU, blended on CPU
     0.9676 [dev_pixelpipe] took 0.006 secs (0.058 CPU) [export] processed `temperature' on CPU, blended on CPU
     1.0055 [dev_pixelpipe] took 0.038 secs (0.580 CPU) [export] processed `highlights' on CPU, blended on CPU
     1.2321 [dev_pixelpipe] took 0.227 secs (3.273 CPU) [export] processed `demosaic' on CPU, blended on CPU
     2.2496 [dev_pixelpipe] took 1.017 secs (3.316 CPU) [export] processed `tonemap' on CPU, blended on CPU
     2.6157 [dev_pixelpipe] took 0.366 secs (3.634 CPU) [export] processed `lens' on CPU, blended on CPU
     2.6467 [dev_pixelpipe] took 0.031 secs (0.463 CPU) [export] processed `basecurve' on CPU, blended on CPU
     2.6684 [dev_pixelpipe] took 0.022 secs (0.305 CPU) [export] processed `colorin' on CPU, blended on CPU
     2.7298 [dev_pixelpipe] took 0.061 secs (0.953 CPU) [export] processed `colorreconstruct' on CPU, blended on CPU
     5.4749 [dev_pixelpipe] took 2.745 secs (43.449 CPU) [export] processed `nlmeans' on CPU, blended on CPU
     5.5678 [dev_pixelpipe] took 0.092 secs (1.446 CPU) [export] processed `globaltonemap' on CPU, blended on CPU
     5.7367 [dev_pixelpipe] took 0.169 secs (2.141 CPU) [export] processed `shadhi' on CPU, blended on CPU
     6.7369 [dev_pixelpipe] took 1.000 secs (14.916 CPU) [export] processed `atrous' on CPU, blended on CPU
     6.7834 [dev_pixelpipe] took 0.046 secs (0.719 CPU) [export] processed `bilat' on CPU, blended on CPU
     6.8683 [dev_pixelpipe] took 0.085 secs (1.281 CPU) [export] processed `colorzones' on CPU, blended on CPU
     6.8976 [dev_pixelpipe] took 0.029 secs (0.412 CPU) [export] processed `levels' on CPU, blended on CPU
     6.9422 [dev_pixelpipe] took 0.045 secs (0.636 CPU) [export] processed `sharpen' on CPU, blended on CPU
     6.9588 [dev_pixelpipe] took 0.017 secs (0.240 CPU) [export] processed `colorcontrast' on CPU, blended on CPU
     6.9999 [dev_pixelpipe] took 0.041 secs (0.636 CPU) [export] processed `colorout' on CPU, blended on CPU
     7.0249 [resample_plain] took 0.025 secs (0.345 CPU) 1:1 copy/crop of 5490x3660 pixels
     7.0249 [dev_pixelpipe] took 0.025 secs (0.345 CPU) [export] processed `finalscale' on CPU, blended on CPU
     7.0249 [dev_process_export] pixel pipeline processing took 6.074 secs (78.894 CPU)
     7.3212 [export_job] exported to `bench.jpg'

darktable-cli says 6.074 secs. That is nearly the exact same value.

How can I compare with other cpus / gpus ?

Why is my link to openbenchmarking blocked. In another posting it was not.

The xmp phornix is using has older modules. It does not reflect a typical use case anymore. I think they should update their xmp like Sarunas did (GPU benchmarks in darktable)

1 Like

test-files can be found here:
https://math.dartmouth.edu/~sarunas/bench_raw/

Would it be posstible to add these files to

.phoronix-test-suite/installed-tests/system/darktable-1.0.5/

Maybe rename it to “bench”?

At the moment I want to compare machines / systems. So I don’t care a lot of a typical usecase.

I would like to be able to say machine A ist 2x faster than machine B.

At the end I would be able to say graphics adapter X and cpu Y is best for needs.

I have tested the setubal image with my minipc. It takes 11.6 sec. While the boat image takes about 6 sec. arecibo.orf (not for dt4) takes 4.3sec

But how can this help me to compare the speed?

I am not interested how long it takes to render 1 photo, it takes as long as it takes. I want to know how faster is another machine.

Let’s say a typical use case are 50 photos to render, in a worst case it are 350 photos.

So if it takes 10min to render it would be ok, faster is always better, but 10 hrs would be unacceptable.

My goal is to setup 2 pcs.

1 minipc., at the moment I have a Beelink SER 6
usecase mobile phone jpg with 27MP, raw is too large about 50MB for 1 photo
and
1 mid-tower with a good graphics adapter.
dng from a fullframe DSLR

I want to get a feeling for the “sweet point”.

$ darktable-cli setubal.orf setubal.orf.xmp bench.jpg --core --disable-opencl -d perf
darktable 5.0.1
Copyright (C) 2012-2025 Johannes Hanika and other contributors.

Compile options:
  Bit depth              -> 64 bit
  Debug                  -> DISABLED
  SSE2 optimizations     -> ENABLED
  OpenMP                 -> ENABLED
  OpenCL                 -> ENABLED
  Lua                    -> ENABLED  - API version 9.4.0
  Colord                 -> ENABLED
  gPhoto2                -> ENABLED
  GMIC                   -> ENABLED  - Compressed LUTs are supported
  GraphicsMagick         -> ENABLED
  ImageMagick            -> DISABLED
  libavif                -> DISABLED
  libheif                -> ENABLED
  libjxl                 -> ENABLED
  LibRaw                 -> ENABLED  - Version 0.22.0-Devel202403
  OpenJPEG               -> ENABLED
  OpenEXR                -> ENABLED
  WebP                   -> ENABLED

See https://www.darktable.org/resources/ for detailed documentation.
See https://github.com/darktable-org/darktable/issues/new/choose to report bugs.

     1.2741 [xmp_import] creating tag: darktable|format|orf
     1.8760 [dt_dev_load_raw] loading the image. took 0.593 secs (0.600 CPU)
     1.9467 [export] creating pixelpipe took 0.067 secs (0.705 CPU)
     1.9470 [dev_pixelpipe] took 0.000 secs (0.000 CPU) initing base buffer [export]
     1.9632 [dev_pixelpipe] took 0.016 secs (0.116 CPU) [export] processed `rawprepare' on CPU, blended on CPU
     1.9783 [dev_pixelpipe] took 0.015 secs (0.064 CPU) [export] processed `temperature' on CPU, blended on CPU
     2.0156 [dev_pixelpipe] took 0.037 secs (0.455 CPU) [export] processed `highlights' on CPU, blended on CPU
     2.0399 [dev_pixelpipe] took 0.024 secs (0.251 CPU) [export] processed `hotpixels' on CPU, blended on CPU
     2.2326 [dev_pixelpipe] took 0.193 secs (2.492 CPU) [export] processed `demosaic' on CPU, blended on CPU
     6.0234 [dev_pixelpipe] took 3.791 secs (48.388 CPU) [export] processed `denoiseprofile' on CPU, blended on CPU
     6.9767 [dev_pixelpipe] took 0.953 secs (9.588 CPU) [export] processed `lens' on CPU, blended on CPU
     7.1755 [dev_pixelpipe] took 0.199 secs (3.029 CPU) [export] processed `ashift' on CPU, blended on CPU
     7.2378 [dev_pixelpipe] took 0.062 secs (0.910 CPU) [export] processed `exposure' on CPU, blended on CPU
     7.2895 [dev_pixelpipe] took 0.052 secs (0.717 CPU) [export] processed `colorin' on CPU, blended on CPU
     7.3328 [dt_ioppr_transform_image_colorspace] IOP_CS_LAB-->IOP_CS_RGB took 0.043 secs (0.672 CPU) [channelmixerrgb]
     7.6155 [dev_pixelpipe] took 0.326 secs (5.173 CPU) [export] processed `channelmixerrgb' on CPU, blended on CPU
     7.6643 [dt_ioppr_transform_image_colorspace] IOP_CS_RGB-->IOP_CS_LAB took 0.049 secs (0.692 CPU) [atrous]
    10.1438 [dev_pixelpipe] took 2.528 secs (37.394 CPU) [export] processed `atrous' on CPU, blended on CPU
    10.1895 [dt_ioppr_transform_image_colorspace] IOP_CS_LAB-->IOP_CS_RGB took 0.046 secs (0.638 CPU) [colorbalancergb]
    11.3599 [dev_pixelpipe] took 1.216 secs (19.293 CPU) [export] processed `colorbalancergb' on CPU, blended on CPU
    11.4067 [dev_pixelpipe] took 0.047 secs (0.657 CPU) [export] processed `rgblevels' on CPU, blended on CPU
    11.8058 [dev_pixelpipe] took 0.399 secs (6.325 CPU) [export] processed `sigmoid' on CPU, blended on CPU
    11.8548 [dt_ioppr_transform_image_colorspace] IOP_CS_RGB-->IOP_CS_LAB took 0.049 secs (0.728 CPU) [bilat]
    13.3339 [dev_pixelpipe] took 1.527 secs (11.874 CPU) [export] processed `bilat' on CPU, blended on CPU
    13.4261 [dev_pixelpipe] took 0.092 secs (1.296 CPU) [export] processed `colorout' on CPU, blended on CPU
    13.4900 [resample_plain] took 0.062 secs (0.884 CPU) 1:1 copy/crop of 8065x6046 pixels
    13.4900 [dev_pixelpipe] took 0.063 secs (0.898 CPU) [export] processed `finalscale' on CPU, blended on CPU
    13.4915 [dev_process_export] pixel pipeline processing took 11.545 secs (148.952 CPU)
    14.3820 [export_job] exported to `bench.jpg'

arecibo.orf

$ darktable-cli arecibo.orf arecibo.orf.xmp bench.jpg --core --disable-opencl -d perf
output file already exists, it will get renamed
darktable 5.0.1
Copyright (C) 2012-2025 Johannes Hanika and other contributors.

Compile options:
  Bit depth              -> 64 bit
  Debug                  -> DISABLED
  SSE2 optimizations     -> ENABLED
  OpenMP                 -> ENABLED
  OpenCL                 -> ENABLED
  Lua                    -> ENABLED  - API version 9.4.0
  Colord                 -> ENABLED
  gPhoto2                -> ENABLED
  GMIC                   -> ENABLED  - Compressed LUTs are supported
  GraphicsMagick         -> ENABLED
  ImageMagick            -> DISABLED
  libavif                -> DISABLED
  libheif                -> ENABLED
  libjxl                 -> ENABLED
  LibRaw                 -> ENABLED  - Version 0.22.0-Devel202403
  OpenJPEG               -> ENABLED
  OpenEXR                -> ENABLED
  WebP                   -> ENABLED

See https://www.darktable.org/resources/ for detailed documentation.
See https://github.com/darktable-org/darktable/issues/new/choose to report bugs.

     1.0862 [exif] Warning: lens "OLYMPUS M.12-200mm F3.5-6.3" unknown as "0 0 50 16 0 0"
     1.0902 [xmp_import] creating tag: Arecibo
     1.0926 [xmp_import] creating tag: Puerto Rico
     1.0946 [xmp_import] creating tag: observatory
     1.0966 [xmp_import] creating tag: panorama
     1.3621 [dt_dev_load_raw] loading the image. took 0.254 secs (0.213 CPU)
     1.4081 [export] creating pixelpipe took 0.043 secs (0.336 CPU)
     1.4092 [dev_pixelpipe] took 0.000 secs (0.000 CPU) initing base buffer [export]
     1.4150 [dev_pixelpipe] took 0.006 secs (0.033 CPU) [export] processed `rawprepare' on CPU, blended on CPU
     1.4224 [dev_pixelpipe] took 0.007 secs (0.045 CPU) [export] processed `temperature' on CPU, blended on CPU
     1.4619 [dev_pixelpipe] took 0.039 secs (0.537 CPU) [export] processed `highlights' on CPU, blended on CPU
     1.5352 [dev_pixelpipe] took 0.073 secs (1.017 CPU) [export] processed `demosaic' on CPU, blended on CPU
     2.9918 [dev_pixelpipe] took 1.457 secs (20.466 CPU) [export] processed `denoiseprofile' on CPU, blended on CPU
     3.3596 [dev_pixelpipe] took 0.368 secs (3.643 CPU) [export] processed `lens' on CPU, blended on CPU
     3.8463 [dev_pixelpipe] took 0.487 secs (5.143 CPU) [export] processed `hazeremoval' on CPU, blended on CPU
     3.9133 [dev_pixelpipe] took 0.067 secs (1.052 CPU) [export] processed `ashift' on CPU, blended on CPU
     3.9345 [dev_pixelpipe] took 0.021 secs (0.331 CPU) [export] processed `exposure' on CPU, blended on CPU
     3.9487 [dev_pixelpipe] took 0.014 secs (0.058 CPU) [export] processed `crop' on CPU, blended on CPU
     3.9647 [dev_pixelpipe] took 0.016 secs (0.239 CPU) [export] processed `colorin' on CPU, blended on CPU
     3.9801 [dt_ioppr_transform_image_colorspace] IOP_CS_LAB-->IOP_CS_RGB took 0.015 secs (0.225 CPU) [channelmixerrgb]
     4.0765 [dev_pixelpipe] took 0.112 secs (1.768 CPU) [export] processed `channelmixerrgb' on CPU, blended on CPU
     4.0951 [dt_ioppr_transform_image_colorspace] IOP_CS_RGB-->IOP_CS_LAB took 0.019 secs (0.264 CPU) [atrous]
     4.9048 [dev_pixelpipe] took 0.828 secs (12.434 CPU) [export] processed `atrous' on CPU, blended on CPU
     4.9208 [dt_ioppr_transform_image_colorspace] IOP_CS_LAB-->IOP_CS_RGB took 0.016 secs (0.231 CPU) [colorbalancergb]
     5.3155 [dev_pixelpipe] took 0.411 secs (6.538 CPU) [export] processed `colorbalancergb' on CPU, blended on CPU
     5.4484 [dev_pixelpipe] took 0.133 secs (2.123 CPU) [export] processed `sigmoid' on CPU, blended on CPU
     5.4658 [dt_ioppr_transform_image_colorspace] IOP_CS_RGB-->IOP_CS_LAB took 0.017 secs (0.257 CPU) [shadhi]
     5.5887 [dev_pixelpipe] took 0.140 secs (1.797 CPU) [export] processed `shadhi' on CPU, blended on CPU
     5.6271 [dev_pixelpipe] took 0.038 secs (0.599 CPU) [export] processed `bilat' on CPU, blended on CPU
     5.6579 [dev_pixelpipe] took 0.031 secs (0.477 CPU) [export] processed `colorout' on CPU, blended on CPU
     5.6787 [resample_plain] took 0.021 secs (0.324 CPU) 1:1 copy/crop of 4665x3521 pixels
     5.6787 [dev_pixelpipe] took 0.021 secs (0.325 CPU) [export] processed `finalscale' on CPU, blended on CPU
     5.6788 [dev_process_export] pixel pipeline processing took 4.271 secs (58.637 CPU)
     5.9934 [export_job] exported to `bench_01.jpg'

I dont understand what you are trying to accomplish. I suggest you grab an image from your camera, process it the way you like it with the modules you want and then use the xmp on both machines to compare their performance, if that’s what you are after.

1 Like

And that is the problem or impossible, because I want to buy new components, which I don’t own at the moment.

I can compare my Beelink Ser 6 Max Ryzen 7 7735HS (no opencl probably, rusticl not tried) with my Ryzen 7 3700X and GeForce GTX 1660-

The minipc takes about 12sec while the GTX 1660 takes about 5sec.

I want to get an idea, how fast a graphics adapter in a midtower-pc of 2025 can be and how fast a minipc using opencl can be with darktable.

GPU benchmarks in darktable shows very old graphy adapters, that doesn’t help me.

$  darktable-cli setubal.orf setubal.orf.xmp test.jpg --core -d perf -d opencl
darktable 5.0.1
Copyright (C) 2012-2025 Johannes Hanika and other contributors.

Compile options:
  Bit depth              -> 64 bit
  Debug                  -> DISABLED
  SSE2 optimizations     -> ENABLED
  OpenMP                 -> ENABLED
  OpenCL                 -> ENABLED
  Lua                    -> ENABLED  - API version 9.4.0
  Colord                 -> ENABLED
  gPhoto2                -> ENABLED
  GMIC                   -> ENABLED  - Compressed LUTs are supported
  GraphicsMagick         -> ENABLED
  ImageMagick            -> DISABLED
  libavif                -> DISABLED
  libheif                -> ENABLED
  libjxl                 -> ENABLED
  LibRaw                 -> ENABLED  - Version 0.22.0-Devel202403
  OpenJPEG               -> ENABLED
  OpenEXR                -> ENABLED
  WebP                   -> ENABLED

See https://www.darktable.org/resources/ for detailed documentation.
See https://github.com/darktable-org/darktable/issues/new/choose to report bugs.

     0.2727 [dt_get_sysresource_level] switched to 1 as `default'
     0.2727   total mem:       32009MB
     0.2727   mipmap cache:    4001MB
     0.2727   available mem:   16004MB
     0.2727   singlebuff:      250MB
     0.3232 [dt_dlopencl_init] could not find default opencl runtime library 'libOpenCL'
     0.3233 [dt_dlopencl_init] could not find default opencl runtime library 'libOpenCL.so'
     0.3235 [opencl_init] opencl library 'libOpenCL.so.1' found on your system and loaded, preference 'default path'
     0.4491 [opencl_init] found 1 platform
[opencl_init] found 1 device

[dt_opencl_device_init]
   DEVICE:                   0: 'NVIDIA GeForce GTX 1660'
   CONF KEY:                 cldevice_v5_nvidiacudanvidiageforcegtx1660
   PLATFORM, VENDOR & ID:    NVIDIA CUDA, NVIDIA Corporation, ID=4318
   CANONICAL NAME:           nvidiacudanvidiageforcegtx1660
   DRIVER VERSION:           570.133.07
   DEVICE VERSION:           OpenCL 3.0 CUDA, SM_20 SUPPORT
   DEVICE_TYPE:              GPU, dedicated mem
   GLOBAL MEM SIZE:          5744 MB
   MAX MEM ALLOC:            1436 MB
   MAX IMAGE SIZE:           32768 x 32768
   MAX WORK GROUP SIZE:      1024
   MAX WORK ITEM DIMENSIONS: 3
   MAX WORK ITEM SIZES:      [ 1024 1024 64 ]
   ASYNC PIXELPIPE:          NO
   PINNED MEMORY TRANSFER:   NO
   AVOID ATOMICS:            NO
   MICRO NAP:                250
   ROUNDUP WIDTH & HEIGHT    16x16
   CHECK EVENT HANDLES:      128
   TILING ADVANTAGE:         0.000
   DEFAULT DEVICE:           NO
   KERNEL BUILD DIRECTORY:   /usr/share/darktable/kernels
   KERNEL DIRECTORY:         /home/ab/.cache/darktable/cached_v5_kernels_for_NVIDIACUDANVIDIAGeForceGTX1660_57013307
   CL COMPILER OPTION:       -cl-fast-relaxed-math
   CL COMPILER COMMAND:      -w -cl-fast-relaxed-math -DNVIDIA_SM_20=1 -DNVIDIA=1 -I"/usr/share/darktable/kernels"
   KERNEL LOADING TIME:       0.1452 sec
[opencl_init] OpenCL successfully initialized. internal numbers and names of available devices:
[opencl_init]		0	'NVIDIA CUDA NVIDIA GeForce GTX 1660'
     0.7481 [opencl_init] FINALLY: opencl PREFERENCE=ON is AVAILABLE and ENABLED.
[opencl_init] opencl_scheduling_profile: 'default'
[opencl_init] opencl_device_priority: '*/!0,*/*/*/!0,*'
[opencl_init] opencl_mandatory_timeout: 1000
[opencl_update_priorities] these are your device priorities:
[opencl_update_priorities] 		image	preview	export	thumbs	preview2
[dt_opencl_update_priorities]		0	-1	0	0	-1
[opencl_update_priorities] show if opencl use is mandatory for a given pixelpipe:
[opencl_update_priorities] 		image	preview	export	thumbs	preview2
[opencl_update_priorities]		0	0	0	0	0
[opencl_synchronization_timeout] synchronization timeout set to 200
[opencl_update_priorities] these are your device priorities:
[opencl_update_priorities] 		image	preview	export	thumbs	preview2
[dt_opencl_update_priorities]		0	-1	0	0	-1
[opencl_update_priorities] show if opencl use is mandatory for a given pixelpipe:
[opencl_update_priorities] 		image	preview	export	thumbs	preview2
[opencl_update_priorities]		0	0	0	0	0
[opencl_synchronization_timeout] synchronization timeout set to 200
     1.5445 [xmp_import] creating tag: darktable|format|orf
     2.2869 [dt_dev_load_raw] loading the image. took 0.729 secs (0.770 CPU)
     2.3467 [export] creating pixelpipe took 0.056 secs (0.446 CPU)
     2.3467 [dt_opencl_check_tuning] use 3516MB (headroom=OFF, pinning=OFF) on device `NVIDIA CUDA NVIDIA GeForce GTX 1660' id=0
     2.3472 [dev_pixelpipe] took 0.000 secs (0.000 CPU) initing base buffer [export]
     2.3595 [dev_pixelpipe] took 0.012 secs (0.090 CPU) [export] processed `rawprepare' on GPU, blended on GPU
     2.3633 [dev_pixelpipe] took 0.004 secs (0.003 CPU) [export] processed `temperature' on GPU, blended on GPU
     2.3753 [dev_pixelpipe] took 0.012 secs (0.011 CPU) [export] processed `highlights' on GPU, blended on GPU
     2.5082 [dev_pixelpipe] took 0.133 secs (0.272 CPU) [export] processed `hotpixels' on CPU, blended on CPU
     2.8865 [dev_pixelpipe] took 0.378 secs (0.222 CPU) [export] processed `demosaic' on GPU with tiling, blended on CPU
     3.7928 [dev_pixelpipe] took 0.906 secs (0.656 CPU) [export] processed `denoiseprofile' on GPU with tiling, blended on CPU
     4.2717 [dev_pixelpipe] took 0.479 secs (1.466 CPU) [export] processed `lens' on GPU, blended on GPU
     4.2945 [dev_pixelpipe] took 0.023 secs (0.022 CPU) [export] processed `ashift' on GPU, blended on GPU
     4.3123 [dev_pixelpipe] took 0.018 secs (0.016 CPU) [export] processed `exposure' on GPU, blended on GPU
     4.3325 [dev_pixelpipe] took 0.020 secs (0.018 CPU) [export] processed `colorin' on GPU, blended on GPU
     4.3483 [dt_ioppr_transform_image_colorspace_cl] IOP_CS_LAB-->IOP_CS_RGB took 0.015 secs (0.014 GPU) [channelmixerrgb]
     4.3784 [dev_pixelpipe] took 0.046 secs (0.041 CPU) [export] processed `channelmixerrgb' on GPU, blended on GPU
     4.4849 [dt_ioppr_transform_image_colorspace] IOP_CS_RGB-->IOP_CS_LAB took 0.044 secs (0.692 CPU) [atrous]
     5.3902 [dev_pixelpipe] took 1.012 secs (1.702 CPU) [export] processed `atrous' on GPU with tiling, blended on CPU
     5.5021 [dt_ioppr_transform_image_colorspace_cl] IOP_CS_LAB-->IOP_CS_RGB took 0.012 secs (0.010 GPU) [colorbalancergb]
     5.5327 [dev_pixelpipe] took 0.142 secs (0.137 CPU) [export] processed `colorbalancergb' on GPU, blended on GPU
     5.5551 [dev_pixelpipe] took 0.022 secs (0.019 CPU) [export] processed `rgblevels' on GPU, blended on GPU
     5.5733 [dev_pixelpipe] took 0.018 secs (0.016 CPU) [export] processed `sigmoid' on GPU, blended on GPU
     5.6794 [dt_ioppr_transform_image_colorspace] IOP_CS_RGB-->IOP_CS_LAB took 0.044 secs (0.682 CPU) [bilat]
     6.8473 [dev_pixelpipe] took 1.274 secs (9.282 CPU) [export] processed `bilat' on CPU, blended on CPU
     6.9313 [dev_pixelpipe] took 0.084 secs (0.081 CPU) [export] processed `colorout' on GPU, blended on GPU
     6.9318 [resample_cl] took 0.000 secs (0.000 CPU) 1:1 copy/crop of 8065x6046 pixels
     6.9464 [dev_pixelpipe] took 0.015 secs (0.014 CPU) [export] processed `finalscale' on GPU, blended on GPU
     7.0089 [opencl_profiling] profiling device 0 ('NVIDIA CUDA NVIDIA GeForce GTX 1660'):
     7.0089 [opencl_profiling] spent  0.5144 seconds in [Write Image (from host to device)]
     7.0089 [opencl_profiling] spent  0.0022 seconds in rawprepare_1f
     7.0089 [opencl_profiling] spent  0.0031 seconds in whitebalance_1f
     7.0089 [opencl_profiling] spent  0.0018 seconds in highlights_initmask
     7.0089 [opencl_profiling] spent  0.0019 seconds in highlights_dilatemask
     7.0089 [opencl_profiling] spent  0.1562 seconds in [Write Buffer (from host to device)]
     7.0089 [opencl_profiling] spent  0.0040 seconds in highlights_chroma
     7.0089 [opencl_profiling] spent  0.0000 seconds in [Read Buffer (from device to host)]
     7.0089 [opencl_profiling] spent  0.0027 seconds in highlights_opposed
     7.0089 [opencl_profiling] spent  0.9045 seconds in [Read Image (from device to host)]
     7.0089 [opencl_profiling] spent  0.0008 seconds in border_interpolate
     7.0089 [opencl_profiling] spent  0.0037 seconds in rcd_border_green
     7.0089 [opencl_profiling] spent  0.0056 seconds in rcd_border_redblue
     7.0089 [opencl_profiling] spent  0.0072 seconds in rcd_populate
     7.0089 [opencl_profiling] spent  0.0052 seconds in rcd_step_1_1
     7.0089 [opencl_profiling] spent  0.0050 seconds in rcd_step_1_2
     7.0089 [opencl_profiling] spent  0.0019 seconds in rcd_step_2_1
     7.0089 [opencl_profiling] spent  0.0053 seconds in rcd_step_3_1
     7.0089 [opencl_profiling] spent  0.0044 seconds in rcd_step_4_1
     7.0089 [opencl_profiling] spent  0.0018 seconds in rcd_step_4_2
     7.0089 [opencl_profiling] spent  0.0043 seconds in rcd_step_5_1
     7.0089 [opencl_profiling] spent  0.0074 seconds in rcd_step_5_2
     7.0089 [opencl_profiling] spent  0.0089 seconds in rcd_write_output
     7.0089 [opencl_profiling] spent  0.0109 seconds in denoiseprofile_precondition_Y0U0V0
     7.0089 [opencl_profiling] spent  0.3282 seconds in denoiseprofile_decompose
     7.0090 [opencl_profiling] spent  0.0347 seconds in denoiseprofile_reduce_first
     7.0090 [opencl_profiling] spent  0.0002 seconds in denoiseprofile_reduce_second
     7.0090 [opencl_profiling] spent  0.1151 seconds in denoiseprofile_synthesize
     7.0090 [opencl_profiling] spent  0.0618 seconds in [Copy Image (on device)]
     7.0090 [opencl_profiling] spent  0.0109 seconds in denoiseprofile_backtransform_Y0U0V0
     7.0090 [opencl_profiling] spent  0.0167 seconds in lens_vignette
     7.0090 [opencl_profiling] spent  0.0372 seconds in lens_distort_bicubic
     7.0090 [opencl_profiling] spent  0.0213 seconds in ashift_bicubic
     7.0090 [opencl_profiling] spent  0.0165 seconds in exposure
     7.0090 [opencl_profiling] spent  0.0151 seconds in colorin_unbound
     7.0090 [opencl_profiling] spent  0.0223 seconds in colorspaces_transform_lab_to_rgb_matrix
     7.0090 [opencl_profiling] spent  0.0167 seconds in channelmixerrgb_CAT16
     7.0090 [opencl_profiling] spent  0.5048 seconds in eaw_decompose
     7.0090 [opencl_profiling] spent  0.1528 seconds in eaw_synthesize
     7.0090 [opencl_profiling] spent  0.0164 seconds in colorbalancergb
     7.0090 [opencl_profiling] spent  0.0150 seconds in rgblevels
     7.0090 [opencl_profiling] spent  0.0168 seconds in sigmoid_loglogistic_per_channel
     7.0090 [opencl_profiling] spent  0.0187 seconds in colorout
     7.0090 [opencl_profiling] spent  3.0843 seconds totally in command queue (with 0 events missing)
     7.0090 [dev_process_export] pixel pipeline processing took 4.662 secs (14.130 CPU)
     7.9862 [export_job] exported to `test.jpg'
 [opencl_summary_statistics] device 'NVIDIA CUDA NVIDIA GeForce GTX 1660' (0): 240 out of 240 events were successful and 0 events lost. max event=239

It’s not just the modules used, the required compiler has changed over the years. That means that the optimisations for the CPU code can have changed, e.g. using newer assembler instructions when available. It gets worse if you can use an even more recent compiler for your own binary…
(The same may happen for the GPU code, which is typically compiled on your machine with the compiler you have installed…)

And newer modules can have been written to better use those possibilities.

So yes, the older benchmarks can show you which system is better for the dt version and image tested, but if your edits are markedly different from the test images, and you use a different dt version, the value of the benchmarks goes down (fast).

Btw, just adding outputs from your runs where they are not relevant to the question only lenghtens your posts, and discourages replies…

Then the old benchmarks are completely irrelevant…

Also, the amount of memory on the graphics card is important, to avoid tiling. Tiling will kill your performance, especially for any module that works on a group of input pixels to calculate the output (the larger that group, the more tiles needed once tiling starts to be necessary). Transfer speed to and from the GPU memory plays a role as well.

So asking for “the speed of a graphics adapter in a midtower-pc of 2025” is about the same as asking for “the airspeed velocity of an unladen swallow”.

1 Like

So how can I find out if it makes sense for me to buy a faster pc?

My pc ist 5 years old, running Ubuntu 24.04 and is fast enough for most things. Bottleneck are programs like darktable and handbrake.

My idea is to use a public testfile like setubal.orf and compare it with newer cpus and gpus.

As a starting point I have values from my old pcs.

If I understood everything right, there are so many things, which influence the speed, so it is not so important which file I compare, but always the same.

With my pcs there are nearly no differences in time, if I use dt 5 or 4.x, but always the same file. There is also no noticeable difference if I use ubuntu 24.04 or 22.04 or 25.04. I am now on 24.04.

Maybe I start better a hardware topic?

Maye you can share an image and an xmp file after asking if anyone has recently updated their PC to something fairly fast so that you can see what the potential boost might be…maybe its a new topic

1 Like

There probably is no sense anymore to test with boat or arecibo — darktable evolves quickly and the processing order and modules used there might be outdated. Also computer hardware becomes more powerfull and those old test cases are just too “easy”, i.e. they are processed very quickly and its harder to see the difference. That’s why I added setubal, which includes more processing and has many details to be worked on. Even then, xmp has already been updated several times with the new versions of darktable and its modules (not the set of modules or parameters in them).

I got disinterested with Nvidia’s “consumer” GPUs, so yes, no contemporary ones tested. But AMD and Intel ones tested are quite recent.

Well, Ryzen 7 9700X 8C/16T @5.4GHz, 32GB @6.4GT/s RAM, X870 chipset, Linux 6.12.20, Debian Sid as of 2025.03 does setubal.orf in 7s without OpenCL.

With OpenCL kernels running in Intel Arc B580 GPU setubal.orf takes 1,4s.