Searching benchmark with GeForce RTX 3060?

Does anyone use a GeForce RTX 3060 and would like to do a benchmark?

There are a few possibilites.

Install Phoronix or do something like this:

darktable-cli setubal.orf setubal.orf.xmp setubal.jpg --core -d perf -d opencl

Testfiles are from Index of /~sarunas/bench_raw/setubal

If you need any help to install phoronix, let me know.

It is clear, that there won’t be an exact comparison, but it gives a feeling how fast the card is compared to an other card.

I can do a 3060TI for you… I have a 12th gen intel i5 running WIn11 with 64GB of DDR5 and OS and DT running on NVME…might not compare to your linux box but its a number…I have a linux install that I don’t often use on another hardrive…if I get time I will check there…

EDIT : Windows result on my PC as configured… About 6.8 seconds…

dtperformance.txt (11.1 KB)

1 Like

Thanks! Which file did you test?

Edit: I see at the end, it was setubal.

Strange, my values with setubal and a GTX 1660 using Ubuntu 24.04. Full log here: Testing with phoronix vs. darktable-cli - #6 by linuxuser

     7.0090 [opencl_profiling] spent  3.0843 seconds totally in command queue (with 0 events missing)
     7.0090 [dev_process_export] pixel pipeline processing took 4.662 secs (14.130 CPU)
     7.9862 [export_job] exported to `test.jpg'
 [opencl_summary_statistics] device 'NVIDIA CUDA NVIDIA GeForce GTX 1660' (0): 240 out of 240 events were successful and 0 events lost. max event=239

I am really interested in the values of others. It looks too slow for me. My card is very very old.

The resource size option may influence the results. With a 12 GB card, it should probably be set to large.

Also, always use LCMS2 can have a large (negative) impact.

2 Likes

The last module (color out) took 3s in CPU. Are you using default settings for that module?

Slow output color profile is typical for always use LCMS2 = enabled.

always use LittleCMS 2 to apply output color profile
If this option is activated, darktable will use the LittleCMS 2 system library to apply the output color profile instead of its own internal routines. This is significantly slower than the default but might give more accurate results in some cases.
(darktable user manual - processing)

No likely not and I know it says it is slower. I’m not sure in the end why I chose it. Maybe I was troubleshooting some thing… I’ll try with it off…

As noted by you and @kofa…changing to default… is much better…

4.8163 [opencl_profiling] spent 1.6073 seconds totally in command queue (with 0 events missing)
4.8164 [dev_process_export] pixel pipeline processing took 3.534 secs (4.344 CPU)
5.9655 [export_job] exported to `setubal_01.jpg’

Pop OS on a secondary drive in my system fairs a little better…

4.2499 [opencl_profiling] spent 1.8208 seconds totally in command queue (with 0 events missing)
4.2500 [dev_process_export] pixel pipeline processing took 2.972 secs (11.086 CPU)
5.1178 [export_job] exported to `setubal_01.jpg’
[opencl_summary_statistics] device ‘NVIDIA CUDA NVIDIA GeForce RTX 3060 Ti’ (0): 166 out of 166 events were successful and 0 events lost. max event=165

1 Like

I was bitten by the same issue. Many years agi, there was a problematic photo, where LCM2 worked much better. I left it on, and as my export speed remained acceptable, never turned it back off, and completely forgot about it. Then, as I was doing some benchmarking, I got much slower exports than the other person, despite having cards of comparable specs.

2 Likes

I have a Ryzen 9 7900 12-Core // 48Go // RTX 3060 12Go home. For Setubal, my results are 10.4 sec with CPU only, 5.2 sec with GPU

3 Likes

I have a Ryzen 9 5900HX 8/16-core 32GB 3600RTX Laptop GPU and for Setubal my results are:


6.6365 [opencl_profiling] spent  3.0609 seconds totally in command queue (with 0 events missing)

6.6366 [dev_process_export] pixel pipeline processing took 4.941 secs (14.449 CPU)

7.5141 [export_job] exported to setubal.jpg
[opencl_summary_statistics] device rusticl AMD Radeon Graphics id=0: NOT utilized

[opencl_summary_statistics] device NVIDIA CUDA NVIDIA GeForce RTX 3060 Laptop GPU id=1: 166 out of 166 events were successful and 0 events lost. max event=165
2 Likes

That’s exactly what I would like to buy.

Do you use Windows?

Do you user Windows?

I do not understand your high values.

Compare with GPU benchmarks in darktable
Arc B580 12GB
Ryzen 7 9700X
1.4 sec

Something in this range I expect with teh 3060 too.

I mentioned my current system above and tested again. It would be useless to buy a new pc or graphics apdater and get a slower performance than now.

This is from my 5 year old system with

AMD Ryzen 7 3700X 8-Core Processor
32GB RAM
nVidia TU116 [GeForce GTX 1660]
Xubuntu 24.04

# reboot

# AMD Ryzen 7 3700X 8-Core Processor
# 32GB RAM
# nVidia TU116 [GeForce GTX 1660]
# Xubuntu 24.04

$ darktable-cli setubal.orf setubal.orf.xmp test.jpg --core -d perf -d opencl
darktable 5.0.1
Copyright (C) 2012-2025 Johannes Hanika and other contributors.

Compile options:
  Bit depth              -> 64 bit
  Debug                  -> DISABLED
  SSE2 optimizations     -> ENABLED
  OpenMP                 -> ENABLED
  OpenCL                 -> ENABLED
  Lua                    -> ENABLED  - API version 9.4.0
  Colord                 -> ENABLED
  gPhoto2                -> ENABLED
  GMIC                   -> ENABLED  - Compressed LUTs are supported
  GraphicsMagick         -> ENABLED
  ImageMagick            -> DISABLED
  libavif                -> DISABLED
  libheif                -> ENABLED
  libjxl                 -> ENABLED
  LibRaw                 -> ENABLED  - Version 0.22.0-Devel202403
  OpenJPEG               -> ENABLED
  OpenEXR                -> ENABLED
  WebP                   -> ENABLED

See https://www.darktable.org/resources/ for detailed documentation.
See https://github.com/darktable-org/darktable/issues/new/choose to report bugs.

     0.2830 [dt_get_sysresource_level] switched to 1 as `default'
     0.2830   total mem:       32009MB
     0.2830   mipmap cache:    4001MB
     0.2830   available mem:   16004MB
     0.2830   singlebuff:      250MB
     0.3345 [opencl_init] opencl library 'libOpenCL' found on your system and loaded, preference 'default path'
     0.4587 [opencl_init] found 1 platform
[opencl_init] found 1 device

[dt_opencl_device_init]
   DEVICE:                   0: 'NVIDIA GeForce GTX 1660'
   CONF KEY:                 cldevice_v5_nvidiacudanvidiageforcegtx1660
   PLATFORM, VENDOR & ID:    NVIDIA CUDA, NVIDIA Corporation, ID=4318
   CANONICAL NAME:           nvidiacudanvidiageforcegtx1660
   DRIVER VERSION:           570.133.07
   DEVICE VERSION:           OpenCL 3.0 CUDA, SM_20 SUPPORT
   DEVICE_TYPE:              GPU, dedicated mem
   GLOBAL MEM SIZE:          5744 MB
   MAX MEM ALLOC:            1436 MB
   MAX IMAGE SIZE:           32768 x 32768
   MAX WORK GROUP SIZE:      1024
   MAX WORK ITEM DIMENSIONS: 3
   MAX WORK ITEM SIZES:      [ 1024 1024 64 ]
   ASYNC PIXELPIPE:          NO
   PINNED MEMORY TRANSFER:   NO
   AVOID ATOMICS:            NO
   MICRO NAP:                250
   ROUNDUP WIDTH & HEIGHT    16x16
   CHECK EVENT HANDLES:      128
   TILING ADVANTAGE:         0.000
   DEFAULT DEVICE:           NO
   KERNEL BUILD DIRECTORY:   /usr/share/darktable/kernels
   KERNEL DIRECTORY:         /home/ab/.cache/darktable/cached_v5_kernels_for_NVIDIACUDANVIDIAGeForceGTX1660_57013307
   CL COMPILER OPTION:       -cl-fast-relaxed-math
   CL COMPILER COMMAND:      -w -cl-fast-relaxed-math -DNVIDIA_SM_20=1 -DNVIDIA=1 -I"/usr/share/darktable/kernels"
   KERNEL LOADING TIME:       0.1464 sec
[opencl_init] OpenCL successfully initialized. internal numbers and names of available devices:
[opencl_init]		0	'NVIDIA CUDA NVIDIA GeForce GTX 1660'
     0.7510 [opencl_init] FINALLY: opencl PREFERENCE=ON is AVAILABLE and ENABLED.
[opencl_init] opencl_scheduling_profile: 'default'
[opencl_init] opencl_device_priority: '*/!0,*/*/*/!0,*'
[opencl_init] opencl_mandatory_timeout: 1000
[opencl_update_priorities] these are your device priorities:
[opencl_update_priorities] 		image	preview	export	thumbs	preview2
[dt_opencl_update_priorities]		0	-1	0	0	-1
[opencl_update_priorities] show if opencl use is mandatory for a given pixelpipe:
[opencl_update_priorities] 		image	preview	export	thumbs	preview2
[opencl_update_priorities]		0	0	0	0	0
[opencl_synchronization_timeout] synchronization timeout set to 200
[opencl_update_priorities] these are your device priorities:
[opencl_update_priorities] 		image	preview	export	thumbs	preview2
[dt_opencl_update_priorities]		0	-1	0	0	-1
[opencl_update_priorities] show if opencl use is mandatory for a given pixelpipe:
[opencl_update_priorities] 		image	preview	export	thumbs	preview2
[opencl_update_priorities]		0	0	0	0	0
[opencl_synchronization_timeout] synchronization timeout set to 200
     2.6284 [dt_dev_load_raw] loading the image. took 1.035 secs (0.779 CPU)
     2.6898 [export] creating pixelpipe took 0.057 secs (0.463 CPU)
     2.6899 [dt_opencl_check_tuning] use 3516MB (headroom=OFF, pinning=OFF) on device `NVIDIA CUDA NVIDIA GeForce GTX 1660' id=0
     2.6903 [dev_pixelpipe] took 0.000 secs (0.000 CPU) initing base buffer [export]
     2.7026 [dev_pixelpipe] took 0.012 secs (0.083 CPU) [export] processed `rawprepare' on GPU, blended on GPU
     2.7064 [dev_pixelpipe] took 0.004 secs (0.003 CPU) [export] processed `temperature' on GPU, blended on GPU
     2.7184 [dev_pixelpipe] took 0.012 secs (0.012 CPU) [export] processed `highlights' on GPU, blended on GPU
     2.8503 [dev_pixelpipe] took 0.132 secs (0.262 CPU) [export] processed `hotpixels' on CPU, blended on CPU
     3.2285 [dev_pixelpipe] took 0.378 secs (0.216 CPU) [export] processed `demosaic' on GPU with tiling, blended on CPU
     4.1352 [dev_pixelpipe] took 0.907 secs (0.652 CPU) [export] processed `denoiseprofile' on GPU with tiling, blended on CPU
     4.6203 [dev_pixelpipe] took 0.485 secs (1.462 CPU) [export] processed `lens' on GPU, blended on GPU
     4.6432 [dev_pixelpipe] took 0.023 secs (0.021 CPU) [export] processed `ashift' on GPU, blended on GPU
     4.6610 [dev_pixelpipe] took 0.018 secs (0.017 CPU) [export] processed `exposure' on GPU, blended on GPU
     4.6813 [dev_pixelpipe] took 0.020 secs (0.018 CPU) [export] processed `colorin' on GPU, blended on GPU
     4.6970 [dt_ioppr_transform_image_colorspace_cl] IOP_CS_LAB-->IOP_CS_RGB took 0.015 secs (0.014 GPU) [channelmixerrgb]
     4.7272 [dev_pixelpipe] took 0.046 secs (0.042 CPU) [export] processed `channelmixerrgb' on GPU, blended on GPU
     4.8336 [dt_ioppr_transform_image_colorspace] IOP_CS_RGB-->IOP_CS_LAB took 0.043 secs (0.676 CPU) [atrous]
     5.7416 [dev_pixelpipe] took 1.014 secs (1.697 CPU) [export] processed `atrous' on GPU with tiling, blended on CPU
     5.8541 [dt_ioppr_transform_image_colorspace_cl] IOP_CS_LAB-->IOP_CS_RGB took 0.012 secs (0.012 GPU) [colorbalancergb]
     5.8848 [dev_pixelpipe] took 0.143 secs (0.139 CPU) [export] processed `colorbalancergb' on GPU, blended on GPU
     5.9074 [dev_pixelpipe] took 0.023 secs (0.019 CPU) [export] processed `rgblevels' on GPU, blended on GPU
     5.9256 [dev_pixelpipe] took 0.018 secs (0.016 CPU) [export] processed `sigmoid' on GPU, blended on GPU
     6.0323 [dt_ioppr_transform_image_colorspace] IOP_CS_RGB-->IOP_CS_LAB took 0.044 secs (0.687 CPU) [bilat]
     7.2061 [dev_pixelpipe] took 1.280 secs (9.245 CPU) [export] processed `bilat' on CPU, blended on CPU
     7.2902 [dev_pixelpipe] took 0.084 secs (0.082 CPU) [export] processed `colorout' on GPU, blended on GPU
     7.2907 [resample_cl] took 0.000 secs (0.000 CPU) 1:1 copy/crop of 8065x6046 pixels
     7.3053 [dev_pixelpipe] took 0.015 secs (0.014 CPU) [export] processed `finalscale' on GPU, blended on GPU
     7.3684 [opencl_profiling] profiling device 0 ('NVIDIA CUDA NVIDIA GeForce GTX 1660'):
     7.3684 [opencl_profiling] spent  0.5166 seconds in [Write Image (from host to device)]
     7.3685 [opencl_profiling] spent  0.0022 seconds in rawprepare_1f
     7.3685 [opencl_profiling] spent  0.0031 seconds in whitebalance_1f
     7.3685 [opencl_profiling] spent  0.0018 seconds in highlights_initmask
     7.3685 [opencl_profiling] spent  0.0019 seconds in highlights_dilatemask
     7.3685 [opencl_profiling] spent  0.1563 seconds in [Write Buffer (from host to device)]
     7.3685 [opencl_profiling] spent  0.0040 seconds in highlights_chroma
     7.3685 [opencl_profiling] spent  0.0000 seconds in [Read Buffer (from device to host)]
     7.3685 [opencl_profiling] spent  0.0027 seconds in highlights_opposed
     7.3685 [opencl_profiling] spent  0.9055 seconds in [Read Image (from device to host)]
     7.3685 [opencl_profiling] spent  0.0008 seconds in border_interpolate
     7.3685 [opencl_profiling] spent  0.0037 seconds in rcd_border_green
     7.3685 [opencl_profiling] spent  0.0055 seconds in rcd_border_redblue
     7.3685 [opencl_profiling] spent  0.0074 seconds in rcd_populate
     7.3685 [opencl_profiling] spent  0.0052 seconds in rcd_step_1_1
     7.3685 [opencl_profiling] spent  0.0049 seconds in rcd_step_1_2
     7.3685 [opencl_profiling] spent  0.0019 seconds in rcd_step_2_1
     7.3685 [opencl_profiling] spent  0.0053 seconds in rcd_step_3_1
     7.3685 [opencl_profiling] spent  0.0044 seconds in rcd_step_4_1
     7.3685 [opencl_profiling] spent  0.0018 seconds in rcd_step_4_2
     7.3685 [opencl_profiling] spent  0.0043 seconds in rcd_step_5_1
     7.3685 [opencl_profiling] spent  0.0074 seconds in rcd_step_5_2
     7.3685 [opencl_profiling] spent  0.0089 seconds in rcd_write_output
     7.3685 [opencl_profiling] spent  0.0109 seconds in denoiseprofile_precondition_Y0U0V0
     7.3685 [opencl_profiling] spent  0.3284 seconds in denoiseprofile_decompose
     7.3685 [opencl_profiling] spent  0.0347 seconds in denoiseprofile_reduce_first
     7.3685 [opencl_profiling] spent  0.0002 seconds in denoiseprofile_reduce_second
     7.3685 [opencl_profiling] spent  0.1154 seconds in denoiseprofile_synthesize
     7.3686 [opencl_profiling] spent  0.0616 seconds in [Copy Image (on device)]
     7.3686 [opencl_profiling] spent  0.0109 seconds in denoiseprofile_backtransform_Y0U0V0
     7.3686 [opencl_profiling] spent  0.0167 seconds in lens_vignette
     7.3686 [opencl_profiling] spent  0.0374 seconds in lens_distort_bicubic
     7.3686 [opencl_profiling] spent  0.0213 seconds in ashift_bicubic
     7.3686 [opencl_profiling] spent  0.0165 seconds in exposure
     7.3686 [opencl_profiling] spent  0.0151 seconds in colorin_unbound
     7.3686 [opencl_profiling] spent  0.0224 seconds in colorspaces_transform_lab_to_rgb_matrix
     7.3686 [opencl_profiling] spent  0.0167 seconds in channelmixerrgb_CAT16
     7.3686 [opencl_profiling] spent  0.5051 seconds in eaw_decompose
     7.3686 [opencl_profiling] spent  0.1527 seconds in eaw_synthesize
     7.3686 [opencl_profiling] spent  0.0164 seconds in colorbalancergb
     7.3686 [opencl_profiling] spent  0.0150 seconds in rgblevels
     7.3686 [opencl_profiling] spent  0.0168 seconds in sigmoid_loglogistic_per_channel
     7.3686 [opencl_profiling] spent  0.0188 seconds in colorout
     7.3686 [opencl_profiling] spent  3.0886 seconds totally in command queue (with 0 events missing)
     7.3686 [dev_process_export] pixel pipeline processing took 4.679 secs (14.063 CPU)
     8.4313 [export_job] exported to `test.jpg'
 [opencl_summary_statistics] device 'NVIDIA CUDA NVIDIA GeForce GTX 1660' (0): 240 out of 240 events were successful and 0 events lost. max event=239

And the test with cpu only:

# reboot

# AMD Ryzen 7 3700X 8-Core Processor
# 32GB RAM
# Xubuntu 24.04

$ darktable-cli setubal.orf setubal.orf.xmp setubal.jpg --core --disable-opencl -d perf

darktable 5.0.1
Copyright (C) 2012-2025 Johannes Hanika and other contributors.

Compile options:
  Bit depth              -> 64 bit
  Debug                  -> DISABLED
  SSE2 optimizations     -> ENABLED
  OpenMP                 -> ENABLED
  OpenCL                 -> ENABLED
  Lua                    -> ENABLED  - API version 9.4.0
  Colord                 -> ENABLED
  gPhoto2                -> ENABLED
  GMIC                   -> ENABLED  - Compressed LUTs are supported
  GraphicsMagick         -> ENABLED
  ImageMagick            -> DISABLED
  libavif                -> DISABLED
  libheif                -> ENABLED
  libjxl                 -> ENABLED
  LibRaw                 -> ENABLED  - Version 0.22.0-Devel202403
  OpenJPEG               -> ENABLED
  OpenEXR                -> ENABLED
  WebP                   -> ENABLED

See https://www.darktable.org/resources/ for detailed documentation.
See https://github.com/darktable-org/darktable/issues/new/choose to report bugs.

     2.1971 [dt_dev_load_raw] loading the image. took 1.036 secs (0.794 CPU)
     2.2600 [export] creating pixelpipe took 0.059 secs (0.491 CPU)
     2.2605 [dev_pixelpipe] took 0.000 secs (0.000 CPU) initing base buffer [export]
     2.2749 [dev_pixelpipe] took 0.014 secs (0.067 CPU) [export] processed `rawprepare' on CPU, blended on CPU
     2.2915 [dev_pixelpipe] took 0.017 secs (0.083 CPU) [export] processed `temperature' on CPU, blended on CPU
     2.3311 [dev_pixelpipe] took 0.040 secs (0.481 CPU) [export] processed `highlights' on CPU, blended on CPU
     2.3548 [dev_pixelpipe] took 0.024 secs (0.245 CPU) [export] processed `hotpixels' on CPU, blended on CPU
     2.5442 [dev_pixelpipe] took 0.189 secs (2.527 CPU) [export] processed `demosaic' on CPU, blended on CPU
     6.5679 [dev_pixelpipe] took 4.024 secs (50.857 CPU) [export] processed `denoiseprofile' on CPU, blended on CPU
     7.6525 [dev_pixelpipe] took 1.085 secs (10.766 CPU) [export] processed `lens' on CPU, blended on CPU
     7.8433 [dev_pixelpipe] took 0.191 secs (3.045 CPU) [export] processed `ashift' on CPU, blended on CPU
     7.9086 [dev_pixelpipe] took 0.065 secs (1.023 CPU) [export] processed `exposure' on CPU, blended on CPU
     7.9509 [dev_pixelpipe] took 0.042 secs (0.669 CPU) [export] processed `colorin' on CPU, blended on CPU
     7.9929 [dt_ioppr_transform_image_colorspace] IOP_CS_LAB-->IOP_CS_RGB took 0.042 secs (0.664 CPU) [channelmixerrgb]
     8.2903 [dev_pixelpipe] took 0.339 secs (5.424 CPU) [export] processed `channelmixerrgb' on CPU, blended on CPU
     8.3329 [dt_ioppr_transform_image_colorspace] IOP_CS_RGB-->IOP_CS_LAB took 0.042 secs (0.665 CPU) [atrous]
    10.9040 [dev_pixelpipe] took 2.613 secs (39.172 CPU) [export] processed `atrous' on CPU, blended on CPU
    10.9468 [dt_ioppr_transform_image_colorspace] IOP_CS_LAB-->IOP_CS_RGB took 0.043 secs (0.667 CPU) [colorbalancergb]
    12.2463 [dev_pixelpipe] took 1.342 secs (21.422 CPU) [export] processed `colorbalancergb' on CPU, blended on CPU
    12.2888 [dev_pixelpipe] took 0.042 secs (0.656 CPU) [export] processed `rgblevels' on CPU, blended on CPU
    12.7448 [dev_pixelpipe] took 0.456 secs (7.272 CPU) [export] processed `sigmoid' on CPU, blended on CPU
    12.7876 [dt_ioppr_transform_image_colorspace] IOP_CS_RGB-->IOP_CS_LAB took 0.043 secs (0.683 CPU) [bilat]
    13.9621 [dev_pixelpipe] took 1.217 secs (9.130 CPU) [export] processed `bilat' on CPU, blended on CPU
    14.0481 [dev_pixelpipe] took 0.086 secs (1.323 CPU) [export] processed `colorout' on CPU, blended on CPU
    14.1135 [resample_plain] took 0.065 secs (1.033 CPU) 1:1 copy/crop of 8065x6046 pixels
    14.1136 [dev_pixelpipe] took 0.065 secs (1.034 CPU) [export] processed `finalscale' on CPU, blended on CPU
    14.1136 [dev_process_export] pixel pipeline processing took 11.854 secs (155.219 CPU)
    15.5526 [export_job] exported to `setubal.jpg'

Yes, it’s windows 11