OpenCL on Fedora

Hi

I can’t open darktable (flatpak package) with OpenCL enabled on Fedora 35. I also tried Blender (rpm package), but also without success.

I had installed OpenCL-AMD with this package. I got this from clinfo:

Number of platforms                               1
  Platform Name                                   AMD Accelerated Parallel Processing
  Platform Vendor                                 Advanced Micro Devices, Inc.
  Platform Version                                OpenCL 2.0 AMD-APP (3314.0)
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd cl_amd_event_callback 
  Platform Extensions function suffix             AMD

  Platform Name                                   AMD Accelerated Parallel Processing
Number of devices                                 1
  Device Name                                     gfx902:xnack+
  Device Vendor                                   Advanced Micro Devices, Inc.
  Device Vendor ID                                0x1002
  Device Version                                  OpenCL 2.0 
  Driver Version                                  3314.0 (HSA1.1,LC)
  Device OpenCL C Version                         OpenCL C 2.0 
  Device Type                                     GPU
  Device Board Name (AMD)                         Renoir
  Device PCI-e ID (AMD)                           0x1636
  Device Topology (AMD)                           PCI-E, 0000:04:00.0
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               7
  SIMD per compute unit (AMD)                     4
  SIMD width (AMD)                                16
  SIMD instruction width (AMD)                    1
  Max clock frequency                             1600MHz
  Graphics IP (AMD)                               9.0
  Device Partition                                (core)
    Max number of sub-devices                     7
    Supported partition types                     None
    Supported affinity domains                    (n/a)
  Max work item dimensions                        3
  Max work item sizes                             1024x1024x1024
  Max work group size                             256
  Preferred work group size (AMD)                 256
  Max work group size (AMD)                       1024
  Preferred work group size multiple (kernel)     64
  Wavefront width (AMD)                           64
  Preferred / native vector sizes                 
    char                                                 4 / 4       
    short                                                2 / 2       
    int                                                  1 / 1       
    long                                                 1 / 1       
    half                                                 1 / 1        (cl_khr_fp16)
    float                                                1 / 1       
    double                                               1 / 1        (cl_khr_fp64)
  Half-precision Floating-point support           (cl_khr_fp16)
    Denormals                                     No
    Infinity and NANs                             No
    Round to nearest                              No
    Round to zero                                 No
    Round to infinity                             No
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
  Single-precision Floating-point support         (core)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  Yes
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
  Address bits                                    64, Little-Endian
  Global memory size                              536870912 (512MiB)
  Global free memory (AMD)                        524288 (512MiB) 524288 (512MiB)
  Global memory channels (AMD)                    4
  Global memory banks per channel (AMD)           4
  Global memory bank width (AMD)                  256 bytes
  Error Correction support                        No
  Max memory allocation                           456340272 (435.2MiB)
  Unified memory for Host and Device              No
  Shared Virtual Memory (SVM) capabilities        (core)
    Coarse-grained buffer sharing                 Yes
    Fine-grained buffer sharing                   Yes
    Fine-grained system sharing                   No
    Atomics                                       No
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       1024 bits (128 bytes)
  Preferred alignment for atomics                 
    SVM                                           0 bytes
    Global                                        0 bytes
    Local                                         0 bytes
  Max size for global variable                    456340272 (435.2MiB)
  Preferred total size of global vars             536870912 (512MiB)
  Global Memory cache type                        Read/Write
  Global Memory cache size                        16384 (16KiB)
  Global Memory cache line size                   64 bytes
  Image support                                   Yes
    Max number of samplers per kernel             5686
    Max size for 1D images from buffer            134217728 pixels
    Max 1D or 2D image array size                 8192 images
    Base address alignment for 2D image buffers   256 bytes
    Pitch alignment for 2D image buffers          256 pixels
    Max 2D image size                             16384x16384 pixels
    Max 3D image size                             16384x16384x8192 pixels
    Max number of read image args                 128
    Max number of write image args                8
    Max number of read/write image args           64
  Max number of pipe args                         16
  Max active pipe reservations                    16
  Max pipe packet size                            456340272 (435.2MiB)
  Local memory type                               Local
  Local memory size                               65536 (64KiB)
  Local memory size per CU (AMD)                  65536 (64KiB)
  Local memory banks (AMD)                        32
  Max number of constant args                     8
  Max constant buffer size                        456340272 (435.2MiB)
  Preferred constant buffer size (AMD)            16384 (16KiB)
  Max size of kernel argument                     1024
  Queue properties (on host)                      
    Out-of-order execution                        No
    Profiling                                     Yes
  Queue properties (on device)                    
    Out-of-order execution                        Yes
    Profiling                                     Yes
    Preferred size                                262144 (256KiB)
    Max size                                      8388608 (8MiB)
  Max queues on device                            1
  Max events on device                            1024
  Prefer user sync for interop                    Yes
  Number of P2P devices (AMD)                     0
  Profiling timer resolution                      1ns
  Profiling timer offset since Epoch (AMD)        0ns (Thu Jan  1 01:00:00 1970)
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            No
    Thread trace supported (AMD)                  No
    Number of async queues (AMD)                  8
    Max real-time compute queues (AMD)            8
    Max real-time compute units (AMD)             7
  printf() buffer size                            4194304 (4MiB)
  Built-in kernels                                (n/a)
  Device Extensions                               cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_media_ops cl_amd_media_ops2 cl_khr_image2d_from_buffer cl_khr_subgroups cl_khr_depth_images cl_amd_copy_buffer_p2p cl_amd_assembly_program 

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  No platform
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   No platform
  clCreateContext(NULL, ...) [default]            No platform
  clCreateContext(NULL, ...) [other]              Success [AMD]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  Success (1)
    Platform Name                                 AMD Accelerated Parallel Processing
    Device Name                                   gfx902:xnack+
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  Success (1)
    Platform Name                                 AMD Accelerated Parallel Processing
    Device Name                                   gfx902:xnack+
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  Success (1)
    Platform Name                                 AMD Accelerated Parallel Processing
    Device Name                                   gfx902:xnack+

Extra info

Graphics Platform: X11

Processors: 16 × AMD Ryzen 7 4800H with Radeon Graphics

Graphics Processor: AMD RENOIR

Thanks in advance.

There is no openCL flatpak package, it currently only works with nvidia cards.

1 Like

Ok, so how can I have darktable with CR3 support and OpenCL enable in Fedora 35?

Do you get OpenCL activated from the standard fedora provided darktable package ?
Just running /usr/bin/darktable-cltest should tell you if that works.

If it works, then I think the best way should be either to compile your own darktable version with CR3 support, or to report a bug to the Darktable and libexif maintainers to get a CR3 enabled version (with a bit of waiting time).

Tell me if that works for you, or I could even look if I could revive my copr build with CR3 enabled (I don’t own Canon hardware so never bothered with that)

1 Like

No, I don’t get OpenCL activated. When I run the command, I get this:

0.013928 [opencl_init] opencl related configuration options:
0.013940 [opencl_init] 
0.013942 [opencl_init] opencl: 1
0.013944 [opencl_init] opencl_scheduling_profile: 'default'
0.013947 [opencl_init] opencl_library: ''
0.013950 [opencl_init] opencl_memory_requirement: 768
0.013952 [opencl_init] opencl_memory_headroom: 400
0.013954 [opencl_init] opencl_device_priority: '*/!0,*/*/*/!0,*'
0.013957 [opencl_init] opencl_mandatory_timeout: 200
0.013959 [opencl_init] opencl_size_roundup: 16
0.013962 [opencl_init] opencl_async_pixelpipe: 0
0.013964 [opencl_init] opencl_synch_cache: active module
0.013966 [opencl_init] opencl_number_event_handles: 25
0.013969 [opencl_init] opencl_micro_nap: 1000
0.013971 [opencl_init] opencl_use_pinned_memory: 0
0.013973 [opencl_init] opencl_use_cpu_devices: 0
0.013975 [opencl_init] opencl_avoid_atomics: 0
0.013977 [opencl_init] 
0.014104 [opencl_init] could not find opencl runtime library 'libOpenCL'
0.014139 [opencl_init] could not find opencl runtime library 'libOpenCL.so'
0.014514 [opencl_init] found opencl runtime library 'libOpenCL.so.1'
0.014539 [opencl_init] opencl library 'libOpenCL.so.1' found on your system and loaded
0.096409 [opencl_init] found 1 platform
0.096432 [opencl_init] found 1 device
0.096445 [opencl_init] discarding device 0 `gfx902:xnack+' due to insufficient global memory (512MB).
0.096448 [opencl_init] no suitable devices found.
0.096450 [opencl_init] FINALLY: opencl is NOT AVAILABLE on this system.
0.096452 [opencl_init] initial status of opencl enabled flag is OFF.

@42578
One thing which I noticed was the following in the output from darktable-cltest

0.096445 [opencl_init] discarding device 0 `gfx902:xnack+' due to insufficient global memory (512MB).

I don’t know in the BIOS of your computer you can increase the amount of memory devoted to the onboard GPU, but it could be the fact that the 512MB of memory allocated to the onboard graphics isn’t sufficient for OpenCL to be enabled? Try upping it to 2GB of reserved ram for the onboard GPU.

1 Like

I can’t increase it, but I also don’t think that’s the problem, because I got OpenCL activated on Arch with the same computer.

The OBS package supports CR3 and will work if your driver supports OpenCL

1 Like

Ok. I install it from here and still don’t get OpenCL activated. I’m assuming that the package I mentioned don’t installed OpenCL properly.

What can I do?

On my laptop with an AMD CPU and a separate GPU (RX5700M), on fedora 35 I just used the official AMD provided OpenCL package; but just that one, not the graphic driver.
Basically, I’ve gone to their support download area to find the link to their latest drivers.
Current driver installation package is : http://repo.radeon.com/amdgpu-install/21.40.2/rhel/8.5/amdgpu-install-21.40.2.40502-1.el8.noarch.rpm
that package contains 2 config for AMD provided package repositories and a shell script to install various part of their graphic stack.
if you look at the install script, it can be simplified to dnf install rocm-opencl-runtime.
So, for me , the first thing to do is to be sure OpenCL works on your system.
If it does not work with darktable nor blender, don’t blame those softwares but the openCL driver install.

2 Likes

To back what @Oleastre is saying: in the readme of the GitHub-Project you linked, it says

This package is intended to work along with the free amdgpu stack for fedora. Similiar to AUR (en) - opencl-amd

So it only works if the driver stack is installed. The comment above is one way to install it; another one would be https://amdgpu-install.readthedocs.io/en/latest/install-prereq.html

I guess, you might have done something similar in arch.

1 Like

And from Darktable manual:

A sufficient amount of graphics memory (1GB+) needs to be available for darktable to take advantage of the GPU.

https://docs.darktable.org/usermanual/3.8/en/special-topics/opencl/activate-opencl/

Definitely, your system GPU does not have access to enough memory to be able to be used by Darktable, even if the driver is correctly installed.

1 Like

I tried both methods and I got this:

Errors during downloading metadata for repository 'amdgpu':
  - Status code: 404 for https://repo.radeon.com/amdgpu/21.40.2/rhel//main/x86_64/repodata/repomd.xml (IP: 13.82.220.49)
Error: Failed to download metadata for repo 'amdgpu': Cannot download repomd.xml: Cannot download repodata/repomd.xml: All mirrors were tried
Ignoring repositories: amdgpu

I wasn’t aware of that, but what can I say is that darktable editing process is slower on Fedora when compared to Arch. I don’t remember what I have done to make it work, but I think that I installed the AUR package mentioned by @sushey.

My laptop is the Slimbook Pro X 15 with 32GB of ram. I don’t know if it matters, but it can run DaVinci Resolve.

With 32Gb of ram, then check this: https://github.com/darktable-org/darktable/issues/10922

Set your host_memory_limit to 0 in preferences to ensure DT can use all of that memory.

2 Likes

That was commented upon in your (?) pull request as being dangerous, so perhaps just set it at 16Mb (or 4 as a first try, increase if that’s stable)

3 Likes

Oh yes, I remember one of the two repositories installed was not working. Just remove that entry in /etc/yum.repo.d and ignore.

1 Like

Why don’t you install the darktable packages provides either by Fedora itself or the once in the openSUSE Build Service for Fedora?

1 Like

I removed it and installed the package, but I still don’t get OpenCL available in darktable.

Thanks, @g-man and @rvietor. Changing that setting improved darktable’s performance.

I tried it, but I wasn’t getting CR3 support or OpenCL available out-of-the-box. With flatpak, I have CR3 support. See this topic.

Given that @asn is the maintainer of those packages and a fedora developer. you can be kinda sure it works.

in your other thread you said you used the fedora package and the fedora flatpak.

I uninstalled darktable and then reinstalled it via: flatpak and rpm from Fedora; flatpak from Flathub. I uninstalled darktable before install it from another package. As I said on the other thread, I can only view CR3 files if I install darktable’s flatpak package from Flathub.