Darktable 4.4.2 with opencl crashes linux

Hello,

I’m running darktable 4.4.2 on EndeavourOS (a rolling arch distro):

[andrew@andrew-desktop darktable]$ darktable --version
this is darktable 4.4.2
copyright (c) 2009-2023 johannes hanika
https://github.com/darktable-org/darktable/issues/new/choose

compile options:
  bit depth is 64 bit
  normal build
  SSE2 optimized codepath enabled
  OpenMP support enabled
  OpenCL support enabled
  Lua support enabled, API version 9.1.0
  Colord support enabled
  gPhoto2 support enabled
  GraphicsMagick support enabled
  ImageMagick support disabled
  libavif support enabled
  libheif support enabled
  libjxl support enabled
  OpenJPEG support enabled
  OpenEXR support enabled
  WebP support enabled

When I enable OpenCL support in darktable, processing speeds up noticeably. However, after a few moments, the entire desktop freezes (not just a darktable crash), goes black, and the system reboots itself.

Here is the output of clinfo:

andrew@andrew-desktop darktable]$ clinfo
Number of platforms                               1
  Platform Name                                   AMD Accelerated Parallel Processing
  Platform Vendor                                 Advanced Micro Devices, Inc.
  Platform Version                                OpenCL 2.1 AMD-APP (3590.0)
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd cl_amd_event_callback 
  Platform Extensions function suffix             AMD
  Platform Host timer resolution                  1ns

  Platform Name                                   AMD Accelerated Parallel Processing
Number of devices                                 1
  Device Name                                     gfx1010:xnack-
  Device Vendor                                   Advanced Micro Devices, Inc.
  Device Vendor ID                                0x1002
  Device Version                                  OpenCL 2.0 
  Driver Version                                  3590.0 (HSA1.1,LC)
  Device OpenCL C Version                         OpenCL C 2.0 
  Device Type                                     GPU
  Device Board Name (AMD)                         AMD Radeon RX 5600 OEM
  Device PCI-e ID (AMD)                           0x731f
  Device Topology (AMD)                           PCI-E, 0000:04:00.0
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               16
  SIMD per compute unit (AMD)                     4
  SIMD width (AMD)                                32
  SIMD instruction width (AMD)                    1
  Max clock frequency                             1780MHz
  Graphics IP (AMD)                               10.1
  Device Partition                                (core)
    Max number of sub-devices                     16
    Supported partition types                     None
    Supported affinity domains                    (n/a)
  Max work item dimensions                        3
  Max work item sizes                             1024x1024x1024
  Max work group size                             256
  Preferred work group size (AMD)                 256
  Max work group size (AMD)                       1024
  Preferred work group size multiple (kernel)     32
  Wavefront width (AMD)                           32
  Preferred / native vector sizes                 
    char                                                 4 / 4       
    short                                                2 / 2       
    int                                                  1 / 1       
    long                                                 1 / 1       
    half                                                 1 / 1        (cl_khr_fp16)
    float                                                1 / 1       
    double                                               1 / 1        (cl_khr_fp64)
  Half-precision Floating-point support           (cl_khr_fp16)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
  Single-precision Floating-point support         (core)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  Yes
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
  Address bits                                    64, Little-Endian
  Global memory size                              6425673728 (5.984GiB)
  Global free memory (AMD)                        5865472 (5.594GiB) 5865472 (5.594GiB)
  Global memory channels (AMD)                    6
  Global memory banks per channel (AMD)           4
  Global memory bank width (AMD)                  256 bytes
  Error Correction support                        No
  Max memory allocation                           5461822664 (5.087GiB)
  Unified memory for Host and Device              No
  Shared Virtual Memory (SVM) capabilities        (core)
    Coarse-grained buffer sharing                 Yes
    Fine-grained buffer sharing                   Yes
    Fine-grained system sharing                   No
    Atomics                                       No
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       1024 bits (128 bytes)
  Preferred alignment for atomics                 
    SVM                                           0 bytes
    Global                                        0 bytes
    Local                                         0 bytes
  Max size for global variable                    5461822664 (5.087GiB)
  Preferred total size of global vars             6425673728 (5.984GiB)
  Global Memory cache type                        Read/Write
  Global Memory cache size                        16384 (16KiB)
  Global Memory cache line size                   64 bytes
  Image support                                   Yes
    Max number of samplers per kernel             16
    Max size for 1D images from buffer            134217728 pixels
    Max 1D or 2D image array size                 8192 images
    Base address alignment for 2D image buffers   256 bytes
    Pitch alignment for 2D image buffers          256 pixels
    Max 2D image size                             16384x16384 pixels
    Max 3D image size                             16384x16384x8192 pixels
    Max number of read image args                 128
    Max number of write image args                8
    Max number of read/write image args           64
  Max number of pipe args                         16
  Max active pipe reservations                    16
  Max pipe packet size                            1166855368 (1.087GiB)
  Local memory type                               Local
  Local memory size                               65536 (64KiB)
  Local memory size per CU (AMD)                  65536 (64KiB)
  Local memory banks (AMD)                        32
  Max number of constant args                     8
  Max constant buffer size                        5461822664 (5.087GiB)
  Preferred constant buffer size (AMD)            16384 (16KiB)
  Max size of kernel argument                     1024
  Queue properties (on host)                      
    Out-of-order execution                        No
    Profiling                                     Yes
  Queue properties (on device)                    
    Out-of-order execution                        Yes
    Profiling                                     Yes
    Preferred size                                262144 (256KiB)
    Max size                                      8388608 (8MiB)
  Max queues on device                            1
  Max events on device                            1024
  Prefer user sync for interop                    Yes
  Number of P2P devices (AMD)                     0
  Profiling timer resolution                      1ns
  Profiling timer offset since Epoch (AMD)        0ns (Wed Dec 31 19:00:00 1969)
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            No
    Thread trace supported (AMD)                  No
    Number of async queues (AMD)                  8
    Max real-time compute queues (AMD)            8
    Max real-time compute units (AMD)             16
  printf() buffer size                            4194304 (4MiB)
  Built-in kernels                                (n/a)
  Device Extensions                               cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_media_ops cl_amd_media_ops2 cl_khr_image2d_from_buffer cl_khr_subgroups cl_khr_depth_images cl_amd_copy_buffer_p2p cl_amd_assembly_program 

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  No platform
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   No platform
  clCreateContext(NULL, ...) [default]            No platform
  clCreateContext(NULL, ...) [other]              Success [AMD]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  Success (1)
    Platform Name                                 AMD Accelerated Parallel Processing
    Device Name                                   gfx1010:xnack-
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  Success (1)
    Platform Name                                 AMD Accelerated Parallel Processing
    Device Name                                   gfx1010:xnack-
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  Success (1)
    Platform Name                                 AMD Accelerated Parallel Processing
    Device Name                                   gfx1010:xnack-

And here is the output of ‘darktable -d opencl’:

     0.0558 [dt_get_sysresource_level] switched to 1 as `default'
     0.0558   total mem:       15667MB
     0.0558   mipmap cache:    1958MB
     0.0558   available mem:   7833MB
     0.0558   singlebuff:      122MB
     0.0558   OpenCL tune mem: OFF
     0.0558   OpenCL pinned:   OFF
[opencl_init] opencl related configuration options:
[opencl_init] opencl: ON
[opencl_init] opencl_scheduling_profile: 'very fast GPU'
[opencl_init] opencl_library: 'default path'
[opencl_init] opencl_device_priority: '*/!0,*/*/*/!0,*'
[opencl_init] opencl_mandatory_timeout: 400
[opencl_init] opencl library 'libOpenCL' found on your system and loaded
[opencl_init] found 1 platform
[opencl_init] found 1 device

[dt_opencl_device_init]
   DEVICE:                   0: 'gfx1010:xnack-'
   PLATFORM NAME & VENDOR:   AMD Accelerated Parallel Processing, Advanced Micro Devices, Inc.
   CANONICAL NAME:           amdacceleratedparallelprocessinggfx1010xnack
   DRIVER VERSION:           3590.0 (HSA1.1,LC)
   DEVICE VERSION:           OpenCL 2.0 
   DEVICE_TYPE:              GPU
   GLOBAL MEM SIZE:          6128 MB
   MAX MEM ALLOC:            5209 MB
   MAX IMAGE SIZE:           16384 x 16384
   MAX WORK GROUP SIZE:      256
   MAX WORK ITEM DIMENSIONS: 3
   MAX WORK ITEM SIZES:      [ 1024 1024 1024 ]
   ASYNC PIXELPIPE:          NO
   PINNED MEMORY TRANSFER:   NO
   MEMORY TUNING:            NO
   FORCED HEADROOM:          400
   AVOID ATOMICS:            NO
   MICRO NAP:                250
   ROUNDUP WIDTH:            16
   ROUNDUP HEIGHT:           16
   CHECK EVENT HANDLES:      128
   PERFORMANCE:              6.895
   TILING ADVANTAGE:         0.000
   DEFAULT DEVICE:           NO
   KERNEL BUILD DIRECTORY:   /usr/share/darktable/kernels
   KERNEL DIRECTORY:         /home/andrew/.cache/darktable/cached_v1_kernels_for_AMDAcceleratedParallelProcessinggfx1010xnack_35900HSA11LC
   CL COMPILER OPTION:       -cl-fast-relaxed-math
   KERNEL LOADING TIME:       0.0148 sec
[opencl_init] OpenCL successfully initialized. Internal numbers and names of available devices:
[opencl_init]		0	'AMD Accelerated Parallel Processing gfx1010:xnack-'
[opencl_init] FINALLY: opencl is AVAILABLE and ENABLED.
[dt_opencl_update_priorities] these are your device priorities:
[dt_opencl_update_priorities] 		image	preview	export	thumbs	preview2
[dt_opencl_update_priorities]		0	0	0	0	0
[dt_opencl_update_priorities] show if opencl use is mandatory for a given pixelpipe:
[dt_opencl_update_priorities] 		image	preview	export	thumbs	preview2
[dt_opencl_update_priorities]		1	1	1	1	1
[opencl_synchronization_timeout] synchronization timeout set to 0
[dt_opencl_update_priorities] these are your device priorities:
[dt_opencl_update_priorities] 		image	preview	export	thumbs	preview2
[dt_opencl_update_priorities]		0	0	0	0	0
[dt_opencl_update_priorities] show if opencl use is mandatory for a given pixelpipe:
[dt_opencl_update_priorities] 		image	preview	export	thumbs	preview2
[dt_opencl_update_priorities]		1	1	1	1	1
[opencl_synchronization_timeout] synchronization timeout set to 0
     0.8299 [lib_load_module] failed to open `midi': libportmidi.so.2: cannot open shared object file: No such file or directory
    13.1597 [dt_opencl_check_tuning] use 3915MB (tunemem=OFF, pinning=OFF) on device `AMD Accelerated Parallel Processing gfx1010:xnack-' id=0
    13.2652 [pixelpipe_process_CL]       [full]         colorout               (   0/   0)  985x 740 scale=0.1892 --> (   0/   0)  985x 740 scale=0.1892 cl input data to host
    13.4469 [pixelpipe_process_CL]       [preview]      colorout               (   0/   0) 1198x 900 scale=1.0000 --> (   0/   0) 1198x 900 scale=1.0000 cl input data to host
    23.9051 [pixelpipe_process_CL]       [full]         colorout               (   0/   0)  985x 740 scale=0.1892 --> (   0/   0)  985x 740 scale=0.1892 cl input data to host
    24.0366 [pixelpipe_process_CL]       [preview]      colorout               (   0/   0) 1198x 900 scale=1.0000 --> (   0/   0) 1198x 900 scale=1.0000 cl input data to host
    46.4283 [pixelpipe_process_CL]       [full]         crop                   ( 229/ 112)  985x 740 scale=0.2333 --> (   0/   0)  985x 740 scale=0.2333 cl input data to host
    46.4522 [pixelpipe_process_CL]       [full]         colorout               (   0/   0)  985x 740 scale=0.2333 --> (   0/   0)  985x 740 scale=0.2333 cl input data to host
    46.5291 [pixelpipe_process_CL]       [preview]      colorout               (   0/   0)  971x 729 scale=1.0000 --> (   0/   0)  971x 729 scale=1.0000 cl input data to host

Nothing further is written on stdout when it crashes.

Is there any other debugging I can enable that might help?

It seems that the crashes mostly occur while loading images. I’ve basically been leaving OpenCL off until I’m ready to export, then just enabling it for the export process. It hasn’t crashed during export. Obviously I’d like to leave it on all the time.

1 Like

What driver are you using?

04:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 [Radeon RX 5600 OEM/5600 XT / 5700/5700 XT] (rev cb) (prog-if 00 [VGA controller])
	Subsystem: Dell Navi 10 [Radeon RX 5600 OEM/5600 XT / 5700/5700 XT]
	Flags: bus master, fast devsel, latency 0, IRQ 160
	Memory at b0000000 (64-bit, prefetchable) [size=256M]
	Memory at c0000000 (64-bit, prefetchable) [size=2M]
	I/O ports at 4000 [size=256]
	Memory at c2100000 (32-bit, non-prefetchable) [size=512K]
	Expansion ROM at c2180000 [disabled] [size=128K]
	Capabilities: <access denied>
	Kernel driver in use: amdgpu
	Kernel modules: amdgpu

So, “amdgpu”? Is that what you mean by driver?

These things don’t look particularly promising:

Oct 09 20:11:14 andrew-desktop kernel: amdgpu 0000:04:00.0: amdgpu: Failed to disallow df cstate
Oct 09 20:11:14 andrew-desktop kernel: amdgpu 0000:04:00.0: amdgpu: GPU reset begin!
Oct 09 20:11:14 andrew-desktop kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* GPU Recovery Failed: -62
Oct 09 20:11:14 andrew-desktop kernel: amdgpu 0000:04:00.0: amdgpu: GPU reset end with ret = -62
Oct 09 20:11:14 andrew-desktop kernel: amdgpu: The cp might be in an unrecoverable state due to an unsuccessful queues preemption
Oct 09 20:11:14 andrew-desktop kernel: amdgpu: qcm fence wait loop timeout expired
...
Oct 09 20:11:05 andrew-desktop kernel: amdgpu 0000:04:00.0: amdgpu: GPU reset(1) failed
Oct 09 20:11:05 andrew-desktop kernel: [drm:amdgpu_device_fw_loading [amdgpu]] *ERROR* resume of IP block <psp> failed -62
Oct 09 20:11:05 andrew-desktop kernel: [drm:psp_resume [amdgpu]] *ERROR* PSP resume failed
Oct 09 20:11:05 andrew-desktop kernel: [drm:psp_hw_start [amdgpu]] *ERROR* PSP load kdb failed!
Oct 09 20:11:03 andrew-desktop kernel: [drm] PSP is resuming...
Oct 09 20:11:03 andrew-desktop kernel: [drm] VRAM is lost due to GPU reset!
Oct 09 20:11:03 andrew-desktop kernel: [drm] PCIE GART of 512M enabled (table at 0x0000008000800000).
Oct 09 20:11:03 andrew-desktop kernel: amdgpu 0000:04:00.0: amdgpu: GPU reset succeeded, trying to resume
Oct 09 20:11:01 andrew-desktop kernel: amdgpu 0000:04:00.0: amdgpu: BACO reset
Oct 09 20:11:01 andrew-desktop kernel: amdgpu: Failed to suspend process 0x8005
Oct 09 20:11:01 andrew-desktop kernel: amdgpu 0000:04:00.0: amdgpu: GPU reset begin!
Oct 09 20:11:01 andrew-desktop kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process  pid 0 thread  pid 0
Oct 09 20:11:01 andrew-desktop kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma1 timeout, signaled seq=9506, emitted seq=9508
...
Oct 09 20:10:50 andrew-desktop kernel: amdgpu 0000:04:00.0: amdgpu:          RW: 0x0
Oct 09 20:10:50 andrew-desktop kernel: amdgpu 0000:04:00.0: amdgpu:          MAPPING_ERROR: 0x1
Oct 09 20:10:50 andrew-desktop kernel: amdgpu 0000:04:00.0: amdgpu:          PERMISSION_FAULTS: 0xb
Oct 09 20:10:50 andrew-desktop kernel: amdgpu 0000:04:00.0: amdgpu:          WALKER_ERROR: 0x5
Oct 09 20:10:50 andrew-desktop kernel: amdgpu 0000:04:00.0: amdgpu:          MORE_FAULTS: 0x1
Oct 09 20:10:50 andrew-desktop kernel: amdgpu 0000:04:00.0: amdgpu:          Faulty UTCL2 client ID: CPF (0x4)
Oct 09 20:10:50 andrew-desktop kernel: amdgpu 0000:04:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00F009BB
Oct 09 20:10:50 andrew-desktop kernel: amdgpu 0000:04:00.0: amdgpu:   in page starting at address 0x0000000000000000 from client 0x1b (UTCL2)
Oct 09 20:10:50 andrew-desktop kernel: amdgpu 0000:04:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:221 vmid:15 pasid:0, for process  pid 0 thread  pid 0)

Can you roll back to a previous version of tbr driver?

@apkerr no, that is the AMD GPU driver, not the OpenCL driver.

As an ordinary user (whichever user you log in as to run DT) please run
ls /lib/dri
and post the output here.

[andrew@andrew-desktop ~]$ ls /lib/dri
crocus_dri.so  i915_dri.so  kms_swrast_dri.so  r300_dri.so  radeonsi_dri.so  virtio_gpu_dri.so  zink_dri.so
d3d12_dri.so   iris_dri.so  nouveau_dri.so     r600_dri.so  swrast_dri.so    vmwgfx_dri.so
[andrew@andrew-desktop ~]$

Which tells me you have the radeonsi driver installed for OpenCL. Read below and delete it if the links don’t show it as apt for the driver you need (as your OpenCL is broken I doubt it will be).

You also have nouveau installed. Do you have any Nvidia chipsets for it to drive? If not I suggest you get rid. sudo or su to root and run
pacman -Rs nouveau
to delete it and any dependencies which do not affect any other software.

Next, read these links carefully. There is an awful lot of good information in there but it is easy to skip over some and some of it is not easy to interpret.
https://wiki.archlinux.org/title/GPGPU#OpenCL
SOLVED OpenCL not working on Desktop / Multimedia and Games / Arch Linux Forums)
Opencl error (may be PHI node, LLVM problem) / Multimedia and Games / Arch Linux Forums

You are using an Arch fork distro. I suggest you bookmark the Arch web site and use its very thorough information. None of the forks I have come across are so well documented. The forum and wiki are especially good sources for reference.

Very common issue with amdgpu is the buggy power management causing driver crashes. Try adding amdgpu.runpm=0 as kernel boot parameter to disable it.

1 Like

I am able to get OpenCL to not crash by using “opencl-rusticle-mesa”.

However, now I’m running into this: [rusticl] [radeonsi] [darktable4] [ppc64le] Darktable always renders black images despite not throwing any error (#7746) · Issues · Mesa / mesa · GitLab

The images look black if either Filmic RGB or Color Balance RGB modules are enabled. I can use Sigmoid instead of Filmic, but I do tend to use Color Balance RGB a lot.

I will keep investigating.

rusticl drivers are not concidered stable for now.

@apkerr rusticl does work well with some AMD GPUs according to various posts around the web. For me the criterion is not whether it is considered “stable” but whether it works. As it isn’t working for you I suggest you delete it and its dependencies. It did not work for me either.

I’ve just installed opencl-amd from the AUR repository. Running DT from the CL with “~$ darktable -d opencl” it runs flawlessly but in some aspects painfully slowly. That is because DT’s setings were altered during the installation (by the software, not by me). It has improved a bit with just a couple of tweaks.

That makes me reasonably confident that a thorough review of my DT settings I should be able to get it running quickly.

Shout if you need any help using the Arch AUR.

Thanks.

opencl-amd is the one that causes my entire screen to go black shortly after opening an image.

I tried the boot parameter suggested by @K-1, but that did not seem to have any effect.

@paperdigits, how would I do that?

Thanks

I don’t know “who” is commenting “it works” and on what test base. Yes - we could have a dt bug being seen only for that driver as we had quite a number for the amd driver. And yes - the driver is not concidered to be stable / fully conforming to standard. So we “allow” it to be used but added a preference in master to at least make the user think about it.

I would be pretty reluctant to advise user to use as the first impression might be “fine for me so let’s go” and after a while we have reports like “darktable just crashes permanently and it never did before” -:slight_smile:

1 Like

My experience too, Jens. Where I see people reporting “it works” that has happened across a variety of hardware configurations and usually stated at the end of a thread where they originally reported a problem.

The Arch wiki does have good information and links to more, as cited in my earlier post. The problem here is that we need to be able to understand it to make full use of it. The whole mesa, opencl and amd-specific stuff was very new to me when my DT first began playing silly games. It was not easy for me to explain clearly because I didn’t understand either the inner workings of DT or what OpenCL was about.

@apkerr Andrew, are you clearing out your previous opencl driver instances including dependencies before trying the next? Have you compared your AMD GPU version to the list? Recommended solutions are version-dependent.

It was rusticl or one of its dependencies which caused my images to appear as black but I didn’t have a complete black screen. The only “tests” I can offer are that I tried various drivers and eventually found one which for now at least is working for me.

In case my experience may help anybody else my GPU is an AMD Ryzen 5 5600G, 6 core 12 thread.

With opencl-amd I have made a couple of tweaks in “darktable preferences > processing > CPU / GPU / memory”. No more black images and DT is now lightning fast. “darktable resources” switched from “small” to “default” helped to clear up the remaining blackened images.

That is nice for you :slight_smile: for me quite something to worry about. It just shouldn’t happen at all, small memory taken should slow down - often quite a lot - the process but it should never ever generate a different result.

1 Like

The black images were left after the foray with rusticl. Installing opencl-amd didn’t clean them all up. But switching from lightable to darkroom after changing the setting from small to default, then clicking on each black image did.

I think I’m cleaning up the dependencies, but maybe not. I’ll try to ensure that I start fresh next time (I’m at work at the moment). To which list of GPU’s are you referring?

I also had the black images using rusticl with certain modules active (Filmic RGB & Color Balance RGB). It seemed to work fine without those modules, but I use them pretty extensively.

As for my current issue - the computer itself is not crashing. I ssh’d into it from another machine and followed the dmesg logs while triggering a crash. I saw error messages similar to the ones I posted above. So the computer is alive, but the gpu died and I’m not sure how to bring it back other than a hard reset.

The dt bug in filmicrgb has likely been fixed in master. About the colorbalancergb module, there have been reports but none with logs showing what might be wrong.