A GPU for vkdt...

Okay, now that Ethereum no longer corners the GPU market, I’m going to shop for a GPU specifically to run vkdt. This is probably quite sad, but I’m a former distributed systems prof who knows next to nothing about the current nomenclature surrounding these devices. Specific make/models is good, but I’m also interested in the generic hardware requirements, e.g., quantity of memory, etc.

I have a GPU my son gave me to run multiple monitors, but it doesn’t seem to be up to the need. Here’s what it does with vkdt:

glenn@bena:~/ImageStuff/vkdt/src$ ./vkdt
[pipe] [global init] cannot open modules directory!
[gui] glfwGetVersionString() : 3.3.6 X11 GLX EGL OSMesa clock_gettime evdev shared
[gui] monitor [0] DVI-I-2 at 1920 0
[gui] monitor [1] DVI-I-1 at 0 0
[gui] vk extension required by GLFW:
[gui]   VK_KHR_surface
[gui]   VK_KHR_xcb_surface
[ERR] init vulkan failed
[ERR] failed to init gui/swapchain\

Here’s the GPU info:

glenn@bena:~$ lspci | grep ' VGA ' | cut -d" " -f 1 | xargs -i lspci -v -s {}
09:00.0 VGA compatible controller: NVIDIA Corporation GF104 [GeForce GTX 460] (rev a1) (prog-if 00 [VGA controller])
	Subsystem: Gigabyte Technology Co., Ltd GF104 [GeForce GTX 460]
	Flags: bus master, fast devsel, latency 0, IRQ 65
	Memory at fa000000 (32-bit, non-prefetchable) [size=32M]
	Memory at d0000000 (64-bit, prefetchable) [size=128M]
	Memory at d8000000 (64-bit, prefetchable) [size=64M]
	I/O ports at f000 [size=128]
	Expansion ROM at 000c0000 [disabled] [size=128K]
	Capabilities: <access denied>
	Kernel driver in use: nouveau
	Kernel modules: nvidiafb, nouveau


That might be one of the issues. OpenCL, at least, needs the proprietary drivers. See Red Hat Experimenting With "NVK" Nouveau Open-Source Vulkan Driver - Phoronix
nouveau (software) - Wikipedia

The GTX 460 is also ancient (12 years old). GeForce 400 series - Wikipedia



sounds like it maybe doesn’t have vulkan support? doing a quick web search i can only find N/A in the vulkan column for this card… so indeed maybe something slightly newer would be more fun to work with.

memory plays a big role if you have a camera with a lot of megapixels. i’m mostly on 16/20MP so i can do a lot with little memory. for instance, the graph reports (vkdt -d mem)

[mem] images : peak rss 202.984 MB vmsize 229.044 MB
[mem] staging: peak rss 31.729 MB vmsize 31.729 MB

when opening a 16MP RAF file. multiply that by the megapixel scale factor for the rock bottom minimum requirement. from what i hear, vkdt starts to run stable for processing thumbnails of larger directories from 4GB video ram.

prices currently are ridiculous. if i was going to buy for myself i’d probably start looking around for an RTX 3060 or similar (has 12G ram, plenty).

amd cards are okay if you don’t need tensor cores or ray tracing (both of which i’ll be using for other things). tensor cores might find their way into vkdt. the more severe thing are floating point atomics. i have to do silly dances to get amd compatibility in some places (remind me to ask money from nvidia for such a statement next time).

1 Like

The prices on GPU should come down pretty soon. Nvidea is about to release the 40 series and apparently there is bunch of the 30 series sitting around. It is likely that most gamers will not want to buy a 30 series.

I suggest going with Nvidia. I suggest having at least 6Gb of video memory since it will help avoiding tiling. More than 12Gb is likely an overkill. The 30 series will have more processing units inside the chip thus allowing it to do more mathematical calculations in parallel than the 20 series. All of this helps with OpenCL.

I’m not sure how much of that translates to Vulkan.


… maybe to add to this: this webpage lists a lot of user reports about specific vulkan features for a multitude of GPU devices.

i’m using some features which are mostly mainstream, and floating point atomics, in particular shaderImageFloat32AtomicAdd which seems to be not generally supported, at least by amd. that said, if this one isn’t there, i’ll work around it, but there will be a speed impact.

pretty much all of it. maybe with the caveat that so far i don’t even do tiling. so if your graph doesn’t fit the device it’ll just not run.

and yes, let’s hope for better prices for the 30xx series soon.

1 Like

Some of these are beasts…it can be more than just buying a card…I think @ around the 3070 or 3080 series Nvidia you will need a pretty serious powersupply and that can also mean having a decent motherboard to handle all that heat and power. I recall I couldn’t wait due to HW failure and so I have a 3060Ti… I am going from memory… I wouldn’t pay more at that point but also the size and power needed by cards higher than that was substantial…so in your research be sure the rest of your kit is up to the task of what you buy…

1 Like

I can report that the RX6600 (that seems to be 240$ at newegg) with 8GB that I bought for vkdt works well with my 42MP RAWs. There used to be an issue that hanatos fixed a while ago.

Generally I guess going with nvidia is the safer bet if the only dev uses that if you don’t want to run wayland (that’s still not possible on nvidia, right?).

1 Like

Wayland works on Nvidia. I’ve used it on fedora 36 kde. But color management is not working for me. I’m still using X11 to be able to load my monitor calibration profile.


For rawproc I decided early on to eschew the OS color management in both Linux and Windows, it does its own color-managed display rendering. Glad I did…

1 Like

i do the same. it’s the only way to be sure :slight_smile:


Thanks everyone for the insight; got a better feel for what I need.

Now, some more prosaic questions. I’m staring at my current configuration, a rather old dual-port DVI GPU which drives two rather old DVI monitors. Not especially interested in messing with the displays, and then I pondered, how does a GPU differentiate between display work and software work? For that, and the DVI port requirement, will I need to keep the old GPU running alongside the new one? Also, I’ll have to check the PSU, don’t remember what watt output I procured.

Geesh, this might turn into a ‘distro mainia’ sort of campaign… :crazy_face:

it does. i mean you can always run vkdt-cli on the command line and tell it which gpu you want to schedule the compute, easy (vulkan makes this very explicit, you pick the device). the gui, you can’t. vkdt does not copy image buffers back to cpu after rendering, it just directly outputs it on the screen, so it has to have the wires attached to it. that’s part of what makes the ui run at high speeds.

it’s possible to code your way around this (by various ways of copying stuff around ) but so far that didn’t appeal to me (sounds messy, multi-thready, and slower in any case). i can see how it would save you a lot of money though… but all the better ways of doing it (device groups) are probably reserved for newer devices, and your current card doesn’t even do vulkan in the first place (to initiate the copy).

seeing that i can’t even seem to find a 1080 with dual dvi ports, the only software solution/workaround would be to copy the images to the cpu and run a cpu gui. :confused:

Just out of curiosity, how slow would that be? You would only need to copy the buffer at screen resolution, do you think it could be a significant bottleneck? (I’m talking in absolute – i.e. human-perceived – terms, not relative)


Okay, the monitors have VGA ports, so probably cheap HDMI->VGA dongles…

Okay, so vkdt is acting just like any other rendering program, using the cores to process the resident image, and then that buffer is just directed to the display. Easier than I thought.

So now, my trade space axes look to be 1) brand, 2) power requirements.

Just to be on the safe side and moderately future proof, I suggest the RTX 4090.

puahaha yeah i suspect you may charge more for your subtle advertising here :smiley:

Yeow, $1600US. I’m in my 60’s, so there’s not that much future left to protect… :laughing:

hehe i would not call humans/perception absolute. but in terms of slow: you need to load some software pixel pushing stack… i think drawing anything full screen will cost you several 10s of milliseconds. the copy itself i don’t know. in this case we’re probably also not talking pcie-5.0. certainly very slow as compared to not doing the dance. plus the synchronisation nightmare.

that said it’ll probably still feel much faster than dt today. the overall laggy user interface is probably within the tolerance of non-gamers.

1 Like

yes there seems to be a point of diminishing returns in terms of awesomeness/watts, especially at the top of the price/performance segment. not too much help, but nvidia-smi allows to set an upper limit on the power the device draws when clocking up (or not doing that). there’s probably a similar tool for amd.

1 Like

i just remembered that amd is about to release a new generation. hoping that they will not set their prices quite as astronomically as nvidia does that may be an alternative. also i’d wait until the intel arc series is available, maybe they will cover the price segments for mortals? if not i think nvidia’s RTX 2060 [super] is old but not obviously terrible and still available.