vkdt dev diary, pt2

danny · July 20, 2022, 11:23am

ahh, i didn’t notice the new package, it works now
it’s great, thank you very much for your help

okke · July 23, 2022, 12:54pm

Are artifacts like below currently expected in some situations? I think I saw them while testing on an integrated intel GPU but also now with an AMD GPU as well on some pictures (especially in high contrast ones). At least the clarity slider affects them somewhat.

mikae1 · July 24, 2022, 9:06pm

Honestly, that’s quite beautiful.

hanatos · July 25, 2022, 6:33am

looks brilliant. you should print and sell

more seriously, would you mind to share a raw + cfg (maybe in private if you don’t want it public)? i’m not sure what i’m looking at here but there may be an easy fix.

okke · July 25, 2022, 9:20am

The cfg should be the one automatically applied without any changes. Disconnecting llap node removes the artifacts.

DSC01594.ARW (23.8 MB)
DSC01594.ARW.cfg.txt (2.7 KB)

Maybe related: if llap is removed but I increase contrast to e.g. >1.5 I get this:

But I like the B&W artifacts better

hanatos · July 25, 2022, 9:27am

awesome. thanks for the image, looking into it.

hanatos · July 25, 2022, 11:56am

okay thanks for bringing this up.

it appears rawspeed had/has a bit of an aggressive black level for this camera. vkdt does not clamp negative values (after colour transform these are valid numbers and they are important for unbiased denoising of black). i did two things and pushed to github:

updated rawspeed to upstream/develop (may need to rm -rf built/ext to trigger rebuild, but the xml change may be enough here). this did indeed correct the black level to a range where it looks almost right.
introduced a bias parameter in the filmcurv module. if you lift this just a little (say to 1e-3 in your case) it will correct the few remaining artefacts:

this is stupid and i hope it doesn’t come up in too many images. may need to think about a more automatic way of detecting this, or probably discuss with our rawspeed friends whether the black level is indeed correct or needs another 1e-3 nudge.

okke · July 25, 2022, 7:22pm

Great, thanks!
Indeed the updated rawspeed seems to mostly help here. You probably didn’t push the bias change yet to vkdt so I can’t say for sure if it would help more but I can still get the magenta & red artifacts if I push contrast.

But the artifacts were not visible in most pictures anyway but would be good to have a way to remove them if needed.

Great job btw in general, vkdt seems to work otherwise pretty nicely (and even better with better pictures)!

hanatos · July 26, 2022, 6:39am

ouch indeed i did not (but have now). thanks for pointing this out.

qosch · July 27, 2022, 9:47pm

Hey,
I’ve received a notebook with a Ryzenn 6800U processor with the most powerful iGPU that supports Linux today (680M). In games its between a desktop 1050ti and a desktop 1650.
How about an integrated benchmark for vkdt vor should we just agree on a few play raw files + cfg?
I am currently fighting a few issues so it could take a few days until I’ll get to testing…

hanatos · July 28, 2022, 3:33pm

ah great idea. probably vkdt-cli and a set of playraw + .cfg files is all it would take. probably need to run 10x in a row to make sure the gpu even throttles up.

qosch · July 30, 2022, 10:30am

First quick report on vkdt on a convertible with ryzen 6000 iGPU:

it works with both X11 and Wayland
I have not decided whether to keep this laptop so I haven’t installed anything on the internal SSD. Instead everything (OS, vkdt, raws) is currently running from a fast (but surely not as fast as a PCIE SSD) USB 3.0 stick. This could be limiting but I hope that after opening an image in the darkroom, everything is kept in ram/vram. Is that the case?
it is unfortunately quite slow, slower than darktable 4 with CPU processing only (dt 4 with opencl is even slower than CPU only processing and also slower than vkdt) with the 42MP raws from my a7RII. By slow I mean its very far from the responsiveness that vkdt has on my desktop PC and more comparable to darktable without opencl acceleration a few years ago. The default pipeline without any of the slow, fancy stuff takes about 1s for the mentioned 42mp raws to update the view after moving a slider (not measured, could be 0.5s to 1.5s I’d say). I find this a bit surprising as the difference between this and the RX6600 in my desktop PC should be a factor of ~3.5 when comparing frame rates of gaming centered reviews. But it feels more like a factor of 10.
the implementation on the Lenovo Yoga 7 gen 7 only allows a maximum of 2Gb of ram to be allocated as “vram”. I had the idea that the slow processing is partly due to tiling because of limited vram. Is that possible? I tried allocating more ram by using the kernel parameter vramlimit but it has no effect, only the bios setting has an impact. The amd website also mentions a bios setting “auto” that supposedly allocates dynamically but the bios version I have does not have that option (only 512MB, 1GB or 2GB). Maybe the 32GB ram version of this device has a 4GB option? If you think the 2gb are likely a big issue I’ll contact lenovo and ask.
there are three vulkan implementations for amd GPUs: amdvlk, radv and amdgpu-pro. I’ll investigate if there are performance differences for vkdt.
one last sidenote: the UI is better than dt on a touchscreen: sliders are easier to manipulate (but can not be dragged) and panning in the image works nicely. Pinch to zoom would be great (not only on the touch screen but touchpad as well) but as far as I can tell imgui has no support for this yet.

cheers

hanatos · July 30, 2022, 1:51pm

great, thanks for testing. yes 2gb is tight. i think 4g should be the minimum requirement for vkdt, certainly for 42mp raws. there is -d mem for memory info.

input data is loaded once and kept on device for subsequent refresh, yes. plus, kernel caches will help cpu side.

as to comparison to std dt: vkdt always processes the full image full res with all modules from beginning to end. consider this if you want to gauge code performance. vkdt has a LOD setting to process low res instead, i don’t test it much though.

okke · August 7, 2022, 8:23am

I was seeing some occasional crashes and thus tried to build with make debug -j12 but then I get a direct crash

[gui] glfwGetVersionString() : 3.3.8 X11 GLX EGL OSMesa clock_gettime evdev shared
[gui] monitor [0] HDMI-A-1 at 0 0
[gui] vk extension required by GLFW:
[gui]   VK_KHR_surface
[gui]   VK_KHR_xcb_surface
[ERR] init vulkan failed
[ERR] failed to init gui/swapchain

with a log in /tmp of this is vkdt 0.0.1-356-ge10f357 reporting a crash: + empty line

Without the debug build I get sometimes when trying to exit: corrupted double-linked list and I have to force Ctrl+C closing vkdt (gui does disappear first). Any pending changes to vkdt.db are lost.

qosch · August 9, 2022, 8:09pm

Of course I am aware that vkdt works with full resolution images. That becomes apparent when zooming in goes as fluent as in a jpeg viewer. What exactly is the LOD mode? Is it a lower fixed resolution or does it make the pipeline run at screen resolution (with recalculation on zooming in)? The latter would be great for mobile devices I think but would probably come with lots of development and testing work.
Good news from further testing: with the vulkan radeon mesa driver vkdt was usable for my 42MP raws. I’d say maybe 0.2s for the pipeline to run so moving sliders is pretty fluent. I’m also running this laptop with the “power saving” configuration enabled in the BIOS, which costs ~20% performance but makes the fans stay off almost always.
Because you didn’t answer directly: is it possible that a lack of VRAM will cause vkdt to run slower or will it just refuse to do some stuff (for example on this 2GB machine I cant crop my 42MP images, it crashes vkdt)?
One more addition: panning the image using the touchscreen only works in X11 mode, not with wayland.

hanatos · August 21, 2022, 1:22pm

heya, when building the debug build, make sure you make clean before that. as in regular dt, debug or not will change the size of a few central structs (to attach debug stuff) and lead to crashes in incremental/inconsistent builds.

you direct crash sounds like the GPU that is chosen by vkdt does not have a monitor attached?

hanatos · August 21, 2022, 1:31pm

this is set here: vkdt/src/gui/render_darkroom.cc at master · hanatos/vkdt · GitHub and influences the output region. lod=1 means it is processed full res (full raw image). lod=2 means it is processed at output resolution (screen pixels). lod>2 means it is scaled even more, e.g. lod=3 is half output res (2x2 blocks in screen display). the scale applies between full res input (raw image) and full res output (screen window size).

oh yeah this sounds good.

yes because i don’t know about the driver and amd in particular. possible that they start some paging in the background that will just be slow? for my part, i just cause a friendly crash and don’t do anything about it.

hm okay that might be on glfw’s end. not sure i’m using the event system incorrectly? sounds implausible. more like imgui doesn’t redraw on wayland, that used to be the case in the past. is the new panning position in effect after moving sliders? if so it might just be a mouse-move redraw issue.

okke · August 21, 2022, 6:27pm

Yep, I just checked, I ran make clean and then after which when running the built vkdt I get the error. But when I build with just make I get strangely a vkdt that does work with the same monitor setup (1 display at monitor [0] HDMI-A-1 at 0 0)

hanatos · August 22, 2022, 8:03am

even stranger. did you try to run directly through

gdb --args ./vkdt -d all
r

and then maybe use bt to find out where it crashes? sounds implausible that the extensions are sometimes found and sometimes not. maybe the actual error is something after initing the extensions?

g-man · August 24, 2022, 1:52am

Feature Request for vkdt: