vkdt devel diary

i just merged a branch that implements support for evaluating gmic’s gaussian/poissonian resnet as gpu shaders. i’ve been working together with @David_Tschumperle on this. really he’s done all the work and i just ported it over to the gpu, which wouldn’t have been very easy without his support in debugging my broken implementation.

a few initial observations: my implementation is stupid:

  • it runs out of memory very quickly, does not attempt any tiling
  • it fetches everything from global memory and is thus very slow
  • it’s split into way too many small kernels

and currently it evaluates a 1080p image in ~300ms on a low end 1650GTX laptop version.

this is really just a starting point now, i need to do a faster implementation and then likely tweak the network architecture for real-time performance and temporal stability (video).