Considering a plugin architecture for the next raw processor

jonathanBieler · June 3, 2022, 4:14pm

One of the advantage of using something like Julia is that you have instant access to good math and machine learning libraries. I’m afraid that at some point ML-based processing that is now mostly available in commercial software or hard to use research tools will become almost a requirement.

The CUDA stack looks good but not too sure outside of that. Another issue is the GUI.

For the pipeline side of things I think a node based approach (à la blender) works the best for graphics.

mbs · June 3, 2022, 4:28pm

Yes – that’s the price of increased flexibility, and the reason I listed “well-defined interfaces” as one of the main challenges.

mbs · June 3, 2022, 4:30pm

Indeed, Julia has great (and improving) support for GPU and distributed computing. Their scientific ML modeling package is probably best in kind.

hanatos · June 3, 2022, 6:16pm

… in vkdt (as in any real time rendering framework, i think it’s the first feature that people implement) you can hotswap your modules. it’ll reload the changed shader code and rerun it in place, keeping your camera position/parameters/etc as they were. i know this is not what you’re asking but it’s great fun for debugging.

snibgo · June 3, 2022, 6:43pm

If I understand correctly, a pipeline is a special case of a node system. If so, nodes seem a better idea.

Basically, a plugin takes as input image data (in some format, encoding, color space, etc) and a mask, and it outputs a new set of image data.

This seems unnecessarily restrictive. Perhaps a plugin could take any number of inputs, and create any number of outputs. Similarly, any number of UI inputs, and metadata input/output.

I suggest thinking carefully about version control, both of the core system and of plugins. For example, what happens when changes to one plugin breaks another plugin?

Do you want to restrict this to processing raws? Why not generalise to allow any image?

mbs · June 3, 2022, 6:46pm

Quoting myself:

In other words, go wild!

PhotoPhysicsGuy · June 3, 2022, 7:06pm

Node based editing is fun.
Natron, Nuke, Fusion all have communities of people who do develop nodes. Be it Nodes which contain node-sub-trees (its not a tree really but for brevity sake), or “plugin nodes” which do contain custom code for speed.
So if the user likes a particular assortment of nodes in a particular structure, you can save that and expose only parameters you want (also go in any time and change all parameters).
I bet a node with libraw could easily open those node-editors up for exactly what you want @mbs .

the guys over at acescentral develope their “new” display-rendering-transforms partly in nuke iirc.

snibgo · June 4, 2022, 1:21am

On nodes: we used to call the general structure Directed Graphs, not necessarily Acyclic. And each node can be a DG, of course.

I see no reason why nodes couldn’t use GPU, with images retained in GPU across the arcs between nodes. But I know almost nothing about GPU coding.

David_Tschumperle · June 4, 2022, 6:15am

What I don’t like so much with node-based designed pipelines, is the fact that the usual constructs for flow control, that every programmer knows (tests, loops, conditional loops, breaks, …) become a pain to use with nodes.
For instance, doing a simple loop with a break inside will be translated into a very complicated graph of nodes.

To be honest, I must admit I’ve never seen the interest of node-based pipelines, compared to the use of a basic script language, where flow control is often straightforward.
In the late 90’s, we were taught image processing with a node-based graphical system.
It became hell, as soon as you wanted to do more than connect simple processing nodes together.

I’ve seen plenty videos of people using Blender, who design quite complicated effects using nodes. They really get huge graphs. I admire them for their patience But I don’t understand how they can manage/maintain that in the long term
(not even sure if we can add comments associated to each node).

hanatos · June 4, 2022, 6:49am

that’s the structure in vkdt too. i allow cycles to process animations/temporal feedback loops.

yes.

indeed. i think nodes are fun for illiterate users who just want to quickly bash something together. they’ll leave the actual programming (inside the node) for someone who knows what they’re doing. that being said for a workflow-based application i’d probably want to hide as much of the graph processing as i can from users too, and just work with very coarse presets/fixed blocks/a set of expert-curated graphs to pick from.

the node processing maps 1:1 to gpu processing. every vkdt node has exactly one compute/draw shader and can have multiple textures bound as input and multiple storage images as output. the gpu texture unit does all the work managing pixel formats. to implement even slightly interesting control flow, as you pointed out above, i need quite a bunch of nodes sometimes. i hide these behind “modules” which are the thing that is exposed to a user currently. that would be something like “local contrast” that actually consists of some 20 nodes internally. on top of that there are “presets” and “blocks” which consist of multiple modules with interconnections which can be inserted into an existing graph as a whole.

the thing with gpu programming is that control flow/branches/complex logic doesn’t really work well there in the first place (because of the heavy SIMT processing and branch divergence as well as incoherent memory accesses). so usually i’d put my problem upside down to match a more simplistic processing logic before i even start putting it into code/nodes.

kmilos · June 4, 2022, 8:36am

This might spark some ideas: [2008.11476] HipaccVX: Wedding of OpenVX and DSL-based Code Generation

PhotoPhysicsGuy · June 4, 2022, 8:38am

afaik most directed graph concepts let you put comments somewhere within/around the structure. Sure, code is more practical and efficient from a programming side.

I think there is a distinction between ‘regular usage of code’ for processing images and ‘developing efficient code’ for processing images.

A directed graph is leaning more into the ‘regular usage’ side with limited capabilities of ‘developing code’.

Since a node could/can (maybe must as a requirement) also contain code, I see no massive downsides to directed graphs and they would adress some/most of the stated issues that @mbs detailed.

snibgo · June 4, 2022, 12:08pm

A directed graph can be expressed in a structured text language, and a text language can be expressed as a DG (as a graphical image).

I last used DGs seriously about 30 years ago. They were invaluable for communicating complex technical relationships within engineering teams, and how those relationships changed as the project developed. They helped us understand how relationships could or should be rearranged for performance, system integrity and so on.

When flattened into stupidly large and complex graphics, we used them to impress, baffle and confuse senior management, ie project sponsors.

Entropy512 · June 6, 2022, 1:10pm

For reference, DaVinci Resolve already does this.

And it was my impression that vkdt was node graph based? (Edit: @hanatos confirmed that)

iarga · June 9, 2022, 3:27pm

Last year I made a perspective matching system with Blender and Sverchok:

Keep in mind that both Blender and Sverchok are not my everyday thing.

But…, it does function.

I could make this work without knowlege of complicated matrix calculations. I used a lot of nodes for translations and rotations ( especially the use of quaternions).

I would love to do something like this with G’MIC, but G’MIC is unfortunately too complicated for me to program in.

The most fun was to learn about this perspective matching and the use of the trirectangular tetrahedron therein.

prokoudine · June 10, 2022, 3:22pm

That is what Olive video editor does (free/libre).

richal · June 10, 2022, 7:49pm

This is what audio video pipelines, like direct show, have been doing for years, and it works well! In this ecosystem, plugins are typically called codecs.
You have to be carefull on interfaces “contracts”, maybe define categories for example to ensure a kind of consistency. A well-thought framework is mandatory.
Gnu Radio is another very good example.

Entropy512 · June 13, 2022, 2:37pm

Wow, thanks for pointing that out! Finally, a FOSS video editor that isn’t using an int8 internal pipeline.

ariznaf · July 6, 2022, 7:46pm

Well in the example given of vkdt module, it seems that it is almost a plugin.

The glsl code can be compiled by the program itself and passed to th gpu.

The interface seems to be described by simple text file, so vkdt would be able to read it an create the interface for the module, may be with some limits in funcionality, but enough for making tests.

As you say it would be great to be able to test some contribution like sigmoid without waiting for developers to add it to main stream (in case the do).

As long as you are using not supported plugin, any problem it could cause would not be attributed to main stream modules.

A way to envelope that plugins in a security shell and catch exceptions to tell the user the problem comes from that plugin would be needed.

paperdigits · July 6, 2022, 8:09pm

Why not distribute the plugins along with the rest of the source code to help ensure compatibility?