Considering a plugin architecture for the next raw processor

mbs · June 3, 2022, 3:21pm

Thanks for the feedback! I’m afraid that I may have given the impression that I was offering a fully-fledged design proposal. All I have is an “utopian ideal” which I threw out there to see if it inspires anyone. That’s all.

I’d just like to expand on something I had already said, but maybe not very clearly: a framework with the ability to use plugins is not incompatible with offering tried-and-true workflows out of the box, designed by experts and known to work well in most cases.

Making an analogy with Lego: you buy a Lego set and it comes with clear, well-designed instructions to build a particular model. If you want, though, you can deviate from the instructions. (You can also buy a Lego bucket with assorted pieces and no instructions; this, I agree, is useful only to a small subset of model builders).

To give a more concrete example or a potential benefit of plugins: Some people here were (are?) working on a sigmoid mapping curve. If I wanted to try it out for myself, I’d need to pull and compile a patched darktable parallel to the main version. If their code is not accepted upstream, it becomes very hard for me to keep using it. A plugin architecture avoids all these problems, both for developers and for users.

mbs · June 3, 2022, 3:24pm

Thanks for the detailed explanation. I need to take a closer look at vkdt, it looks awesome!

I’d just like to add that I’m thinking of plugins as one step further than modules: the difference is that plugins can be easily added to the framework without recompiling, and they can be distributed and shared on their own. An example is a web browser, where you can easily add/remove plugins, which may be developed independently of the browser framework.

priort · June 3, 2022, 4:05pm

Does this ignore the concept of a pipeline and having things controlled…if for example one module controlled gamut or made some adjustments and then subsequent modules were integrated to expect that could you introduce all kinds of issues by just throwing in a random plug-in… I can’t think of a concrete example and I know you could mess things up as it is by moving modules around willy nilly but could it not be that you would have to know quite well what each module did with respect to input and output and how that tied in with the other modules… Just thinking out loud…

jonathanBieler · June 3, 2022, 4:14pm

One of the advantage of using something like Julia is that you have instant access to good math and machine learning libraries. I’m afraid that at some point ML-based processing that is now mostly available in commercial software or hard to use research tools will become almost a requirement.

The CUDA stack looks good but not too sure outside of that. Another issue is the GUI.

For the pipeline side of things I think a node based approach (à la blender) works the best for graphics.

mbs · June 3, 2022, 4:28pm

Yes – that’s the price of increased flexibility, and the reason I listed “well-defined interfaces” as one of the main challenges.

mbs · June 3, 2022, 4:30pm

Indeed, Julia has great (and improving) support for GPU and distributed computing. Their scientific ML modeling package is probably best in kind.

hanatos · June 3, 2022, 6:16pm

… in vkdt (as in any real time rendering framework, i think it’s the first feature that people implement) you can hotswap your modules. it’ll reload the changed shader code and rerun it in place, keeping your camera position/parameters/etc as they were. i know this is not what you’re asking but it’s great fun for debugging.

snibgo · June 3, 2022, 6:43pm

If I understand correctly, a pipeline is a special case of a node system. If so, nodes seem a better idea.

Basically, a plugin takes as input image data (in some format, encoding, color space, etc) and a mask, and it outputs a new set of image data.

This seems unnecessarily restrictive. Perhaps a plugin could take any number of inputs, and create any number of outputs. Similarly, any number of UI inputs, and metadata input/output.

I suggest thinking carefully about version control, both of the core system and of plugins. For example, what happens when changes to one plugin breaks another plugin?

Do you want to restrict this to processing raws? Why not generalise to allow any image?

mbs · June 3, 2022, 6:46pm

Quoting myself:

In other words, go wild!

PhotoPhysicsGuy · June 3, 2022, 7:06pm

Node based editing is fun.
Natron, Nuke, Fusion all have communities of people who do develop nodes. Be it Nodes which contain node-sub-trees (its not a tree really but for brevity sake), or “plugin nodes” which do contain custom code for speed.
So if the user likes a particular assortment of nodes in a particular structure, you can save that and expose only parameters you want (also go in any time and change all parameters).
I bet a node with libraw could easily open those node-editors up for exactly what you want @mbs .

the guys over at acescentral develope their “new” display-rendering-transforms partly in nuke iirc.

snibgo · June 4, 2022, 1:21am

On nodes: we used to call the general structure Directed Graphs, not necessarily Acyclic. And each node can be a DG, of course.

I see no reason why nodes couldn’t use GPU, with images retained in GPU across the arcs between nodes. But I know almost nothing about GPU coding.

David_Tschumperle · June 4, 2022, 6:15am

What I don’t like so much with node-based designed pipelines, is the fact that the usual constructs for flow control, that every programmer knows (tests, loops, conditional loops, breaks, …) become a pain to use with nodes.
For instance, doing a simple loop with a break inside will be translated into a very complicated graph of nodes.

To be honest, I must admit I’ve never seen the interest of node-based pipelines, compared to the use of a basic script language, where flow control is often straightforward.
In the late 90’s, we were taught image processing with a node-based graphical system.
It became hell, as soon as you wanted to do more than connect simple processing nodes together.

I’ve seen plenty videos of people using Blender, who design quite complicated effects using nodes. They really get huge graphs. I admire them for their patience But I don’t understand how they can manage/maintain that in the long term
(not even sure if we can add comments associated to each node).

hanatos · June 4, 2022, 6:49am

that’s the structure in vkdt too. i allow cycles to process animations/temporal feedback loops.

yes.

indeed. i think nodes are fun for illiterate users who just want to quickly bash something together. they’ll leave the actual programming (inside the node) for someone who knows what they’re doing. that being said for a workflow-based application i’d probably want to hide as much of the graph processing as i can from users too, and just work with very coarse presets/fixed blocks/a set of expert-curated graphs to pick from.

the node processing maps 1:1 to gpu processing. every vkdt node has exactly one compute/draw shader and can have multiple textures bound as input and multiple storage images as output. the gpu texture unit does all the work managing pixel formats. to implement even slightly interesting control flow, as you pointed out above, i need quite a bunch of nodes sometimes. i hide these behind “modules” which are the thing that is exposed to a user currently. that would be something like “local contrast” that actually consists of some 20 nodes internally. on top of that there are “presets” and “blocks” which consist of multiple modules with interconnections which can be inserted into an existing graph as a whole.

the thing with gpu programming is that control flow/branches/complex logic doesn’t really work well there in the first place (because of the heavy SIMT processing and branch divergence as well as incoherent memory accesses). so usually i’d put my problem upside down to match a more simplistic processing logic before i even start putting it into code/nodes.

kmilos · June 4, 2022, 8:36am

This might spark some ideas: [2008.11476] HipaccVX: Wedding of OpenVX and DSL-based Code Generation

PhotoPhysicsGuy · June 4, 2022, 8:38am

afaik most directed graph concepts let you put comments somewhere within/around the structure. Sure, code is more practical and efficient from a programming side.

I think there is a distinction between ‘regular usage of code’ for processing images and ‘developing efficient code’ for processing images.

A directed graph is leaning more into the ‘regular usage’ side with limited capabilities of ‘developing code’.

Since a node could/can (maybe must as a requirement) also contain code, I see no massive downsides to directed graphs and they would adress some/most of the stated issues that @mbs detailed.

snibgo · June 4, 2022, 12:08pm

A directed graph can be expressed in a structured text language, and a text language can be expressed as a DG (as a graphical image).

I last used DGs seriously about 30 years ago. They were invaluable for communicating complex technical relationships within engineering teams, and how those relationships changed as the project developed. They helped us understand how relationships could or should be rearranged for performance, system integrity and so on.

When flattened into stupidly large and complex graphics, we used them to impress, baffle and confuse senior management, ie project sponsors.

Entropy512 · June 6, 2022, 1:10pm

For reference, DaVinci Resolve already does this.

And it was my impression that vkdt was node graph based? (Edit: @hanatos confirmed that)

iarga · June 9, 2022, 3:27pm

Last year I made a perspective matching system with Blender and Sverchok:

Keep in mind that both Blender and Sverchok are not my everyday thing.

But…, it does function.

I could make this work without knowlege of complicated matrix calculations. I used a lot of nodes for translations and rotations ( especially the use of quaternions).

I would love to do something like this with G’MIC, but G’MIC is unfortunately too complicated for me to program in.

The most fun was to learn about this perspective matching and the use of the trirectangular tetrahedron therein.

prokoudine · June 10, 2022, 3:22pm

That is what Olive video editor does (free/libre).

richal · June 10, 2022, 7:49pm

This is what audio video pipelines, like direct show, have been doing for years, and it works well! In this ecosystem, plugins are typically called codecs.
You have to be carefull on interfaces “contracts”, maybe define categories for example to ensure a kind of consistency. A well-thought framework is mandatory.
Gnu Radio is another very good example.