AI Model Integration Policy

While it is true that “the more said, the more can be argued”, your very short statement is also easily questionable, as non-ML modules such as composite or retouch (or even masked exposure, for that matter) can already be used to alter images.

I think that what Pascal is trying to do here is to build a reference framework to reduce the perceived arbitrariness of future judgement calls that will have to be made on these subjects. He has tried (quite successfully, imho) to stay general while providing enough examples to make the criteria operationally useful.

5 Likes

@Pascal_Obry
Perhaps due to the division and general back-and-forth around AI, it would be helpful to indicate in this document that it is a reflection of your guiding principles.

I have seen a couple replies here pointing out inconsistencies in phrases or word usage. It is tempting to try and perfect this document as some kind of universal guide, but I think it is ultimately a glance into your guiding principles, and that is helpful in and of itself.

Ultimately, you are the maintainer, and even if your logic is not perfect, it is helpful to see it written out and made clear for reference.

Personally, I would be happy with a long write-up of your feelings about photo editing and what you enjoy/dislike, a bit of your history regarding photography, technology adoption, etc. I think it would be equally, if not more helpful than this more narrowly-scoped document when it comes to understanding the actual mind behind darktable’s maintenance.

It was nice to see the Execution and Privacy section!

@Pascal_Obry When you use the word “scene” do you refer to the sensor data in the imported image file? Or, do you mean the real world scene captured by the sensor?

The draft so far looks good, with caveats on specificity and word definitions that others have brought up.

I think it’s also completely fair to treat this as a living document that can / will change as technology and contributions change.

3 Likes

I was going to say something similar myself. We don’t need to perfect it from the get-go. It will undoubtedly go through many revisions, but I think it’s important to have a statement of intent and to create a starting point for the overall philosophy.

As with lots of other software, a hard line is being drawn at “generative AI”, while accepting that AI used for other tasks is ok. As long as this is clear, other examples and clarifications can be tackled on a case-by-case basis as they arise.

1 Like

This exactly.

2 Likes

That is a high standard. I am not even sure many of the non-ai modules can get that far. This makes me rethink much of the content of the policy.

Not sure it is a high standard. It is the exact definition of photography as it was created. That is “write with light”, at the time on glass negatives, then negatives and now digital sensors. The way to capture the scene has always been the same - record the light - and it is not because today it is easy to change the scene that it is a good thing (or bad thing) but at least we move away from photography when changing too many things (content) on the picture.

2 Likes

I think we simply can’t define a text that will catch all cases now and in the future (without getting completely non-understandable as law texts).

But this is now very good and makes clear what we want and where we see severe problems.

8 Likes

@Pascal_Obry Given things like ‘Spirit’ photography, the ‘solarization’ of Lee/Ray (which was discovered in the 1800’s), and the bevy of effects and processes from the darkroom (Jerry Uelsmann being an extreme example), I am not sure fidelity to the real life scene has ever been a must. Even Ansel Adam’s later prints of El Capitan made significant alterations to what was ‘written’ in light at the film plane. But, reasonable people can differ as to what counts as photography.

Philosophical discussions aside: From a pragmatic view

  1. I do not know how a tool which only performs mathematical operations on a set of values could align to the reality that is supposed to be captured.

  2. I am not sure that all our existing tooling can live up to the standards/ Guiding Principle.

  3. Including Content Aware fill (even with its restrictions) makes an exception to the principles

If the policy is meant to help people know what can and cannot be done, but the policy makes (seemingly) arbitrary rules and exceptions, I am not sure if it can accomplish its end.

Maybe just forego everything but the use case examples which have already been fleshed out in more detail from the first iteration. Note that the list is not exhaustive, and consider every candidate on a case by case basis.

1 Like

For clarity, it might be good to change some of the language around what’s not permitted or not allowed. That means a use case isn’t supported or a code/model contribution won’t be accepted, depending on the context. But I don’t think it means disallowing user behavior.

I want to ask about another kind of content-aware fill that is not currently allowed by the policy (“removing objects may be acceptable when the resulting area is reconstructed exclusively from existing image data”). The scenario is when data not present in the image is needed to complete the fill, but there is only one valid answer. For example, if a scanned photo has a defect that obscures someone’s eye, would we allow a model that can add a correct-looking eye? The point is that any correct fill would look very similar, but (if it’s a side view) the fill can’t be accomplished with the type of data that’s already in the photo. Is this use case supported? IMO this is different from a content-aware fill of a skyline that could add a hill, a building, or a tree. The latter is clearly not supported.

What you describe sounds like generative AI to me because it’s creating and adding a feature that was never in the original scene.
The Retouch module in Darktable can already be quite powerful for retouching work (and in conjunction with the Enlarge Canvas module can also add details that weren’t in the original scene), but at some point we should remember that Darktable is a RAW editor, and some jobs are best left to pixel editors like Gimp.

1 Like

This is a fun case. Here is an argument for such an ai-tool:

  1. The reason why the job is best left to the Gimp is that it requires less effort and time, relative to dt.
  2. Reducing time/energy of tasks, that dt is already able to perform, is a valid reason to use an ai-enabled version of the tool (see ai masking)
  3. The composite module already enables you to fix the defect via importing pixels from another image.
  4. The portion of the image effected by the defect was found in the original scene (using Pascal’s conception of ‘scene’)
    Therefore,
  5. An ai-enabled tool that fixes the defect without being limited to the existing pixels in the image should be allowed.

But @europlatus is right that such a tool would probably not be merged.

I have commented that the policy looking really nice.
But a couple of conversations last week about photography and AI got me thinking…

And maybe @AtaraxicShrug was also hinting at this… Maybe his is not.

Photography is an art form.

But from a pure art perspective. Why would it a problem that someone replaces the sky of an image? Why would it be an issue if unwanted objects are removed, like people etc? Why would it be problem if we create very surreal images with darkroom? Why would ‘painting like’ generating modules be problem?

So from that perspective, why would ‘we’ created a policy that in the end limits the freedom of the user to do with the images what he wants?

So is there a real need for using a policy. I mean, even if these modules would be in darktable, a user does not have to use them?

Using these techniques only be come an issues when the user that creates these images claim they are real… But is that something that darktable cannot stop. And the majority of the people using is not creating world-press photos are we?

So what are we fussing about?

2 Likes

After following this new thread, I somehow get the impression that there seems to be a kind of trigger with some topics - followed by starting the full disassembly of an idea perhaps driven by the wish to optimize and polish the each and everything to its very sad end. Sorry if I’m stepping someone on his toes, but it is getting somehow really annoying from my perspective.

Therefore I would like to kindly suggest to just hold the breath, stop iterating and simply watch and see how @Pascal_Obry will develop this policy over the time. Perhaps it gets nourished by other discussions here and there. Perhaps he will ask for further input. I really appreciate, that he as the maintainer of this wonderful project started such kind of clarifying policy and openly shares his vision regarding the future development and some boundaries within this project. And from my perspective the current state of the policy is indeed already very fine, since a vision is a vision and doesn’t need to spell out the each and everything.

Sorry for the noise.

4 Likes

FIAP and PSA rules (as well as those of other such organizations) explicitly prohibit the use of generative AI in photo contests under their patronage. Although many images winning prizes are composite images of multiple photographs and it’s sometimes hard to tell what was the main photograph in the final image. But all the underlying photographs must be captured by the author. That’s the line they draw.

Well said. There is always the danger of over-policing the use of AI in some modules.
Darktable has the ability to turn our photo into works of art (at least in our eyes). You’ve only got to see the variety of attempts in Play Raw. People use DT and other software to put their own artistic touches.
Very early on in these discussions the idea was floated of labeling modules that were produced with the assistance from AI.
Then the choice is with the user.
AI is changing daily and it could well be that all these posts on a number of threads will seem to be pointless in the near future.
I saw recently a photo that had a large number of details changed simply by asking chatgpt, completely removing the need for a raw developer. That could be the future for a number of people.
As long as I can play with gimp. DT and others then I’m happy.

I think the idea is to limit scope creep more than to limit artistic intent. Darktable is a RAW photo editor, there’s some boundaries inherent to that but there’s also a lot of grey area. Pascal, as maintainer, also has to balance against the project’s other goals like backwards compatibility and reproducibility. This type of document is really just to narrow some of the grey area and make sure that work is focused on and in alignment with the state goals of the project.

If I put in feature requests to have darktable also be an mp3 player, a video editor, a website creator, a notes app, and more, I think nobody would be arguing that saying no to those is limiting my freedom. Those are all obviously beyond the scope of darktable.

3 Likes

First off, thanks to @Pascal_Obry for taking on the task of setting a policy for what AI tools should be accepted or rejected in darktable. It will provide a clear reference that can guide future decisions. The policy doesn’t say what someone can do to their images, just what they can do to their images using darktable.

To me, one of the most important statements in the policy is:

All AI models must run entirely locally on the user’s machine. Integration of models that require access to external network services (e.g., cloud-based inference, remote APIs, or online dependencies) is not permitted.

darktable has a long history of maintaining compatibility for older edits, and eliminating dependencies on external resources is an important part of that.

The guiding principle is a line in the sand, and makes it pretty clear what will or will not be accepted as a part of the package. Anyone who wants to color outside of those lines, whatever the final lines are, can always do that work in Gimp or Krita.

2 Likes

That’s the right attitude.

The bar should be what’s useful for a raw image editor.