I wrote an AI-powered command-line PP3 generator.

I’m excited to share a new tool I’ve been working on that combines the power of AI with RawTherapee’s advanced RAW photo processing capabilities. It’s called AI-PP3.

What is AI-PP3?
AI-PP3 is a command-line tool that uses artificial intelligence to analyze RAW photos and generate optimized processing profiles (PP3 files) for RawTherapee. Whether you’re working with DNG, NEF, CR2, or ARW files, AI-PP3 can automatically create profiles tailored to your images, saving you time and effort in post-processing.

Key Features:

  • :robot: AI-Powered Analysis: Leverages AI to analyze your RAW photos and generate optimized PP3 profiles.
  • :camera_flash: Multi-Format Support: Works with popular RAW formats like DNG, NEF, CR2, and ARW.
  • :art: Seamless RawTherapee Integration: Automatically processes images or generates PP3 files for manual fine-tuning.
  • :zap: Flexible AI Providers: Supports OpenAI, Anthropic, Google, and xAI, so you can choose the best AI for your needs.
  • :hammer_and_wrench: Simple CLI Interface: Easy to use with straightforward commands and options.

Chekout GitHub - tychenjiajun/art: AI-PP3 (AI RawTherapee) is a command-line tool that leverages artificial intelligence to analyze RAW photos and generate optimized processing profiles (PP3 file) for RawTherapee. for more detail.

3 Likes

Hello, please change the name of your generator, as ART is a fork of RawTherapee. It stands for Another RawTherapee.

https://art.pixls.us/

Renaming is on the way

2 Likes

Hello, I don’t have API keys for those AI machines but I’m interested in your work. Can you upload here a raw photo that is opened in RT with the standard profile and another one with your AI optimized profile? Curious!

All pp3 files are AI-optimized base on DSC00503-DxO_DeepPRIME.dng.pp3. The original raw image is DSC00503-DxO_DeepPRIME.dng.pp3.

gemma3.pp3 is generated by model OpenRouter
qwen.pp3 is generated by model OpenRouter
mistral.pp3 is generated by model OpenRouter

mistral.pp3 (14.0 KB)


gemma3.pp3 (14.0 KB)
DSC00503-DxO_DeepPRIME.dng.pp3 (14.0 KB)
DSC00503-DxO_DeepPRIME.dng (63.7 MB)

qwen.pp3 (14.0 KB)

Sounds interesting… any chance for other raw formats? I fear if I have to convert my raws into dng before using your sw, and then taking care of raws and dngs will need the same amount of time like it’d do the old manual way :wink:

Under the hood, it use rawtherapee-cli to convert raw file to jpeg, and feed the converted jpeg to the specific multimodal model. So it should support all format that rawtherapee-cli supports.

1 Like

Hello, thanks for uploading those photos.

I compared the three pp3s with diff and I saw (mostly slight) differences with exposure, color balance, green balance and highlights and shadows.

I’m an amateur photographer and no programmer or engineer. I am just wondering where that AI part is, what it does. Or in other words, you stated above:

Why are those three pp3s optimized? I don’t see that at first sight.

On your GitHub page you say:

By combining computer vision AI with RawTherapee’s advanced processing capabilities, AI-PP3 helps you achieve professional-quality results automatically.

You have much more experience with your software. Can you confirm the above quote? Perhaps the photos above are not the best examples to show the power of your AI-PP3 machine?

PS. I’m not an AI hater at all, it can be very useful. (I recently used an AI machine to translate a large part of RawTherapee in Dutch).

Best regards,
Paul.

The result depends on two things: the model’s ability and the prompt. If the result isn’t good enough, it’s because most models are trained to describe images, not write pp3 settings.

The original prompt is located at art/src/prompts.ts at main · tychenjiajun/art · GitHub

Below is an example of using a different prompt to get better differences in the final pp3 file.

ai-pp3 DSC00503-DxO_DeepPRIME.dng --provider openai-compatible --model google/gemma-3-27b-it:free --verbose -k --base DSC00503-DxO_DeepPRIME.dng.pp3 -o dramatic.pp3 -p "Unleash the full potential of RAW image pp3 settings with a dramatic flair. Elevate your creative precision by providing:

1. ANALYSIS: A profound exploration into the image's essence, identifying its strengths and areas for transformation.
2. PLAN: Set ambitious goals for adjustments and articulate a vivid creative vision that transcends ordinary imagery.
3. CHANGES: Deliver precise SEARCH/REPLACE blocks to breathe life into the envisioned transformation.

RULES:
- Respect the original structure & headers as the foundation upon which to build your masterpiece.
- Proceed thoughtfully, modifying only those values necessary to achieve the desired dramatic effect.
- Maintain the document's integrity by never adding or removing sections.

CHANGES FORMAT:
\`\`\`
<<<<<<< SEARCH
[Exposure]
Auto=false
Clip=0.02
Compensation=0
Brightness=0
=======
[Exposure]
Auto=false
Clip=0.02
Compensation=-0.5
Brightness=25
>>>>>>> REPLACE
\`\`\`

Prepare to embark on this transformative journey as the pp3 content to be optimized follows below:
"

The AI analyzes the converted JPEG image using DSC00503-DxO_DeepPRIME.dng.pp3 and responds as follows:

====

RAW Image PP3 Settings - Dramatic Enhancement

1. ANALYSIS:

The image presents a silhouetted cityscape against a muted, slightly cloudy sky. The primary strengths are the strong shapes of the buildings and trees, and the potential for a dramatic mood. The image currently lacks contrast and feels somewhat washed out. The airplane is a small detail that could be emphasized or de-emphasized depending on the desired effect. The goal is to create a more impactful and atmospheric image, leaning towards a slightly surreal or cinematic look. We’ll focus on deepening the shadows, enhancing the contrast, and subtly adjusting the color balance to evoke a sense of mystery and scale.

2. PLAN:

The plan is to transform the image into a more dramatic and visually striking composition. This will be achieved through the following:

  • Increased Contrast: Sharpen the distinction between the silhouetted elements and the sky.

  • Shadow Deepening: Enhance the darkness of the silhouettes to create a stronger sense of depth and mystery.

  • Subtle Color Shift: Introduce a slight cool tone to the sky to complement the dark silhouettes.

  • Airplane Emphasis: Slightly brighten the airplane to draw the eye, but not make it the focal point.

  • Retinex Application: Use Retinex to bring out details in the sky without blowing out highlights.

3. CHANGES:


<<<<<<< SEARCH

[Exposure]

Auto=false

Clip=0.02

Compensation=0

Brightness=0

=======

[Exposure]

Auto=false

Clip=0.02

Compensation=-0.3

Brightness=15

>>>>>>> REPLACE


<<<<<<< SEARCH

[Retinex]

Enabled=false

Str=20

Scal=3

Iter=1

Grad=1

Grads=1

Gam=1.3

Slope=3

Median=false

Neigh=80

Offs=0

Vart=200

Limd=8

highl=4

skal=3

complexMethod=normal

RetinexMethod=high

mapMethod=none

viewMethod=none

Retinexcolorspace=Lab

Gammaretinex=none

CDCurve=0;

MAPCurve=0;

CDHCurve=0;

LHCurve=0;

Highlights=0

HighlightTonalWidth=80

Shadows=0

ShadowTonalWidth=80

Radius=40

TransmissionCurve=1;0;0.5;0.34999999999999998;0.34999999999999998;0.59999999999999998;0.75;0.34999999999999998;0.34999999999999998;1;0.5;0.34999999999999998;0.34999999999999998;

GainTransmissionCurve=1;0;0.10000000000000001;0.34999999999999998;0;0.25;0.25;0.34999999999999998;0.34999999999999998;0.69999999999999996;0.25;0.34999999999999998;0.34999999999999998;1;0.10000000000000001;0;0;

=======

[Retinex]

Enabled=true

Str=25

Scal=3

Iter=2

Grad=1

Grads=1

Gam=1.4

Slope=4

Median=false

Neigh=80

Offs=0

Vart=220

Limd=8

highl=5

skal=3

complexMethod=normal

RetinexMethod=high

mapMethod=none

viewMethod=none

Retinexcolorspace=Lab

Gammaretinex=none

CDCurve=0;

MAPCurve=0;

CDHCurve=0;

LHCurve=0;

Highlights=10

HighlightTonalWidth=70

Shadows=10

ShadowTonalWidth=70

Radius=45

TransmissionCurve=1;0;0.5;0.34999999999999998;0.34999999999999998;0.59999999999999998;0.75;0.34999999999999998;0.34999999999999998;1;0.5;0.34999999999999998;0.34999999999999998;

GainTransmissionCurve=1;0;0.10000000000000001;0.34999999999999998;0;0.25;0.25;0.34999999999999998;0.34999999999999998;0.69999999999999996;0.25;0.34999999999999998;0.34999999999999998;1;0.10000000000000001;0;0;

>>>>>>> REPLACE


<<<<<<< SEARCH

[White Balance]

Enabled=true

Setting=Camera

Temperature=6504

Green=1

=======

[White Balance]

Enabled=true

Setting=Camera

Temperature=6800

Green=2

>>>>>>> REPLACE


<<<<<<< SEARCH

[Shadows & Highlights]

Enabled=false

Highlights=0

HighlightTonalWidth=70

Shadows=0

ShadowTonalWidth=30

Radius=40

Lab=false

=======

[Shadows & Highlights]

Enabled=true

Highlights=-10

HighlightTonalWidth=60

Shadows=20

ShadowTonalWidth=40

Radius=45

Lab=false

>>>>>>> REPLACE

These changes aim to create a more dramatic and atmospheric image, enhancing the silhouettes and adding a subtle mood shift. Further adjustments can be made based on personal preference.

====

Output pp3 and jpeg


dramatic.pp3 (14.0 KB)

The Analysis and Plan are really impressive! So that’s how the AI works, thanks for clarifying.

Last practical point. I’m not sure if many users of RawTherapee have those API keys for AI machines or know how to obtain them. I’m afraid that might be beyond their scope, as most of them are “ordinary” photographers. That might limit the potential user base of your pp3 generator…

How does the LLM know exactly what value it should use for HighlightTonalWidth, for example?

Using the option --base DSC00503-DxO_DeepPRIME.dng.pp3 includes specific sections of the file DSC00503-DxO_DeepPRIME.dng.pp3 in the prompt. This tells the model that the current HighlightTonalWidth is 60 and shows how the image looks with this setting. The model then tries to change this value to 70.

But, 70 might not be the best setting for this image. The result depends on the prompt, the analysis, and the plan. If you use a different base .pp3 file, you might get a different result. You will not get a HighlightTonalWidth if your .pp3 doesn’t include this parameter.

It’s fine for me not to have a large user base since this is a free, open-source, experimental program.

You can obtain free API keys from Google and Open Router. While these services aren’t very fast, they are still usable. Once you’ve set them up, they can be used indefinitely.

Even without any API keys, users can still manually copy my prompts and paste them into chatbots like ChatGPT to see the results. This command-line program simply makes it much easier to handle batch processing tasks.

Have you considered training your own model, instead of just feeding things to an existing LLM?

Very cool. I’ll take a closer look at the code in the next few days. I’ve actually been working on an end user “technical” document, showing how badly implemented, or flat-out vapor, most “AI” tools for global raw editing is (No offense, I really do hope your tool is elegantly implemented), I absolutely wanted to avoid learning the intricacies of another software with functioning “AI” analysis and profile generation. I don’t want to guess what percentage of stand-alone software promising “AI” tools are just continuing with established algorithms and tools, but I can say with certainty that most cameras, even the most modern, don’t have any meaningful “AI” functionality. The most obvious reason just being the lack of internal storage space for the logic to reference and adapt. Canon for example, advertises “AI enhanced noise reduction”, and it’s just static algorithms. At least they’re being honest about their autofocus now, in their marketing documents they say it’s “trained with data derived from AI”. Guess they found out more photographers understand how generative, and evaluative models work, the amount of storage even the smallest, compressed datasets need, and the processing power/volatile memory/user specific discreet data needed to make something like a useful AI model operate. The least you would expect from a properly integrated denoise/noise minimizing logic would be an improvement over time (personally I think they should provide config tips like seeing the lens on f/8 and suggesting a wider aperture, or a lower EV+/-, or a slower shutter speed. Especially in cases where sensor/camera temps are high due to direct sunlight or video recording.) But all that aside, the only way to improve the denoise, and AF performance on any mirrorless camera, is to update the firmware. Not very AI-y :slight_smile: Anyways, a lot of stuff like this will be in the document, I just wanted an “AI” tool I can run on a software I know very well, to show an accurate comparison, just by processing the same image manually, and comparing the results.

Thanks again for putting in this work! And best of luck to your code! I typically find that AI processing is very heavy-handed. Shadows/Dark areas/night sky ends up very black, and wierd artifacts like strange colored donuts around clipped highlights when there’s a color or luminance gradient, and a light or sign that gets clipped in the middle of it.

Oh haha. I guess I can move on and not waste my time setting this up. I was assuming it was much more integrated into the software/had logic to read and react to specific images without user intervention. Not that I had much faith in the code I have seen in the AI integrations for GIMP, or previous open source standalone applications, but the way you have this set up, I can likely de-noise an image, comparably, faster, without having to run bat files to query off-site models. For reference, manually denoising an image typically takes me less than 60 seconds, maybe 90 if shadows are really ragged, i’ll go into RGB curves and taper shadows to zero to fix any RGB noise being persistent (I will check through a few demosaice algorithms first, the native denoise controls definitely react to the way the image is generated. Don’t know if you need info on how different users approach manual denoise, but figured i’d offer up some of the more general steps I take.

Also, my actual “compute” time per image outputing a 32 bit FP TIFF (700MB file) of a 75MB .RAW file (36.3MP Nikon D810, 14bit lossless uncompressed raws), even with quite a few functions and adjustments running, is about 5 seconds per image. That’s on an HP Z440, Xeon E5-2697v3, 128GB DDR4 2133 (4 channel), and a PCIe-Gen3 Nvme drive running from a PCIe slot for OS, and a 2nd single Nvme SSD card for my image files. Hope some of this info helps you benchmark your code, or make it more functional. :slight_smile:
Also not sure the percentage of users who export straight to .jpg using rawtherapee. When I first started using it (a LONG time ago now), i really really didn’t like how the compression functioned in rawtherapee. The difference between a 7MB 1080x1350 PNG, vs a 1.5MB 1080x1350 .jpg is pretty obvious. Once I’m finished with my global edits, I export as a 32bit FP .TIFF file, open with GIMP, crop/resize as needed, occasionally utilize the 6 color saturation/hue adjustment, then export as a 16bit ARGB .PNG file. With many social media sites restricting images mostly on a pixel level (1350x1080 for anything facebook/meta), it should be quite rare for someone to use a lossy format like .jpg, when supported “lossless-ish” formats like 16 bit PNG, or .TIFF files are there.

1 Like

Training a model needs two main things: a set of training data and a GPU. The training data usually includes thousands of input images and their corresponding output pp3 files. As a single developer, it’s hard for me to get such a large amount of data and the expensive GPUs needed.

Thank you for this work and this information. It is very enriching and this approach must be known and valued.
I am not (at all) an expert in artificial intelligence, even though it interests me.

I’m speaking for myself; this may not be the opinion of others… For your information and reminder, I’m sick and old (77 years old), an old Frenchman who discovered computers and C++ upon retirement.

As its name suggests, Rawtherapee is primarily a Raw processing software… which poses many problems when evaluating the image, since there is no native image. I won’t list the entire process, but will focus on the key points that would be affected by AI (of course theirs others…)

  • Which demosaicing software to choose, particularly based on noise.
  • How to optimize Capture Sharpening – which will basically improve the image’s edges if it is noisy.
  • How to denoise (where to evaluate noise) – based on what criteria, with what tools to reduce noise where needed and not where needed.
  • How to restore sharper edges to the image after denoising without excessive effects.
  • How to differentiate actions depending on the image – for example, highlight the main subject and blur/soften less important areas.
  • How to evaluate white balance
  • How to evaluate image dynamics (DR), black point and white point values, and which algorithm(s) to use from all those available (Log encoding, Sigmoid, TRC, GHS, etc., etc.)
  • How to evaluate the user’s wishes for the final rendering: contrast, saturation, color fidelity, etc.
  • How to evaluate possible cropping and deletions, etc.
  • Etc.

When I was younger (1993), I worked on the cognitive aspect of expert systems. What makes Mr. X a recognized expert? At the time, it wasn’t about making an exhaustive inventory of his knowledge and skills, but rather trying to model (this is very close to AI) how he goes about it, how he approaches the subject (in fact, how to tell AI to approach the subject). In France, some people call this REX (Retour d’Expérience), which is different from “Feedback” because the key is to identify what makes him an expert and then allow other users (or AI) to learn from this approach.

In March 2024, I tried this, but I think it was misunderstood - my English is poor, and without exchanges and discussions with real people, misunderstandings quickly lead to confusion. You can find the (incomplete) summary of this approach on Rawpedia.
Rawtherapee processing challenge

I think that, as Bill Gates rightly says, there is one area - at least in the near future - where AI will not replace humans: algorithms.
What I wrote above is just algorithms: what to do, what to decide, how to guide, etc.

For those interested, I’ve opened a “captur-noise” branch that covers some of the points mentioned above (Capture Sharpening Raw, Denoise, Capture Sharpening after, etc.).

Developing AI in open source software will face major challenges, because to create AI, you need three things:

  • one or more algorithms (the key of the system)
  • a computer that knows how to perform/optimize parallelization. As a reminder, Rawtherapee doesn’t use the GPU but optimizes the CPU based on the number of cores.
  • a knowledge base (data), which we obviously don’t have.

Thank you for reading, and excuse my bad english.

Jacques

3 Likes

Thank you for sharing your workflow. I notice there’s some misunderstanding that needs clarification: ai-pp3 doesn’t handle AI calculations or RAW processing. The AI work is done by the provider, and RawTherapee handles the processing. There’s little to no room for me to optimize the running time of ai-pp3.

Since ai-pp3 just passes the processing tasks to rawtherapee-cli, I’ve added support for TIFF and PNG output formats.

I’ve also included an FAQ document and examples. As shown in the examples, current AI models aren’t great at denoising yet. However, models are improving quickly. GPT 1.0 wasn’t very capable, but GPT 4.0 is much better and can now solve Olympiad questions. I believe one day, these models will be able to do much more.

I see that ART has a feature called User Command. I plan to make ai-pp3 a User Command for ART one day, which will integrate it more closely with the raw image processing program’s user interface.

1 Like

Thank you so much for sharing the work on the Rawtherapee Processing Challenge! I hadn’t encountered this before, but after reviewing it, I downloaded the RAW images and used ai-pp3 to tackle the challenge. You can see the results in the examples section.

Today, when people refer to AI, they typically mean large language models (LLMs) powered by deep neural networks. These are fundamentally different from older expert systems. The AI behind ai-pp3 is a Multimodal Large Language Model (MLLM), which combines text and visual capabilities. This means it can analyze photos, thanks to models developed by companies like OpenAI and Google. For a deeper understanding of how these models work, you can watch this video: Deep Dive into LLMs like ChatGPT.

Before starting the ai-pp3 project, I had only two years of photography experience and no prior knowledge of RawTherapee. However, I had strong expertise in software engineering and neural networks. For about a year, I explored ways to merge modern AI with photography. One day, I noticed that RawTherapee uses the .pp3 file format, which is both human-readable and compatible with LLMs. This inspired me to experiment with using LLMs to process pp3 files—and that’s how ai-pp3 was born.

One major challenge for ai-pp3 is that existing AI models are trained on specialized datasets designed for specific tasks. For example, some datasets focus on Optical Character Recognition (OCR), like this one, while others prioritize image description, such as this dataset. Unfortunately, no existing dataset is tailored for generating photo sidecar files (like pp3), which is the core function of ai-pp3. This gap highlights the need for future work to create or adapt datasets for this purpose.