Working with afre_cleantext filter and G'MIC plugin

Here are 2 test sets. Reference books are often printed on very thin paper, so back printing is visible through. Some areas on the back and front pages are highlighted by a black background, so the front highlighting must be preserved, and back highlighting removed. There are side “finger” areas “Code” which must remain untouched. Also in some pages the back text overlaps front photo or illustration area, so the text its desirable to remove, while keeping illustrations fully untouched.

https://workupload.com/file/jyxGvv3G
https://workupload.com/file/npm5Tx8h

Binarization is still often done at converting to DjVU to keep the book size small, meaning 60MB per book with 1000 content rich clean pages, such as in the linked example. The print visible from back side can be removed without damage to front side print by an algorithm identifying mirrored letters.

The original small font text seems not exactly black. Yet I was able to improve it a little better with your filter. I hope if the full color is preserved, you can improve the code to keep letters fully visible while background fully removed. One way to achieve that is to improve letters with unsharping, thickening, smoothing, and despeckle algorithms, then making color letters black within the entire outline.

Text can be flatten after cleanup by a different restoration package, then white background replaced at converting to DjVU with easier to read grey or light brown background.