Working with afre_cleantext filter and G'MIC plugin

This is how it looks flatten and binarized at 127KB file size. Possible to read as this is the ultimate goal. But some text outline is missing, and more prior font outline improvement is desired so its not partially lost at binarization.

Ideally the font outline should be filled without defects, and its pixel color spectrum should be narrowed at cleanup close to black, so its not cut off to white later at binarization thus making small visible defects in the outline. The background might also contain close to black pixels as a result, and their removal may then require a different filter like despeckle. :sweat_smile:

Talking to the site admin, since this site can’t show DJVU format when uploaded, there is a little sense in blocking other hosting sites links here for DJVU files, unless you enable showing them here. :blush:

It is too late now here in India. I will try to post another tiff file tomorrow, if you are interested.

No problem. I’m definitely interested. :sleeping:

OK. Here is a tif file that is obtained by using the RawTherapee 5.7 programme. The advantage of this is that you can apply the attached processing file to all images without having to open each one of them.
It is a 16 bit tif file generated from a 8 bit jpg file. Hence, the same processing applied to an original 16 bit file may yield better result.

Processing file: 484RawTherapee-1.tif.out.pp3 (11.9 KB)

The text recovery is quite noticeable, but I was unable to remove enough background noise without loosing some text outline, even with intermediate processing in a specialized book restoration package before extra processing in DjVU cleaner & converter. In other words, we’re facing the same task of narrowing the text outline all pixels color spectrum closer to black color before binarization.

Besides, this RawTherapee TIFF format can’t be converted directly by DjVU tools for some reason, thus requiring be re-saved as TIFF by another graphical package.

As mentioned, there are many ways to reach the goal. This time I am not using afre_cleantext but afre_contrastfft (I haven’t written the GUI part yet).

3 Likes

It looks like you’re closer to the goal Of what @sambul81 want than I thought.

It may be, here’s 50KB DjVU page to support that.

Actually there is large community of book lovers who want “what sambul81 wants”. :joy: And it was obvious from the start, the man is highly intelligent and a bright talent. You guys rock!

Looks like I don’t need the collab after all. Don’t get too excited: I haven’t release the GUI yet.

I know, because… here there are 2 test sets. Hope you won’t forget about… :crazy_face:

There’s a bit more processing step that could be added. Lightness/Contrast and erode/dilate. And afre would have a working alternative to afre_cleantext.

I wonder if there was any improvement progress on the new “immature” plugin lately? :yum:

I have found that the easiest way to eliminate bleed thru of text from the other side of the page, is to place a black sheet of paper behind the page being scanned. I use it all the time and it works great.

1 Like

Would you point to a suitable black material on Ebay or such? Or were did you get your sheets?

@Bilbo Yes, it is all about technique as I have been saying all along.

@sambul81 I have decided not to. You should improve your scanning technique first. Call it tough love. :slight_smile:

1 Like

And just one more thing, if you ever want to improve upon current technique, you could try to learn g’mic and contribute. Ever since learning g’mic, I’m far more independent in that aspect. On the side note, still need to learn c++ to finish some of my needs though since no one else wants to take it up.

I participate on other forums too. Some folks solicit commercial services for a fee on these forums despite being strictly prohibited by Rules. I was openly objecting to this abuse. The guy in retaliation contacted every software developer around the world we discussed on that forum asking them not to improve their free packages. The motif is clear - he was facing loss of illegal income. He might be a member of this forum too. However, he’s not the only one.

Some “free” soft devs abuse access to users sharing knowledge on such forums to develop improved software they offer for a fee to some companies. The technique is primitive: complain that you’re poor, incapable to survive, solicit samples and detail info about the problem and expected resolution from narrow knowledge holders, do all this by extending fake promises, and once collected enough info, show your real face to the forum. That works well too for some, if such conduct is allowed by the forum admins.

Another interesting possibility is using this forum to advertise skills you don’t actually have. For example, one can use a popular graphics editor to cleanup someone’s sample, and then say they developed their own code to do that. Of course when asked to show such a code, a suitable answer will be what? Empty handed… substitute proof by some rude redirection.

I want to tell you btw that you should address people in the manner they address you. You can’t dictate people what to do, can’t be rude - its against forum rules, and don’t give unsolicited advice when asked about totally different matters. I did not scan these pages at all, because the source doesn’t have an accessible scanner. Its clear from my above notes and the photos, so your comments about scans are not only unsolicited, but also irrelevant to the conversation and quite stupid to be honest.

This is a very bad habit, don’t ask people to be rude with you by abusing access to forum members and their polite honest attitude.

@sambul81 it isn’t clear what conspiracy theories you’re floating here and it doesn’t matter, really. Nobody is obliged to provide you anything at all, just as you are not obliged to provide them anything either. Whatever you provide is of your own volition. What you do on other forums and how people treat one another on those forums is not of consequence here.

We are all here because we have a common interest and generally enjoy solving problems around imaging. If someone happens to solve your problem 100%, that is fantastic. If they only solve it 1% or even zero, but share along that way, that is also good.

There is no floor or ceiling to sharing and everyone should share what they want, whether that is 100%, 1%, or 0%.

Seems like you got somewhere between 1 and 99%, you should be thankful.

Rudeness is not welcome here. You’ve ready been messaged once, let’s not go for more.

May I politely ask you what I should be thankful @afre for? Pls don’t try to misrepresent my reply as rude, since its a straight question. This forum declared goal is promoting free software and free graphics advice. In this post he claimed that he developed a plugin and shown corresponding picture, but failed short to give any guidance or explanation - and that’s the ONLY reason members share info here. Then later again without any explanation why he failed to deliver on his earlier plugin promise, he suddenly accepted a role of a teacher who is telling minors how to behave in school.

Who is this guy to begin with to be so rude to others? His input to my problem resolution is exactly ZERO, not btw 1 and 99% as you claim. The fake picture he posted can’t be used for anything at all, since no comments were provided how to achieve that.

I started this thread to discuss how to improve a photograph, and he is suddenly teaching me how to scan papers? Real substance of the discussed issue is substituted by some rude “advice” after fake plugin advertising. What should I be grateful to him for? By encouraging fake ads of non-existing skills and changing the discussion topic, you can’t promote honest people attitude here. Such conduct should not be supported by forum staff!

Most of the reference code is available in his gmic filter, called afre_cleantext, which is available here: gmic-community/afre.gmic at master · dtschump/gmic-community · GitHub

Have you tried it?