Semi-Automated Cleaning of Photographed Book Pages

Since my son’s at school, I am more than often confronted with the following situation: Some book is missing for homework and we get the needed pages sent from other pupils’ parents. Typically, my wife gets them via whatsapp, low resolution, low dynamic range, photographed at a weird angle and typically far from flat. Now I wonder if there is some semi-automated processing chain that could handle this situation.

I’ve simulated such a photograph with blender, as I cannot post an original school book page due to copyright. However, the school books are often of the type where you write directly into the book (like a form).

There are solutions such as unpaper (and lots of proprietary apps as well as some open source solutions) that are made to handle badly scanned pages. However, I did not yet find something that works well with this type of photographed pages.

What I would like to solve semi-automatically is

  • un-warp the page
  • crop to the page borders
  • generate a gain map from the background to undo the uneven lighting impact

Is there already something that could be used or does somebody have some implementation ideas? With semi-automated I mean it would be perfectly acceptable to select the page corners and align the outline with splines manually, maybe even defining sample points for the gain map. I tried fully manual with gimp warp as well as cage transform, but that did not work well and solved only parts of the issue. Manually aligning gradients so solve the uneven lighting is, however, close to impossible, and sometimes there are shadows on the pages …

I know that I will have to do this for quite a while (my daughter enters school in 1.5 years), such that for now I see some benefit of automating this topic …

I expect there is a package somewhere to do all that. Meanwhile, I can glue some scripts together that use ImageMagick. The bad news is: they are Windows BAT scripts. Yeah, I know.

Taking just the first problem: “un-warp the page”. Your input image is named warpedPage.png.

%IMG7%magick ^
  warpedPage.png ^
  -fuzz 10%% ^
  -fill Black -draw "color 0,0 floodfill" ^
  -fill White +opaque Black ^
  wonb.png

call %PICTBAT%find4cornSub wonb.png corners wonb_test.png

set corners.

%IMG7%magick ^
  wonb.png ^
  -fill None -stroke Blue ^
  -draw "line %corners.TL.x%,%corners.TL.y% 0,%corners.TL.y%" ^
  -draw "line %corners.TR.x%,%corners.TR.y% %%[fx:w-1],%corners.TR.y%" ^
  -draw "line %corners.BL.x%,%corners.BL.y% 0,%corners.BL.y%" ^
  -draw "line %corners.BR.x%,%corners.BR.y% %%[fx:w-1],%corners.BR.y%" ^
  -stroke None -fill Blue ^
  -draw "color 0,0 floodfill" ^
  -draw "color 0,%%[fx:h-1] floodfill" ^
  -fill Black -opaque White ^
  lined.png

call %PICTBAT%str2lines3 lined.png warpedPage.png undist.png

call %PICTBAT%find4cornSub undist.png corners2 undist_test.png

set corners2.

set X0=%corners2.TL.x%
if %X0% GTR %corners2.BL.x% set X0=%corners2.BL.x%

set Y0=%corners2.TL.y%
if %Y0% GTR %corners2.TR.y% set Y0=%corners2.TR.y%

set X1=%corners2.TR.x%
if %X1% GTR %corners2.BR.x% set X1=%corners2.BR.x%

set Y1=%corners2.BL.y%
if %Y1% GTR %corners2.BR.y% set Y1=%corners2.BR.y%

echo %X0%,%Y0%, %X1%,%Y1%

set COORDS=^
  %corners2.TL.x%,%corners2.TL.y% %X0%,%Y0% ^
  %corners2.TR.x%,%corners2.TR.y% %X1%,%Y0% ^
  %corners2.BL.x%,%corners2.BL.y% %X0%,%Y1% ^
  %corners2.BR.x%,%corners2.BR.y% %X1%,%Y1%

%IMG7%magick ^
  undist.png ^
  -distort Perspective "%COORDS%" ^
  undist2.png

find4cornSub.bat is published at Find four corners, and str2lines3.bat is published at Straightening two lines.

Here is the result, undist2.png:

The next steps are simpler. We have made a rectangle, coordinates (X0,Y0) at top-left and (X1,Y1) at bottom-right, so we crop to that. The image has dark text on a light background, so for the gain map we replace each pixel value with the lightest value in a 5x5 sliding window. This is an approximation to the paper background. Divide the image by the paper background to remove the uneven lighting. Finally, stretch the contrast to make a small percentage black and a large percentage white.

Replace my final command above with:

%IMG7%magick ^
  undist.png ^
  -distort Perspective "%COORDS%" ^
  +write undist2.png ^
  -crop %%[fx:%X1%-%X0%]x%%[fx:%Y1%-%Y0%]+%X0%+%Y0% ^
  +repage ^
  ( +clone ^
    -statistic Maximum 5x5 ^
    +write gainMap.png ^
  ) ^
  -compose DivideSrc -composite ^
  -contrast-stretch 1x80%% ^
  undist3.png

The result, undist3.png, is:

In the real world, the hardest part is often the first: find the edges of the paper. A photo will often include edges of other pages, the book cover, a dusty white background behind the book, and a heavy dark shadow in the gutter between pages.

1 Like

Have you tried ScanTailor? (GitHub - 4lex4/scantailor-advanced: ScanTailor Advanced is the version that merges the features of the ScanTailor Featured and ScanTailor Enhanced versions, brings new ones and fixes.)
I use it regularly to geometrically remap pictures of lab books and it’s quite powerful, including non linear edges and projections. However, it involves a bit of manual fiddling and I don’t think it has gain map support.

I did try it indeed, but was not successful. Maybe I should try harder … If gain maps are not possible, I could maybe use it as a first step and then Alan’s approach for the final touch. Thanks for the suggestion.

For this example (dark text on light background), a gain map is very easy. If ScanTailor otherwise does what you need, then you might suggest to the ScanTailor author that a gain map facility wold be a useful addition.

This looks really promising, I have to study it in detail (this I will do next, but i did not want to leave your reply unanswered for too long). Especially, I need to translate it to something that could run on linux. Thank you very much for putting this together! Once I have something that runs on linux, I’ll post it here.

I would guess this is dependent on the page contents. I’ll have a look how it perform with the real work sheets and book pages, and if I have to tune this size.

@chris: Yes, the best size of the “5x5” window depends on the thickness of the text strokes. Larger text will need larger windows. Making the window too large will reduce the accuracy of the estimated gain map, which probably won’t matter much.

1 Like