newbie help, histogram-binning percentage threshold ('auto levels')

Hi,

I’m no math expert, and not native English, so I’m bound to use some terms wrong. Sorry in advance :).

But I’m dipping in g’mic command-line things again, to see if I can convert some imagemagick + vips pipeline to g’mic. Maybe to see how it performs, maybe just to learn g’mic more to add it as a tool in my arsenal.

What I’m looking for is sort of what Photoshop and other tools do with ‘auto levels’. You set a percentage on the lower side and on the upper side.

Let’s assume a 16-bit (0 to 65535) range of a single (grayscale) channel.
What I think I want, is to get a histogram with 65535 bins, so one bin for each possible value.
If I want to know the ‘threshold of 20%’, I mean I want to know the value at which point 20% of the pixels in the image are at or below that value.

So, let’s say I have a 200x200 image, that is 40000 pixels. 20% of that is 8000.
So if I take a histogram with 65535 bins, I want to start at the left and go summing bin by bin until I’ve reached 8000. If that’s bin 13124 as an example, that means that 20% (8000) of the pixels have a value of 13124 or less.

I think tools like Photoshop use this for ‘auto levels’. For example, use 0.1% as percentage for the shadow side, then use 99.9% for the percentage at the highlight side, and then you have two values you can use to scale the image in between (a normal levels adjustment in gfx tools).

Is there a command in g’mic to give me those values, or how would I go about doing it otherwise?

I would say something like this, written as a G’MIC command
(I assume your input image has value range [0,65535] there):

valmin :
  +histogram_cumul 65536,0,0,65535
  eval "for (x = 0, x<w, i[x++]>=8000?break()); x" rm.

Then, in bash:

$ gmic sp colorful,200 blur 1 mul 257 'val=${valmin.}' echo 'VAL=$val'
[gmic]-0./ Start G'MIC interpreter.
[gmic]-1./ Input sample image 'colorful' (1 image 200x200x1x3).
[gmic]-1./ Blur image [0] with standard deviation 1, neumann boundary conditions and gaussian kernel.
[gmic]-1./ Multiply image [0] by 257.
[gmic]-1./ Set local variable 'val=3665'.
[gmic]-1./ VAL=3665
[gmic]-1./ Display image [0] = 'colorful'.

So histogram_cumul is a histogram where every value is the sum of all previous values? (That aligns nicely).

That ‘valmin’ command… is that placed in a common folder somewhere that’s included automatically?

I’m just using really long commandlines at the moment :s.

I did manage to rewrite the for loop to a pipeline on the CLI and it appears to be working, so great!

Maybe I need to start looking into writing the scripts as files instead of commandlines :P.

Yes definitely, because you won’t have to deal with both the cli syntax and the G’MIC syntax at the same time, it can be cumbersome sometimes to mix the two.
Just create your own G’MIC script file myscript.gmic containing the valmin definition, then :

$ gmic myscript.gmic [do whatever you want here, including calling 'valmin']

Instead of doing the for() loop, we can do the histogram_cumul twice, then directly read the value. This is useful if we want to find many thresholds.

For example: using for() loop:

f:\web\im>%GMIC% sp colorful,200 blur 1 mul 257 histogram_cumul 65536,0,0,65535
eval "for (x = 0, x<w, i[x++]>=8000?break()); x" echo ${} rm.
[gmic]-0./ Start G'MIC interpreter.
[gmic]-1./ Input sample image 'colorful' (1 image 200x200x1x3).
[gmic]-1./ Blur image [0] with standard deviation 1, neumann boundary conditions
 and quasi-gaussian kernel.
[gmic]-1./ Multiply image [0] by 257.
[gmic]-1./ Compute cumulative histogram of image [0], using 65536 levels.
[gmic]-1./ Evaluate expression 'for (x = 0, x<w, i[x++]>=8000?break()); x' and a
ssign it to status.
[gmic]-1./ 3676
[gmic]-1./ Remove image [0] (0 images left).
[gmic]-0./ End G'MIC interpreter.

Using the cumulative histogram twice:

f:\web\im>%GMIC% sp colorful,200 blur 1 mul 257 histogram_cumul 65536,0,0,65535
histogram_cumul 65536,0,0,65535 echo {i[8000]} rm.
[gmic]-0./ Start G'MIC interpreter.
[gmic]-1./ Input sample image 'colorful' (1 image 200x200x1x3).
[gmic]-1./ Blur image [0] with standard deviation 1, neumann boundary conditions
 and quasi-gaussian kernel.
[gmic]-1./ Multiply image [0] by 257.
[gmic]-1./ Compute cumulative histogram of image [0], using 65536 levels.
[gmic]-1./ Compute cumulative histogram of image [0], using 65536 levels.
[gmic]-1./ 3675
[gmic]-1./ Remove image [0] (0 images left).
[gmic]-0./ End G'MIC interpreter.

(Of course, doing this all on the command line is clumsy.)

But, if I got a 3000x2000 image, that is 6,000,000 pixels. Let’s say I want the threshold to be at 75% (4,500,000) That would mean that in the ‘for loop style’ method, I would do i[x++] >= 4500000.

Using the histogram-twice like you said, means I can’t do i[4500000] because the 2nd histogram is only 65535 pixels wide. So something doesn’t seem to add up there. Or am I missing something?

I normally do this in ImageMagck, not G’MIC, becase I don’t know G’MIC so well. And I normally use normalised histograms, so the counts are proportions of the total numbers of pixels (0 <= proportion <= 1) rather than absolute counts.

So after the double cumulative histogram, I would examine the pixel at 75% of the width. The value at that pixel (on a scale of 0 to 1) would tell me the proportion of pixels that were below 75% of maximum.

I expect the equivalent can be done in G’MIC, but I don’t know how.

I’ve changed the second cumulative histogram to 100,0,0,{wh}. It creates a 100-pixel wide image with all the possible values of the previous one (wh is width x height, so total number of pixels). I can then take i[0] to take the 1%, or i[98] to take the 99% (so it’s off by 1 :P) but the values seem right up there with the for-loop method, and quite a bit faster.

How would you do stuff like this in Imagemagick? It’s histogram-generating code seems so slow if you want the full 65535 bins, and I don’t know of any way to do cumulative mode.

I wrote a little script that took the output of the ‘text mode histogram’ to get the cumulative values, but it was far from speedy. That’s why I started using libvips for my coding, but on the CLI I was wanting something else. That’s why I took to g’mic (the fact there is a gimp + photoshop filter also makes it interesting to write your own stuff).

ImageMagick has a “-process” mechanism, where we can write out own modules in C, and call them. One of my modules is called “mkhisto”. Using it at the command line:

magick toes.png -process 'mkhisto cumul norm' -process 'mkhisto cumul norm' info:

toes.png PNG 65536x1 65536x1+0+0 16-bit sRGB 320268B 0.376u 0:00.542

This gives information about the resulting image, 65536 pixels wide and 1 high. It takes about 2 seconds on my laptop. (toes.png is 267x233 pixels.)

Another purpose for double-cumulative histograms is in adjusting an image so that its histogram matches anything we want, such as the histogram of another image, or a Gaussian distribution, or an equalized distribution.

My “mkhisto” is documented at Process modules: mkhisto, with source code.

I like
– the scripting of gmic
process of magick

Went with gmic because I struggle more with C than gmic scripting, but the process method would definitely be much snappier. I don’t know if gmic has an equivalent of process

exec ? https://gmic.eu/reference/exec.html

Code within the command of gmic’s “exec” has no knowledge of any images in gmic’s memory (as far as I know).

As far as I know, gmic has no equivalent to IM’s “-process”. The closest would be by modifying gmic source code, and adding new commands that were implemented in C.

gmic has less need for this, as it has a fast expression evaluation for each pixel, and the ability to build user-defined commands that call any other commands.

1 Like

That’s interesting thanks. I always thought I should dive deeper in IM syntax to see what differences there are between these tools, and see the particular advantages of IM.

PS : I also wonder if that could be interesting to have some live interaction to discuss about different G’MIC tricks. It is sometimes a bit cumbersome to explain something using a forum.

Here are some useful features of ImageMagick:

IM’s “-process” gives me access, in C code, to IM’s data and metadata for all the images currently in memory. In that C code, I can create new images that the rest of the IM command can access. And my C code can call any of IM’s internal functions.

IM can convert between ICC profiles, so I can easily use Elle’s profiles for different primaries and transfer response curves.

IM knows what colour model is used for each image: RGB or XYZ or Lab or HSL or whatever. So I can convert an image to a given colour model without needing to know which model it came from.

IM has the concept of “maximum white”, which we can override if we want. So if I read three files that represent the same image stored as 8-bit integer or 16-bit integer or 32-bit integer, IM will store them internally with the same values (aside from quantisation), and I don’t need to explicitly normalise them to some common standard. In gmic, if I append an 8-bit image to a 16-bit image without explicitly normalising them, I get an unexpected result.

When GMIC has read a file and the pixel values are in the range [0…200] the information about the bit-depth of the source is lost, and we don’t know if the nominal range was [0…255] or [0…65535] or something else.

Similar comments apply when writing files.

Some gmic commands can take many arguments. IM’s approach is instead to define settings, which influence the behaviour of subsequent operations. This is a simple language difference rather than a fundamental difference. There are many other language differences.

IM can process images that are too large to fit into memory. As far as I know, gmic can’t.

IM can distribute processing between multiple computers. As far as I know, gmic can’t.

(Please don’t read this as “IM is superior to gmic”. I could write a longer list of useful gmic features.)

1 Like

No worry :slight_smile: I don’t think I’m that kind of guy…

Your list is very informative.

What I understand here is that IM 's main target is 2D color images (managing of color profiles, color spaces and so on, is made transparent to the user). As you know, G’MIC has a completely different approach : the input/output data can be anything (just considered as an array of values), it just assumes the user knows exactly what kind of data he manipulatess (particularly what the values of the “image(s)” mean). So, definitely not targeted on color images only.

I think G’MIC is indeed less convenient for the user interested in color image conversion. Besides, it’s true that G’MIC only deals with quite basic image formats.

But on the contrary, I feel G’MIC allows more flexibility when it comes to write custom scripts to process various image data (multi-spectral, volumetric, …). That’s why for instance, it is even possible to write things like “a parser that converts markdown files in HTML” as a G’MIC script :slight_smile:

I’ve never been a fan of such approaches in software or programming languages. Often, the proposed user API ends up with a mixing that force to use both settings and arguments, which is sometimes hard to follow and seems unnatural (at least to me).

Indeed.

Indeed.

Something that has always scared me in ImageMagick is the syntax of the language which seems to follow quite different rules depending on the type of command you want to use.
My own biased opinion : The rules of the G’MIC language may look a bit esoteric at first glance, but once you get them, you feel like you can do whatever you want :slight_smile:

Actually, I’m so biased that I’d be curious to know why some people would prefer to use G’MIC rather than IM to write image processing pipelines. @afre @Reptorian @garagecoder @grosgood and others.

Is that only because scripts can be easily converted as filters for the GIMP plug-in ?

If so, why not creating a similar GIMP plug-in that embeds IM features ?
@snibgo, has it be considered by the IM development team ?
Just curious!

@David_Tschumperle I prefer to use G’MIC as it is available for as many image-processing applications like Krita and GIMP. Also, after learning of it, I find it easier than to use C++ or C# to create filters, and from what I seen in processing and pals, I prefer gmic due to its flexibility.

I concur with both your points. I see them as strengths for either platform. As said above, I love the gmic scripting. IM on Windows BAT processing is awkward, even though it comes so naturally for @snibgo.

I would take you up on that if more people are interested. If they can have beer over dt, why not for cimg / gmic? :wink: (I don’t drink beer though. :stuck_out_tongue_closed_eyes:)

Difficult for me to say why I would choose one over the other when I’ve barely used IM. My impression is that G’MIC has a syntax more like a generic interpreter (suited to building very long pipelines), versus IM appearing more like a classic command line tool with some higher level syntax. I’m not a fan of sharing global state between commands either - debugging nightmare for anything lengthy.

My initial impressions of G’MIC was that it was arcane. The wise among you know that such terms tell as much about the person using them as the things they are used upon — that the person, in fact, knows little about such things so termed and thus is deeply struck by mystery. Black magic is arcane. So is calculus, linear algebra and Emacs. Become familiar with them, the mystery falls away and the thing becomes simply useful. But it is worth noting why some things become useful while others just sit.

I’m not qualified to compare the present-tense ImageMagick with G’MIC; I have not used ImageMagick for some time. However, ten to fifteen years ago, I used ImageMagick a great deal in projects for which I now harness G’MIC. The transition from one to the other was gradual, unplanned and without much conscious consideration. Perhaps over the course of six months or a year I found myself using G’MIC rather more than ImageMagick and that occurred primarily because G’MIC command structure would stick with me, while I needed to frequently refresh my memory about ImageMagick settings.

Some time ago, @David_Tschumperle wrote a précis on his design approach to G’MIC — one of a number, I am sure, but this is the one that happened to stick with me. He wrote:

That’s it for me: conciseness and coherence. To which, I may add, staying in the working context. I’m not going to congratulate myself for saving a nickel of processing time if a dollar is spent switching between G’MIC, C or Python. My time that is available for exploration is limited. When I’m exploring, I want to stay in one environment, stick with one toolset and rue minutes spent in decamping and regrouping. Life is short. Mind the overhead.

That said, I tell myself from time to time that I should give really ImageMagick another go, and perhaps I shall — Thank you, @snibgo, for your report from other realms beyond the blue hills and far valleys.

1 Like

That was a good point indeed.
Next version of G’MIC will improve that : command input will return the bit depth when loading .png and .tiff files.