G'MIC exercises

Your request is a bit vague as it is. Could you be more precise on what the algorithm should do ?

E.g., use apply_gamma to make mean = 130, its argument set to a precision of 3. I could guess or have a command do it for me.

>gmic sp tiger p
  min = 3, max = 248, mean = 98.0154, std = 48.22

>gmic sp tiger apply_gamma 1.52 p
  min = 3, max = 248, mean = 129.994, std = 45.6714

This is actually what the command balance_gamma does:

gmic sp tiger p balance_gamma 130 p
$ gmic sp tiger p balance_gamma 130 p
[gmic]-1./ Input sample image 'tiger' (1 image 750x500x1x3).
[gmic]-1./ Print image [0] = 'tiger'.
[0] = 'tiger':
  size = (750,500,1,3) [4394 Kio of floats].
  min = 3, max = 248, mean = 98.0154, std = 48.22, coords_min = (305,74,0,2), coords_max = (707,471,0,0).
[gmic]-1./ Apply gamma-corrected color balance of image [0], with reference color (130).
[gmic]-1./ Print image [0] = 'tiger'.
[0] = 'tiger':
  size = (750,500,1,3) [4394 Kio of floats].
  min = 8.55469, max = 250.374, mean = 127.262, std = 42.182, coords_min = (305,74,0,0), coords_max = (175,39,0,2).
[gmic]-1./ End G'MIC interpreter.

To adjust the image mean by raising the values to a power, we often use this formula:

new_value = old_value ^ P

where:

new_value and old_value are on the scale 0.0 to 1.0;
P = log(new_mean) / log(old_mean).

This is only approximate, but usually good enough. It will accurately transform pixels that were exactly at old_mean to be new_mean, but this doesn’t mean that the overall average will become new_mean.

My iterating scripts and C-code functions are mostly variations on how we manually search for a word in a paper dictionary, “divide and conquer”:

  1. If the word is in the dictionary, it is between the first and last pages. This gives us bottom and top limits for the search.

  2. Choose a page between the two limits.

  3. This page either is the correct one for the word, or contains “larger” words, or contains “smaller” words. (Eg “magick” is smaller, in dictionary terms, than “wand”.)

  4. So this page is the new limit for the top or bottom of the search area.

  5. Repeat from (2) until we have the correct page.

When we have numbers instead of words, this is simple to implement. For some purposes, variations can be faster, eg Fibonacci search technique - Wikipedia.

I see that if I solve for x I get the factor.

image

Would the same apply to an arbitrary function? If so, the approximation would provide the iteration command a closer starting point.

The approximation “P = log(new_mean) / log(old_mean)” could be used as the first guess for an iteration that sets the new mean to any required precision. However, my iteration methods require two initial guesses. At each iteration, one of the guesses moves closer to the other. Iterations stop when the two guesses are sufficiently close.

Note that when a simple mathematical function is known, solving for an unknown variable is either very easy (by algebra) or fairly easy (eg by Newton-Raphson). But the mathematical function that calculates the overall mean of an image after a power function is too complex for me to consider using those methods.

The command I wrote searches in one direction until it passes the target and keeps on backtracking in smaller increments until it reaches the goal. Looking it up, it is called a binary search algorithm. (I usually make stuff up and find out what it is called later. :slight_smile:) Unfortunately, it takes a long time to do on my aging hot as pancakes laptop. I haven’t grasped making multi-step iterative calculations and doing them without messing around with the image buffer. :nerd_face:


Update Each iteration was too costly, so I ended up reducing the size of the image to 10%, which works well with the right interpolation. Am able to save 24s of processing. :blush:

I think that what you are looking for in this particular case is the so-called Dichotomic Search.
Basically, for any increasing function y = f(x), it is able to find the value of xt such that f(xt) = target_y, where target_y is the value you want to obtain. The function f can be very complex to compute, but has to be monotonic increasing.
More infos on the wikipedia page about this algorithm.

Below is an implementation of the dichotomic search in G’MIC, with two different uses.
The first one, to find the value of log(pi) by inverting the exp() function.
The other one, by finding an optimal power number to apply on the image pixels so that the image J = cut(I^gamma,0,255) has an average of 128. This last example shows a complex use case where the function would be clearly not simple to invert mathematically (because of the value cut). In this case, doing a dichotomic search is quite efficient.

# search_dichotomic : "increasing_command_y(x)",target_y,precision>0
# Return 'nan' if search failed.
search_dichotomic : skip ${3=1e-3}
  v -
  m "_sdc : $1"
  target_value,epsilon=$2,$3

  # Find acceptable parameter bounds.
  mpos,nb_attempts=-1,20
  do mvalue=${_sdc\ $mpos} if {$mvalue<$target_value} break fi mpos*=2 nb_attempts-=1 while $nb_attempts
  if {!$nb_attempts} u nan v + return fi
  Mpos,nb_attempts=1,20
  do Mvalue=${_sdc\ $Mpos} if {$Mvalue>$target_value} break fi Mpos*=2 nb_attempts-=1 while $nb_attempts
  if {!$nb_attempts} u nan v + return fi

  # Start dichotomic search.
  nb_attempts=100
  do
    cpos={($mpos+$Mpos)/2}
    cvalue=${_sdc\ $cpos}
    if {abs($cvalue-$target_value)<$epsilon} u $cpos v + return
    elif {$cvalue<$target_value} mpos=$cpos
    else Mpos=$cpos
    fi
    nb_attempts-=1
  while $nb_attempts
  u {$nb_attempts?$cpos:nan} v +


# Show different uses of the 'search_dichotomic' command.
test_dichotomic :
  # Find logarithm of pi.
  search_dichotomic "u {exp($""1)}",pi logpi=${}
  v + e "log(pi) = "$logpi" (-> exp -> "{exp($logpi)}")." v -

  # Find gamma correction of an image, such that its average is 128.
  sp tiger
  search_dichotomic "+pow $""1 c. 0,255 res={ia} rm. u $res",128 gamma=${}
  +pow $gamma c. 0,255

If you have any questions about it, just let me know.

With a minor variation, a dichotomic search can be used for monotonically decreasing functions. A more significant change doesn’t require monoticity.

Sometimes a function does not change monotonically, but does contain a maximum (or minimum) that we want to find. For example: what is the power number that maximises the overall contrast of an image? Assuming the function is roughly symmetrical about the maximum, we can solve this by calculating the function at two places between the two limits instead of just one place. Test which result is largest, and use this to determine which limit to replace. Iterate until the desired precision is obtained.

I wanted to keep it simple. Of course, this works with decreasing functions too (just considers finc(x) = -fdec(x) and you get an increasing function :slight_smile: ).

For functions with min/max to find, gradient descent algorithms are usually good, except in the case where you cannot easily compute the function derivative, or at least a close approximation of it.

@David_Tschumperle this is probably the worst day to ask G’MIC questions due to certain events elsewhere (good luck! :slight_smile: ), but I’d like to make better use of vectors in the math parser.

First question: In general what’s the fastest way to loop over a local window, treating each [R,G,B] location as a vector? Do we use crop(), nested for() loops, read/write other images on the stack, or something else? For example, one possible approach using a secondary single channel image to avoid looping all channels:

for (j = -N, j<=N, ++j,
  for (k = -N, k<=N, ++k,
    currentPixelVector = J(#0,j,k);
    # Do stuff with currentPixelVector
  );
);

Second question: are there good ways to do certain operations on the whole local window of vectors treated as a set of vectors? For example if we want to do a linear combination of them, is there a way to do that with sum()?

Edit: thinking further about it, I suppose many operations like that could be done by treating the local window as a matrix… so the question becomes: can we easily obtain an M x N matrix from the local window, where M = channels and N = (window w*h)?

I guess these two options are equivalent. It highly depends on what you want to do with your data afterwards. Copying image data in an image on the stack could be also on good solution if you want to apply some complex operations that e.g. exist as a G’MIC command in an optimized way (e.g. resize, that you can call with an ext() in the math parser).

Yes, if you treat the neighborhoods as vector themselves, it is quite easy (and prob. efficient) to do this, e.g.

N = 0.1*crop(x-2,y-2,z,0,5,5,1,1) + 0.2*crop(x-2,y-2,z,1,5,5,1,1) + 0.3*crop(x-2,y-2,z,2,5,5,1,1);

will compute a weighted sum of R,G,B neighbor data.
Using a single crop would be also possible, then multiply it by a specific matrix.

1 Like

@David_Tschumperle below is a geometric median based on a modified Weiszfeld algorithm. Do you have some ideas for speeding it up? It’s ok for a prototype but still painfully slow for even medium size images:

geometric_median : skip ${1=3},${2=12}
  repeat $! l[$>]
  [0] +boxfilter. {$1+1-$1%2} sh. 0
  f. "
    init(
      const boundary = 1;
      const N = int($1/2);
      const W = N+1;
      weightedSum = I(#1);
      totalDist = weightedSum;
    );

    Y = I(#2,x,y); # centroid
    
    for (iters = 0, iters<$2, ++iters,
      weightedSum = 0;
      totalDist = 0;
      distSum = 0;
      flag = 0;
      
      for (j = -N, j<W, ++j,
        for (k = -N, k<W, ++k,
          X = J(#1,j,k);
          diff = X - Y;
          dist = norm2(diff);
          if(dist==0, flag=1,
            R = 1/dist;
            weightedSum += R * X;
            totalDist += R * diff;
            distSum += R;
          );
        );
      );
      
      if(totalDist!=0,
        bal = flag ? 1/norm2(totalDist) : 0;
        Y = max(0, 1 - bal) * weightedSum / distSum + min(1, bal) * Y
      )
    );
    
    I(#0,x,y) = Y; 1" k[0]
  endl done

Motivation: an example of comparison with “normal” median (original, median, geometric median):
testmedian

It can be seen a normal median produces “false” colours at certain edges, geometric doesn’t.

Edit: there’s an alternative version of that paper which has a nicer representation of the mapping function (taken from citations of this wikipedia page).

Edit2: spotted one obvious change - amended slightly!

1 Like

Result is pretty cool, GC. Can see such a use to uprez icons and clipart. :slight_smile:

I’ll take a quick look today.
Probably one good thing to add is a * as the first character of the fill expression: it will ensure multi-threading is activated, which seems to be possible here.
Usually, the G’MIC math parser tries to “auto-guess” if an expression can be evaluated in multi-threaded mode, but if the expression contains instructions to explicitly write pixels in images (as I(#0,x,y) = Y; ), then multi-threading is usually disabled by default, as it could lead to undefined behaviors.
In your case anyway, activating the multi-threading seems to be OK as you won’t write a color in any other location than (x,y).
Doing this already boosts the speed in a visible manner.

Just looked to your code, and I must admit I don’t have ideas right now to improve it.
What I see is that you have three nested loops : 12 iterations of (x,y) loops over all the image, I’m not really surprised this is somehow “slow” to compute, particularly if you think that at the end it is evaluated by a kind of ‘bytecode’ interpreter (the math expression is ‘compiled’ into bytecode, but the bytecode sequence itself is interpreted).

I wonder what @garagecoder is up to. Maybe he is a Nintendo dev. :stuck_out_tongue:


I still have a backlog of loose ends to look into in this thread, but I already have two questions that I have been waiting a very long time to ask.

1. In the paper, from what I could gather, the guided filter (which I love to use) could be used to transfer structures from image a to image b. My issue with the current implementation is that it is a blurring filter first. Any structure transferred is lost to the blur even when I use 1,1e-10. (As an aside, I see that apps like RawTherapee use radii that are less than one. Is it enlarging, filtering, then shrinking the image?) Another application is to refine masks. Again, I don’t know how to do what the paper shows.

2. Another question surfaced from my involvement with the PlayRaws. Take the recent one e.g., [PlayRaw] View from Fort Carré in France. There is text on the crane. When I down sample the image, it turns into an indiscernible blob. It looks like there are two groups that are equivalent: 1,4 and 2,3,5,6.

1=nearest | 2=average | 3=linear | 4=grid | 5=bicubic | 6=lanczos

Looks like IM has more possibilities but my head hurts from the reading.
Resizing or Scaling -- IM v6 Examples
Resampling Filters -- IM v6 Examples
Nicolas Robidoux Resampling -- IM v6 Examples

I also found this paper that uses what is called detail-preserving image downscaling (DPID) from another paper. Their implementation doesn’t look complicated but I don’t have the math or dev chops to figure it out. :blush:

@David_Tschumperle one possible speedup is more to do with the algorithm. Because some vectors can converge early it will be simple to add an early exit from the iterations where totalDist = 0 (may have some precision problems to deal with). Thanks for the tip about threads!

Edit: indeed G’MIC is in no way to blame for the speed. With the “early exit” added it’s doing 7x7 12 iters of the “parrots” sample in under 30s on this ageing core2. For a JIT interpreter this is actually amazing!

@afre sorry, I shouldn’t have hijacked the thread this way… next time I’ll start a new post. If I have time at the weekend I’ll have a look at your questions (can’t promise good answers though :slight_smile: ). My time in game dev is long past, ETL/databases is my thing now!

No problem. It isn’t off topic, actually. :sunny:

@garagecoder, I’ve moved your nice geometric median in category ‘Repair/’. Thanks for this cool contribution ! :beers:

2 Likes