Min-heap management

David_Tschumperle · November 3, 2023, 4:54pm

This post is targeted to people who develop G’MIC filters. It probably won’t be very interesting for others

Today, I’ve added two new functions to the G’MIC maths evaluator, which allow developers to work with min-heap structures. Min-heaps are useful structure for modeling priority queues. In G’MIC, they are stored as dynamic arrays.
The new functions are:

da_push_heap(_#ind,elt1,_elt2,...,_eltN): Insert the N given elements in the min-heap #ind.
da_pop_heap(_#ind): Remove and return the min element from the min-heap.

They basically act like the already existing functions da_push() and da_pop(), except that they ensure the dynamic array keep a min-heap structure, meaning the value at position [0] is always minimal. And each has a quite nice algorithm complexity of O(log(N)), where N is the number of element in your heap.

An example is worth 1000 words:

foo : 
  1x2 # [0] and [1] are two dynamic arrays (initially empty). [1] will be used as a min-heap.
  eval "
    const N = 300; # Number of values to push

    # Push random values in min-heap [1].
    repeat (N,
      value = round(u(100));
      da_push_heap(#1,value);
    );

    # Pop values from min-heap [1] and insert them in dynamic array [0].
    # -> As the pop_heap() function always return the minimal value in the min-heap [1],
    # you'll end up with a sorted array.
    repeat (N,
      elt = da_pop_heap(#1);
      da_push(#0,elt);
    );
  "
  rm. da_freeze
  plot      # Plot sequence of values in dynamic array [0].

which gives:

I hear you saying: “Hey! but I could use the sort command, after pushing all those values to get the same result!”. That’s true, but that was a toy example to illustrate the property of min-heaps

Now, another use case, where min-heaps can be useful : Suppose you have a set of random points and you want to draw, for each point, the K-closest connections with the other points.
That’s simple: For each point, just compute all the distances to the other points, and push them in your min-heap. Then, pop only K values, and you’ll get it :

foo : check "isint(${1=3}) && $1>=0"

  # Generate a list of random points (but wih a minimal distance between them).
  # [0]: list of points (random xy coordinates)
  srand 0
  600,600 noise_poissondisk. 50 1,1,1,2 eval.. "i?da_push([x,y])" da_freeze. rm..

  # [1]: matrix of distances between all pairs of points.
  {0,[h,h]} f. ">y>=x?i(y,x):norm(I[#0,x] - I[#0,y])"

  # [2]: canvas for visualization.
  600,600,1,3

  # Find K-nearest neighbors for each point.
  # [3]: min-heap
  1,1,1,2
  eval[0] "
    const K = $1;
    xy = I;

    # Push distances from current point to all others in min-heap.
    i[#3,h(#3)-1] = 0; # Fast way to clear min-heap
    repeat(h,k,k!=y?da_push_heap([i(#1,k,y),k]));

    # Draw the K-closest connections in canvas.
    repeat (K,k,
      ind = da_pop_heap();
      nxy = I[ind[1]];
      color = k==0?[255,255,64]:k==1?[255,128,16]:[255,0,0];
#      polygon(#2,2,xy[0],xy[1],nxy[0],nxy[1],1,color);
      run('line_aa[2] ',xy[0],',',xy[1],',',nxy[0],',',nxy[1],',1,',v2s(color)); # Lot slower but prettier
    );
    I"
  eval[0] "ellipse(#2,i0,i1,5,5,0,1,255); I"
  k[2] r2dx 400

anim_closest

In the example above, the colors for the connecting lines are : yellow=closest, orange=2nd-closest and red=3rd-closest (or +).

Heap structures are in fact helpful when developing algorithms that involve graphs, such as the famous Djikstra algorithm. G’MIC already had the djikstra command, but now we can imagine that someone could implement its own custom version of Djikstra more easily than before.

I’ve built new binary packages for 3.3.2 pre-releases, including these two new commands, so feel free to test.

Cheers.

EDIT: I forgot to say that of course, you can manage max-heaps too, just insert negative values in your heap.

Reptorian · November 4, 2023, 6:05am

This makes me wonder where should I use this after I return from a long hiatus. But, I’ll probably only use it in new filters, or if it comes up during a refactor. Great work.

garagecoder · November 4, 2023, 9:48am

Good news indeed, there were some algorithms in the past I decided not to implement that I’m much more likely to try with this! Makes me wonder if there are plans to add any others… one which interested me in the past was skip lists

David_Tschumperle · November 4, 2023, 5:56pm

Update:

I’ve finally recoded from scratch the command dijkstra, so it won’t be a native command anymore in the next G’MIC version.
This command allows to find a minimal path between two vertices of a graph, as shown in the example below:
gmic_dijkstra

For those interested, here is the G’MIC script that generates the image above;

foo :

  # [0]: list of vertices V (random points).
  srand 0
  600,600 noise_poissondisk. 70 1,1,1,2 eval.. "i?da_push([x,y])" da_freeze. rm.. => V

  # [1]: Adjacency matrix A.
  {0,[h,h]} f. ">y>=x?i(y,x):(d = norm(I[#0,x] - I[#0,y]); d<90?d:inf)" => A

  # [2] : Graph representation.
  600,600,1,3 => canvas

  # Find shortest path with the Dijkstra algorithm.
  start=3
  end=46
  +dijkstra[A] $start,$end => D

  # Draw shortest path on canvas.
  eval "
    current = $end;
    while (current!=$start,
      parent = i(#$D,0,current,0,1);
      parent>=0?(
        xyc = I[#$V,current][0,2];
        xyp = I[#$V,parent][0,2];
        run('thickline[canvas] ',v2s([xyc[0],xyc[1],xyp[0],xyp[1]]),',5,0.75,255,128,255');
        current = parent;
      ):run('error[] \"Vertex ',current,' has no parent!\"');
    )"

  # Draw graph vertices and edges.
  eval[A] ">x>y && !isinf(i)?(
    xy0 = I[#$V,x];
    xy1 = I[#$V,y];
    run('line_aa[canvas] ',v2s([xy0[0],xy0[1],xy1[0],xy1[1]]),',1,128');
  )"
  eval[V] "
    ellipse(#$canvas,i0,i1,3,3,0,1,255);
    run('t[canvas] ',y,',',i0-5,',',i1+5,',14,1,0,255,0');
  "
  k[canvas] r2dx 450

afre · November 6, 2023, 3:32am

How about skip graphs?

David_Tschumperle · November 6, 2023, 7:27am

I’ve hardcoded the functions for heap management because I had the need for heaps a few times in my own filters (that’s why there were also functions for heap management in the math_lib command before).
Heaps are useful for managing priority queues, and this kind of structure appears quite often, even in image processing algorithms

I’m not aware of classical algorithms that use skip lists or skip graphs in image processing. There are probably a few. I think that the dynamic arrays we have in G’MIC are flexible enough to allow the implementation and management of almost any kind of exotic structures in a custom command. After all, dynamic arrays are just memory buffers with variable sizes, so they can be used to store anything you want.