Hello there.
I was a bit bored yesterday so I’ve decided to try implementing Huffman coding directly in G’MIC language. This is a classical non-destructive compression algorithm.
It’s not really because I wanted to compress things (we already have the native command serialize
for this purpose, and it uses a smarter algorithm), but because I wanted to test how easy/hard it is to manage tree structures inside a G’MIC command, and Huffman relies on the construction of a binary tree, and the algorithm is simple enough. So that was a good occasion!
(side note: I’ll may have to rely on tree structures later to optimize my WIP raytracer…).
What is fun is actually to add some bit-inversion noise in the compressed Huffman code, before decoding : it creates nice compression glitch effects. Could be even better doing this patch by patch, this clearly gives some ideas for a future Huffman Glitch
filter !
For those who are interested, here is the code:
Huffman coding in G'MIC
# Simple implementation of huffman compression, in G'MIC.
# $1: Number of bit inversion done in the compressed code.
huffman : check "${1=0}>=0"
sp colorful,480 n. 0,255 round.
W,H,S:=w,h,s y => input
# Bottom-up construction of the Huffman tree.
siz={input,whds}
+histogram[input] 256,0,255 1,1,1,6 eval.. "begin(id = 0); i?da_push([id++,i,x,-1,-1,-1])" rm..
# -> list of active leafs/nodes ( id,occurence,value,parent,child1,child2 )
eval "
active = 0;
id = da_size();
do (
# Find the two minimal occurrences in active nodes.
omin1 = omin2 = inf; oind1 = oind2 = 0;
for (k = da_size() - 1, k>=active, --k,
occ = i(0,k,0,1);
occ<=omin1?(oind2 = oind1; omin2 = omin1; oind1 = k; omin1 = occ):
occ<=omin2?(oind2 = k; omin2 = occ);
);
oind1>oind2?swap(oind1,oind2);
# Merge them as a new active node, and set children as inactive.
i(0,oind1,0,3) = i(0,oind2,0,3) = id;
da_push([ id++,omin1 + omin2,-1,-1,i[oind1],i[oind2] ]);
swap(I[oind1],I[active++]);
swap(I[oind2],I[active++]);
,_(while) active<da_size()-1);
da_freeze()"
sort. +,y channels. 2,5 => tree # ( value,parent,child1,child2 )
# Construct codebook for faster encoding.
1,256,1,2 # codebook ( code,nb_bits )
eval[tree] "i2<0?( # Leaf
ind = y;
code = nb_bits = 0;
while (ind>=0, # Iteratively go up to root node
par = i(0,ind,0,1);
par>=0?((code<<=1)|=(i(0,par,0,3)==ind); ++nb_bits);
ind = par;
);
ncode = 0; repeat (nb_bits,(ncode<<=1)|=(code&1); code>>=1); # Reverse bit order
I[#-1,i0] = [ ncode,nb_bits ];
); I" => codebook
# Encode input data.
0 eval[input] ">begin(out_stream = 0; nb_bits = 0);
val = i;
C = I[#$codebook,i];
(out_stream<<=C[1])|=C[0];
nb_bits+=C[1];
while (nb_bits>=8, # Write completed bytes
da_push(out_stream>>(nb_bits-=8));
out_stream&=2^nb_bits-1;
);
end(nb_bits?da_push(out_stream<<(8 - nb_bits)); da_freeze())"
=> compressed
e " > Size(original) = "$siz", Size compressed = "{compressed,h}
# Add bit artefacts in compressed code.
eval "
repeat ($1,
ind = round(u(0,h#$compressed - 1));
bind = int(u(0,8))%8;
mask = 1<<bind;
val = i[#$compressed,ind];
i[#$compressed,ind] = (val&mask)?val&(255 - mask):val|mask;
)"
# Decode compressed data.
0 eval "
off = mask = byte = bytes_written = 0;
ind = h#$tree - 1;
while (bytes_written<$siz,
!mask?(byte = i[#$compressed,off++]; mask = 128);
bit = (byte&mask)!=0;
mask>>=1;
child = i(#$tree,0,ind,0,bit?3:2);
i(#$tree,0,child,0,2)<0?( # Leaf
da_push(i(#$tree,0,child,0,0));
ind = h#$tree - 1;
++bytes_written;
):(ind = child);
); da_freeze()"
=> decompressed
r[input,decompressed] $W,$H,1,$S,-1
+-[decompressed] [input] norm. e " > Reconstruction error : "{ia} rm.
It’s not that long (and of course a bit slow compared to a C++ implementation, but still usable for compressing/decompressing images).
Here are the results of adding bit-inversion noise in the compressed code. First image has 0 noise, then I gradually increase the level of noise.
No noise:
100 bits inverted:
1000 bits inverted:
10.000 bits inverted:
100.000 bits inverted:
I’ll probably try to clean the code and create a filter from it in the next days.
Feel free to play with it!