Reptorian G'MIC Filters

Reptorian · October 9, 2024, 2:42am

I’m sorry for mentioning you. I found out the bug is in the bitshift. It should have been 1. I’m so dumb.

Anyway, I have updated a old filter of mine called Statistical Mode. Deleted some non-generic version of mode filters too. So, I don’t need 8-bit/16-bit options. And it’s faster too.

More news on Statistical Mode. I think I can make the fill version faster it seems. I tested double dictionary approach, and it is 2.6 times faster than what derived from David’s work.

EDIT:
I think I did it.

C:\>gmic sp dog tic rep_mode , toc
[gmic]./ Start G'MIC interpreter (v.3.4.3).
[gmic]./ Input sample image 'dog' (1 image 1024x685x1x3).
[gmic]./ Initialize timer.
[gmic]./ Elapsed time: 0.188 s.
[gmic]./ Display image [0] = '[[2,2,2]]'.
[0] = '[[2,2,2]]':
  size = (1024,685,1,3) [8220 Kio of float32].
  data = (2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2, ... ,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2).
  min = 2, max = 2, mean = 2, std = 0, coords_min = (0,0,0,0), coords_max = (0,0,0,0).
[gmic]./ End G'MIC interpreter

C:\>gmic sp dog tic new_rep_mode , toc
[gmic]./ Start G'MIC interpreter (v.3.4.3).
[gmic]./ Input sample image 'dog' (1 image 1024x685x1x3).
[gmic]./ Initialize timer.
[gmic]./ Elapsed time: 0.063 s.
[gmic]./ Display image [0] = '[[2,2,2]]'.
[0] = '[[2,2,2]]':
  size = (1024,685,1,3) [8220 Kio of float32].
  data = (2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2, ... ,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2).
  min = 2, max = 2, mean = 2, std = 0, coords_min = (0,0,0,0), coords_max = (0,0,0,0).
[gmic]./ End G'MIC interpreter.

prawnsushi · October 10, 2024, 2:08pm

Color me puzzled i have no idea what you’re talking about…

Reptorian · October 10, 2024, 2:17pm

Well, I deleted some message. But, for context I forgotten GUI programming, and GUI states can only be 0,1,2. Well, I inserted something by (!$var)<<2 as state. That is bitshift by 2 or multiply by 4, so it wouldn’t work. That was a very stupid part on me, for missing something so obvious.

I’m thinking of changing how colormap 0 works.

If you been following this thread: I recoded rep_mode to use double dictionary, and given that filter finds the most common color, it seems that some of the techniques used there can also be transferred into colormap.

How that code works is that the code utilize multiple threads with it own dictionary, then it performs a single-thread merging of color+occurrence. So, there’s two dictionary here. And the time to find all of the color is half the time of colormap 0.

But, before I do that, I’ll have to wait to see what David thinks of this. Not pinging him because I still prefer him to take his time given recent news.

I could always do rep_colormap and copy some of the code except for colormap 0 and then tests can be performed there.

Reptorian · October 16, 2024, 2:07am

On this idea, I’m thinking it definitely has real potential:

# [gmic]./ Elapsed time: 2.227 s.
#@cli rep_new_bin2dec: binary_number
#@cli : New algorithm to convert binary to decimal.
rep_new_bin2dec:
scale_factor:=log(1<<24)/log(1e7)
chunk_size:=0x1000

('$1')

if im<_'0'||iM>_'1' error inv_char_det fi
crop. {xM},100%

if w<53
	u {s2v([('0b'),crop(#-1)])}
	rm.
	return
fi

imgs=$!
{ceil(w/24)},1,1,1,"begin(
		const offset=w*24-w#-1;
		const dist=24-offset;
		pos=started=output=0;
	);
	started?(
		pos+=24;
		output=s2v([('0b'),crop(#-1,pos,0,0,0,24,1,1,1)]);
	):(
		pos=x*24-offset;
		pos>=0?(
			output=s2v([('0b'),crop(#-1,pos,0,0,0,24,1,1,1)]);
		):(
			output=s2v([('0b'),crop(#-1,0,0,0,0,dist,1,1,1)]);
		);
		started=1;
	);
	output;"

rm[-2]
mirror[-1] x

if w#-1>$chunk_size
	s x,-$chunk_size
	ap[$imgs--1] __rep_new_bin2dec_convert_base_float2base_10M # Apply command to different chunk in parallel threads
else
	__rep_new_bin2dec_convert_base_float2base_10M[-1]
fi
__rep_new_bin2dec_convert_base_float2base_10M:
crop 0,{for(p=w#-1-1,p>=0,--p,if(i[#-1,p],break()));p;}
scale_factor:=log(1<<24)/log(1e7)
iterations,new_size:=w-1,ceil(w#-1*$scale_factor)
$new_size

eval.. >"begin(
		const converted_base=1e7;
		const number_of_digits_in_base_float=w;
		multiplier_vector=vector(#number_of_digits_in_base_float+1);
		multiplier_vector[0]=1;
		number_of_digits_per_multiplier_vector=1;
		started=0;
	);
	started?(
		remainder=secondary_remainder=0;
		multiplier=i;
		for(p=0,p<number_of_digits_per_multiplier_vector||remainder,++p,
			new_val=(multiplier_vector[p]<<24)+remainder;
			multiplier_vector[p]=new_val%converted_base;
			remainder=int(new_val/converted_base);
			if(multiplier,
				secondary_new_val=i[#-1,p]+multiplier_vector[p]*multiplier+secondary_remainder;
				i[#-1,p]=secondary_new_val%converted_base;
				secondary_remainder=int(secondary_new_val/converted_base);
			);
		);
		if(secondary_remainder,i[#-1,p]=secondary_remainder);
		number_of_digits_per_multiplier_vector=p;
	):(
		i[#-1,0]=i%converted_base;
		if(i>=converted_base,i[#-1,1]=int(i/converted_base););
		started=1;
	);

	"
rm[-2] crop 0,{for(p=w#-1-1,i[#-1,p]<=0,--p)}

Of course, I still need to do more things, but the elapsed time tells me that I might be able to speed up conversion of 1M binary to decimal by more than 5x. So, it might even do that in 15 s. That’s slow, but much, much better than all the alternatives I tried.

Reptorian · October 20, 2024, 5:29pm

Ok, I tested it, execution time ranges from 10-15 s for 1M binary digits. The current public implementation, it takes 55 s with 96 GB RAM. Yeah, it’s still bad, but all the other methods I tried are even worse than the current implementation. They take more than 3 minutes. Double Dabble taken more than 5 minutes. The new implementation is more than 3x improvement over the current public implementation.

Reptorian · October 22, 2024, 4:58am

I made a interesting change. I was talking with a Python coder, and he had a idea of using fractional base. So, we chatted a bit and shared code.

Here’s Reverse Digits using a Base 1.27:

So, check out Reverse Digits again after update.

Reptorian · October 31, 2024, 5:37am

Made a slight twist to the Square Chair filter. You can see linear gradient progression:

Reptorian · November 1, 2024, 4:48am

Even more work on the Square Chair:

Reptorian · November 14, 2024, 6:15am

A little update, I think this will take a little longer than expected, but I did do some things to make it easier.

It’s looking like that I might break the previous record of 1800 input variables, the GUI filter will definitely use over 2000 input variables.

Reptorian · December 5, 2024, 5:44am

Taking a break from that thing. Went to revisit a old problem.

As @afre was in that thread, I’ll let him know it has been solved - Sort images according to N-Sized Channels

This can work on 20 channels too.

I do think it can be extended though. It’s slow though.

@David_Tschumperle How I would extend this per axis akin to what you did with autocrop?

#@cli rep_sort_colors:
#@cli : Sort pixels by their color.
rep_sort_colors:
m __$0_store_level:"
	if s>1
		N:=$!
		+channels {s-1}
		store[-$N--1] v$""1
		channels 0,{s-2}
	else
		store v$""1
	fi
	"

foreach {
	n,w,h,d,s,whd={n},{w},{h},{d},{s},{whd}
	mirror. c
	resize $whd,1,1,100%,-1

	repeat $s {
		foreach {
			+channels {s-1}
			pixelsort.. +,x,[-1]
			sort.
			+discard. x
			resize. 100%,1,1,2,-1

			start_value:=i[#-2,0]

			eval[-2] >"begin(
					test_value=$start_value;
					ix=ex=p=0;
				);
				i==test_value?(ex=x):(
					I[#-1,p++]=[ix,ex];
					ix=ex=x;
					test_value=i;
				);
				end(
					I[#-1,p]=[ix,ex];
				);
				"

			rm[-2]

			repeat w {
				+crop[0] {I[#1,$>]}
			}

			rm[0,1]
		}
		__$0_store_level $<
	}

	repeat $s {
		N:=$!
		${v$<}
		a[$N--1] x
	}

	a c

	resize $w,$h,$d,100%,-1
}

Reptorian · December 15, 2024, 5:43am

I think closer to finishing that now I figured out merge sort part. Hmm, back to the code derived from colormap 0. How come it’s slower for large images with way more colors than colormap 0. Yet, more than twice as fast in sp cat and so?

#@cli rep_uniq_cols: _preserve_frequency={ 0=false | 1=true },seed
#@cli : Find all colors, with the option to preserve frequency
#@cli : Default values: '_preserve_frequency=0
rep_uniq_cols:
check isbool($1) _rep_colors_freqstat {$1-2}
#@cli rep_mode: fill_with_mode_color={ 0=false | 1=true },seed
#@cli : Fill image with the most common color.
rep_mode:
check isbool($1) _rep_colors_freqstat $1
_rep_colors_freqstat:
check "isint($1,-2,1)"

use_color_counts:=$1>-2

original_status=${}

bitshift_factor=10
buckets_per_threads,mt_dictionary_buckets:=[1,$_cpus]<<$bitshift_factor

if $use_color_counts
	m __$0_process_dictionary:"permute yzcx s c,-{s-1}"
else
	m __$0_process_dictionary:"permute yzcx"
fi

if isbool($1)
	highest_color_frequency_mode=1

	if $1 m __$0_header:"output_dimensions:=[w#-1,h#-1,d#-1,s#-1]"
	else  m __$0_header:"output_dimensions:=[1,1,1,s#-1]"
	fi

	m __$0_finalize_single_thread:"
		eval[-1] begin(m=0);if(i==iM#-1,++m;);end(merge(m,+);set('t_color',m););
		if $t_color>1 $output_dimensions,[{vector(#s#0,nan)}]
		else     $output_dimensions,[{I[#0,xM#1]}]
		fi
		keep[-1]"

	m __$0_finalize_multi_threads:"rm $output_dimensions,[${}]"
else
	highest_color_frequency_mode=0

	m __$0_header:"skip ,"

	if $use_color_counts m __$0_finalize_single_thread:"a[-2,-1] c"
	else  m __$0_finalize_single_thread:"skip ,"
	fi

	m __$0_finalize_multi_threads:"a[-]"
fi

foreach {
	spec_size:=s
	__$0_header
	mt_mode:=whd>0x8000

	if $mt_mode
		dictionary_buckets,p_mode=$mt_dictionary_buckets,:
	else
		dictionary_buckets,p_mode=$buckets_per_threads,>
	fi

	# Create Hashmap
	100%,100%,100%,1,$p_mode"
		begin_t(
			const bf=$bitshift_factor;
			seed=v(0x1234,0xabcd);
			offset=2+(t<<bf);
		);
		channels=I#0;
		hash_value=0;
		repeat($spec_size,p,
			hash_value=xor(((hash_value<<1)-hash_value)*seed,channels[p]);
			hash_value&=0xff;
		);
		hash_value+offset;"

	# Find Frequency of Colors

	0x$dictionary_buckets
	eval[0] $p_mode"
		begin_t(
			const use_color_counts=$use_color_counts;
			const bf=$bitshift_factor;
			const spec_size=$spec_size;
			const vector_insertion_size=spec_size+use_color_counts;
			sizes=found_vectors=vector(#$mt_dictionary_buckets);
			count_of_found_vector=0;
			last_color=vectors(nan);
			offset=2+(t<<bf);
			use_color_counts?(
				const dec_spec_size=spec_size-1;
				last_p=nan;
				process()=(
					(H?(p=same_color?last_p:find(#key,I,0,vector_insertion_size)+1))?(
						if(!same_color,p+=dec_spec_size;);
						++i[#key,p];
						last_p=p;
					):(
						new_size=H+vector_insertion_size;
						resize(#key,1,new_size,1,1,-1);
						copy(i(#key,0,H,0,0),[I,1]);
						++sizes[key-offset];
						last_p=new_size-1;
					);
				);
			):(
				process()=(
					!(H?(same_color?1:find(#key,I,0,vector_insertion_size)+1))?(
						new_size=H+vector_insertion_size;
						resize(#key,1,new_size,1,1,-1);
						copy(i(#key,0,H,0,0),I);
						++sizes[key-offset];
					);
				);
			);
		);
		key=i#1;
		H=h(#key);

		if(!H,
			found_vectors[count_of_found_vector++]=key;
		);

		same_color=same(I,last_color);
		process();
		last_color=I;
		I;

		end_t(
			repeat(count_of_found_vector,p,
				img_pos=found_vectors[p];
				resize(#img_pos,vector_insertion_size,sizes[img_pos-offset],1,1,-1);
			);
		);"

	rm[0,1]
	a y
	__$0_process_dictionary

	if !$mt_mode
		__$0_finalize_single_thread
		continue
	fi

	{w#0},1,1,1,"begin(
			const spec_size=$spec_size;
			const offset=1+$!;
			seed=v(0x1234,0xabcd);
		);
		channels=I#0;
		hash_value=0;
		repeat(spec_size,p,
			hash_value=xor(((hash_value<<1)-hash_value)*seed,channels[p]);
			hash_value&=0xff;
		);
		hash_value+offset;" => hashmap

	0x$buckets_per_threads

	if $highest_color_frequency_mode
		eval[0] >"begin(
				const hashmap=$hashmap;
				const step_s=s+1;
				const back_s=s-1;
				current_maximum_color=vectors(nan);
				maxima_a=maxima_b=0;
				chval(value)=if(value>=maxima_a,
					current_maximum_color=I;
					maxima_b=maxima_a;
					maxima_a=value;
				);
				last_color=vectors(nan);
				last_p=nan;
			);
			key=i#hashmap;
			H=h(#key);

			same_color=same(I,last_color);
			(H?(p=same_color?last_p:find(#key,I,0,step_s)+1))?(
				if(!same_color,p+=back_s;);
				current_number_of_color=i[#key,p]+=i#1;
				chval(current_number_of_color);
				last_p=p;
			):(
				new_size=H+step_s;
				resize(#key,1,new_size,1,1,-1);
				current_number_of_color=i#1;
				chval(current_number_of_color);
				copy(i(#key,0,H,0,0),[I,current_number_of_color]);
				last_p=new_size-1;
			);
			I;

			end(
				maxima_a>maxima_b?(
					set('{}',v2s(current_maximum_color));
				):(
					set('{}',v2s(vectors(nan)););
				);
			);"

		rm
		$output_dimensions,[${}]
	else
		eval[0] >"begin(
				const hashmap=$hashmap;
				const offset=hashmap+1;
				const spec_size=$spec_size;
				const use_color_counts=$use_color_counts;
				const vector_insertion_size=spec_size+use_color_counts;

				sizes=found_vectors=vector(#$buckets_per_threads);
				count_of_found_vector=0;
				last_color=vectors(nan);
				use_color_counts?(
					const color_counts_pos=hashmap-1;
					const dec_spec_size=spec_size-1;
					last_p=nan;
					process()=(
						(H?(p=same_color?last_p:find(#key,I,0,vector_insertion_size)+1))?(
							if(!same_color,p+=dec_spec_size;);
							i[#key,p]+=i#color_counts_pos;
							last_p=p;
						):(
							new_size=H+vector_insertion_size;
							resize(#key,1,new_size,1,1,-1);
							copy(i(#key,0,H,0,0),[I,i#color_counts_pos]);
							++sizes[key-offset];
							last_p=new_size-1;
						);
					);
				):(
					process()=(
						!(H?(p=same_color?1:find(#key,I,0,vector_insertion_size)+1))?(
							new_size=H+vector_insertion_size;
							resize(#key,1,new_size,1,1,-1);
							copy(i(#key,0,H,0,0),I);
							++sizes[key-offset];
						);
					);
				);
			);

			key=i#hashmap;
			H=h(#key);

			if(!H,
				found_vectors[count_of_found_vector++]=key;
			);

			same_color=same(I,last_color);
			process();
			last_color=I;
			I;

			end(
				repeat(count_of_found_vector,p,
					img_pos=found_vectors[p];
					resize(#img_pos,vector_insertion_size,sizes[img_pos-offset],1,1,-1);
				);
			);
			"

		rm[0-$hashmap]
		a y
		permute yzcx
	fi
}

status $original_status

um __$0_process_dictionary,__$0_header,__$0_finalize_single_thread,__$0_finalize_multi_threads

Where’s the bottleneck? The bottleneck is in eval[0], but why? I can only see that it has to do with expensive resize operation. That’s my guess. David algorithm does not resize as much. Maybe I can remedy that.

Reptorian · December 16, 2024, 5:23am

Yeah, I was able to fix it. It’s faster on larger images with way more color, but definitely much faster with 256 colors regardless of any size. So, my code is a improvement based on David’s code. Now, I"m using geometric growth for resizing array, and to avoid as much resize, just like how David implement his colormap 0 code. Which makes me think about this for other cases where I use a dictionary structure.

Real color JPG photo:

D:\Pictures\A Year Folders Pictures\2007 Folders Pictures\Recovered Photograph>gmic 100_1090.jpg v - tic colormap 0 toc v +
[gmic]./ Start G'MIC interpreter (v.3.5.0).
[gmic]./ Input file '100_1090.jpg' at position 0 (1 image 2855x1955x1x3).
[gmic]./ Elapsed time: 1.357 s.
[gmic]./ Display image [0] = '[colormap of 100_1090]'.
[0] = '[colormap of 100_1090]':
  size = (262638,1,1,3) [3077.8 Kio of float32].
  data = (0,1,0,1,1,0,0,2,0,1,1,0,2,0,2,0,3,0,1,0,1,2,3,0,2,3,0,2,1,3,0,0,4,1,0,4,4,0,3,3,0,2,4,2,1,2,0,0,3,0,3,4,4,0,5,5,1,0,4,0,3,5,3,4, ... ,232,227,227,226,225,239,239,236,235,234,234,234,241,240,232,231,231,233,229,229,229,227,238,238,238,235,235,233,231,230,229,237,240,237,234,233,243,240,237,240,236,235,235,235,233,232,239,239,239,236,236,241,238,238,234,241,241,238,236,236,240,237,237,239).
  min = 0, max = 255, mean = 113.888, std = 69.4318, coords_min = (0,0,0,0), coords_max = (179481,0,0,0).
[gmic]./ End G'MIC interpreter.
D:\Pictures\A Year Folders Pictures\2007 Folders Pictures\Recovered Photograph>gmic 100_1090.jpg tic rep_uniq_cols 0 toc
[gmic]./ Start G'MIC interpreter (v.3.5.0).
[gmic]./ Input file '100_1090.jpg' at position 0 (1 image 2855x1955x1x3).
[gmic]./ Initialize timer.
[gmic]./ Elapsed time: 1.176 s.
[gmic]./ Display image [0] = '[empty]'.
[0] = '[empty]':
  size = (262638,1,1,3) [3077.8 Kio of float32].
  data = (0,237,101,152,153,83,158,98,147,188,222,173,255,79,188,194,32,104,59,182,12,126,66,11,213,87,124,162,198,94,217,119,186,68,50,227,104,215,172,35,18,214,156,59,171,101,169,164,199,16,43,102,149,160,78,146,164,136,181,26,22,24,73,184, ... ,27,27,19,47,51,11,27,7,35,7,15,39,99,23,27,15,35,47,11,83,67,31,31,31,23,27,23,87,71,79,67,51,55,23,15,31,59,75,27,55,71,91,39,19,67,15,79,55,71,63,19,63,47,43,31,11,31,15,35,27,31,19,27,43).
  min = 0, max = 255, mean = 113.888, std = 69.4318, coords_min = (0,0,0,0), coords_max = (12,0,0,0).
[gmic]./ End G'MIC interpreter.

Sample Cat Result (256 colors on 600x550):

D:\Pictures\A Year Folders Pictures\2007 Folders Pictures\Recovered Photograph>gmic sp cat tic colormap 0 toc
[gmic]./ Start G'MIC interpreter (v.3.5.0).
[gmic]./ Input sample image 'cat' (1 image 600x550x1x3).
[gmic]./ Initialize timer.
[gmic]./ Estimate full colormap for image [0], sorted by increasing norm.
[gmic]./ Elapsed time: 0.09 s.
[gmic]./ Display image [0] = '[colormap of cat]'.
[0] = '[colormap of cat]':
  size = (256,1,1,3) [3072 b of float32].
  data = (4,16,30,24,43,42,56,60,54,72,69,79,74,65,87,87,49,87,100,77,98,97,111,98,112,87,108,122,123,108,121,131,97,119,133,131,116,140,143,134,127,141,149,91,115,148,125,144,153,141,155,152,139,161,135,151,161,156,150,161,134,160,167,131, ... ,102,62,140,154,128,86,109,51,115,99,69,102,128,122,158,79,146,133,92,165,113,67,139,105,85,158,126,169,136,147,115,100,176,140,87,149,161,131,182,117,105,157,151,174,165,189,117,129,143,174,184,165,196,137,181,153,191,202,165,199,207,185,223,214).
  min = 4, max = 244, mean = 118.258, std = 66.143, coords_min = (0,0,0,0), coords_max = (245,0,0,0).
[gmic]./ End G'MIC interpreter.
D:\Pictures\A Year Folders Pictures\2007 Folders Pictures\Recovered Photograph>gmic sp cat tic rep_uniq_cols 0 toc
[gmic]./ Start G'MIC interpreter (v.3.5.0).
[gmic]./ Input sample image 'cat' (1 image 600x550x1x3).
[gmic]./ Initialize timer.
[gmic]./ Elapsed time: 0.044 s.
[gmic]./ Display image [0] = '[empty]'.
[0] = '[empty]':
  size = (256,1,1,3) [3072 b of float32].
  data = (157,231,134,171,197,150,188,212,112,165,131,202,219,98,206,167,205,168,212,174,49,234,87,167,198,189,182,161,184,56,162,211,74,240,192,91,139,243,140,198,24,219,188,210,144,182,238,182,127,203,212,219,196,159,210,195,198,158,188,228,189,217,211,184, ... ,13,115,14,149,25,12,5,110,35,17,4,169,8,52,21,161,174,131,51,55,103,31,140,165,19,7,46,51,75,14,49,112,117,77,15,40,4,105,40,31,80,4,51,21,6,53,133,104,30,9,4,17,35,105,102,90,25,31,15,79,25,41,8,8).
  min = 4, max = 244, mean = 118.258, std = 66.143, coords_min = (170,0,0,0), coords_max = (67,0,0,0).
[gmic]./ End G'MIC interpreter.

It’s not much faster, but hey, the base code allows for other things too.

On more extreme case, it is definitely faster because of the use of multi-threading:

D:\Pictures\A Year Folders Pictures\2007 Folders Pictures\Recovered Photograph>gmic sp cat,5000 tic colormap 0 toc
[gmic]./ Start G'MIC interpreter (v.3.5.0).
[gmic]./ Input sample image 'cat' (1 image 5000x4583x1x3).
[gmic]./ Initialize timer.
[gmic]./ Estimate full colormap for image [0], sorted by increasing norm.
[gmic]./ Elapsed time: 3.114 s.
[gmic]./ Display image [0] = '[colormap of cat]'.
[0] = '[colormap of cat]':
  size = (256,1,1,3) [3072 b of float32].
  data = (4,16,30,24,43,42,56,60,54,72,69,79,74,65,87,87,49,87,100,77,98,97,111,98,112,87,108,122,123,108,121,131,97,119,133,131,116,140,143,134,127,141,149,91,115,148,125,144,153,141,155,152,139,161,135,151,161,156,150,161,134,160,167,131, ... ,102,62,140,154,128,86,109,51,115,99,69,102,128,122,158,79,146,133,92,165,113,67,139,105,85,158,126,169,136,147,115,100,176,140,87,149,161,131,182,117,105,157,151,174,165,189,117,129,143,174,184,165,196,137,181,153,191,202,165,199,207,185,223,214).
  min = 4, max = 244, mean = 118.258, std = 66.143, coords_min = (0,0,0,0), coords_max = (245,0,0,0).
[gmic]./ End G'MIC interpreter.

D:\Pictures\A Year Folders Pictures\2007 Folders Pictures\Recovered Photograph>gmic sp cat,5000 tic rep_uniq_cols 0 toc
[gmic]./ Start G'MIC interpreter (v.3.5.0).
[gmic]./ Input sample image 'cat' (1 image 5000x4583x1x3).
[gmic]./ Initialize timer.
[gmic]./ Elapsed time: 0.752 s.
[gmic]./ Display image [0] = '[empty]'.
[0] = '[empty]':
  size = (256,1,1,3) [3072 b of float32].
  data = (194,125,151,119,194,193,101,169,131,142,87,133,184,167,142,212,65,174,195,177,163,4,223,227,217,171,174,219,141,188,177,219,185,205,74,220,189,229,210,243,116,168,188,214,154,201,206,177,152,42,167,142,183,162,210,244,54,167,150,173,98,212,235,166, ... ,53,223,128,51,60,28,31,14,95,54,37,40,62,109,214,19,39,117,165,24,146,106,54,15,40,207,5,77,19,26,16,161,153,21,61,5,143,99,5,94,56,47,12,140,68,42,11,5,102,128,100,85,7,4,115,32,41,95,61,69,51,174,45,7).
  min = 4, max = 244, mean = 118.258, std = 66.143, coords_min = (21,0,0,0), coords_max = (55,0,0,0).
[gmic]./ End G'MIC interpreter.

Note to self: Avoid excessive resize() on single-threading merging of multiple dictionary. Seem to be the reason why the first result doesn’t show difference.

[gmic]./rep_uniq_cols/_rep_colors_freqstat/*foreach#115/ Elapsed time: 0.3 s.
[gmic]./rep_uniq_cols/_rep_colors_freqstat/*foreach#115/*if#229/ Elapsed time: 0.759 s.

The first part is multiple threads utilizing multiple dictionary. Second part is single-threaded merging of multiple dictionaries. So, all I need to do is to avoid excessive resize() and I got a huge boost in speed.

EDIT:

Ok, I did that, and I got the overall execution time to be reduced by more than half for application on 2855x1955 JPG 8-bit image with 262638 colors.

[gmic]./rep_uniq_cols/_rep_colors_freqstat/*foreach#116/ Elapsed time: 0.306 s.
[gmic]./rep_uniq_cols/_rep_colors_freqstat/*foreach#116/*if#231/ Elapsed time: 0.241 s.

So, the total execution time is .547 s. Which is better than 1.147 s provided by colormap 0.

On sample cat, the total execution time is 1/3 of colormap 0. With sp cat,5000, well, the execution time is 8x faster for my version.

Reptorian · January 1, 2025, 7:32am

A brief conclusion of my G’MIC coding last year.

Well, I did successfully improved a few of my old codes. They execute faster, is more readable. But, haven’t really done much other than refactoring and improving. And I’m still on that stage.

The last code I push only is for Color Modulo Texture and is now ~20% faster. That’s not much. But, then again, all I’m doing is making codes faster, building “library” code to make my refactoring easier.

Also, I lowered priority on G’MIC coding at some point.

Reptorian · January 2, 2025, 6:17am

@David_Tschumperle I can’t seem to find a thread about AI coding in G’MIC, and I know you posted it. So, now I feel like experimenting with it a bit before calling it a night.

EDIT: Found it, had to use the less memorable term nn_lib than AI.

afre · January 2, 2025, 1:17pm

Happy new year. Optimization is an important task. Great to see that you made inroads. nn_lib is more searchable than AI.

Reptorian · January 9, 2025, 6:23am

I went back to my old code, and it seem like I have a new version of Fibonacci coming up.

And the code is very readable too (Not current code!):

#@cli new_rep_fibonacci_fill: fibonacci_form={ 0=corner | 1=spiral},stacking={ 0=Right-Down/Down-Right | 1=Left-Down/Down-Left | 2=Right-Up/Up-Right | 3=Left-Up/Up-Left },orientation,start_square_size>=1,border_size>=0,0<=_pos_x<=1,0<=_pos_y<=1
new_rep_fibonacci_fill:
skip ${6=50%},${7=50%}

check "isbool($1)      &&
       isint($2, 0, 3) &&
       isbool($3)      &&
       isint($4,0)     &&
       isint($5,0)      "

if $1
	check "isfinite($6) &&
	       isfinite($7)  "
	pos_x,pos_y=$6,$7
	if !ispercentage($6) pos_x:=($6+1)/2 fi
	if !ispercentage($7) pos_y:=($7+1)/2 fi
fi

foreach {
	od:=w,h rm
	$od

	eval "
		const mode_spiral = 1;
		const fibonacci_form = $1;
		const stacking = $2;
		const orientation = $3;
		const start_size = int($4);
		const border_size = int($5);
		const reverse_append_x = stacking & 1;
		const reverse_append_y = stacking > 1;
		const X = w#-1 - 1;
		const Y = h#-1 - 1;

		fibonacci_form==mode_spiral?(
			area_to_be_filled = wh#-1;
			const start_offset = start_size >> 1;
			x_pos = int(X * "$pos_x" ) - start_offset;
			y_pos = int(Y * "$pos_y" ) - start_offset;
		):(
			area_to_be_filled = max( (w#-1 - (!reverse_append_x ? border_size)  ) , 0 ) * max( (h#-1 - (!reverse_append_y ? border_size)) , 0 ) ;
			x_pos = reverse_append_x? X - border_size : border_size;
			y_pos = reverse_append_y? Y - border_size : border_size;
		);
		value = bool(border_size);

		square_coordinates(x_coords, y_coords, size) = (
			off = size - 1;
			[
				x_coords, y_coords,              x_coords + off, y_coords,
				x_coords + off, y_coords + off,  x_coords, y_coords + off
			];
		);

		calc_coordinates_area(arg_vector)=(
			ref(arg_vector,test_vector);
			coordinates_area = ( test_vector[2] - test_vector[0] + 1) * ( test_vector[7] - test_vector[1] + 1 );
			unref(test_vector);
			coordinates_area;
		);

		cut_coordinates(arg_vector) = (
			ref(arg_vector,vector2trim);
			vector2trim = vmin(vmax(0, vector2trim), [X, Y, X, Y, X, Y, X, Y]);
			unref(vector2trim);
		);

		detect_if_quadrilateral_intersect_with_image_boundary(var_coords) = (
			ref(var_coords,coords_to_test);
			out=!(coords_to_test[0] > X ||
			      coords_to_test[1] > Y ||
			      coords_to_test[4] < 0 ||
			      coords_to_test[5] < 0);
			unref(coords_to_test);
			out;
		);

		last_size = 0;
		current_size = start_size + border_size;

		temp=fibonacci_fill_area=square_coordinates(x_pos,y_pos,current_size);
		if(detect_if_quadrilateral_intersect_with_image_boundary(temp),
			if(border_size,polygon(#-1,4,square_coordinates(x_pos,y_pos,start_size),1,1););
			cut_coordinates(temp);
			area_to_be_filled-=calc_coordinates_area(temp);
		);
		unref(temp);

		++value;
		border_size ? (
			attempt_insert_fibonacci_square_into_image(x_coords, y_coords, size)=(
				test_coordinates = square_coordinates(x_coords, y_coords, size) ;

				if(detect_if_quadrilateral_intersect_with_image_boundary(test_coordinates),
					cut_coordinates(test_coordinates);

					area_to_be_filled -= calc_coordinates_area(test_coordinates) ;

					square_to_insert = square_coordinates(x_coords, y_coords, size - border_size);

					if(detect_if_quadrilateral_intersect_with_image_boundary(square_to_insert),
						cut_coordinates(square_to_insert) ;
						polygon(#-1,4,square_to_insert,1,value) ;
					);
				);

				last_size = current_size ;
				current_size = size ;

				++value ;
			);
		):(
			attempt_insert_fibonacci_square_into_image(x_coords, y_coords, size)=(
				square = square_coordinates(x_coords, y_coords, size) ;

				if(detect_if_quadrilateral_intersect_with_image_boundary(square),
					cut_coordinates(square);
					area_to_be_filled -= calc_coordinates_area(square) ;
					polygon(#-1,4,square,1,value);
				);

				last_size = current_size ;
				current_size = size ;

				++value;
			);
		);

		new_size()=(
			new_size = last_size + current_size;
		);

		insert_north()=(
			new_size();
			fibonacci_fill_area[1] -= new_size;
			fibonacci_fill_area[3] -= new_size;
			attempt_insert_fibonacci_square_into_image(fibonacci_fill_area[0],fibonacci_fill_area[1],new_size);
		);

		insert_south()=(
			new_size();
			insertion_x = fibonacci_fill_area[6];
			insertion_y = fibonacci_fill_area[7] + 1;
			fibonacci_fill_area[5] += new_size;
			fibonacci_fill_area[7] += new_size;
			attempt_insert_fibonacci_square_into_image(insertion_x,insertion_y,new_size);
		);

		insert_west()=(
			new_size();
			fibonacci_fill_area[0] -= new_size;
			fibonacci_fill_area[6] -= new_size;
			attempt_insert_fibonacci_square_into_image(fibonacci_fill_area[0],fibonacci_fill_area[1],new_size);
		);

		insert_east()=(
			new_size();
			insertion_x = fibonacci_fill_area[2] + 1;
			insertion_y = fibonacci_fill_area[3];
			fibonacci_fill_area[2] += new_size;
			fibonacci_fill_area[4] += new_size;
			attempt_insert_fibonacci_square_into_image(insertion_x,insertion_y,new_size);
		);

		fibonacci_form==mode_spiral?(
			reverse_append_x?(
				x0() = insert_east();
				x1() = insert_west();
			):(
				x0() = insert_west();
				x1() = insert_east();
			);

			reverse_append_y?(
				y0() = insert_south();
				y1() = insert_north();
			):(
				y0() = insert_south();
				y1() = insert_north();
			);

			orientation?(
				step_0() = y1();
				step_1() = x0();
				step_2() = y0();
				step_3() = x1();
			):(
				step_0() = x0();
				step_1() = y0();
				step_2() = x1();
				step_3() = y1();
			);

			step=0;

			process()=(
				step == 3 ?( step_3() ):
				step == 2 ?( step_2() ):
				step == 1 ?( step_1() ):
				             step_0()  ;
				if(++step==4,step=0;);
			);
		):(
			reverse_append_x?(
				step_0()=insert_west();
			):(
				step_0()=insert_east();
			);

			reverse_append_y?(
				step_1()=insert_north();
			):(
				step_1()=insert_south();
			);

			step=orientation;

			process()=(
				step?(step_1()):(step_0());
				step=!step;
			);
		);

		while(area_to_be_filled,
			process();
		);
		"
}

afre · January 9, 2025, 6:36pm

Would be fun if there was an option to have Fibonacci fills within the Fibonacci fills, or make them a long staircase downward.

Reptorian · January 9, 2025, 6:51pm

I think this can be done. You’d have to start with larger start size, and extract blocks of non-zero values, call the filter into each block, then insert them back with image command. It’s out of scope though, it’s a nice idea.

prawnsushi · January 10, 2025, 9:19am

I played a little with your previous version, i can say that the outline makes it look a lot better (at least to me). Would be nice also that we could choose a single color (instead of palettes), but still retain the gradient look:

`

You should do it It would be nice to play with this. And have each sequence filled with a different color shade?
Hell, it would be nice in 3d too.

Reptorian · January 10, 2025, 6:18pm

Yes, it does look better with variable start square size and outline. And yes, I will add more options than the previous version.

What I might do is a separate filter which allows you to use a layer which has grayscale value, and that grayscale value is used to map image, and then, you can move around points to shift image position, scale image position, and so on. So, that way, you can stack several Fibonacci images together.

I thought of this. Well, you can’t stack equal sized cubes together like you can with squares.
So, a compromise would be stacking different rectangle prisms. It seem like a harder version than the fibonacci work. There’s also Padavon numbers, and it can be generated by this code:

$ echo ${-rep_padavon_sequence\ 22}
[gmic_math_parser] sequence = [ 1,1,1 ] (size: 3)
[gmic_math_parser] current_value = 2
[gmic_math_parser] current_value = 2
[gmic_math_parser] current_value = 3
[gmic_math_parser] current_value = 4
[gmic_math_parser] current_value = 5
[gmic_math_parser] current_value = 7
[gmic_math_parser] current_value = 9
[gmic_math_parser] current_value = 12
[gmic_math_parser] current_value = 16
[gmic_math_parser] current_value = 21
[gmic_math_parser] current_value = 28
[gmic_math_parser] current_value = 37
[gmic_math_parser] current_value = 49
[gmic_math_parser] current_value = 65
[gmic_math_parser] current_value = 86
[gmic_math_parser] current_value = 114
[gmic_math_parser] current_value = 151
[gmic_math_parser] current_value = 200
[gmic_math_parser] current_value = 265

#----------------------

rep_padavon_sequence:
eval "
	sequence=[1,1,1];
	print(sequence);
	repeat($1-3,
		current_value=sequence[0]+sequence[1];
		copy(sequence[0],sequence[1],2,1,1);
		sequence[2]=current_value;
		print(current_value);
	);
	"