A guide about computing the temporal average/median of video frames with G'MIC

(David Tschumperlé) #1

I’d like to sum up some of the coding work I’ve done these last days on G’MIC, about the computation of temporal average and median of video sequences. Video frame averaging is something @patdavid has been interested in for quite a long time now (see his interesting blog post about it), and I’ve already tried to make him use G’MIC instead of FFMPEG+ImageMagick for this task :slight_smile: , by writing a G’MIC script that does the job for him. The time has passed, and there have been officially supported commands in the G’MIC standard library to do that, at least in the recent versions (commands -average_video and -average_files).

The Smashing Pumpkins - “Tonight, Tonight” averaged (6475 frames)

@patdavid asked me once if I could do the same for computing the temporal median (pixel by pixel) instead of the average. Averaging is an operation that can be done easily with streaming videos: each time you load a new frame, you just add it to an accumulation buffer, and then you can deallocate the frame buffer immediately. At the end, you divide the accumulation buffer by the number of added frames and you’re done.
But computing a pointwise median in a streamed way is far more complex: estimating the median usually requires to sort all the temporal values for each pixel, and this is something you can only do when all temporal data have been read. And for long videos, that means a lot of data (e.g. approx 5600 frames for a 3:40 video, it quickly fills in all your available memory even with a SD video!).

I’m happy to say I’ve finally found two simple solutions to this problem, and thus I can proudly announce two new commands -median_video and -median_files have been added in G’MIC standard library. They work exactly as -average_video and average_files, but they compute the temporal median instead of the temporal average.

The Smashing Pumpkins - “Tonight, Tonight” pixelwise median (6475 frames)

What are the tricks ? Nothing really complicated :

  1. Instead of loading all the video frames at once into memory, I allow to load only a few rows of each frame so that I don’t fulfill the memory. This is done iteratively: the more memory you have, the more rows you can read in one iteration. Then the median is computed for all these rows, and the result is saved in a buffer, and the process starts again with the next rows.

  2. Now comes the real trick: if you consider that your video data is encoded with integers and with no more than 8bits/channels (which is very often the case), then there is a nice optimization for computing the median. In this case, all we need is to estimate the temporal histogram of the data, for each pixel of the video frame. So basically, we have to allocate a width x height x 256 x nb_channels buffer that will contain our histogram. Then the median can be computed using the cumulative histogram (which is something simple to compute, and can be done in-place without additional memory overhead), see this page for a math description on how it works. And the second really cool stuff is that this histogram can be computed in a ‘streamed’ way: an histogram just counts the number of different values appearing in the video, so counting these values can be done frame by frame, and only a single frame at once has to be kept in memory (+ the accumulation buffer).

So, that is exactly what has been done in newly added commands -median_video and -median_files.
And of course, the commands take advantage of multiple CPU cores when possible :slight_smile:
Let me show you how to use these commands now.

You need to have the version 1.7.2+ of the G’MIC command line tool gmic installed on your system (if this is not the case, download it now!).
A quick view on the help for the commands -average_video and -median_video:

$ gmic -h average_video

                        input_filename,_first_frame>=0,_last_frame={ >=0 | -1=last },_frame_step>=1,

        Compute the average frame of a video file.
        If a display window is opened, the frames are displayed in it during processing.
        Default values: 'first_frame=0', 'last_frame=-1', 'frame_step=1' and 


$ gmic -h median_video

                        input_filename,_first_frame>=0,_last_frame={ >=0 | -1=last },_frame_step>=1,
                          _frame_rows[%]>=1,_is_fast_approximation={ 0 | 1 }

        Compute the median frame of a video file, doing it by blocks of rows to save memory usage.
        If a display window is opened, the frames are displayed in it during processing.
        Default values: 'first_frame=0', 'last_frame=-1', 'frame_step=1', 'frame_rows=100%' and 

Now, if you have a video file, you are ready to go. For the example I’ve chosen this video clip from Pink, which has already a respectable size of 1280x720 (for 5972 frames) . First, make sure you get the latest version of the G’MIC standard library:

$ gmic -update

And now, let’s go :

$ gmic -average_video pink.mp4 -n 0,255 -o average.jpg
$ gmic -median_video pink.mp4 -n 0,255 -o median.jpg

As you may have noticed, I’m also normalizing the result in range [0,255] to be sure I don’t lost the image dynamic too much. After a few seconds, I get these results (may take a few minutes for a long/bigger video of course).

Comparison of average/median frame from the video clip of P!nk - Try.

Of course, you can also play with the parameters of the commands, for instance you can compute the average/median only for a subset of video frames (here, from frame 500 to frame 1499):

$ gmic -average_video pink.mp4,500,1499 -n 0,255 -o sub_average.jpg
$ gmic -median_video pink.mp4,500,1499 -n 0,255 -o sub_median.jpg

Comparison of average/median frame from a subset of the video clip of P!nk - Try (frames 500->1499).

If you don’t have a lot of memory, it may be safe to reduce the number of simultaneous rows used for the median computation (you should never have issues with averaging though):

$ gmic -median_video pink.mp4,500,1499,1,20% -n 0,255 -o sub_median.jpg

You may want to preview the process. In this case, just add a -w (as -window) before invoking -average_video or -median_video:

$ gmic -w -average_video pink.mp4,500,1499 -n 0,255 -o sub_average.jpg
$ gmic -w -median_video pink.mp4,500,1499,1,20% -n 0,255 -o sub_median.jpg

I’ve described how to do with a simple video file. Now, maybe you have your video frames already decomposed as a set of distinct image files. No problem, just use -average_files and -median_files instead:

$ gmic -w -average_files \*.jpg,500,1499 -n 0,255 -o sub_average.jpg
$ gmic -w -median_files frame_\*.png,500,1499,1,20% -n 0,255 -o sub_median.jpg

(on the default Windows terminal, I guess you won’t need to escape the *, so write frame_*.png rather than frame_\*.png).

That should be as simple as this! I’m planning to add an additional parameter to these commands, allowing to specify a processing pipeline to be applied on each streamed frame before computing the average / median. I guess this could be useful to get even more details in the final result (for instance, if we apply the Freaky details on the streamed frames, rather than on the final result).

That’s it for now. Any thoughts ?

Astro Stacking with G'MIC?
Using MoviePy to Average Videos
Sharing Galore

Maybe the following idea is totally BS, but couldn’t this method even work for float data if you do a 2-pass or n-pass approach: In the first pass, use a low quantization and determine which of the histogram’s “bar” is holding the median. Then, use only the image data belonging to this bar and the count of above and below pixels for computing the exact median, with a much reduced data set (either using some sparse array mapping or by just iterating again and throwing unused data away)? If the data is still too much, it could be further reduced with step 1 only for the important bar. Again, this may be totally nonsense, but it just came to my mind.

(Jan Moren) #3

As an optional extra, you could also return the index of the median input frame defined as the minimal pixel-wise squared distance to the median you calculate here.

(Pat David) #4

Normally I find the median to be not as interesting or visually pleasing as the mean. That is, until I tried to process this mean blend of all of the U.S. Presidential portraits a little further:

While I like the mean blend, I started fiddling years ago with blending (pun intended) a median over the image as well. I ended up including a median over the mean in “Value” blend mode in GIMP:

I think the addition of the harder edges in this added an almost painterly feel to the result which was much more pleasing to me personally.

Looking forward to trying this on some other videos! Thank you so much @David_Tschumperle for all the hard work! :smiley:

(Pat David) #5

What exactly is going on with the command after *.png? Are those frame start/end points? Also, what is 1,20%?

(David Tschumperlé) #6

Pat, according to the documentation of the arguments (see gmic -h median_video) :

                    input_filename,_first_frame>=0,_last_frame={ >=0 | -1=last },_frame_step>=1,
                      _frame_rows[%]>=1,_is_fast_approximation={ 0 | 1 }

In G’MIC, command arguments are always separated by commas ,.
So that means first argument is the filename, second optional argument is the indice of the first frame (500 in my example), third argument is the indice of the last frame (1499 in my example), fourth argument is the frame step, (1 in my example), fifth argument is the number of frame rows processed during a single iteration (can be specified as a percentage of the frame width, e.g. 20% in my example, which means you’ll need 5 iterations to complete), and last argument tells if the used algorithm is the fast approximation or not (1 by default).
Invoking -h median_video also tells you about the values of the default arguments used when they are not specified.

Note that you can also specify for instance the argument frame_rows without having to specify the other arguments, like this :

$ gmic -w -median_files frame_*.png,,,,50% -n 0,255 -o sub_median.jpg

Isn’t that simple ? :smiley:

(Pat David) #7

Look - you simply can’t expect me to be going around reading documentation that answers my questions! I’m a busy guy! I’ve got things happening! :smiley: :smiley: :smiley:

(David Tschumperlé) #8

I know I know, what I only expect is that you finally admit the G’MIC syntax is quite logical after all :smiley:

(Ismaël Joffroy Chandoutis) #9

Hey there,
Thank you for this great guide !!
I’m currently looking to mimic argentic techniques like multiple exposure and long exposure with digital animated video and automation techniques. Do you think we can reach it with average techniques and blend modes ? I liked this animation here : https://vimeo.com/81126520, but I’m looking with something more dynamic (see exemple below)

For long exposures effect, I’m looking to animate video in a similar way than this : https://www.youtube.com/watch?v=nsjDnYxJ0bo (I know that it is a Gan generation, a very different way to obtain it but I find the result pretty similar and more lively, despite the fact that it is low rez)

And for multiple exposures, I’m looking to replicate this

The walkthrough is recorded 100 times on the same reel, at different exposure.

Do you think, we can achieve that with automation ?

Sorry for my english (you know french people…) and thank you for your advices and tips :wink: