I’d like to sum up some of the coding work I’ve done these last days on G’MIC, about the computation of temporal average and median of video sequences. Video frame averaging is something @patdavid has been interested in for quite a long time now (see his interesting blog post about it), and I’ve already tried to make him use G’MIC instead of FFMPEG+ImageMagick for this task , by writing a G’MIC script that does the job for him. The time has passed, and there have been officially supported commands in the G’MIC standard library to do that, at least in the recent versions (commands -average_video
and -average_files
).
The Smashing Pumpkins - “Tonight, Tonight” averaged (6475 frames)
@patdavid asked me once if I could do the same for computing the temporal median (pixel by pixel) instead of the average. Averaging is an operation that can be done easily with streaming videos: each time you load a new frame, you just add it to an accumulation buffer, and then you can deallocate the frame buffer immediately. At the end, you divide the accumulation buffer by the number of added frames and you’re done.
But computing a pointwise median in a streamed way is far more complex: estimating the median usually requires to sort all the temporal values for each pixel, and this is something you can only do when all temporal data have been read. And for long videos, that means a lot of data (e.g. approx 5600 frames for a 3:40 video, it quickly fills in all your available memory even with a SD video!).
I’m happy to say I’ve finally found two simple solutions to this problem, and thus I can proudly announce two new commands -median_video
and -median_files
have been added in G’MIC standard library. They work exactly as -average_video
and average_files
, but they compute the temporal median instead of the temporal average.
The Smashing Pumpkins - “Tonight, Tonight” pixelwise median (6475 frames)
What are the tricks ? Nothing really complicated :
-
Instead of loading all the video frames at once into memory, I allow to load only a few rows of each frame so that I don’t fulfill the memory. This is done iteratively: the more memory you have, the more rows you can read in one iteration. Then the median is computed for all these rows, and the result is saved in a buffer, and the process starts again with the next rows.
-
Now comes the real trick: if you consider that your video data is encoded with integers and with no more than 8bits/channels (which is very often the case), then there is a nice optimization for computing the median. In this case, all we need is to estimate the temporal histogram of the data, for each pixel of the video frame. So basically, we have to allocate a
width x height x 256 x nb_channels
buffer that will contain our histogram. Then the median can be computed using the cumulative histogram (which is something simple to compute, and can be done in-place without additional memory overhead), see this page for a math description on how it works. And the second really cool stuff is that this histogram can be computed in a ‘streamed’ way: an histogram just counts the number of different values appearing in the video, so counting these values can be done frame by frame, and only a single frame at once has to be kept in memory (+ the accumulation buffer).
So, that is exactly what has been done in newly added commands -median_video
and -median_files
.
And of course, the commands take advantage of multiple CPU cores when possible
Let me show you how to use these commands now.
You need to have the version 1.7.2+ of the G’MIC command line tool gmic
installed on your system (if this is not the case, download it now!).
A quick view on the help for the commands -average_video
and -median_video
:
$ gmic -h average_video
-average_video:
input_filename,_first_frame>=0,_last_frame={ >=0 | -1=last },_frame_step>=1,
_output_filename
Compute the average frame of a video file.
If a display window is opened, the frames are displayed in it during processing.
Default values: 'first_frame=0', 'last_frame=-1', 'frame_step=1' and
'output_filename=(undefined)'.
and
$ gmic -h median_video
-median_video:
input_filename,_first_frame>=0,_last_frame={ >=0 | -1=last },_frame_step>=1,
_frame_rows[%]>=1,_is_fast_approximation={ 0 | 1 }
Compute the median frame of a video file, doing it by blocks of rows to save memory usage.
If a display window is opened, the frames are displayed in it during processing.
Default values: 'first_frame=0', 'last_frame=-1', 'frame_step=1', 'frame_rows=100%' and
'is_fast_approximation=1'.
Now, if you have a video file, you are ready to go. For the example I’ve chosen this video clip from Pink, which has already a respectable size of 1280x720
(for 5972
frames) . First, make sure you get the latest version of the G’MIC standard library:
$ gmic -update
And now, let’s go :
$ gmic -average_video pink.mp4 -n 0,255 -o average.jpg
$ gmic -median_video pink.mp4 -n 0,255 -o median.jpg
As you may have noticed, I’m also normalizing the result in range [0,255]
to be sure I don’t lost the image dynamic too much. After a few seconds, I get these results (may take a few minutes for a long/bigger video of course).
Comparison of average/median frame from the video clip of P!nk - Try.
Of course, you can also play with the parameters of the commands, for instance you can compute the average/median only for a subset of video frames (here, from frame 500 to frame 1499):
$ gmic -average_video pink.mp4,500,1499 -n 0,255 -o sub_average.jpg
$ gmic -median_video pink.mp4,500,1499 -n 0,255 -o sub_median.jpg
Comparison of average/median frame from a subset of the video clip of P!nk - Try (frames 500->1499).
If you don’t have a lot of memory, it may be safe to reduce the number of simultaneous rows used for the median computation (you should never have issues with averaging though):
$ gmic -median_video pink.mp4,500,1499,1,20% -n 0,255 -o sub_median.jpg
You may want to preview the process. In this case, just add a -w
(as -window
) before invoking -average_video
or -median_video
:
$ gmic -w -average_video pink.mp4,500,1499 -n 0,255 -o sub_average.jpg
$ gmic -w -median_video pink.mp4,500,1499,1,20% -n 0,255 -o sub_median.jpg
I’ve described how to do with a simple video file. Now, maybe you have your video frames already decomposed as a set of distinct image files. No problem, just use -average_files
and -median_files
instead:
$ gmic -w -average_files \*.jpg,500,1499 -n 0,255 -o sub_average.jpg
$ gmic -w -median_files frame_\*.png,500,1499,1,20% -n 0,255 -o sub_median.jpg
(on the default Windows terminal, I guess you won’t need to escape the *
, so write frame_*.png
rather than frame_\*.png
).
That should be as simple as this! I’m planning to add an additional parameter to these commands, allowing to specify a processing pipeline to be applied on each streamed frame before computing the average / median. I guess this could be useful to get even more details in the final result (for instance, if we apply the Freaky details on the streamed frames, rather than on the final result).
That’s it for now. Any thoughts ?