Making Videos from Figures

visual
Author

Hongyang Zhou

Published

October 25, 2019

Modified

July 31, 2024

An animation is simply frames of figures. It is pretty helpful to be proficient at generating a set of figures and combining them into a video. However, naively combining figures into animations ends up in huge files: for a 1920x1080 resolution video at a frame rate of 24, one minute takes 2460192010803 ~ 9 GB! Video files stored in computers take advantage of multiple compression techniques to reduce the storage requirements, including intra prediction (帧内预测), inter prediction (帧间预测), transform (转换), quantization (量化),deblocking filter (去区块滤波器), entropy encoding (熵编码). These can be summarized into three main modules as described in H.265/HEVC:

Making Figures

Guidelines in making figures from time-series data:

  • Reuse whatever possible. The figure canvas, axis, colorbar can often be reused. Instead of creating and deleting objects, think about the possibilty of modifying the properties of objects or replacing the data. Many practical examples can be found in Vlasiator.jl using Matplotlib.

Concatenating Figures Into Videos

I have encountered the task of combining figures into a video several times. In the early days, I used built-in MATLAB functions for concatenating figures into videos. There are some shortcomings with this method:

  1. It requires a MATLAB license.
  2. The simple brute force algorithm generates large video files, and you cannot control the output resolution.

Then I tried the VideoIO package in Julia, but unfortunately, currently it lacks the support for VGBA encoding format.

After searching more on the web, I found a neat solution to these kinds of task: ffmpeg. I installed it through Macports on Mac, but you can also download it directly from the website for installation on other platforms.

Using FFmpeg

Although it seems easy to make videos from figures, it is actually not. You need to have some basic understandings of how figures are saved and how different video formats are structured. The best tool for video format conversion and filtering is FFmpeg. Here is a nice intro to FFmpeg in Chinese

I have encountered several issues when using ffmpeg:

  1. Image size must be a multiple of 2.

My png files generated from Matplotlib have odd pixel numbers for both width and height.
From one of the answers posted on StackOverFlow >As required by x264, the “divisible by 2 for width and height” is needed for YUV 4:2:0 chroma subsampled outputs. 4:2:2 would need “divisible by 2 for width”, and 4:4:4 does not have these restrictions. However, most non-FFmpeg based players can only properly decode 4:2:0, so that is why you often see ffmpeg commands with the -pix_fmt yuv420p option when outputting H.264 video.

There is an option -2 in specifying the size. For example, >-vf scale=1280:-2 Set width to 1280, and height will automatically be calculated to preserve the aspect ratio, and the height will be divisible by 2
>-vf scale=-2:720 Same as above, but with a declared height instead; leaving width to be dealt with by the filter.

  1. movie that is generated by ffmpeg is not playable.

From ffmpeg wiki: >You may need to use -vf format=yuv420p (or the alias -pix_fmt yuv420p) for your output to work in QuickTime and most other players. Some players only support the YUV planar color space with 4:2:0 chroma subsampling for H.264 video. Otherwise, depending on your source, ffmpeg may output to a pixel format that may be incompatible with these players.

Use -vf format=yuv420p or -pix_fmt yuv420p in the command line options.

  1. Missing frames from the input figure list.

If one of your figure file is corrupted (e.g. empty), ffmpeg will stop reading all the subsequent figures.

  1. Order of files are messed up.

The best practice is to pad zeros such that all the file names have the same width.

H.264 v.s H.265

H.265 is superior in most cases, but it is not fully supported on all video players and platforms. Try to use H.265 if possible to save half the storage space compared with H.264.

比起H.264/AVC,H.265/HEVC提供了更多不同的工具来降低码率,以编码单位来说,最小的8x8到最大的64x64。信息量不多的区域(颜色变化不明显,比如一张汽车图片中车体的红色部分和地面的灰色部分)划分的宏块较大,编码后的码字较少,而细节多的地方(比如轮胎)划分的宏块就相应的小和多一些,编码后的码字较多,这样就相当于对图像进行了有重点的编码,从而降低了整体的码率,编码效率就相应提高了。同时,H.265的帧内预测模式支持33种方向(H.264只支持8种),并且提供了更好的运动补偿处理和矢量预测方法。

在相同的图像质量下,相比于H.264,通过H.265编码的视频码流大小比H.264减小约39%-44%。通过主观视觉测试得出的数据显示,在码率减少51%-74%的情况下,H.265编码视频的质量还能与H.264编码视频质量近似甚至比之更好。本质上,这代表着比预期的信噪比更好。

TL;DR

Finally, the following command works for converting figures into videos:1

ffmpeg -r 12 -pattern_type glob -i '*.png' -vcodec libx265 -vf scale=1080:-2 -pix_fmt yuv420p pi.mp4

If libx265 is not supported, you can revert back to libx264.

To transcode into H.265 using FFmpeg,

ffmpeg -i input.mp4 -map 0 -c copy -c:v libx265 output.mp4

Merging videos

Note that this is different from concatenating frames in multiple videos. We have three convenient filters to use: vstack, hstack, and xstack.

  • Merge two videos horizontally

From this thread:

ffmpeg -i left.mp4 -i right.mp4 -filter_complex hstack -vcodec libx265 output.mp4

By default FFmpeg outputs H.265 encoded videos.

  • Merge three videos horizontally
ffmpeg -i input0 -i input1 -i input2 -filter_complex "[0:v][1:v][2:v]hstack=inputs=3[v]" -map "[v]" -vcodec libx265 output

[0:v] and [1:v] refer to the first and second video streams from the first and second inputs. The -map option determines which streams should be included in the output file.

Footnotes

  1. Since this command uses glob, it does not work on Windows.↩︎