The ffmpeg manifesto

Creating a blurred overlay

2023 Oct 29

We believe video editing software should be free and open without freemium limits on resolution.

We believe GUIs are bloatware which do not lend themselves to reusable scriptable workflows.

We believe CLIs are the ideal type of interface.

Just kidding, I’m not going to write a whole blog post in this style.

The gargantuan one-liner

Let’s say you want to take a widescreen video in 16:9 aspect ratio and post it somewhere like TikTok or Reels. The only problem is that those are mobile-friendly vertical video platforms, so a widescreen video isn’t ideal. Instead of a short fat widescreen video, you need a tall skinny vertical video.

One common hack is to overlay the original widescreen video centered overtop of a blurred and cropped background in the correct vertical aspect ratio, 9:16 instead of 16:9. Here’s what that looks like with some content from Fallout 4:

Figure 1a: 16:9 input Figure 1b: blurred overlay 9:16 output

You can do this with a single gargantuan one-liner ffmpeg command:

ffmpeg \
	-i "input.mp4" \
	-vf "split [fg][bg];
	[bg]        crop=ceil(ih*9/16/2)*2:ih, boxblur=30 [bg];
	[fg]       scale=ceil(ih*9/16/2)*2:ceil(ih*9*9/16/16/2)*2 [fg];
	[bg][fg] overlay=y=main_h/2-overlay_h/2" \
	"output.mp4"

We’ll break this down step by step below.

The deceptive allure of ffmpeg

ffmpeg is intimidating. You get lured in with straightforward conversion commands with just a couple arguments, e.g. avi to mp4:

ffmpeg -i "input.avi" "output.mp4"

And then there are moderately more advanced commands to apply video filters, e.g. scaling an input resolution down or up to HD resolution 1920:1080:

ffmpeg -i "input.mp4" -filter:video "scale=1920:1080" "output.mp4"

And finally there are gargantuan one-liners like ours, similar to what you might find on stack overflow posts from ffmpeg wizards.

ffmpeg \
	-i "input.mp4" \
	-vf "split [fg][bg];
	[bg]        crop=ceil(ih*9/16/2)*2:ih, boxblur=30 [bg];
	[fg]       scale=ceil(ih*9/16/2)*2:ceil(ih*9*9/16/16/2)*2 [fg];
	[bg][fg] overlay=y=main_h/2-overlay_h/2" \
	"output.mp4"

How does anyone come up with these? What seemed like a problem of crafting a few command line arguments has evolved into a task of writing a weird interpretted programming language, complete with variables like [fg] and [bg], mathematical operations and the ceil() function, and bizarrely abbreviated constant variables like ih and main_h.

But before we dive into the complexity, let’s help you help yourself with ffmpeg.

Getting help with ffmpeg

To get top-level help, use ffmpeg --help or ffmpeg -h:

ffmpeg -h

# Hyper fast Audio and Video encoder
# usage: ffmpeg [options] [[infile options] -i infile]... {[outfile options] outfile}...
# 
# Getting help:
#     -h      -- print basic options
#     -h long -- print more options
# 
# Video options:
# -vframes number     set the number of video frames to output
# -r rate             set frame rate (Hz value, fraction or abbreviation)
# -s size             set frame size (WxH or abbreviation)
# -vf filter_graph    set video filters

There’s a long verbose banner with version and build info before the actual help starts, and a lot of other “basic” options which I’ve omitted. For someone used to keyword arguments which don’t depend on position, it can be confusing that there are separate [infile options] and [outfile options] which do depend on where they are positioned in the argument list.

To get help with filters, use ffmpeg -filters:

ffmpeg -filters

# Filters:
#   T.. = Timeline support
#   .S. = Slice threading
#   ..C = Command support
#   A = Audio input/output
#   V = Video input/output
#   N = Dynamic number and/or type of input/output
#   | = Source or sink filter
#  T.. boxblur           V->V       Blur the input.
#  ..C crop              V->V       Crop the input video.
#  TSC overlay           VV->V      Overlay a video source on top of the input.
#  ..C scale             V->V       Scale the input video size and/or convert the image format.

It feels inconsistent, but notice there’s no -h in the above help command for filters. Again, I’ve omitted everything that’s irrelevant to this blog.

Note that most of these filters map a single video input to a single video output V->V. The exception is overlay, which maps two inputs to one output VV->V.

Finally, to get help with a specific filter, e.g. crop, use ffmpeg -h filter=crop:

ffmpeg -h filter=crop

# Filter crop
#   Crop the input video.
#     Inputs:
#        #0: default (video)
#     Outputs:
#        #0: default (video)
# crop AVOptions:
#   out_w             <string>     ..FV..... set the width crop area expression (default "iw")
#   w                 <string>     ..FV..... set the width crop area expression (default "iw")
#   out_h             <string>     ..FV..... set the height crop area expression (default "ih")
#   h                 <string>     ..FV..... set the height crop area expression (default "ih")
#   x                 <string>     ..FV..... set the x crop area expression (default "(in_w-out_w)/2")
#   y                 <string>     ..FV..... set the y crop area expression (default "(in_h-out_h)/2")
#   keep_aspect       <boolean>    ..FV..... keep aspect ratio (default false)
#   exact             <boolean>    ..FV..... do exact cropping (default false)

The online ffmpeg help can be better at providing examples for filters like this. Putting all the help together, here’s an example crop command:

ffmpeg -i "input.mp4" -filter "crop=608:1080" "crop.mp4"

This crops the input to a width of 608 and a height of 1080.

Breaking down the one-liner

I’ll be honest, it took me a couple hours of trial and error to settle on this one-liner. I started step by step with a workflow that used 4 separate commands:

  1. Crop the background to a vertical aspect ratio
  2. Blur the background
  3. Scale the foreground down to match the background’s width
  4. Overlay the foreground onto the background

Let’s first translate those steps in the most straightforward way to ffmpeg commands, trading elegance for simplicity. We’ll get it working before we get it working elegantly, efficiently, and generally.

Fixed 1920:1080 resolution

Here are those steps translated into ffmpeg commands, assuming the input has dimensions 1920:1080:

ffmpeg -i "input.mp4" -vf "crop=608:1080" "crop.mp4"
ffmpeg -i  "crop.mp4" -vf "boxblur=30" "blur.mp4"
ffmpeg -i "input.mp4" -vf "scale=608:342" "scale.mp4"
ffmpeg -i "blur.mp4" -i "scale.mp4" -filter_complex "overlay=y=369" "output.mp4"

The short option -vf is an alias of the long options -filter, -filter:v, or -filter:video. Why specify video? Because ffmpeg can filter audio too!

Every video filter has arguments. For example, the crop filter takes width and height arguments with the syntax crop=width:height, e.g. crop=608:1080. This filter takes the 1920:1080 input (16:9 aspect) and crops to the middle 608:1080 (9:16 aspect). The crop filter has other optional arguments listed in the help above that can crop other areas besides the middle. We’ll figure out how to generalize this to other input dimensions with the same aspect ratio below.

The boxblur filter has an argument to control the blurriness.

The scale filter downsizes the input to 608:342, the same width as the crop (608) with the original 16:9 aspect ratio.

Finally, the overlay filter lays the scaled video over the middle of the blurred video, at a vertical y position 369. We have to do a little math to figure out the middle position. Here it is half the blur height minus half the overlay height, i.e. 0.5 * 1080 - 0.5 * 342 = 369. If you’re worried about whether y is measured from the top or from the bottom, it doesn’t matter.

General 16:9 aspect ratio

Instead of hard-coding dimensions for a 1920:1080 input, these can be generalized to work with any 16:9 aspect ratio, e.g. 1280:720, HD 1920:1080, 4K 3840:2160, etc.

This can be achieved by using the named variables accepted by many filters in ffmpeg, e.g. ih for the height of the input frame. I might call these constants, as they don’t really change on the fly and they’re distinct from labels which we’ll get to later, but ffmpeg calls them variables anyway instead of constants.

ffmpeg -i "input.mp4" -vf "crop=ceil(ih*9/16/2)*2:ih" "crop.mp4"

ffmpeg -i "crop.mp4" -vf "boxblur=30" "blur.mp4"

ffmpeg -i "input.mp4" \
	-vf "scale=ceil(ih*9/16/2)*2:ceil(ih*9*9/16/16/2)*2" \
	"scale.mp4"

ffmpeg -i "blur.mp4" -i "scale.mp4" \
	-filter_complex "overlay=y=main_h/2-overlay_h/2" "output.mp4"

Many filters accept ih for the input height and iw for the input width. The overlay filter is slightly more complicated because there are two inputs and thus two heights: main_h and overlay_h. If you get the main and overlay inputs mixed up like I do, just remember that the overlay input gets layed over the main input.

I should explain the funny ceil() function in the crop and scale filters, e.g. crop=ceil(ih*9/16/2)*2:ih. Ideally we would want a cropped width of just ih*9/16, but some formats and codecs will error out if you try to use an odd numbered dimension. To guarantee an even number, we can divide by 2, round up, and multiply by 2 again, hence ceil(ih*9/16/2)*2. The scale filter is similar.

Further generalization to other aspect ratio inputs and outputs like 4:3 is left as an exercise for the reader.

Chaining steps together into a filtergraph

The workflow above leaves a few temporary files crop.mp4, blur.mp4, and scale.mp4 which can be cleaned up. The gargantuan one-liner is better because it does not waste any time reading, writing, and transcoding temporary files. In a benchmark that I ran, the whole process takes 1m57s as four separate commands but just 0m44s as a one-liner. If you want the details of that benchmark, the input video was a 1 minute long, 60 fps, 1920:1080, 20 MB mp4 file.

Again, take things one step at a time when chaining together these gargantuan one-liners. To start, the first two steps of cropping and blurring can be combined:

ffmpeg -i "input.mp4" -vf "crop=ceil(ih*9/16/2)*2:ih" "crop.mp4"
ffmpeg -i "crop.mp4" -vf "boxblur=30" "blur.mp4"

The same filters in a single step look like this:

ffmpeg -i "input.mp4" -vf "crop=ceil(ih*9/16/2)*2:ih, boxblur=30" "blur.mp4"

The video filters -vf are quoted in a single string, separated by commas, e.g. "filter1, filter2", which becomes "crop=ceil(ih*9/16/2)*2:ih, boxblur=30".

The ffmpeg filtering introduction explains this excellently:

Filters in the same linear chain are separated by commas, and distinct linear chains of filters are separated by semicolons.

Again, the overlay filter is more complicated, because it takes two inputs. To apply this filter without any intermediate files, we have to split the input stream into two streams: (1) the cropped and blurred background [bg] and (2) the scaled foreground [fg]. The labels [bg] and [fg] are dummy variables. We can name them anything we want using the [bracket] syntax, unlike the named constants such as ih or main_h.

ffmpeg \
	-i "input.mp4" \
	-vf "split [fg][bg];
	[bg]        crop=ceil(ih*9/16/2)*2:ih, boxblur=30 [bg];
	[fg]       scale=ceil(ih*9/16/2)*2:ceil(ih*9*9/16/16/2)*2 [fg];
	[bg][fg] overlay=y=main_h/2-overlay_h/2" \
	"output.mp4"

In two distinct linear chains, the background [bg] is cropped and blurred, while the foreground [fg] is scaled. Finally, foreground is layed over the background.

Et voilà! Here’s what it looks like in action: https://www.tiktok.com/@jeff.irwin/video/7294441886876618030