Understanding the FFmpeg Filter Hierarchy
In FFmpeg, video and audio processing relies on three fundamental concepts:
- Filter: The atomic unit that performs a specific transformation on input frames, such as scaling (scale), cropping (crop), or overlaying (overlay).
- FilterChain: A linear sequence of filters where the output of one filter serves as the input for the next. Filters in a chain are separated by commas.
- FilterGraph: A complex structure containing one or more FilterChains. Chains can be connected in parallel or series, allowing for complex branching and merging logic. Chains are separated by semicolons.
To view all available filters in your FFmpeg installation, use the command:
ffmpeg -filtersFilter Syntax and Graph Configuration
A filter definition typically follows this pattern:
[in_link]filter_name=parameters[out_link]Labels in square brackets identify input and output pads. Parameters can be provided as key-value pairs or positional arguments. For example, scaling a video can be written in multiple ways:
# Key-value pairs
ffmpeg -i input.mp4 -vf scale=w=iw/2:h=ih/2 output.mp4
# Positional arguments
ffmpeg -i input.mp4 -vf scale=iw/2:ih/2 output.mp4A common example of a FilterGraph creates a "mirror" effect by splitting the input, flipping one stream, and overlaying it back:
ffmpeg -i INPUT -vf "split [main][tmp]; [tmp] crop=iw:ih/2:0:0, vflip [flip]; [main][flip] overlay=0:H/2" OUTPUTThis graph performs the following:
1. split: Duplicates the input into two streams, labeled [main] and [tmp].
2. crop & vflip: The [tmp] stream is cropped to the bottom half and vertically flipped, labeled [flip].
3. overlay: The processed [flip] stream is overlaid onto the [main] stream at the vertical midpoint.
C API Implementation
Programmatically implementing filters involves managing the AVFilterGraph, which acts as the container for the entire processing pipeline. The workflow requires creating source and sink endpoints—specifically buffer for input and buffersink for output—and linking them through the desired filter chain.
1. Initializing the Filter Graph
The first step is allocating the graph context:
AVFilterGraph *graph = avfilter_graph_alloc();
if (!graph) {
return AVERROR(ENOMEM);
}2. Configuring the Source (Buffer)
The buffer filter acts as the entry point for raw video frames into the graph. It must be initialized with parameters matching the source video stream (dimensions, pixel format, time base).
char args[512];
AVFilterContext *src_ctx;
const AVFilter *buffersrc = avfilter_get_by_name("buffer");
snprintf(args, sizeof(args),
"video_size=%dx%d:pix_fmt=%d:time_base=%d/%d:pixel_aspect=%d/%d",
codec_ctx->width, codec_ctx->height, codec_ctx->pix_fmt,
stream->time_base.num, stream->time_base.den,
codec_ctx->sample_aspect_ratio.num, codec_ctx->sample_aspect_ratio.den);
int ret = avfilter_graph_create_filter(&src_ctx, buffersrc, "in",
args, NULL, graph);
if (ret < 0) {
// Handle error
}3. Configuring the Sink (Buffersink)
The buffersink filter is the exit point, used to retrieve processed frames from the graph. It often requires setting acceptable output pixel formats.
AVFilterContext *sink_ctx;
const AVFilter *buffersink = avfilter_get_by_name("buffersink");
ret = avfilter_graph_create_filter(&sink_ctx, buffersink, "out",
NULL, NULL, graph);
if (ret < 0) {
// Handle error
}
enum AVPixelFormat pix_fmts[] = { AV_PIX_FMT_YUV420P, AV_PIX_FMT_NONE };
av_opt_set_int_list(sink_ctx, "pix_fmts", pix_fmts,
AV_PIX_FMT_NONE, AV_OPT_SEARCH_CHILDREN);4. Linking and Configuration
Once endpoints are created, intermediate filters are instantiated and linked, or the graph is parsed from a string. Finally, the graph must be configured to validate the links and negotiate formats.
avfilter_link(src_ctx, 0, intermediate_ctx, 0);
avfilter_link(intermediate_ctx, 0, sink_ctx, 0);
ret = avfilter_graph_config(graph, NULL);
if (ret < 0) {
// Handle error
}The AVFilter Structure
The AVFilter struct defines the capabilities and properties of a filter type. It contains fields for the name, description, and input/output pads (inputs and outputs). It also holds function pointers for initialization and uninitialization. Every filter instance (AVFilterContext) maintains a reference to its defining AVFilter structure, while the context holds the specific state for that instance.