Demo

Overview

The WebCodecs API is focused on encoding and decoding "chunks" of media, but media is often stored in "containers" (e.g. song.mp4). To decode such media, apps must first "demux" the file to extract the chunks. The reverse operation (combining chunks into a container file) is known as "muxing".

This demo uses FFmpeg (compiled to WebAssembly) to demux an MP4 file.

The main advantage of using FFmpeg is its support for muxing and demuxing across a large (perhaps the largest) breadth of container formats. This support is well tested, actively maintained, and includes numerous options for customization.

The main disadvanatages of using FFmpeg are app complexity and slighlty larger binary size vs a demuxer written purely in JavaScript. For MP4 in partcular, MP4Box.js is an excellent JavaScript library that should generally be preferred if all you need is an MP4 demuxer.

The guide below shows how to configure FFmpeg for MP4 demuxing and compile it using WebAssembly.

Building the FFmpeg demuxer

Preparation

The basic ingredients are
Steps
  1. Make a project directory

    mkdir wasm_demuxer
    cd wasm_demuxer

  2. Get emsdk (per their docs, NOTE: slight modifications needed on Windows)

    git clone https://github.com/emscripten-core/emsdk
    cd emsdk
    ./emsdk install latest
    ./emsdk activate latest
    source ./emsdk_env.sh
    cd ..

  3. Get FFmpeg

    git clone https://git.ffmpeg.org/ffmpeg.git ffmpeg
    cd ffmpeg

FFmpeg Configuration

Before getting started with WASM it's helpful to configure and "make" a native build to debug any underlying issues (e.g. things missing from your toolchain). While in your ffmpeg/ directory, run:

mkdir out_native
cd out_native

The following command configures the build to our specification. While in out_native/, run:

../configure --disable-everything --disable-all --disable-doc --disable-htmlpages --disable-manpages --disable-podpages --disable-txtpages --disable-debug --disable-bzlib --disable-error-resilience --disable-iconv --disable-lzo --disable-network --disable-schannel --disable-sdl2 --disable-symver --disable-xlib --disable-zlib --disable-securetransport --disable-faan --disable-alsa --disable-autodetect --disable-linux-perf --disable-asm --enable-small --enable-static --enable-avformat --enable-avutil --enable-avcodec --enable-demuxer=mov --arch=x86_32 --target-os=none --enable-cross-compile

If the above step fails, check ffbuild/config.log to see what went wrong.

Arguments explained:

Assuming configure passed, run the following command to ensure your configuration compiles:

make -j4

Assuming that passed, you're ready to prepare a WASM build! Let's make a new directory to hold the wasm build artifacts.

cd ..
mkdir out_wasm
cd out_wasm

Set CFLAGS to use Oz (recognized by emcc, but not gcc) to have the emscripten compiler optimize for size.

CFLAGS="-Oz"

Now, from the out_wasm directory, run configure wrapped by Emscripten's emconfigure command. Note this command appends a handful arguments specific to building with Emscripten.

emconfigure ../configure --disable-everything --disable-all --disable-doc --disable-htmlpages --disable-manpages --disable-podpages --disable-txtpages --disable-debug --disable-bzlib --disable-error-resilience --disable-iconv --disable-lzo --disable-network --disable-schannel --disable-sdl2 --disable-symver --disable-xlib --disable-zlib --disable-securetransport --disable-faan --disable-alsa --disable-autodetect --disable-linux-perf --disable-asm --enable-small --enable-static --enable-avformat --enable-avutil --enable-avcodec --enable-demuxer=mov --enable-protocol=file --arch=x86_32 --target-os=none --enable-cross-compile --extra-cflags="$CFLAGS" --extra-cxxflags="$CFLAGS" --ar=emar --ranlib=emranlib --cc=emcc --cxx=em++ --objcc=emcc --dep-cc=emcc

Compiling

Now generate the FFmpeg libraries by running the make command, this time wrapped by Emscripten's emmake utillity.

emmake make -j4

Finally, it's time to put it all together and build the WASM component!

Copy the following files into your ffmpeg/out_wasm directory:

Now run the following to build the WASM module:

emcc "$CFLAGS" -s INITIAL_MEMORY=33554432 --closure=1 -s WASM_BIGINT -s ASSERTIONS=0 -s ALLOW_TABLE_GROWTH -s MODULARIZE=1 -s EXPORT_ES6=1 -s 'EXPORT_NAME=createWasmModule' -s EXPORTED_FUNCTIONS=@exported_functions.txt -s EXPORTED_RUNTIME_METHODS=@exported_runtime_methods.txt -I. -Isrc/ -Llibavformat -Llibavcodec -Llibavutil -lavformat -lavcodec -lavutil glue.c -o ffmpeg_wasm.out.js

Arguments explained

You should now see ffmpeg_wasm.out.js and ffmpeg_wasm.out.wasm in your out_wasm folder. All that's left to do is invoke FFmpeg APIs from Javscript.

Javascript

This is regrettably a bit complex. In particular, using FFmpeg's AVIO to facilitate streaming demuxing requires a number of DedicatedWorkers and extra signalling. Here we go...

The root of demuxing code lives in ffmpeg_demuxer.js. This class creates a new blocking_demuxer_worker.js and proxies all API calls to that worker.

Those proxied calls are passed to the ffmpeg_demuxer_blocking_helper.js, which is where FFmpeg APIs are actually invoked (i.e. where demuxing actually occurs). This class uses FFmpeg's AVIO interfaces to facilitate streaming demuxing. The tricky bit is that the AVIO "read" callbacks are synchronous, so we use Atomics.wait() to block them while we fetch the media file from the network. This motivates us to do network downloading on another worker! Enter: download_worker.js.

The download worker runs the download_reader.js to fetch and buffer the download and respond to read requests from the ffmpeg_demuxer_blocking_helper.js. The read's are passed between the workers using a SharedArrayBuffer wrapped by shared_read_buffer.js.

Demuxed chunks are ultimately fed into WebCodecs AudioDecoder for decoding. Decoded AudioData outputs are then buffered and rendered using WebAudio's AudioWorklet. All of this is orchestrated by the AudioRenderer ( audio_renderer.js) component. The details of this component are covered in more depth in this talk.

Each of the modules mentioned above include verbose debug logs that are disabled by default. Enable their logs using the flags near the top of each file.

Running locally

The audio rendering features of this demo requires cross origin isolation to use SharedArrayBuffer. Use the provided server.js node script to serve the files with the required http headers for local testing (requires node.js).

node server.js

Demuxer performance

Local profiling shows average demuxer reads generally take less than a tenth of a millisecond with occasional outliers in the realm of 1 millisecond. It's plenty fast.

Perormance is great, but not unique to this approach. You should expect similar performance from any muxer or demuxer implementation, including purely javascript based demuxers. Muxing and demuxing are not resource intensive.

Demuxer binary size

The total size is 144 KB after brotli compression. This should work fine for most use cases. Media applications will often download media assets with much larger sizes. For example, a typical 3.5 minute song using AAC in MP4 will be at least 3 MB, or ~20x the size of this demuxer.

Adding additional formats to FFmpeg WASM demuxer only increases the size by a small amount. For example, adding webm, mp3, and ogg support to our configuration (--enable-demuxer=mov,matroska,mp3,ogg) adds 28 KB after brotli compression.

For applications that are especially sensitive to binary size, a purely-javascript demuxer (or perhaps a lighter wasm library) may be preferred. For example, mp4box.js brotli compressed size totals 26 KB, or 1/5 the size of the FFmpeg WASM demuxer.

File size break down

  1. ffmpeg_wasm.out.wasm: 130 KB brotli compressed (349 KB raw)
  2. ffmpeg_wasm.out.js: 14 KB brotli compressed (46 KB raw)

Licensing

FFmpeg

Copyright (c) 2000-2022 the FFmpeg developers.

FFmpeg is licensed under the LGPLv2.1 license and its source can be downloaded here. See FFmpeg's "Legal" page for additional info.

Emscripten

Copyright (c) 2010-2014 Emscripten authors, see AUTHORS file.

Emscripten is licensed under the MIT license and the University of Illinois/NCSA Open Source License.

ringbuf.js

ringbuf.js is licensed under the Mozilla Public License 2.0.

Original files

Code authored specifically for this demo is licensed under the W3C software and document license.

Other references

This demo was my first time using WebAssembly. I found these resources extremely helpful!