I was trying to watch a video on a Chromecast, but it kept temporarily pausing/stuttering every 20 seconds. The weird thing is, it played continuously in the phone browser, so it wasn’t a buffering issue. Well no matter, I can probably just download the video and cast with VLC media player.
To download, I tried the Video DownloadHelper browser extension.
It shows media on the current page.
In this case, there was a lot of media, roughly 128 different video fragments at sequentially-numbered URLs like
That explains the pauses: Chromecast had to keep loading new videos, and either the website’s casting code didn’t handle this well or the machine itself couldn’t multitask.
Either way, this became a problem of downloading video fragments and concatenating them into one big video.
The command that eventually worked is mostly taken from the FFmpeg wiki page on concatenation:
ffmpeg -f concat -safe 0 -i fragments.txt -shortest -c copy merged.mp4
-shortest: Ensures that the audio and video of each fragment match each other’s length. Without this, the merged video briefly froze every 20 seconds while the audio kept playing.
-safe 0 -i fragments.txt: Allow unsafe characters in the filenames of of
fragments.txt. In my case, file has lines like
file pathto/video-fragment-X.mp4, where the slash is considered unsafe.
-c copy: Copy input video rather than re-encoding it.
The full script goes in a few stages:
aria2c. Its default is 5 simultaneous downloads.
# Create the list of URLs. for i in $(seq 1 128); do printf "https://cdn.grercez.dev/coolvideo-%04d.mp4\n" $i done > urls.txt # Download video fragments to the fragments/ directory. mkdir -p fragments aria2c --dir=fragments --input-file=urls.txt # Create list of downloaded files for ffmpeg. cat urls.txt | while read url ; do printf "file fragments/%s\n" "$(basename "$url")" >> fragments.txt done > urls.txt # Merge video fragments. ffmpeg -f concat -safe 0 -i fragments.txt -shortest -c copy merged.mp4 # Clean up everything but merged.mp4. rm -f urls.txt fragments.txt fragments/* rmdir -f fragments
A few days later, I wanted to make a simple meme out of a screen recording. The basic idea here was to merge audio into a silent video. But before that, I also needed to crop the video, extract the funny segment, and re-encode it to be a reasonable size.
Let’s look at these steps individually. However, it’s best to keep re-encoding to a minimum, so we’ll combine all the steps into one command at the end.
As a first step, we want to extract audio track from a video obtained with
Since this won’t be the final product, we can be a bit sloppy and re-encode it:
ffmpeg -nostdin -i audio_original.mkv -vn -b:a 128k -y audio_track.m4a
-nostdin: Ignore standard input. Without this, copy/pasting the commands doesn’t work as expected since parts of them are consumed on stdin.
-vn: No video.
-b:a 128k: (optional) Use a 128k bitrate for audio. This is the default, but it’s nice to be explicit.
-y: Allow overwriting the output file.
We could also have been careful and extracted the original audio.
ffprobe audio_original.mkv shows that the audio codec is
opus, which is customarily contained in an
Extracting the original would be:
ffmpeg -nostdin -i audio_original.mkv -vn -c copy -y audio_original.oga
However, we don’t go this route!
Our final video will be a
.mp4, which in not a well-supported container for Opus audio, so we’ll have to re-encode the audio anyway.
The first step is cropping.
In this case, the original video was 1440x2960 and I ended up removing the top 540 pixels and the bottom 420 pixels (420=2960-2000-540).
Also, it made sense to output a 700x1000 video instead of 1400x2000.
This took some trial and error of course, so previewed various options using
My final preview command looks a lot like the
ffmpeg command to encode it:
ffplay -i video_original.mp4 -vf crop=1400:2000:0:540,scale=700:1000 ffmpeg -nostdin -i video_original.mp4 -vf crop=1400:2000:0:540,scale=700:1000 -an -b:v 1M -y video_track.mp4
-vf crop=1400:2000:0:540,scale=700:1000: Video filters to crop and scale.
-an: Remove audio if it exists.
-b:v 1M: (optional) Encode using an average of 1 million bits per second of video.
Trim audio and video samples. Whatever you want to call this operation, it’s pretty easy.
ffmpeg -nostdin -ss 00:03:28.5 -t 00:00:46.5 -i audio_track.m4a -c copy -y audio_sample.m4a ffmpeg -nostdin -ss 00:03:31.5 -t 00:00:46.5 -i video_track.mp4 -c copy -y video_sample.mp4
-ss: Start time.
Merge audio and video.
Finally the merging of audio and video.
I had trouble keeping the audio and video in sync but eventually found the
Unfortunately, I couldn’t figure out how to use this without re-encoding the video:
ffmpeg -nostdin -i audio_sample.m4a -i video_sample.mp4 -fflags +genpts -acodec copy -b:v 1M -y meme.mp4
-fflags +genpts: Add timestamps to keep audio and video in sync.
-acodec copy: Copy input audio rather than re-encoding it. This doesn’t save much time.
In the steps above, the video is re-encoded twice. To avoid extra artifacts, the final product should be made with one command:
ffmpeg -nostdin \ -ss 00:03:28.5 -i audio_original.mkv \ -ss 00:03:31.5 -i video_original.mp4 \ -t 00:00:46.5 \ -map 0:a -map 1:v \ -vf crop=1400:2000:0:540,scale=700:1000 \ -fflags +genpts \ -b:a 128k -b:v 1M -y meme.mp4
-map 0:a: Take audio from first input.
-map 1:v: Take video from second input.
Instead of video bitrate
-b:v 1M, you could play around with the H.264 Constant Rate Factor.
For me, the default CRF of 23 yielded a video bitrate of 1236k, but after using slower presets and tuning as an animation, I was able to obtain bitrate of 1008k with a lower (better) CRF of 22.
The final command being:
ffmpeg -nostdin \ -ss 00:03:28.5 -i audio_original.mkv \ -ss 00:03:31.5 -i video_original.mp4 \ -t 00:00:46.5 \ -map 0:a -map 1:v \ -vf crop=1400:2000:0:540,scale=700:1000 \ -fflags +genpts \ -b:a 128k -crf 22 -preset veryslow -tune animation -y meme.mp4
-crf 22: H.264 constant rate factor.
-preset veryslow: Use the slowest presets to get the best compression.
-tune animation: Specify that the video is an animation-like. For my gameplay screen recording, this gave crisper edges with less bitrate.