When I was a kid, the state of the art for home movies was 8mm film. My parents had a movie camera that used a film cartridge that contained a 16mm film strip. They would insert the cartridge one way and record a few minutes of film, and then flip the cartridge over and record another few minutes. To develop the film, the processing company would open the film cartridge and split the long reel of 16mm wide film into two reels of 8mm wide film. They only took 25 reels of film over a span of 12 years. Back in 2012, I found a place that would convert the movies to DVD, and I kept an MP4 version on my computer, and I gave copies to everyone in my family. It was pretty awesome.

By the time my kids came along, the world had moved to videotape, and we were just moving from analog to digital. We went top-of-the-line with a Sony MiniDV (NTSC) recorder. The NTSC DV video files record 720×480 pixels at 30fps… far from the 1080p videos that you’d shoot on your iPhone today, but pretty hot for 2001. Like my parents, we ended up recording 20 MiniDV tapes over a period of 12 years, although our tapes were 60 minutes each instead of about 5.

This is the story of converting those tapes to a modern video format.

My 20 tapes sat on a shelf for several years, and then in 2012 I finally had the tools needed to copy the raw digital DV files to a computer hard disk. I used a tool called “dvgrab” on a laptop that had a FireWire connector. One side effect of dvgrab was that it saved every scene in a separate timestamped file. This turned out to be quite fortunate. When I was done, I had 250GB of raw DV files on an external hard drive. Unfortunately, I did not have enough disk space to do any processing on these files, so this external USB hard disk sat on a shelf for five years.

In 2017 I re-discovered this USB hard disk, and I decided to finish the job. I wanted to encode them in MP4 format, and I wanted them to be separated into “episodes” (such as “2003 beach”, “4th birthday”, “zoo” and so on).

The first step was to divide the 5000+ separate scene files into folders for episodes. I started with a script that looked at the timestamps in the filenames, and it moved them into folders based on that. This got me 90% of the way there. Here’s the script. It used a single variable for the “gap” in time that would mean a break to the next episode.

#!/bin/bash

GAP=$((60*90))

previous_timestamp=0
for x in $(ls -1 ALL/dvgrab-20*.dv | sort) ; do
    oldfolder=$(dirname $x)
    file=$(basename $x)
    # dvgrab-2011.02.12_19-04-58.dv
    year=${file:7:4}
    mon=${file:12:2}
    day=${file:15:2}
    hour=${file:18:2}
    min=${file:21:2}
    sec=${file:24:2}
    timestamp=$(date +%s -d "$year-$mon-$day $hour:$min:$sec")
    if [[ $(( $timestamp - $previous_timestamp )) -gt $GAP ]] ; then
        newfolder=$(date "+%Y-%m-%d_%H-%M-%S" -d "@$timestamp")
        mkdir $newfolder
    fi
    mv -v "$x" "$newfolder/$file"
    previous_timestamp=$timestamp
done

After I had the scenes grouped into episodes, I did an initial encoding of the entire thing. This ran overnight, but it could go unattended. The basic process was to concatenate the DV files and the use “ffmpeg” to encode the episode into an MP4 file. I used the following script to do this in a loop.

I ran this script many times, and it over time I tweaked the ffmpeg options to get a better output. This is the final cut.

#!/bin/bash

TMP="/tmp/encode"
srcdir="/media/alan/sandisk248GB/MINIDV"
destdir="/home/alan/media/videos/minidv"
wildcard="20*"  # directories starting with a date from 2001 onwards

ffopts=""

# FILTERS
ffopts="$ffopts -vf yadif"   # de-interlacing

# VIDEO ENCODING OPTIONS
ffopts="$ffopts -vcodec libx264"
ffopts="$ffopts -preset medium"  # balance encoding speed vs compression ratio
ffopts="$ffopts -profile:v main -level 3.0 "  # compatibility, see https://trac.ffmpeg.org/wiki/Encode/H.264
ffopts="$ffopts -pix_fmt yuv420p"  # pixel format of MiniDV is yuv411, x264 supports yuv420
ffopts="$ffopts -crf 23"  # The constant quality setting. Higher value = less quality, smaller file. Lower = better quality, bigger file. Sane values are [18 - 24]
ffopts="$ffopts -x264-params ref=4"

# AUDIO ENCODING OPTIONS
ffopts="$ffopts -acodec aac"
ffopts="$ffopts -ac 2 -ar 24000 -ab 80k"  # 2 channels, 24k sample rate, 80k bitrate

# GENERIC OPTIONS
ffopts="$ffopts -movflags faststart"  # Run a second pass moving the index (moov atom) to the beginning of the file.

for folder in $(cd $srcdir ; ls -1d $wildcard) ; do
    echo ; echo ; echo ; echo ; date ; echo $folder ; echo
    # do not overwrite existing files
    if [[ ! -f $destdir/$folder.mp4 ]] ; then
        mkdir $TMP 2> /dev/null
        cat $srcdir/$folder/*.dv >> $TMP/$folder.dv
        ffmpeg -i $TMP/$folder.dv $ffopts $destdir/$folder.mp4
        rm -frv $TMP
    else
        ls -l $destdir/$folder.mp4
    fi
done

The next step was the most time-consuming (but fun) part. I wanted to curate all of the “episodes” to make sure that they each contained a single subject in its entirety. I found a few variations:

  • A single file contained two subjects: this happened if two things occurred without a 90-minute “gap” between them.
  • A single episode spanned two files: this happened when there was a 90-minute “gap” in the action
  • A lead-in or fade-out that fell outside of the main timespan of the episode: this happened if I had taped an intro graphic (usually I just wrote on an index card and taped a few seconds of that) the day before an event, or if I started the next event by fading out the last image from the previous event.
  • Occasionally, a single DV file needed to be split into two. Although dvgrab usually broke scenes into files of their own, sometimes it would concatenate two scenes.
  • There were a few scenes that needed to be deleted: mis-takes, “blank filler” at the end of a tape, and so on.

To do this curation step, I loaded up “VLC” video player with a playlist of all of the episodes, and I simply watched them at 4x speed. I’d skip through predictable bits, and pay very close attention to the beginning and end of each episode. When I found something wonky, like a fade-out in its own separate directory, or a fade-out at the beginning of the next episode, I would find that DV file in the original directories and move it to the proper one.

When I was done, I simply deleted the MP4 files ran the encoding script again.

I noticed that the files would not play on my iPhone, and so I spent some time tweaking the ffmpeg options and re-encoding a few files (I limited it by changing the “wildcard” variable). Once I found the right options, I changed the wildcard back, deleted the MP4 files, and re-ran the encoder over all of the files again.

When it was all over, I ended up with 349 “episode” files in MP4 format, taking up 9.2 GB of disk space (much less than the 250 GB of the original DV files).