Category: OS


I have an Nissan Altima with a BOSE radio that allows me to hook up an USB thumbdrive containing mp3 files. The problem is most of my audiobooks are in m4b format. Previously I’ve used tools like mp3splt and tried to split on ‘silence’ or timed increments (say 15 min) but I was getting mp3 files that would be split in midsentence and sometimes midword. It became very annoying after awhile.

So, I came up with a really simple script to convert an audiobook (m4b) into mp3 files splitting on the chapters. We are dependent on FFmpeg::Command and a modified FFprobe Perl module.

In the following example, we are converting a Ben Bova audiobook but we are going to specify to start the track numbering at “10″ because the 2nd file ended with track “9″.

jason@jason-Inspiron-1545 ~/bin $ ./test_mp4_info.pl -i "/home/jason/Audiobooks/Ben Bova/Mars/Mars 3.m4b" -o mp3 -a "Mars" -t 10
Converting "Mars 3.m4b" to "mp3/010 Mars.mp3"
  album: Mars
  artist: Ben Bova
  title: 010 – Mars
  genre: Audiobook
  track: 10
  … COMPLETE
Converting "Mars 3.m4b" to "mp3/011 Mars.mp3"
  album: Mars
  artist: Ben Bova
  title: 011 – Mars
  genre: Audiobook
  track: 11
  … COMPLETE
Converting "Mars 3.m4b" to "mp3/012 Mars.mp3"
  album: Mars
  artist: Ben Bova
  title: 012 – Mars
  genre: Audiobook
  track: 12
  … COMPLETE
Converting "Mars 3.m4b" to "mp3/013 Mars.mp3"
  album: Mars
  artist: Ben Bova
  title: 013 – Mars
  genre: Audiobook
  track: 13
  … COMPLETE
Converting "Mars 3.m4b" to "mp3/014 Mars.mp3"
  album: Mars
  artist: Ben Bova
  title: 014 – Mars
  genre: Audiobook
  track: 14
  … COMPLETE

Source code:

#!/usr/bin/perl

use strict;
use warnings;

use lib qw(/home/jason/bin);

use Getopt::Std;
use File::Basename;
use FFmpeg::Command;
use FFprobe;

$|++;

###############################
sub _encode_mp3 {
  my ($input_file, $output_dir, $album, $starting_track) = @_;

  my %tags = ();
  my $track_number;
  my $mp4 = FFprobe->probe_file($input_file);
  my $base_output_file = basename($input_file);
  $base_output_file =~ s/\.\w+$//;

  if (exists $mp4->{format}->{‘TAG:comment’}) {
    $tags{genre} = $mp4->{format}->{‘TAG:comment’};
    $tags{genre} =~ s/("’)//g;
  }

  if (exists $mp4->{format}->{‘TAG:genre’}) {
    $tags{genre} = $mp4->{format}->{‘TAG:genre’};
    $tags{genre} =~ s/("’)//g;
  }

  if (exists $mp4->{format}->{‘TAG:artist’}) {
    $tags{artist} = $mp4->{format}->{‘TAG:artist’};
    $tags{artist} =~ s/("’)//g;
  }

  if ($album) {
    $tags{album} = $album;
  } elsif (exists $mp4->{format}->{‘TAG:album’}) {
    $tags{album} = $mp4->{format}->{‘TAG:album’};
  }

  $tags{album} =~ s/("’)//g;
  $track_number = $starting_track if $starting_track;

  foreach my $chapter (sort keys %{$mp4->{chapters}}) {
    unless ($starting_track) {
      $track_number = $chapter;
    }

    my $output_file = sprintf "%s/%03d %s.mp3", $output_dir, $track_number, $tags{album};
    my $start = $mp4->{chapters}->{$chapter}->{start};
    my $duration = $mp4->{chapters}->{$chapter}->{end} - $start;
    my @options = ();

    if ($album) {
      $tags{title} = sprintf "%03d – %s", $track_number, $album;
    } else {
      if (exists $mp4->{format}->{‘TAG:title’}) {
        $tags{title} = sprintf "%03d – %s", $track_number, $mp4->{format}->{‘TAG:title’};
      } else {
        $tags{title} = sprintf "%03d – %s", $track_number, $base_output_file;
      }
    }

    $tags{title} =~ s/("’)//g;

    my $ffmpeg = FFmpeg::Command->new;

    $ffmpeg->input_options({
        file => $input_file,
     });

    $ffmpeg->output_options({
     ‘file’ => $output_file,
     ‘audio_codec’ => ‘libmp3lame’,
     ‘audio_bit_rate’ => 64,
     });

    printf "Converting \"%s\" to \"%s\"\n", basename($input_file), $output_file;

    foreach my $tag (keys %tags) {
      push @options, ‘-metadata’, $tag . "=" . $tags{$tag};
      printf "\t%s: %s\n", $tag, $tags{$tag};
    }

    push @options,
      ‘-metadata’ => ‘track=’ . $track_number,
      ‘-ss’ => $start,
      ‘-t’ => $duration;

    printf "\ttrack: %d\n", $track_number;

    $ffmpeg->options(
      @options
    );

    $ffmpeg->exec();
    print "\t… COMPLETE\n";
    $track_number++ if $starting_track;
  }
}
###############################

my %arg_options = ();
getopts(‘a:i:o:t:’, \%arg_options);

if ($arg_options{i} && $arg_options{o}) {
  my $input_file = $arg_options{i};
  my $output_dir = $arg_options{o};
  my $starting_track = $arg_options{t};
  my $album = $arg_options{a};

  if (-f $input_file && -d $output_dir) {
    _encode_mp3($input_file, $output_dir, $album, $starting_track);
  } else {
    warn ("Unable to find file: \"" . $input_file . "\"\n") unless -f $input_file;
    warn ("Unable to find dir: \"" . $output_dir . "\"\n") unless -f $output_dir;
  }
}

I have multiple audiobook files (m4b) that ffprobe is able to retrieve the chapters from just fine… except the chapter information is printed to stderr and never in the formatted (STDOUT) output. The Perl module FFprobe doesn’t handle the chapters so I submitted feature request #73803

Feature request is to format the chapter output.

jason@jason-Inspiron-1545 ~/bin $ ffprobe "/home/jason/Audiobooks/Ben Bova/Mars/Mars 1.m4b" 1>/dev/null
….
  libavutil    51.  7. 0 / 51.  7. 0
  libavcodec   53.  5. 0 / 53.  5. 0
  libavformat  53.  2. 0 / 53.  2. 0
  libavdevice  53.  0. 0 / 53.  0. 0
  libavfilter   2.  4. 0 /  2.  4. 0
  libswscale    2.  0. 0 /  2.  0. 0
  libpostproc  52.  0. 0 / 52.  0. 0
[mov,mp4,m4a,3gp,3g2,mj2 @ 0xddfac0] max_analyze_duration reached
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from ‘/home/jason/Audiobooks/Ben Bova/Mars/Mars 1.m4b’:
  Metadata:
    major_brand     : M4B
    minor_version   : 0
    compatible_brands: M4B mp42isom
    creation_time   : 2009-09-08 16:19:29
    album           : Mars
    artist          : Ben Bova
    genre           : Audiobook
  Duration: 03:51:23.41, start: 0.000000, bitrate: 81 kb/s
    Chapter #0.0: start 0.000000, end 2779.567914
    Metadata:
      title           : Mars – 01 of 24
    Chapter #0.1: start 2779.567914, end 5555.049161
    Metadata:
      title           : Mars – 02 of 24
    Chapter #0.2: start 5555.049161, end 8334.617075
    Metadata:
      title           : Mars – 03 of 24
    Chapter #0.3: start 8334.617075, end 11110.098322
    Metadata:
      title           : Mars – 04 of 24
    Chapter #0.4: start 11110.098322, end 13883.419864
    Metadata:
      title           : Mars – 05 of 24
    Stream #0.0(und): Audio: aac, 44100 Hz, stereo, s16, 80 kb/s
    Metadata:
      creation_time   : 2009-09-08 16:19:29
    Stream #0.1(eng): Subtitle: text / 0×74786574
    Metadata:
      creation_time   : 2009-09-08 17:31:00
Unsupported codec with id 94213 for input stream 1
jason@jason-Inspiron-1545 ~/bin $

patch to add m4b chapter support:

82c82
< my ($tree, $branch, $tag, $stream);

>     my ($tree, $branch, $tag, $stream, $chapter);
100c100,108
< }

>   } elsif ($line =~ m/Chapter \#(\d+\.*\d+): start (\d+\.*\d+)\, end (\d+\.*\d+)/i) {
>       my ($start, $end) = ($2, $3);
>       $chapter = $1;
>       $chapter =~ s/\.//g;
>       $chapter =~ s/^0+(\d)/$1/;
>
>       $$tree{chapters}{$chapter} = { start => $start, end => $end };
>     } elsif ($line =~ /title\s+: (.+)$/) {
>       $$tree{chapters}{$chapter}{title} = $1;
101a110
>   }

In the following example, I have 21 files that came from a raw partition that I split at 10GB intervals. I am piping that to parallel bzip2 (pbzip2) and writing it to a raw partition (logical volume).

 cat /mnt/DBADEV1/DBADEV1.disk.bz2.00 /mnt/DBADEV1/DBADEV1.disk.bz2.01 /mnt/DBADEV1/DBADEV1.disk.bz2.02 /mnt/DBADEV1/DBADEV1.disk.bz2.03 /mnt/DBADEV1/DBADEV1.disk.bz2.04 /mnt/DBADEV1/DBADEV1.disk.bz2.05 /mnt/DBADEV1/DBADEV1.disk.bz2.06 /mnt/DBADEV1/DBADEV1.disk.bz2.07 /mnt/DBADEV1/DBADEV1.disk.bz2.08 /mnt/DBADEV1/DBADEV1.disk.bz2.09 /mnt/DBADEV1/DBADEV1.disk.bz2.10 /mnt/DBADEV1/DBADEV1.disk.bz2.11 /mnt/DBADEV1/DBADEV1.disk.bz2.12 /mnt/DBADEV1/DBADEV1.disk.bz2.13 /mnt/DBADEV1/DBADEV1.disk.bz2.14 /mnt/DBADEV1/DBADEV1.disk.bz2.15 /mnt/DBADEV1/DBADEV1.disk.bz2.16 /mnt/DBADEV1/DBADEV1.disk.bz2.17 /mnt/DBADEV1/DBADEV1.disk.bz2.18 /mnt/DBADEV1/DBADEV1.disk.bz2.19 /mnt/DBADEV1/DBADEV1.disk.bz2.20 | pbzip2 -dcv -p4 > /dev/mapper/VG_VMH1-LV_DBADEV1

Output:

Parallel BZIP2 v1.0.5 – by: Jeff Gilchrist [http://compression.ca]
[Jan. 08, 2009]             (uses libbzip2 by Julian Seward)

         # CPUs: 4
——————————————-
         File #: 1 of 1
     Input Name:
    Output Name:

 BWT Block Size: 900k
Decompressing data (no threads)

This will take a while, so let’s determine which file it is currently working on:

$ lsof|grep DBADEV1
cat       24137      root    3r      REG                8,1   10737418240         80 /mnt/DBADEV1/DBADEV1.disk.bz2.07

Oh boy, it’s only on file #8. oh well, we can watch it a little easier with the “watch” command set to run the lsof command every 5 seconds:

watch -n5 ‘lsof|grep DBADEV1′

For those of you that are thinking about using using libvirt/kvm on Linux… here is a discussion on proposed best practices

little annoyed that setting a ‘default’ connect string with virt-top and virsh is different:

virsh uses the environment variable VIRSH_DEFAULT_CONNECT_URI
export VIRSH_DEFAULT_CONNECT_URI=’qemu:///system’

virt-top uses the config file .virt-toprc
connect qemu:///system