O'Reilly Hacks
oreilly.comO'Reilly NetworkSafari BookshelfConferences Sign In/My Account | View Cart   
Book List Learning Lab PDFs O'Reilly Gear Newsletters Press Room Jobs  


 
Buy the book!
Digital Video Hacks
By Josh Paul
May 2005
More Info

HACK
#45
Convert a Closed Caption File to a Script
If you've closed captioned your project, you can use a small amount of Perl code to extract a script for others to read.
The Code
[Discuss (0) | Link to this hack]

The Code

Perl works well with text. It can be intimidating to look at and difficult to read, but it can perform wonderful tasks and save a lot of time when used. If you are using Mac OS X or Linux, Perl is most likely already installed on your computer. If you are using Windows, you can download Perl from Active State (http://www.activestate.com/Products/ActivePerl/; free).

The following is a Perl script to reformat .tds closed caption files:

	#!/usr/bin/perl
	# for Caption Center (.tds) files
	while (<>) {
	# remove everything before and including BeginData
$_ =~ s/.*BeginData//g;
	# remove all of the >>
	$_ =~ s/>> //g;
	# remove all of the úF
	$_ =~ s/\x9cF//g;
	# remove all of the Û followed by 3 characters
	$_ =~ s/\x9e(…)/ /g;
	# reformat the timecodes… i.e. ù01005108 to 01:00:51;08
	$_ =~ s/\x9d([0-9][0-9])([0-9][0-9])([0-9][0-9])([0-9][0-9])
	/\r\1:\2:\3\;\4\t\t/g;
	# replace all of the places where there is a tab-tab-return with a									single tab
	$_ =~ s/\t\t\r/\t/g;
	# print everything back out
	print $_;
	}

That's it. That's the entire application. Fin. Done. Out.

Save the file to your computer and name it CCConverter.pl—or whatever you want; it's your application after all.


O'Reilly Home | Privacy Policy

© 2007 O'Reilly Media, Inc.
Website: | Customer Service: | Book issues:

All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.