Using PowerShell to convert VTT to SRT
Converting between VTT and SRT
So I have to convert between VTT and SRT a few times a year. I’m lazy and forgetful so I decided to script the tasks with PowerShell.
We have a quarterly developer conference at work. With 7500+ employees, we do 4-5 tracks of 3-4 sessions each. Like everything else these days, it’s a virtual conference and is presented through Microsoft Teams. The sessions are recorded and we are building up a library as a reference. And yes, we are hiring.
When we get the videos from Teams, we also get a caption file, in VTT format. We have employees who are hearing impaired and they make use of closed captions. When you have a session with multiple presenters, Teams ability to tag the speaker’s name to the text is very helpful for the viewers. That alone makes it worthwhile to use the caption files instead of captioning in realtime as the viewer watches the video.
VTT is the shortened version of WebVTT, which stands for Web Video Text to Track File. It’s a W3C standard and is designed so that it can be used by web browsers. It’s a flexible format that supports a lot of functionality.
I on the team that puts on the conference and one of my tasks is to clean up the videos. I’ll go in and trim off leading and trailing dead time on each video. Some presenters will trigger the recording 5, 10, even 15 minutes before the session starts.
When you trim the video, you have to make sure the caption file is also edited. Otherwise the timestamps will be off. The presenter will say “Hello” and the caption would show up minutes later.
My tool for editing the videos is Adobe Premiere. I have it and I know enough about how to use it that I can make quick work of trimming the videos. After I edit the video, I submit it to Adobe’s batch encoder and move to the next video while the last one is being rendered.
Premiere doesn’t recognize VTT files. Or at least, I couldn’t get it recognize them. Premire uses the old SRT format. SRT (SubRip Text) was the format used by SubRip, a program that could scan rendered videos and extract subtitles from them.
The formats
The VTT and SRT formats are similar, but not interchangable.
VTT | SRT |
---|---|
WEBVTT | |
1 | |
00:00.000 –> 00:00.900 | 00:00:00,000 –> 00:00:00,900 |
Some dialog here | Some dialog here |
2 | |
00:05:16.400 –> 00:05:25.300 | 00:05:16,400 –> 00:05:25,300 |
This is an example of | This is an example of |
a subtitle - 2nd subtitle. | a subtitle - 2nd subtitle. |
I would need to convert the VTT files to SRT, make the edits, then convert the edited SRT files to VTT. In addition to the format conversion, I was going to make a formatting change.
The VTT files generated by Teams use HTML like formatting to take the speaker name. It uses both of the following formats:
<v LastName, Firstname>Spoken dialog goes here, could have a comma and the & would be escaped</v>
<v Firstname Lastname>Spoken dialog goes here, could have a comma and the & would be escaped</v>
That is how the text would appear if you played the video back with an app like. VLC. We decided to change that to look like
LastName, Firstname: Spoken dialog goes here, could have a comma and the & would be escaped
Firstname Lastname: Spoken dialog goes here, could have a comma and the & would be escaped
PowerShell to the rescue
I wrote a pair of PowerShell scripts to convert from VTT to SRT and back again. The VTT to SRT conversion would also handle the<v>
tags.
Converting VTT to SRT
This is the part of the post where I walk through the code. I commented the PowerShell code and it’s all simple. Both scripts are available as GitHub Gist links.
Closer look
Let’s take a close look at the Updaye-NameFormat function:
function Update-NameFormat{
param (
[string]$inputString
)
$vpattern = "^<v "
$trailingPattern = "</v>$"
# Does the string start with <v ?
if ($inputString -match $vpattern) {
# Remove the <v> tag
$outputString = $inputString -replace $vpattern, ''
# look for the closing part of <v> tag
$index = $outputString.IndexOf('>')
if ($index -ge 0) {
$outputString = $outputString.Substring(0, $index) + ": " + $outputString.Substring($index + 1)
}
} else {
$outputString = $inputString
}
# Remove any trailing </v>
$outputString = $outputString -replace $trailingPattern, ''
# Decode HTML entities
$outputString = Convert-HtmlEntities -inputString $outputString
return $outputString.Trim()
}
In the Update-NameFormat function, I have the code that replaces the <v></v>
tages. I tried doing the <v>
replacements wiuth just RegEx, but getting it to match commas and spaces for the name separators made the code clunkly. I love a good RegEx, but I prefer simple code that I can read 6 months from now and still understand.
Instead of matching on the <v Last, First>
or <v First last>
patterns, I broke it and add partial matches and replacements. With the code below, I look for ~<v>
, which matches “<v” at the start of the line. If a match is found, it’s removed. Then we look the first occurence of “>” and replace it with “: “. Finally, we do a RegEx replacement for the </v>
pattern.
Converting SRT back to VTT
With the caption files in SRT format, I can load the video and the caption file into Premiere. When I trim off the video, Premiere updates the timestamps in the caption file. When I’m done, I just need to convert the SRT files to VTT.
The script to convert SRT to VTT is simpler. Since it’s not doing any <v>
replacements or HTML decoding, it’s “read a line, write a line”.
Running the scripts
To run either script, you just pass the file name in. Can be an exact match or you can use wildcards. I wrote these scripts for Windows, using PowerShell 7.5.0. It should work on other platforms with any recent version of PowerShell.
Comments