Workflow
Overview
YouTube Subtitler is a bit different when it comes to subtitling videos. The workflow might be a bit different from what you're used to from other subtitling applications or sites. Here's a general description of the workflow that's promoted by this site.
The workflow is divided into two steps - transcribing and synchronizing, and a small mini-step that comes between them - processing. Here's a brief description of each of the steps:
-
Transcribing
This is the first step. At this point, you should play the video and type everything you want to appear in the subtitles. You shouldn't worry about timing the transcript, just type everything. A non-professional who types fast enough, and understand the spoken language in the video should be able to transcribe 10 minutes of video in 40-60 mintes. -
Processing
Once you're done with the transcript, you process it to convert it into subtitles. This conversion is done by the system. You only get to see the processed subtitles and approve (or not). -
Synchronizing
Once the transcript has been processed into subtitles, there's the small matter of synchronizing the subtitles with the video. This is done by playing the video and marking the start- and end-time (also called in- and out-time) for each subtitle. While this may sound scary, it usually takes 10-15 minutes to synchronize a 10 minute video.
Workflow Example
For this example, let's add subtitles to the intro of Star Trek (the original TV series from 1966, starring William Shatner and Leonard Nimoy).
Transcribing
During the transcribing phase, you watch the video and type anything that's being said in a big text field. Keep typing while the video is playing. If you can't keep up, pause the video, or rewind a bit and continue typing. It's easier if you use the shortcut keys for common actions like play and pause (see Tips and Tricks). In the end, the transcript could look something like this:

Notice that there are empty lines separating the 4 paragraphs in the transcript.
Each paragraph represents a consecutive flow of spoken words. Whenever there's
a pause, you should leave an empty line and start a new paragraph. Later when
processing the transcript, empty lines mean the next paragraph should start a
new subtitle (which makes sense if there's a pause in the dialog).
Processing
Once the transcript is ready, we ask the system to process it (convert it to subtitles). The system will show us a preview of this process:

As you can see, processing converted each paragraph to one or more subtitles.
For example, the third paragraph (It's five year mission...) was split into two
subtitles. Also notice that the first paragraph takes only a single line within
the first subtitle (each subtitle has two lines), but the second paragraph
doesn't start on the second line of the first paragraph, because there was an
empty line between the paragraphs, and this means there might be a long pause
between whatever's in the first and second paragraph, so they can't share a
single subtitle.
Synchronizing
Once the transcript is split into subtitles, we just need to synchronize them. That is - specify the start- and end-time for each subtitle (also called in- and out-time). At the end of this process we'll have a list of subtitles that are timed, and can be properly shown while the video is playing.
The result looks like this: