Batch Transcript

Fetch transcripts for entire playlists with concurrency control, progress tracking, and partial failure handling

Overview

transcribePlaylist() fetches transcripts for every video in a YouTube playlist. It combines the YouTube Data API v3 (to list playlist videos) with the Innertube transcript module (to fetch each transcript).

API key required

Unlike single-video transcripts, batch operations require a YouTube Data API key to fetch the playlist's video list. Each video's transcript still uses the Innertube API (no additional quota).

Import from the main entry point:

import { transcribePlaylist } from 'lyra-sdk'

Or from the transcript subpath:

import { transcribePlaylist } from 'lyra-sdk/transcript'

transcribePlaylist(playlistUrlOrId, options)

Fetch transcripts for all videos in a playlist. Accepts a playlist ID or URL.

import { transcribePlaylist } from 'lyra-sdk'

const result = await transcribePlaylist('PLrAXtmErZgOeiKm4sgNOknGvNjby9efdf', {
  apiKey: process.env.YOUTUBE_API_KEY!,
})

Also accepts playlist URLs:

const result = await transcribePlaylist(
  'https://www.youtube.com/playlist?list=PLrAXtmErZgOeiKm4sgNOknGvNjby9efdf',
  { apiKey: process.env.YOUTUBE_API_KEY! }
)

Options

PropertyTypeDefaultDescription
apiKeystringRequired. YouTube Data API v3 key
langstringBCP 47 language code for transcripts
concurrencynumber3Max parallel transcript fetches
fromnumber1-indexed start position in the playlist
tonumber1-indexed end position in the playlist
onProgress(done: number, total: number, videoId: string, status: string) => voidProgress callback fired after each video
cacheCacheStoreCache instance for transcript data
retriesnumber0Max retry attempts per video's transcript fetch
retryDelaynumber1000Base delay in ms for exponential backoff
customFetch(url: string, init?: RequestInit) => Promise<Response>fetchCustom fetch function for proxy/networking support

All options except apiKey, concurrency, from, to, and onProgress are forwarded to the transcript fetch for each video.

Response

Returns a PlaylistTranscriptResult:

PropertyTypeDescription
playlistIdstringResolved playlist ID
totalVideosnumberTotal videos in the playlist
requestedRange[number, number]Actual range fetched [from, to]
succeedednumberNumber of successfully transcribed videos
failednumberNumber of videos that failed
resultsVideoTranscriptResult[]Per-video results (see below)

VideoTranscriptResult

Each item in results is either a success or a failure:

Success:

PropertyTypeDescription
videoIdstringYouTube video ID
titlestringVideo title from the playlist
positionnumber1-indexed position in the playlist
status"success"Result status
linesTranscriptLine[]Transcript lines

Failed:

PropertyTypeDescription
videoIdstringYouTube video ID
titlestringVideo title from the playlist
positionnumber1-indexed position in the playlist
status"failed"Result status
errorstringError message

Concurrency control

The concurrency option controls how many transcripts are fetched in parallel. A bounded concurrency pool ensures at most N requests are in-flight at any time.

// Fetch 6 videos at a time
const result = await transcribePlaylist(playlistId, {
  apiKey,
  concurrency: 6,
})

How the pool works

Worker creation

The pool spawns min(concurrency, totalVideos) workers. For a 50-video playlist with concurrency: 3, only 3 workers are created.

Shared index counter

Each worker pulls the next video ID from a shared atomic counter (next++). This guarantees every video is processed exactly once, in order, regardless of completion timing.

Result placement

Results are placed into a pre-allocated array at the original index. The output order always matches the playlist order, even though fetches complete out of order.

Choosing a concurrency value

ValueBest for
1Strict sequential processing, minimal network load
3Default — balances speed and YouTube rate limits
5–10Large playlists (100+ videos) on stable connections
10+Only with caching or proxy rotation

Higher concurrency increases the chance of YouTube rate-limiting. If you see frequent failures, lower the concurrency or enable caching + retries.


Range filtering

Use from and to to fetch transcripts for a subset of the playlist. Both are 1-indexed (matching YouTube's playlist position numbering).

// Fetch only videos 5 through 15
const result = await transcribePlaylist(playlistId, {
  apiKey,
  from: 5,
  to: 15,
})

Omitting from starts at position 1. Omitting to goes to the end of the playlist.

Validation

ConditionError
from < 1TranscriptPlaylistError: "from" must be >= 1
to < 1TranscriptPlaylistError: "to" must be >= 1
to < fromTranscriptPlaylistError: "to" must be >= "from"

If from exceeds the playlist length, an empty result set is returned (no error).


Progress callback

Track progress as transcripts are fetched. The callback fires after each video completes (success or failure).

const result = await transcribePlaylist(playlistId, {
  apiKey,
  onProgress(done, total, videoId, status) {
    const pct = ((done / total) * 100).toFixed(1)
    const icon = status === 'success' ? '✓' : '✗'
    console.log(`[${icon}] ${done}/${total} (${pct}%) — ${videoId}`)
  },
})

Callback parameters

ParameterTypeDescription
donenumberVideos completed so far (including this one)
totalnumberTotal videos in the requested range
videoIdstringID of the video that just completed
statusstring"success" or "failed"

Partial failure handling

Individual video failures are caught and recorded, not thrown. This means a 100-video playlist where 3 videos fail still returns results for the other 97.

const result = await transcribePlaylist(playlistId, { apiKey })

console.log(`Succeeded: ${result.succeeded}`)
console.log(`Failed: ${result.failed}`)

// Check which videos failed
const failures = result.results.filter(r => r.status === 'failed')
for (const f of failures) {
  console.log(`  #${f.position} ${f.videoId}: ${f.error}`)
}

// Process successful transcripts
const successes = result.results.filter(
  (r): r is Extract<typeof r, { status: 'success' }> => r.status === 'success'
)
for (const s of successes) {
  console.log(`#${s.position} ${s.title}: ${s.lines.length} lines`)
}

Usage with TranscriptClient

The TranscriptClient class also supports transcribePlaylist(). Constructor defaults are merged with per-call overrides:

import { TranscriptClient, InMemoryCache } from 'lyra-sdk/transcript'

const client = new TranscriptClient({
  retries: 2,
  retryDelay: 500,
  cache: new InMemoryCache(),
})

const result = await client.transcribePlaylist(playlistId, {
  apiKey: process.env.YOUTUBE_API_KEY!,
  concurrency: 5,
})

Error handling

import { transcribePlaylist, TranscriptPlaylistError } from 'lyra-sdk'

try {
  const result = await transcribePlaylist(playlistId, { apiKey, from: -1 })
} catch (error) {
  if (error instanceof TranscriptPlaylistError) {
    console.log('Invalid range:', error.message)
  }
}

Errors

ErrorWhen it's thrown
TranscriptPlaylistErrorInvalid from/to range
AuthErrorInvalid or missing API key
NotFoundErrorPlaylist does not exist
YTErrorYouTube Data API failure (quota, network, etc.)

Individual transcript failures for specific videos are not thrown — they appear as { status: "failed", error: "..." } in the results array. Only playlist-level errors (auth, not found, invalid range) throw.

On this page