A Swift 6 library for parsing, converting, resyncing, and saving subtitle files.
SubtitleKit normalizes every supported format into a single SubtitleDocument model, letting you parse once and convert to any output format from the same object.
- Object-first API -- parse into a
Subtitlevalue, then convert, resync, inspect, or save from that same value. - 9 built-in formats with a unified timing/text model (
SubtitleDocument). - Auto-detection by file extension, filename, and content sniffing.
- Round-trip fidelity -- metadata, styles, and cue attributes survive parse/serialize cycles within the same format.
- Extensible -- add new formats by conforming to
SubtitleFormatand registering at runtime. - Concurrency-ready -- all model types are value types and
Sendable. The global registry is thread-safe. - Zero dependencies -- only Foundation.
| Format | Extension | Type | Notes |
|---|---|---|---|
| SubRip | .srt |
Time-based | Most widely used subtitle format |
| WebVTT | .vtt |
Time-based | W3C web standard; supports cue IDs, settings, and metadata blocks |
| SubViewer | .sbv |
Time-based | YouTube caption format |
| MicroDVD | .sub |
Frame-based | Requires a frame rate (fps) to convert to/from time-based formats |
| Sub Station Alpha | .ssa |
Time-based | SSA v4; preserves styles and script info metadata |
| Advanced SSA | .ass |
Time-based | ASS v4+; superset of SSA with richer styling |
| LRC | .lrc |
Time-based | Synchronized lyrics; end times are inferred from the next cue |
| SAMI | .smi |
Time-based | Microsoft format; HTML-based cue content |
| JSON | .json |
Time-based | Generic array-of-objects interchange format |
| Custom | any | any | User-defined via SubtitleFormat protocol |
Add the package to your Package.swift:
dependencies: [
.package(url: "https://github.com/dioKaratzas/swift-subtitle-kit.git", from: "1.0.0")
]Then add the library product to your target dependencies:
.target(
name: "MyApp",
dependencies: [
.product(name: "SubtitleKit", package: "swift-subtitle-kit")
]
)Requirements: Swift tools-version 6.0, iOS 13+, macOS 10.15+, tvOS 13+, watchOS 6+.
import SubtitleKit
let subtitle = try Subtitle.parse(rawSRTText)
print(subtitle.formatName) // "srt"
print(subtitle.cues.count) // number of timed cueslet subtitle = try Subtitle.parse(rawText, format: .vtt)let subtitle = try Subtitle.parse(rawText, fileName: "episode.srt")
// or
let subtitle = try Subtitle.parse(rawText, fileExtension: "vtt")let subtitle = try Subtitle.load(from: fileURL)
// Extension of the URL is used for format detectionlet subtitle = try Subtitle.parse(content, format: .srt)
for cue in subtitle.cues {
print("[\(cue.startTime)ms - \(cue.endTime)ms] \(cue.plainText)")
}
// SubtitleEntry includes cues, metadata, and styles
for entry in subtitle.entries {
switch entry {
case .cue(let cue): print(cue.plainText)
case .metadata(let meta): print("\(meta.key)")
case .style(let style): print(style.name)
}
}SubtitleCue fields:
| Property | Type | Description |
|---|---|---|
id |
Int |
Cue sequence number |
cueIdentifier |
String? |
Optional format-specific ID (e.g., WebVTT cue IDs) |
startTime |
Int |
Start time in milliseconds |
endTime |
Int |
End time in milliseconds |
duration |
Int |
Computed: endTime - startTime |
rawText |
String |
Original cue text (may contain formatting tags) |
plainText |
String |
Tag-stripped and normalized cue text |
frameRange |
FrameRange? |
Frame numbers for frame-based formats (MicroDVD) |
attributes |
[SubtitleAttribute] |
Format-specific key/value attributes |
let vttText = try subtitle.text(format: .vtt, lineEnding: .lf)let vttSubtitle = try subtitle.convert(to: .vtt, lineEnding: .lf)
print(vttSubtitle.formatName) // "vtt"let output = try Subtitle.convert(
rawSRTText,
from: .srt,
to: .vtt,
lineEnding: .lf
)let text = try subtitle.text() // uses source format and line ending// Shift all cues forward by 2 seconds
let shifted = subtitle.resync(.init(offset: 2_000))// Speed up by 5%
let faster = subtitle.resync(.init(ratio: 1.05))let adjusted = subtitle.resync(.init(offset: 500, ratio: 0.98))let custom = subtitle.resync { start, end, frame in
(start + 100, end + 300, frame)
}var mutable = subtitle
mutable.applyResync(.init(offset: -500))let cleaned = subtitle.clean([
.removeSDH,
.removeWatermarks,
.removeSpeakerLabels,
.removeCuesContainingMusicNotes,
.removeAllLineBreaks,
.mergeCuesWithSameText,
.fixUppercaseText,
.removeCurlyBracketTags,
.removeHTMLTags
])Use clean() with no arguments to apply all built-in operations in default order, or applyClean(...) for the mutating variant.
If you need change stats/details (remaining/removed/modified and cue-by-cue status):
let result = subtitle.cleanWithReport([.removeSDH, .removeSpeakerLabels])
print(result.report.originalCueCount)
print(result.report.remainingCueCount)
print(result.report.modifiedCueCount)
print(result.report.removedCueCount)
for change in result.report.changes {
print(change.cueID, change.status, change.changedBy)
}// Infer format from the destination file extension
try subtitle.save(to: outputURL)
// Explicit format and line ending
try subtitle.save(to: outputURL, format: .srt, lineEnding: .crlf)Detection follows a strict priority order:
- Explicit format in
SubtitleParseOptions.format - File extension argument (
fileExtension:) - Filename extension extracted from
fileName: - Content sniffing via registered
canParsechecks
Content sniffing order: VTT, LRC, SMI, ASS, SSA, SUB, SRT, SBV, JSON.
// Detect without parsing
let format = Subtitle.detectFormat(in: rawText)
let format = Subtitle.detectFormat(in: rawText, fileName: "track.srt")import SubtitleKit
struct PipeFormat: SubtitleFormat {
let name = "pipe"
let aliases = ["pipe", "pip"]
func canParse(_ content: String) -> Bool {
content.split(whereSeparator: \.isNewline).contains { line in
line.split(separator: "|", maxSplits: 2).count == 3
}
}
func parse(_ content: String, options: SubtitleParseOptions) throws(SubtitleError) -> SubtitleDocument {
var entries: [SubtitleEntry] = []
for (i, line) in content.split(whereSeparator: \.isNewline).enumerated() {
let parts = line.split(separator: "|", maxSplits: 2, omittingEmptySubsequences: false)
guard parts.count == 3,
let start = Int(parts[0]),
let end = Int(parts[1])
else {
throw SubtitleError.malformedBlock(format: name, details: String(line))
}
entries.append(.cue(.init(
id: i + 1, startTime: start, endTime: end,
rawText: String(parts[2]), plainText: String(parts[2])
)))
}
return SubtitleDocument(formatName: name, entries: entries)
}
func serialize(_ document: SubtitleDocument, options: SubtitleSerializeOptions) throws(SubtitleError) -> String {
let lines = document.cues.map { "\($0.startTime)|\($0.endTime)|\($0.rawText)" }
return lines.joined(separator: options.lineEnding.value)
+ (lines.isEmpty ? "" : options.lineEnding.value)
}
}extension SubtitleFormat where Self == PipeFormat {
static var pipe: SubtitleFormat { PipeFormat() }
}SubtitleFormatRegistry.register(.pipe)
let parsed = try Subtitle.parse("0|1000|Hello\n", format: .pipe)
let srt = try parsed.text(format: .srt)Format-specific serialization options are grouped in SubtitleSerializeOptions
rather than polluting every method signature. For SAMI output:
let options = SubtitleSerializeOptions(
format: .smi,
lineEnding: .crlf,
sami: .init(title: "My Subtitles", languageName: "English", closeTags: true)
)
let smiText = try subtitle.text(using: options)
try subtitle.save(to: outputURL, using: options)- Model types (
Subtitle,SubtitleDocument,SubtitleCue,SubtitleEntry, etc.) are all value types conforming toSendable. They are safe to pass across actor/task boundaries. - Registry --
SubtitleFormatRegistry.currentis protected by an internal lock. The staticregister(_:)method is atomic. - Parsers and serializers are stateless struct methods. As long as custom format implementations are also
Sendable, concurrent parsing across different tasks is safe. - In tests, call
SubtitleFormatRegistry.resetCurrent()in adeferblock to prevent cross-test leakage.
All errors are typed as SubtitleError, which conforms to LocalizedError:
| Case | Meaning |
|---|---|
unsupportedFormat(String) |
Named format not registered |
unableToDetectFormat |
No format matched by hints or content |
malformedBlock(format:details:) |
Structural parse error in a format block |
invalidTimestamp(format:value:) |
Unparseable timestamp string |
unsupportedVariant(format:details:) |
Recognized but unsupported format variant |
invalidFrameRate(Double) |
Non-positive or invalid FPS value |
do {
let subtitle = try Subtitle.parse(brokenText)
} catch let error as SubtitleError {
print(error.errorDescription ?? "Unknown subtitle error")
}| Type | Role |
|---|---|
Subtitle |
Primary entry point: parse, convert, resync, save |
SubtitleDocument |
Unified document model (entries array + format name) |
SubtitleEntry |
Enum: .cue, .metadata, .style |
SubtitleCue |
Timed cue with text, timestamps, and attributes |
SubtitleMetadata |
Key/value metadata from format headers |
SubtitleStyle |
Named style with key/value fields |
SubtitleAttribute |
Key/value pair used across cues, metadata, styles |
| Type | Role |
|---|---|
SubtitleParseOptions |
Format hints, FPS, whitespace preservation |
SubtitleSerializeOptions |
Target format, line ending, FPS, SAMI-specific options |
SubtitleResyncOptions |
Offset, ratio, frame-value mode |
LineEnding |
.lf or .crlf |
| Type | Role |
|---|---|
SubtitleFormat |
Protocol for format adapters |
SubtitleFormatRegistry |
Registration, resolution, and detection |
SRTFormat, VTTFormat, ... |
Built-in format implementations |
- Cross-family conversion normalizes style/metadata to the lowest common denominator. ASS/SSA styles are best preserved when staying in the ASS/SSA family.
- SAMI HTML semantics (classes, nested tags) are simplified to plain text when converting to non-SAMI formats.
- MicroDVD (
.sub) stores frame numbers; converting to/from time-based formats requires an accuratefpsvalue. The default is 25 fps. - LRC has no explicit end times; SubtitleKit infers end times from the start of the next cue (final cue gets a 2-second default duration).
- JSON format uses
JSONSerializationfor broad compatibility; the schema follows the subsrt-ts convention (array of objects withtype,start,end,content,textfields). - BOM handling -- UTF-8 byte order marks are stripped during normalization. The
sourceHadByteOrderMarkproperty records whether one was present. - Line endings -- source line endings (LF vs CRLF) are detected and preserved by default. Override with the
lineEnding:parameter on any serialization method.