Media Prepare

Architecture

How the transcoding pipeline works

@shelby-protocol/media-prepare is designed as a declarative orchestration layer for FFmpeg. It generates FFmpeg command-line arguments but does not execute FFmpeg directly—this separation allows the same plans to work across Node.js (native FFmpeg) and browsers (FFmpeg.wasm).

Pipeline Overview

┌─────────────────┐     ┌─────────────┐     ┌──────────────┐     ┌─────────────┐
│ CmafPlanBuilder │ ──▶ │  CmafPlan   │ ──▶ │ PlanExecutor │ ──▶ │   FFmpeg    │
│   (fluent API)  │     │ (validated) │     │  (platform)  │     │ (execution) │
└─────────────────┘     └─────────────┘     └──────────────┘     └─────────────┘
  1. CmafPlanBuilder - Fluent builder API for defining transcoding parameters
  2. CmafPlan - Validated intermediate representation (Zod schemas)
  3. PlanExecutor - Platform-specific executor (Node.js or browser)
  4. FFmpeg - Actual transcoding and packaging

Core Components

Plan Builder

The CmafPlanBuilder provides a fluent API for constructing transcoding plans:

import { CmafPlanBuilder, videoLadderPresets } from "@shelby-protocol/media-prepare/core";

const plan = new CmafPlanBuilder()
  .withInput("input.mp4")                           // Source file
  .withOutputDir("output")                          // Output directory
  .withVideoLadder(videoLadderPresets.vodHd_1080p)  // Bitrate ladder
  .withVideoCodec({ kind: "x264" })                 // Encoder settings
  .addAudioTrack({ ... })                           // Audio configuration
  .withSegmentDuration(4)                           // Segment length
  .withHlsOutput()                                  // Output format
  .build();                                         // Validate and return plan

All inputs are validated at build time using Zod schemas, providing helpful error messages for invalid configurations.

Transcoder

The FfmpegTranscoder generates FFmpeg arguments for video and audio encoding:

  • Video encoding with configurable codecs (x264, x265, aom-av1)
  • Multi-rung bitrate ladders for adaptive streaming
  • Audio encoding with multiple language tracks
  • Codec-specific optimizations (presets, profiles, tuning)

Packager

The CmafPackager generates FFmpeg arguments for CMAF + HLS packaging:

  • Fragmented MP4 (fMP4) segments
  • HLS playlists with variant streams
  • Segment duration configuration
  • Stream mapping for multi-track output

Platform Adapters

Node.js

The Node.js adapter uses the system FFmpeg installation:

import { NodeCmafPlanExecutor } from "@shelby-protocol/media-prepare/node";

const executor = new NodeCmafPlanExecutor({
  shaka: true,   // Shaka packager (DRM + DASH)
  ffprobe: true, // Auto-detect frame rate
  verbose: false,
});

Features:

  • SystemFfmpegExecutor - Spawns native FFmpeg process
  • FFprobe integration - Auto-detects source media properties
  • Shaka Packager - Optional DRM support (Widevine)

Browser

The browser adapter uses FFmpeg.wasm:

import { WasmFfmpegExecutor } from "@shelby-protocol/media-prepare/browser";

const executor = new WasmFfmpegExecutor(ffmpegInstance);

Features:

  • WasmFfmpegExecutor - Wraps @ffmpeg/ffmpeg
  • Virtual filesystem - Works with FFmpeg.wasm's in-memory FS
  • Progress callbacks - Report transcoding progress

Output Format

The library produces CMAF (Common Media Application Format) output:

  • Container: Fragmented MP4 (.m4s segments, .mp4 init segments)
  • Playlists: HLS (.m3u8) with DASH support planned
  • Segments: Configurable duration (1-30 seconds)

Example output structure:

output/
├── master.m3u8              # Master playlist
├── 1080p/
│   ├── init.mp4             # Initialization segment
│   ├── playlist.m3u8        # Variant playlist
│   └── segment-000.m4s      # Media segments
├── 720p/
│   └── ...
├── 480p/
│   └── ...
└── audio-eng/
    ├── init.mp4
    ├── playlist.m3u8
    └── segment-000.m4s

Validation

All configuration is validated using Zod schemas:

  • VideoRung - Resolution and bitrate validation
  • AudioTrack - Language codes and bitrate limits
  • SegmentDuration - Range constraints (1-30 seconds)
  • VideoCodec - Codec-specific option validation

Invalid configurations throw descriptive errors at build time, not runtime.