Timeline Request Structure
Timeline requests use a full render manifest shape.
Need the short mental model first? See Timeline Overview timing inference model.
Endpoint
- Render:
POST https://api.reelforger.com/v1/videos/render - Validate:
POST https://api.reelforger.com/v1/videos/validate
Top-level structure
{
"version": "v1",
"output": { "width": 1080, "height": 1920, "fps": 30 },
"assets": [],
"composition": {}
}
Authoring invariants
These rules are enforced by the contract and are worth treating as non-negotiable when generating manifests:
output.fpsis fixed at30output.widthandoutput.heightmust be even integerscomposition.duration_secondscannot exceed300assets[].idvalues must be uniquecomposition.timeline[].idvalues must be uniquecomposition.text_overlays[].idvalues must be unique when present- every
composition.timeline[].asset_idmust reference an existing entry inassets[] - caption
words[].startandwords[].endare in milliseconds, not seconds metadatamust be a flat key/value object with at most10keys; keys and string values are capped at500characters
Required fields
versionoutput.width,output.height,output.fpscompositionobject
At least one visual/audio path must be represented through composition.timeline and corresponding assets.
Optional but common fields
idempotency_key(safe retry dedupe)composition.auto_stitch(stitch untimed video by layer order; untimed audio in mixed timelines defaults tostart_seconds: 0unless explicitly timed)composition.text_overlayscomposition.captionswebhook_url,webhook_headers,webhook_secretmetadata
Assets and layer linkage
- Each timeline layer references an asset via
asset_id. - Every
asset_idmust exist inassets[]. - Layer
typeand assettypeshould match intended usage. - Asset
typemust be one ofvideo,audio, orimage.
Time rules
imagelayers requiretime.start_seconds.image.time.duration_secondscan be omitted when composition duration is inferable:- from explicit
composition.duration_seconds, - from max timed end across timeline/text overlays,
- or at render-time when
composition.auto_stitchis enabled and media durations are probed.
- from explicit
video/audiolayers requiretimeunlesscomposition.auto_stitchistrue.trim.start_secondsis optional for audio/video.- Explicit layer timing always wins when provided.
- When
composition.auto_stitch: true, untimedvideolayers are sequenced incomposition.timelineorder. - In mixed timelines, untimed
audiolayers default tostart_seconds: 0and are aligned to the stitched video duration unless explicittimeis provided. composition.text_overlaysalso contribute to inferred composition duration for image-layer timing.
Shared enum values and defaults
Layout
| Field | Allowed values | Default / note |
|---|---|---|
layout.fit | cover, contain | Defaults to cover |
layout.x, layout.y | typically percent or pixel strings | If expressed as percentages, keep them in sane bounds |
layout.width, layout.height | typically percent or pixel strings | Defaults are full-frame when layout is omitted |
Percent guardrails when using % values:
x,y: between-100%and100%width,height: between0%and100%
Layer visuals
| Field | Allowed values | Default / note |
|---|---|---|
background_mode | blurred, transparent, solid | Defaults to blurred |
motion | zoom_in, zoom_out, pan_left, pan_right, none | Defaults to none |
Text and captions
| Field | Allowed values | Default / note |
|---|---|---|
text_align | left, center, right, justify | Applies inside the text overlay bounding box |
captions.mode | word_only, phrase, phrase_karaoke | Defaults to phrase_karaoke |
captions.provider | assemblyai | Current provider enum |
Caption preset values:
tiktok_classicbold_outlinekaraoke_yellowneon_glowsoft_pilltypewriterhandwritingluxury_serif
Captions and alignment
- If
composition.captions.wordsis provided, ReelForger uses those words/timestamps directly. - If
composition.captions.wordsis omitted, ReelForger may transcribe automatically. - If ReelForger cannot determine a single speech source, you must provide
composition.captions.transcription_source_asset_id. - Add
composition.captions.correct_textwhen you need improved punctuation/casing alignment. correct_textremains valid even when words are auto-transcribed by ReelForger.- Current behavior: auto-transcription runs on the full selected source duration sent for transcription.
- If
correct_textaligns poorly to the detected words, the render can fail withcaption_alignment_failed. - Keep caption placement in safe lower-third regions for social readability.
Caption configuration quick matrix
| Field | Required | Notes |
|---|---|---|
captions.provider | Yes | Currently assemblyai |
captions.preset | Yes | Choose from the supported preset enum above |
captions.mode | Yes | phrase_karaoke is the default / most social-friendly |
captions.words | Optional | Supply when you already have timed words in milliseconds |
captions.transcription_source_asset_id | Conditionally required | Required when words are omitted and speech source is ambiguous |
captions.correct_text | Optional | Helps punctuation/casing, but can fail alignment if the text is too different |
captions.max_chars_per_segment | Optional | Chunking control for phrase-based captioning |
captions.time_overrides | Optional | Apply style/layout overrides to specific time windows |
Timeline caption example (omitted words + explicit source)
{
"version": "v1",
"output": { "width": 1080, "height": 1920, "fps": 30 },
"assets": [
{ "id": "talking-head", "type": "video", "url": "https://example.com/talking-head.mp4" }
],
"composition": {
"timeline": [
{
"id": "layer-1",
"type": "video",
"asset_id": "talking-head",
"time": { "start_seconds": 0, "duration_seconds": 12 }
}
],
"captions": {
"provider": "assemblyai",
"preset": "karaoke_yellow",
"mode": "phrase_karaoke",
"transcription_source_asset_id": "talking-head"
}
}
}
Validate first
Use https://api.reelforger.com/v1/videos/validate with the exact same body before rendering in production.
Warnings highlight common readability/layout/timing risks before credits are spent.
Less common but schema-visible fields
motiononvideoandimagelayers is supported and can be used for Ken Burns style movement over the layer duration.style.transformis supported on media layers when you need explicit CSS-like transform control.video.media_settings.mutedis available when the video should contribute no audible output.video.media_settings.crossfade_secondsis a more advanced video-only setting. Use it cautiously, validate first, and prefer simpler timing/layout patterns unless you specifically need overlapping transitions.transitionsis currently schema-visible but not a recommended primary authoring surface in these docs. Unless you have a validated known-good pattern, prefer explicit timing and layer composition instead.