VideoDB Documentation

Pages
- Welcome to VideoDB Docs
- Quick Start Guide
  Video Indexing Guide
  Semantic Search
  Collections
  Public Collections
  Callback Details
  Ref: Subtitle Styles
  Language Support
  Guide: Subtitles
  How Accurate is Your Search?
- Visual Search and Indexing
  Scene Extraction Algorithms
  Custom Annotations
  Scene-Level Metadata: Smarter Video Search & Retrieval
  Advanced Visual Search Pipelines
  Playground for Scene Extractions
  Deep Dive into Prompt Engineering : Mastering Visual Indexing
  How VideoDB Solves Complex Visual Analysis Tasks
  Multimodal Search: Quickstart
  Conference Slide Scraper with VideoDB
- Examples and Tutorials
  Dubbing - Replace Soundtrack with New Audio
  VideoDB: Adding AI Generated voiceovers to silent footage
  Beep curse words in real-time
  Remove Unwanted Content from videos
  Instant Clips of Your Favorite Characters
  Insert Dynamic Ads in real-time
  Adding Brand Elements with VideoDB
  Elevating Trailers with Automated Narration
  Add Intro/Outro to Videos
  Audio overlay + Video + Timeline
  Building Dynamic Video Streams with VideoDB: Integrating Custom Data and APIs
  AI Generated Ad Films for Product Videography
  Fun with Keyword Search
  Overlay a Word-Counter on Video Stream
  Generate Automated Video Outputs with Text Prompts | VideoDB
  Multimodal Search
  How I Built a CRM-integrated Sales Assistant Agent in 1 Hour
  Make Your Video Sound Studio Quality with Voice Cloning
  Automated Traffic Violation Reporter
  VideoDB x TwelveLabs: Real-Time Video Understanding
- Live Video→ Instant Action
- Generative Media Quickstart
  Generative Media Pricing
- Video Editing Automation
  Fit & Position: Aspect Ratio Control
  Trimming vs Timing: Two Independent Timelines
  Advanced Clip Control: The Composition Layer
  Caption & Subtitles: Auto-Generated Speech Synchronization
  Example Notebooks
- Transcoding Quickstart
- Director - Video Agent Framework
  Agent Creation Playbook
  Setup Director Locally
- Workflows and Integrations
  Zapier Integration
  Auto-Dub Videos & Save to Google Drive
  Create & Add Intelligent Video Highlights to Notion
  Create GenAI Video Engine - Notion Ideas to Youtube
  Automatically Detect Profanity in Videos with AI - Update on Slack
  Generate and Store YouTube Video Summaries in Notion
  Automate Subtitle Generation for Video Libraries
  Solve customers queries with Video Answers
  N8N Workflows
  AI-Powered Meeting Intelligence: Recording to Insights Automation
  AI Powered Dubbing Workflow for Video Content
  Automate Subtitle Generation for Video Libraries
  Automate Interview Evaluations with AI
  Turn Meeting Recordings into Actionable Summaries
  Auto-Sync Sales Calls to HubSpot CRM with AI
  Instant Notion Summaries for Your Youtube Playlist
- Meeting Recording SDK
- Open Source
  LlamaIndex VideoDB Retriever
  PromptClip: Use Power of LLM to Create Clips
  StreamRAG: Connect ChatGPT to VideoDB
- VideoDB MCP Server
- Give your AI, Eyes and Ears
  Building Infrastructure that “Sees” and “Edits”
  Agents with Video Experience
  From MP3/MP4 to the Future with VideoDB
  Dynamic Video Streams
  Why do we need a Video Database Now?
  What's a Video Database ?
  Enhancing AI-Driven Multimedia Applications
  Beyond Traditional Video Infrastructure
- Customer Love
- Join us
  Internship: Build the Future of AI-Powered Video Infrastructure
  Ashutosh Trivedi
  Playlists
  Talks - Solving Logical Puzzles with Natural Language Processing - PyCon India 2015
  Ashish
  Shivani Desai
  Gaurav Tyagi
  Rohit Garg
  Edge of Knowledge
  Language Models to World Models: The Next Frontier in AI
  Society of Machines
  Society of Machines
  Autonomy - Do we have the choice?
  Emergence - An Intelligence of the collective
  Building Intelligent Machines
  Part 1 - Define Intelligence
  Part 2 - Observe and Respond
  Part 3 - Training a Model
  Updates
  VideoDB Acquires Devzery: Expanding Our AI Infra Stack with Developer-First Testing Automation

VideoDB Documentation

Video Editing Automation

Explore

Video Editing Automation

Dynamic Video Streams

Fit & Position: Aspect Ratio Control

Trimming vs Timing: Two Independent Timelines

Advanced Clip Control: The Composition Layer

Caption & Subtitles: Auto-Generated Speech Synchronization

Example Notebooks

⁠

Open in Collab

⁠

Imagine building videos like coding - declarative, composable, and infinitely reusable.

VideoDB Editor lets you create videos programmatically using code instead of clicking timelines. You define what you want (assets, effects, timing), and the engine handles the rendering.

This guide is your complete conceptual introduction. By the end, you’ll understand how to compose anything from simple clips to complex multi-layer productions - all through code.

⁠

Why Code-First Video Editing?

Traditional video editors are built for one-off productions. But what if you need to:

Generate 100 personalized videos from a template

Build a TikTok content pipeline that runs daily

Create video variations for A/B testing

Automate highlight reels from live streams

Code changes everything:

Reusability – One video asset, infinite variations

Scalability – Loop over data to generate hundreds of videos

Version control – Git-track your compositions

Automation – Integrate with AI, databases, APIs

⁠

The 4-Layer Architecture

VideoDB Editor uses a hierarchy where each layer has one job. Understanding this structure is the key to mastering composition:

Asset → Clip → Track → Timeline

⁠

Let’s walk through each layer using the simplest possible example: one video asset playing for 10 seconds. This is the “Hello World” of Editor - understanding this foundation lets you build anything.

⁠

Installing VideoDB in your environment

VideoDB is available as

python package 📦⁠

⁠

!pip install videodb

Layer 1: Assets – Your Raw Materials

Assets are your content library. They reference media that exists in your VideoDB collection but don’t define how or when it plays.

VideoAsset

Your main video content. Each VideoAsset points to a video file via its ID.

Key parameters:

id (required) – The VideoDB media ID

start (optional) – Trim point in seconds (e.g., start=10 skips first 10s of source)

volume (optional) – Audio level: 0.0 (muted) to 2.0 (200%), default 1.0

Real example:

from videodb.editor import Timeline, Track, Clip, VideoAsset

video_asset = VideoAsset(

# Create a VideoAsset pointing to a video file in your collection

id=video.id,

start=0,

volume=1

)

# Ready to use in a Clip

This says: “Use the video from your VideoDB collection, start from the beginning (start=0), and keep original volume (volume=1).”

Important distinction: VideoAsset.start trims the source file. Where it appears on the timeline is controlled later at the Track layer. This “double start” concept is critical - we’ll explore it more in Layer 3 (Tracks).

AudioAsset

Background music, voiceovers, or sound effects. Works exactly like VideoAsset.

Key parameters:

id (required) – The VideoDB audio file ID

start (optional) – Same trim behavior as VideoAsset

volume (optional) – 0.0-2.0 range (0.2 = 20% volume)

ImageAsset

Logos, watermarks, title cards, or static backgrounds.

Key parameters:

id (required) – The VideoDB image ID

crop (optional) – Rarely used; trims edges before rendering

Crop the sides of an asset by a relative amount. The size of the crop is specified using a scale between 0 and 1.

A left crop of 0.5 will crop half of the asset from the left, a top crop of 0.25 will crop the top by quarter of the asset.

Images are static by nature - duration, position, and size are controlled at the

Clip layer.⁠

⁠

TextAsset

Custom text overlays with full typography control.

Key parameters:

text (required) – The string to display

font (optional) – Font object with family, size, color

border, shadow, background (optional) – Styling objects

Color format: ASS-style &HAABBGGRR in hex (e.g., &H00FFFFFF = white)

⁠

CaptionAsset

Auto-generated subtitles synced to speech. This is where VideoDB gets magical.

Important: CaptionAsset is a separate asset type from TextAsset. While TextAsset is for custom text overlays you write yourself, CaptionAsset automatically generates subtitles from video speech.

Key parameters:

src (required) – Set to "auto" to generate captions from video speech

animation (optional) – How words appear: reveal, karaoke, supersize, box_highlight

primary_color, secondary_color (optional) – ASS-style colors

font, positioning, border, shadow styling (optional)

Critical requirement: Before using CaptionAsset(src="auto"), you must call video.index_spoken_words() on the source video. This indexes the speech for auto-caption generation. Without it, captions won’t generate.

⁠

Supported Fonts for Text and Caption Assets:

⁠

Supported Indic fonts:

Noto Sans Kannada

Noto Sans Devanagari

Noto Sans Gujarati

Noto Sans Gurmukhi

Recap: Assets answer “What content exists?” They don’t yet define timing, size, position, or effects. That’s the Clip layer’s job. (rephrase)

⁠

Layer 2: Clips – The Presentation Engine

Clips wrap Assets and define how and how long they appear. This is your effects layer.

Every Clip must have an asset and a duration. Everything else is optional.

⁠

Duration – How Long It Plays

duration is a float in seconds. It defines how long the clip plays on the timeline.

Real example:

from videodb import Clip

clip = Clip(

asset=video_asset,

duration=10

)

“Play this VideoAsset for 10 seconds.”

Key insight: Duration is independent of the source file’s length. If your source is 2 minutes but you set duration=10, only 10 seconds play (starting from VideoAsset.start).

We get an error if clip duration greater than video/audio length.

Fit – How It Scales to Canvas

When your asset’s aspect ratio doesn’t match the timeline’s, fit controls scaling behavior.

Four modes:

Fit.crop (most common) – Fills the canvas completely, cropping edges if needed

Use when: Filling the frame is priority, cropping is acceptable

Example: 16:9 video on a 9:16 (vertical) timeline

Fit.contain – Fits the entire asset inside the canvas, adding bars if needed

Use when: Showing all content is priority, bars are acceptable

Example: Preserving widescreen footage in a square format

Fit.cover – Stretches to fill canvas (distortion possible)

Use when: Artistic effect or abstract content

Fit.none – Uses native pixel dimensions (no scaling)

Use when: Precise pixel control needed (e.g., 1:1 pixel mapping)

Real example:

clip = Clip(

asset=video_asset,

duration=10,

fit=Fit.crop

)

“Fill the canvas completely, crop edges if aspect ratios don’t match.”

Position – Where It Appears

Position uses a 9-zone grid system:

top_left top top_right

center_left center center_right

bottom_left bottom bottom_right

⁠

Real example:

logo_clip = Clip(

asset=logo,

duration=30,

position=Position.top_right

)

“Place the logo in the top-right corner.”

Offset – For fine-tuned positioning

⁠

from videodb.editor import Offset

clip = Clip(

asset=logo,

duration=30,

position=Position.center,

offset=Offset(x=0.3, y=-0.2)

)

This shifts the logo 30% right, 20% up from center.

Scale – Size Adjustment

scale is a multiplier applied after fit. Default is 1.0.

Real example:

pip_clip = Clip(

asset=overlay_video,

duration=15,

scale=0.3

)

“Shrink this video to 30% of its fitted size” (perfect for picture-in-picture).

Opacity – Transparency

opacity ranges from 0.0 (invisible) to 1.0 (opaque).

Real example:

watermark_clip = Clip(

asset=logo,

duration=30,

opacity=0.6

)

“Make the logo 60% opaque (semi-transparent).”

Filter – Visual Effects

Apply color/blur effects:

from videodb.editor import Filter

clip = Clip(

asset=VideoAsset(id=video.id),

duration=10,

filter=Filter.greyscale

)

Available filters: greyscale, blur, boost (saturation), contrast, darken, lighten, muted, negative.

Filter

Effect

Filter.greyscale

Removes all color, creating a black-and-white look

Filter.blur

Blurs the scene for artistic or privacy effects

Filter.contrast

Increases contrast, making darks darker and lights lighter

Filter.darken

Darkens the entire scene

Filter.lighten

Lightens the entire scene

Filter.boost

Boosts both contrast and saturation for vibrant colors

Filter.muted

Reduces saturation and contrast for a subdued look

Filter.negative

Inverts colors for a surreal, negative effect

There are no rows in this table

⁠

Transition – Fades

Fade in/out at clip start/end:

from videodb.editor import Transition

clip = Clip(

asset=VideoAsset(id=video.id),

duration=10,

transition=Transition(

in_="fade",

out="fade",

duration=2

)

“Fade in over 1 second at start, fade out over 2 seconds at end.”

Recap: A Clip wraps an Asset and defines how long it plays (duration) and how it appears (fit, position, scale, opacity, filter, transition). Now let’s see how to place clips on the timeline.

⁠