Skip to content

Add TwelveLabs Pegasus clip finder as opt-in alternative#18

Open
mohit-twelvelabs wants to merge 1 commit into
ClipsAI:mainfrom
mohit-twelvelabs:feat/twelvelabs-integration
Open

Add TwelveLabs Pegasus clip finder as opt-in alternative#18
mohit-twelvelabs wants to merge 1 commit into
ClipsAI:mainfrom
mohit-twelvelabs:feat/twelvelabs-integration

Conversation

@mohit-twelvelabs

Copy link
Copy Markdown

Hi! I'm Mohit, I work at TwelveLabs (@mohit-twelvelabs).

This PR adds PegasusClipFinder, an opt-in alternative to the existing transcript-only ClipFinder. The default ClipFinder finds clips from the words alone using TextTiling; PegasusClipFinder instead uses TwelveLabs Pegasus, a video-language model that analyzes the actual video (visuals, audio, pacing). It proposes highlight time ranges which are aligned to the transcript's character offsets, so the returned Clip objects are fully interchangeable with those produced by ClipFinder.

Why this helps clipsai
The README notes the library is tuned for "audio-centric, narrative-based videos." Pegasus extends usefulness to videos where highlights are visual rather than purely verbal, without changing the existing pipeline or output type.

Non-breaking / opt-in

  • Nothing in the existing path changes. ClipFinder and its defaults are untouched.
  • The twelvelabs SDK is an optional extra (pip install "clipsai[twelvelabs]"), imported lazily only inside find_clips, so importing clipsai stays dependency-free for everyone not using it.
  • Follows the repo's conventions: a ConfigManager subclass for validation, ClipFinderError for errors, NumPy-style docstrings, returns list[Clip].

Usage

from clipsai import PegasusClipFinder, Transcriber

transcription = Transcriber().transcribe(audio_file_path="/abs/path/to/video.mp4")
clips = PegasusClipFinder().find_clips(
    transcription=transcription,
    video_url="https://example.com/video.mp4",  # or video_id=... for an indexed video
)

How it was tested

  • 17 unit tests in tests/test_pegasus_clipfinder.py (no network): config validation, API-key handling, response parsing (incl. JSON wrapped in prose/markdown), time-range -> Clip mapping (filtering, clamping to media end, sorting, malformed-range skipping), and find_clips wiring with a mocked SDK client.
  • black and flake8 clean against the repo's setup.cfg config.
  • Live smoke test against the TwelveLabs API: confirmed SDK reachability and the exact analyze(model_name=, video={"type":"url","url":...}, prompt=, max_tokens=) request shape and .data response field. The Pegasus request envelope was verified end-to-end (auth + model + video URL contract all accepted).

You can grab a free API key at https://twelvelabs.io — there's a generous free tier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant