AI Transcription in Premiere Pro with Echoe Scribe
•8 min read•By Phantom Editor Team
AI Transcription in Premiere Pro with Echoe Scribe
If you edit in Adobe Premiere Pro, you already know the pain: you want accurate captions, but you don’t want to bounce between websites, exports, and re-import steps every time you tweak formatting.
Echoe Scribe is built to keep transcription inside Premiere while giving you multiple AI engines to choose from, plus quick caption “reflow” controls so you can change line breaks without having to re-transcribe.
It’s also a strong option when you’re working with languages Premiere’s native Speech-to-Text doesn’t support. Adobe publishes its official supported language list here: Languages supported by Speech-to-Text (Adobe).
Featured Snippet Summary: With Echoe Scribe, you can transcribe a Premiere sequence using local Whisper (offline) or cloud engines like Deepgram, Speechmatics, and AssemblyAI (BYOK). After transcription, use Reflow Lines → Reflow & Import to change the number of words/lines per caption without re-transcribing.
Here’s the full walkthrough video (this blog is a written version of it):
What Echoe Scribe does (in plain English)
Echoe Scribe helps you:
Transcribe a sequence using the engine you prefer (local or cloud).
Import the transcription as captions back into your Premiere timeline.
Reflow captions (change words-per-line / lines-per-caption) without running transcription again.
Step 1: Open Echoe Scribe and choose your sequence
Open Echoe Scribe inside Premiere (left panel), then choose what you want to transcribe (in the video, the demo uses a full sequence).
Step 2: Pick your transcription engine and language
Echoe Scribe lets you select from multiple engines:
Local Whisper (downloadable models that run on your computer)
Deepgram (cloud)
Speechmatics (cloud)
AssemblyAI (cloud)
Then pick your language before you transcribe. Each provider has different language coverage, so if you work across languages, it’s useful to have multiple options available in one workflow.
Step 3: Set caption formatting (words per line + lines per caption)
Before transcription, set how you want captions to be structured:
Words per line (ex: 6 or 8)
Lines per caption (ex: 1, 2, or 3)
This is the “editor-facing” part of captions that affects readability the most, especially for social formats.
Step 4: Click Transcribe (what happens behind the scenes)
When you hit Transcribe:
For cloud models, Echoe Scribe exports the sequence audio to the selected provider and receives back timed text.
For local Whisper, you can download the Whisper model and transcribe locally (useful if you want offline transcription).
In the demo, Deepgram is called out as being particularly fast, with strong word and timing accuracy.
Step 5: Reflow lines and import (without re-transcribing)
This is the feature that makes Echoe Scribe feel “Premiere-native.”
After you’ve already transcribed, you can go to Reflow Lines, change your formatting (words/line, lines/caption), then hit Reflow & Import.
The key point from the video:
Echoe Scribe re-uses the existing transcription and adjusts the caption line breaks based on your new settings.
You don’t have to transcribe again just to change formatting.
Examples shown in the video:
Switching to two lines per caption card
Trying one word per line (not always practical, but demonstrates the control)
Choosing the right model (quick guide)
Different engines have different trade-offs. Here’s a practical summary based on what’s explained in the walkthrough:
Engine
Runs where?
When to use
Deepgram (Nova 3)
Cloud
Great default when you want fast turnaround with strong timing.
Speechmatics
Cloud
Strong accuracy; can be slower than Deepgram.
AssemblyAI
Cloud
A balanced option (often “in the middle” on speed vs Deepgram/Speechmatics).
OpenAI Whisper (local)
Your computer
Useful for offline workflows and broad language support; larger models tend to be more accurate but heavier. The walkthrough notes timing may be less tight than some cloud engines.
Add your API keys (BYOK) for cloud engines
Deepgram, Speechmatics, and AssemblyAI require an API key.
In Echoe Scribe, go to:
Settings
API Keys
Paste your provider key(s)
Important security note from the walkthrough: API keys are basically passwords. Don’t share them publicly, and treat them like any other credential.
If you’re deciding whether to set this up: most providers offer some kind of free tier or trial at various times, but those details change—so it’s best to check each provider’s current terms directly.
How to get an API key (provider links)
The exact UI changes over time, but the flow is usually the same: create an account → open your dashboard/portal → generate an API key → copy it once → paste it into Echoe Scribe.
Deepgram: Start at Deepgram → sign up → open your dashboard → generate an API key.
Speechmatics: Start at Speechmatics → sign up → open your dashboard → generate an API key.
AssemblyAI: Start at AssemblyAI → sign up → open your dashboard → generate an API key.
Once you’ve generated the key, return to Echoe Scribe and paste it into Settings → API Keys.
Multi-language tips (formatting matters)
The video shows a few language examples:
Mandarin/Chinese: “one word” may map differently than in English (and can display as multiple characters). If captions feel too dense, try fewer words per line and/or fewer lines per caption.
Dutch: the demo switches language and transcribes successfully with Deepgram.
Hindi: the demo uses AssemblyAI on a Hindi podcast and imports caption cards back into the timeline.
If you regularly work in multiple languages, the most practical workflow is:
pick the best engine for the language and speed you need
transcribe once
iterate on caption formatting using Reflow & Import until it matches your style
If you run into bugs or want a feature added, send us a note—feedback from real editing workflows is what shapes what we build next.
Phantom Editor FAQ
Phantom Editor is an AI-powered plugin for Adobe Premiere Pro that reduces repetitive and time-consuming editing tasks to speed up post-production. It includes 13+ tools for captions, AI transcription, AI long-form clipping, podcast multi-cam editing, repeat removal, and more.
Phantom Editor is exclusively designed for Adobe Premiere Pro. It integrates directly into your Premiere Pro workspace and works seamlessly with your existing editing workflow. Maybe we'll add DaVinci Resolve to the mix in the future.
macOS (Apple Silicon M1 and newer) and Windows PCs. Not compatible with Intel-based Macs.
Yes, absolutely!
No, it is not.
Premiere Pro Version 2023 and above.
Yes, we support all Premiere Pro language versions.
The Phantom Plugin is a subscription. However, we do have some tools available as non-subscription one-time purchases.
Yes! Phantom Editor offers a 7-day free trial so you can test all the features before committing to a purchase.
Three tools are available for free in Phantom Editor:
Dopple Copy & Paste - Paste images directly from your clipboard into Premiere Pro
Silence Remover - Automatically detect and remove silent segments
Transition Assistant - Apply, remove, and change transition effects at scale
Our one-click installers should make it fairly simple to install. If you have any issues, you can also use the ZXP installer for a more robust install. Visit the Installation page on the Phantom Editor website for detailed installation instructions specific to your operating system and Premiere Pro version.
Phantom Editor works with Adobe Premiere Pro Version 2023 and above. It's compatible with macOS (Apple Silicon M1 and newer) and Windows PCs. For AI-powered tools like Banshee Captioner and Echoe Scribe, having adequate RAM and processing power will improve performance, especially for local AI transcription.
You can, however we only allow 1 license key to be active at a time.
We currently only allow 1 active session per license key.
For non-AI tools, yes. Some AI features require internet access to work.
Most features work offline. However, some AI-powered features require internet connectivity. Banshee Captioner offers unlimited local transcription using OpenAI Whisper models (no internet needed). Features like Charon Video Downloader and Phantom Stock Media require internet access.
Yes, we do! Users have access to different AI models and providers, but mostly we use Gemini!
Most of our tools that use AI are done via cloud processing. However, our Banshee Captioner tool does have local AI transcription using OpenAI Whisper.
Your video and audio files are not used for AI training. Files are temporarily uploaded to our AI provider only for processing, and your data remains secure and private at all times.
We will be adding a payment option for additional AI credits in the future. For the time being, email us and we'll work something out.
Wraith Multi-Cam is an intelligent multi-camera editing tool that automatically switches between up to 8 camera angles based on who's speaking. It can edit a 1-hour multi-cam podcast with 3 cameras and 2 speakers in approximately 1.5 minutes, making it perfect for podcasts, interviews, and panel discussions.
AI Repeat Removal (Beta) analyzes your audio sequence using AI to identify and remove repeated lines and mistakes. It scans your content, identifies the best positions to cut, and provides a transcription window where you can review, keep, remove, or add sections as needed.
Viral Spectre uses AI to automatically find viral moments in your long-form content, reframe footage for social media platforms, and cut clips based on your settings. It's similar to Opus Clips but built directly into Premiere Pro. It supports over 100 languages and is perfect for turning podcasts, YouTube videos, and live streams into short-form content for TikTok, Reels, and YouTube Shorts.
Banshee Captioner offers:
Unlimited AI transcription using local OpenAI Whisper models (no cloud processing required)
Support for 99+ languages including RTL languages (Arabic, Hebrew, etc.)
12 fully customizable caption presets with custom font support (more will be added per update)
Built-in transcription editor for text, timing, and line breaks
BYOK (Bring Your Own Key) support for AssemblyAI integration
Import/Export of both Premiere .json and .srt caption files
Both Banshee Captioner and Echoe Scribe support 99+ languages, including right-to-left (RTL) languages like Arabic and Hebrew. Viral Spectre supports over 100 different languages.
Yes! Banshee Captioner includes 12 fully customizable caption presets with support for custom fonts, allowing you to create social-media ready captions that match your brand style. More will be added per update.
Casper Folder Manager automatically imports new files the moment they're added to your chosen watch folder or subfolder. It mirrors your folder structure inside Premiere Pro, eliminating the need to manually import files one by one. You can save presets for project structures, making it ideal for reusing assets.
The Silence Remover automatically detects and removes or disables silent segments from your sequence. You have precise control with adjustable threshold, duration, and padding settings, and can view the audio waveform for visual confirmation.
Clip & Reframe makes it easy to cut sections of your sequence and reformat them into any aspect ratio (perfect for Instagram, TikTok, YouTube Shorts). It includes options to use Adobe's AI Auto Reframe to keep subjects in focus or manually adjust framing. It's non-destructive and preserves scale keyframes.
Current partners include Pexels, Giphy, Pixabay, and Unsplash. You can import high-quality stock footage directly into your Premiere Pro project with different quality options. More partners are coming soon.
Echoe Scribe is an AI transcription tool that transcribes selected sequences into accurate SRT captions. It offers unlimited local transcription using OpenAI Whisper, BYOK support for AssemblyAI, flexible caption reflow, and generates clean, export-ready SRT files.
To give you an idea, when we first launched on Oct 22nd, 2024, we had 8 tools. In 4 months after that, we added 5 new tools and pushed 1 update once a week.