Guide
How to isolate audio from video
This guide explains a practical workflow for extracting voices, instruments, and ambience from mixed video audio. The goal is speed and usable results, not a perfect lab-grade separation in every case.
Step-by-step workflow
- Start with the cleanest source clip you have: If you have multiple takes, choose the one with the best mic proximity and lowest clipping. Better inputs produce better separations.
- Write a specific target prompt: Use descriptive prompts like 'main speaker voice', 'acoustic guitar strumming', or 'soft crowd ambience'. Avoid broad prompts like 'good audio'.
- Process and preview both tracks: Listen to isolated and background tracks independently before exporting. This helps confirm the extraction quality and spot artifacts early.
- Balance levels for your target platform: Raise the isolated track for clarity and keep a small amount of background when needed for natural tone in social clips or interviews.
- Export and continue your edit: Send results into your broader workflow for captions, transitions, and mastering. AudioPrompt is strongest as a fast front-end isolator.
Prompt examples that usually work better
- "Primary host voice at center"
- "Lead vocal with minimal reverb"
- "Hi-hat and snare only"
- "Street ambience with passing cars"
Common mistakes to avoid
- Using vague prompts that do not describe the target sound source
- Expecting perfect isolation from clipped or severely distorted recordings
- Skipping preview checks and discovering artifacts only after export
- Treating one prompt result as final instead of iterating quickly