Upload Audio + Character
Audio is required for transcription. Character reference is required before image generation so identity stays locked.
Build ILLCO-ready lyric video runs from uploaded audio, uploaded character references, Realtime-first transcription, user-approved lyrics, selected image counts, dissolve visuals, readable ASS subtitles, strict rhyme coloring, and FFmpeg QC notes.
Transcribe first, confirm lyrics, generate only the credit-selected image count, then cross-dissolve the visuals. No fake exact-sync claims.
Audio is required for transcription. Character reference is required before image generation so identity stays locked.
The app uses a rap-specialist transcription prompt and blocks image generation until the user confirms the lyrics are correct.
The selected image count controls credit use. Generated stills are dissolved together instead of hard-cut.
Upload audio, upload a character reference, transcribe first, approve the lyrics, then generate the selected number of visuals and dissolve them.

Credits meter transcription, Agent SDK planning, image generation count, timing-plan passes, and full render-plan runs.