AI Voice
Voice generation that behaves like a production layer.
14 audio bindings, 5 metered audio actions, and connected dubbing, cleanup, and narration surfaces from 5 credits upward.
Voice routes
7
Voice models
14
Providers
6
Narration floor
5
Voice Signal
Voice Routes
Keep spoken work inside one calm production layer.
Voice should scan faster and quieter than video. The route map below is intentionally cleaner, so narration, dubbing, and support lanes feel connected instead of noisy.
01
Core lane
Text to Speech
Natural synthesis with routed models for narration, explainers, and production voiceover.
Generate narration from the live audio registry with direct handoff into queue, dubbing, and publish.
Open route02
Route
Speech to Speech
Transform performances while preserving timing, phrasing, and emotional structure.
Transform a working take without losing timing, then keep the output inside the same spoken-audio workflow.
Open route03
Route
Voice Cloning
Build reusable voice identities with policy-aware access and cleaner studio handoff.
Build reusable voice identities with clearer operational context instead of treating cloning as a detached magic feature.
Open route04
Route
Voice Effects
Shape, texture, and stylize spoken performance for branded or narrative delivery.
Treat spoken texture and stylization as part of production, not a sandbox disconnected from the rest of the stack.
Open route05
Route
Dubbing
Run multilingual line replacement and performance sync inside the same workflow.
Run multilingual speech replacement inside the same metered audio system that already handles narration and sync adjacency.
Open route06
Route
Transcription
Turn spoken material into review-ready text that stays close to captions, dubbing, and publish.
Convert spoken material into review-ready text without leaving the audio workflow or inventing a fake standalone model shelf.
Open route07
Route
Cleanup
Remove noise and recover clarity before final delivery, review, or publish.
Support imperfect takes before replacing them outright, while keeping review and publish continuity intact.
Open routeWhy ELYSIO Voice
Spoken audio behaves like part of production.
The voice pillar is stronger when narration, speech transformation, dubbing, transcription, and cleanup are treated as one connected production layer instead of a loose collection of audio toys.
Narration to dub in one lane
Text-to-speech, speech-to-speech, and dubbing remain adjacent so delivery changes do not break the rest of the spoken workflow.
Operational support routes
Transcription and cleanup are framed honestly as support lanes that feed review, captions, and publish rather than fake standalone model families.
Voice identity continuity
Cloning and effects sit close to the same audio workspace, making reuse and governance easier to reason about.
Real audio stack
The public voice family shows which routed providers and access modes are live before the team commits to narration or localization work.
Live Model Stack
Voice models currently wired into the audio lane.
Lead provider
OpenArt Collection
7 routes
Primary access
Balanced
12 routes
Profile spread
1
active profile labels
Fish Audio TTS
Natural vocal expression profile with rich tone.
Fish Audio V1.5
Balanced profile for clean and expressive spoken audio.
GPT 4o Audio Preview
Experimental profile for multimodal voice quality tests.
GPT 4o Mini TTS
Conversational profile optimized for short-form narration.
Mini TTS
Lower-latency profile for quick previews and drafts.
Playground TTS
Creative voice experimentation profile.
Standard TTS
General-purpose text-to-speech profile.
FAI Narrator Pro
FAI.ai narration profile focused on polished long-form delivery and premium pacing.
Metered Operations
The real priced actions underneath this pillar.
These are the actual metered operations from the billing catalog. The page stays honest about what is charged, who can access it, and which provider family it routes through.
TTS Standard
tts_standard
Credits
5
Tier
Free+
Provider
Gemini
TTS Premium
tts_premium
Credits
15
Tier
Pro+
Provider
FAI.ai
Audiobook Generation
audiobook_gen
Credits
20
Tier
Pro+
Provider
FAI.ai
Dubbing Scene
dubbing_scene
Credits
25
Tier
Pro+
Provider
FAI.ai
Use Cases
Where this pillar earns its place.
01
Narration, dubbing, and creator voiceover inside one routed audio workspace.
02
Speech transfer and cloning for multilingual, character, and branded performance lanes.
03
Audiobook, long-form spoken content, and line-by-line scene dubbing.
04
Cleanup and restoration before publish, review, or client delivery.
05
Music and spoken audio systems that remain visible in queue, usage, and Flow.
06
Audio production with clearer plan, tier, and route awareness.
System Handoff
Move into the next surface without losing context.
Models / Voice
Linked product route inside the ELYSIO platform surface map.
/models/voice
Open surfaceAudio Studio
TTS, dubbing, music, and audio workflows in one place.
/studio/audio
Open surfaceAi / Video / Lipsync
Linked product route inside the ELYSIO platform surface map.
/ai/video/lipsync
Open surfacePricing
Plan, credit, and bundle logic for moving from exploration into actual production usage.
/pricing
Open surfaceVoice routes in this family start at 5 credits in the live billing catalog.