Explore/muapi.ai/suno-voice-clone

muapi/suno-voice-clone

Text to Audio

Clone your singing voice in two takes for use with Suno music generation. Submit a 10-second sample, then read back a fresh random phrase the system generates (anti-deepfake liveness check), and receive a reusable voice_id you can drop into Suno music creation. Free during preview.

Input

Configure the model parameters below.

Drag & drop, paste file/image, or paste a link

Result

Your generated results
will appear here

📝

Overview

About this model

Clone your own singing voice from a short recording and reuse it in AI music generation. A two-take liveness flow protects against deepfake misuse: you upload a 10-second sample, then read aloud a fresh random phrase the system generates from it, and the result is a reusable voice_id you can drop into music creation requests.

1Original music: write a song and have it sung in your own voice.
2Creator branding: maintain a consistent vocal identity across released tracks.
3Demos and covers: hear how a melody sits in your range without recording yourself singing every line.
4Multilingual releases: clone once, sing in any of 10 supported languages.
💰

Pricing & Value

Cost analysis

muapiappFree during preview

Two-stage voice clone with anti-deepfake liveness check, usable directly in music generation.

Fal.aiNot available

No equivalent singing-voice cloning product.

ReplicateNot available

No equivalent singing-voice cloning product gated to music output.

* Competitor pricing is estimated based on similar model architectures and usage tiers.

⚙️

Technical Details

Configuration schema

Voice Sample URLstring

URL of a clean 10-second recording of the voice to clone. Mono is fine. The provider extracts a vocal segment between vocal_start_s and vocal_end_s.

Default Valuehttps://d3adwkbyhxyrtq.cloudfront.net/webassets/videomodels/minimax-voice-clone-in.wav
Voice Namestring

A short label for your voice, shown in the voice picker (optional).

Default Value-
Descriptionstring

Free-form description of this voice (optional).

Default Value-
Style Tagsstring

Comma-separated style hints used at music generation time (optional).

Default Value-
LanguageEnum (10 options)

Language the voice sample is spoken in.

Default Valueen
Vocal Start (seconds)int

Start time of the vocal segment within the sample.

Default Value0
Vocal End (seconds)int

End time of the vocal segment within the sample. Must be greater than Vocal Start.

Default Value10
📖

Implementation Guide

Developer documentation

How to Use Custom Voice Cloning

This flow takes two recordings. The phrase step is a liveness check that prevents cloning someone else's voice from found audio — there is no way to skip it.

  1. Record sample (10 seconds): Capture a clean recording of the voice in a quiet room. Mono is fine. Trim silence at the start. Upload it via /api/v1/upload_file and keep the returned URL.

  2. Submit sample:

    curl -X POST https://api.muapi.ai/api/v1/suno-voice-clone \
      -H 'x-api-key: $MUAPI_KEY' \
      -d '{"audio_url": "https://.../sample.wav", "voice_name": "My Voice", "language": "en"}'
    

    Response: {"request_id": "...", "status": "processing"}.

  3. Poll for the phrase: Hit /api/v1/predictions/{request_id}/result until the response shows stage: "awaiting_phrase" with a phrase field. The phrase is generated freshly per request and cannot be reused.

  4. Record the phrase: Read the phrase aloud in the same voice, in one take, 5–15 seconds. Upload the recording.

  5. Submit confirmation:

    curl -X POST https://api.muapi.ai/api/v1/suno-voice-clone/{request_id}/confirm \
      -H 'x-api-key: $MUAPI_KEY' \
      -d '{"audio_url": "https://.../phrase.wav"}'
    
  6. Poll for completion: Same poll endpoint as before; final response carries voice_id. That is the durable identifier you pass to music generation.

Library, availability, and refresh

  • GET /api/v1/suno-voices — list your voices (most recent first), each with stage, is_available, and (if mid-flow) the active phrase.
  • POST /api/v1/suno-voices/{voice_db_id}/check — re-check availability before using a voice (voices have a finite validity window).
  • POST /api/v1/suno-voices/{voice_db_id}/refresh — if a voice has expired, request a new phrase and re-record. Your original sample stays on file.
  • DELETE /api/v1/suno-voices/{voice_db_id} — remove a voice from your library.

Common Questions

Frequently asked

Why do I have to record twice?

The second recording is a liveness check. By making you read a fresh random phrase that you could not have prepared in advance, the provider verifies you are the actual speaker — not someone uploading a YouTube clip of a celebrity or a coworker. This is enforced upstream and cannot be skipped.

What audio format and length should the sample be?

10 seconds of clean speech or singing is ideal. Mono or stereo, mp3 or wav both work. Use vocal_start_s and vocal_end_s to mark the vocal segment within a longer file if needed.

How long does a voice stay usable?

Voices created by the provider have a limited validity window. Call the `check` endpoint before each music generation, and the `refresh` endpoint to get a new verification phrase if a voice has expired.

Can I use the same voice_id across multiple songs?

Yes — that is the whole point. Once cloning succeeds, the voice_id is yours to reuse in any music generation request until it expires.

What does it cost?

Free during preview. Pricing may change once the upstream provider begins charging.