How To Use Google Lyria 3Built on Google Lyria 348kHz stereo audio

Use prompts, lyrics, timing, BPM, and images the right way

This guide condenses the official Google Lyria 3 capabilities into a creator-friendly workflow. It covers Clip vs Pro, custom lyrics, timestamp structure, image-to-music, instrumental prompts, language control, output parsing, and practical guardrails.

Why this page exists

The builder is powered by Google Lyria 3, but the workflow is shaped by our agent layer: structured prompting, cleaner lyric and timing controls, stronger generation defaults, async orchestration, and reusable track management. It is not a thin shell around a single API call.

Lyria 3 Clip

lyria-3-clip-preview

Best for

Fast tests, hooks, loops, previews

Duration

Always 30 seconds

Output

MP3

Lyria 3 Pro

lyria-3-pro-preview

Best for

Fuller songs with verses, choruses, bridges

Duration

A couple of minutes, guided by your prompt

Output

Model-selected MP3 or WAV

1. Start with the right model

Use Clip when you want to explore ideas fast. Use Pro when you already know the direction and want a longer, more structured piece.

Clip is fixed at 30 seconds, so it is ideal for testing genres, moods, and hooks.

Pro is better when you need verses, choruses, bridges, or a longer emotional arc.

A strong workflow is Clip first, Pro second.

2. Write a musically specific prompt

Lyria performs best when you describe the actual musical brief instead of a vague vibe.

Mention genre or genre blend: lo-fi hip hop, cinematic orchestral, indie pop, jazz fusion.

Name instruments: Rhodes, strings, brass, 808, acoustic guitar, vocal harmonies.

Set tempo and key when relevant: 85 BPM, D minor, G major.

Describe the mood and energy: nostalgic, aggressive, dreamy, uplifting, tense.

For Pro, mention desired length in the prompt when duration matters.

3. Use custom lyrics when words matter

If you already know the lyric direction, paste it clearly and separate it from production instructions.

Use section tags such as [Verse], [Chorus], [Bridge], [Intro], [Outro].

Keep your musical direction above the lyrics so the model sees both intent and words.

If you want no vocals, do not provide lyrics and explicitly say instrumental only.

4. Control timing and structure with timestamps

When you need precise pacing, tell the model what should happen in each time window.

Example: [0:00 - 0:10] Intro, [0:10 - 0:30] Verse, [0:30 - 0:50] Chorus.

Use timestamps to control energy lifts, instrument entrances, vocal timing, and fade-outs.

This is especially useful for trailers, scene music, and directed builds.

5. Add images when visuals should influence the song

Google Lyria 3 supports multimodal music generation. You can provide up to 10 images and ask the music to follow their mood, colors, and story.

Use moodboards, concept art, cover sketches, scene stills, or product visuals.

Only add images when visual direction really matters. Otherwise keep the request simpler.

Images work best when your prompt also explains what musical feeling the visuals should produce.

6. Force instrumental output when needed

For background music, trailers, games, and beats, tell Lyria explicitly that you want no vocals.

Use a phrase like: Instrumental only, no vocals.

This should appear directly in the prompt, not just as an implied preference.

Clip is often enough for instrumental concept testing before moving to Pro.

7. Match the prompt language to the lyric language

Lyria adapts vocal style and pronunciation to the language of your prompt.

If you want French lyrics, prompt in French.

If you want English vocals with Japanese section tags or notes, make that explicit.

Language control works better when you avoid mixing too many languages in one request.

8. Understand the response correctly

The model returns multiple parts. Some parts are text and some parts are audio bytes.

Do not assume the first part is always lyrics or always audio.

Iterate through all returned parts and detect text versus inline audio data.

The text output can contain lyrics, structure notes, or other written material alongside the audio.

Best practices

Iterate with Clip first, then send the strongest prompt to Pro.
Use concrete musical language instead of generic adjectives alone.
Separate lyric content from production notes for cleaner guidance.
Use timestamps when structure matters more than vibe.
Prompt in the language you want sung.
Keep copyright and artist-style requests out of the prompt.

Limits and safety notes

Clip always returns 30 seconds.
Pro usually produces a couple of minutes, but exact length can vary.
Results are non-deterministic, so the same prompt can return different music.
Lyria 3 generation is single-turn, not iterative in-place editing.
Google states generated audio includes a SynthID watermark.
Safety filters can block copyrighted lyrics or artist-voice imitation requests.

Compliance and anti-infringement guidance

This tool is built on Google Lyria 3 and follows the same category of safety guardrails used across leading creative AI products. Avoid copyrighted lyrics, artist-name imitation, or requests to clone a recognizable performer. Focus on original briefs: genre, arrangement, instrumentation, emotion, language, lyrics, and structure.