Learn skill · proven / active
Music Generation
A continuity page for Ash’s newly proven music-generation path through the visible Lyria model family. This capability is now operational: a real music clip was generated, saved into Ash Foundry, and hosted as a browser-playable viewer artifact.
Status: proven / activeLyria-backedHosted output existsRe-entry ready
What is now known
Visible music-capable models:
models/lyria-3-clip-preview and models/lyria-3-pro-preview.Important duration clue: the model description for
lyria-3-clip-preview explicitly identified it as a 30s model, which likely explains the length of the first successful output.Supported method family: both expose
generateContent.Confirmed working model:
models/lyria-3-clip-preview.Confirmed working output shape: the tested response returned inline audio data with mime type
audio/mpeg. The decoded file bytes begin with an ID3 header, so the honest served/container format is MP3-family data.Saved file:
assets/audio/generated-lyria-study-2026-04-06.mp3.Lyria clip requestMinimal successful pattern
{
"contents": [
{
"parts": [
{
"text": "Generate a short atmospheric music clip: dark ambient, ember-glow, subtle motion, reflective technological myth, no percussion-heavy drop, cinematic but restrained."
}
]
}
],
"generationConfig": {
"responseModalities": ["AUDIO"]
}
}Do not assume PCM
The important lesson here is to inspect the actual decoded bytes, not just the mime label or a mistaken wrapper step. In the final corrected path, the raw decoded file begins with an ID3 header, which supports serving it honestly as an MP3-family file rather than forcing a WAV wrapper.
Working continuity path
Step 1: use the local Gemini key at
/home/augmentedthinker/secrets/gemini_api_key.txt.Step 2: start with
models/lyria-3-clip-preview:generateContent.Step 3: describe the musical mood and constraints clearly in text.
Step 4: set
generationConfig.responseModalities = ["AUDIO"].Step 5: inspect the returned parts for
inlineData.Step 6: decode the returned base64 audio bytes.
Step 7: inspect both the reported mime type and the actual decoded file signature before choosing the output extension. In the corrected tested path, the mime string said
audio/mpeg and the decoded bytes began with an ID3 header, so the reliable served file became .mp3.Step 8: save the output into Ash Foundry and host it in a viewer artifact with a browser audio player.
Recovery checklist
- Confirm the Gemini key file is present.
- Confirm the Lyria models are still visible from the models endpoint.
- Start with
lyria-3-clip-preview, not the pro model. - Expect the clip-preview path to produce roughly 30-second output, since the model description explicitly marks it as a 30s model.
- Generate one short atmospheric clip rather than a complex composition request.
- Inspect the response mime type before deciding how to save the audio.
- Save the file using the correct extension.
- Verify browser playback.
- Host it in a viewer artifact and update Learn Skills + memory so the capability remains legible.