Audio input

Audio input is supported by Gemini multimodal models. Two paths:

Path 1: OpenAI-shape `input_audio` on `/v1/chat/completions`

The gateway translates the OpenAI input_audio content part to Gemini’s inline_data automatically. The format field maps to the right MIME type (mp3 → audio/mp3, wav → audio/wav, etc.).

curl https://api.clearmaas.com/v1/chat/completions \
  -H "Authorization: Bearer sk-clearmaas-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "google/gemini-2.5-flash",
    "messages": [{
      "role": "user",
      "content": [
        {"type": "text", "text": "What is happening in this audio clip?"},
        {"type": "input_audio", "input_audio": {"data": "<base64>", "format": "mp3"}}
      ]
    }]
  }'

Path 2: Native `/v1beta/` with `inline_data`

If you’re already on Gemini’s native protocol, pass inline_data directly — no translation involved.

curl "https://api.clearmaas.com/v1beta/models/google/gemini-2.5-flash:generateContent" \
  -H "Authorization: Bearer sk-clearmaas-..." \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{
      "role": "user",
      "parts": [
        {"text": "What is happening in this audio clip?"},
        {"inline_data": {"mime_type": "audio/mp3", "data": "<base64>"}}
      ]
    }]
  }'

Supported model families

Gemini multimodal models accept inline audio — for example google/gemini-2.5-flash and the Gemini 3.x line. Behavior matches Google’s published Gemini API exactly.

Limits

Inline audio payloads are size-capped by the upstream provider. For longer files, the provider’s own File API (uploaded outside ClearMaas against the provider directly) is the typical workaround. Check Google’s Gemini API documentation for current size and duration limits.

Getting started

Routing

Advanced

Native Formats

Compatibility

Operations

Other

Path 1: OpenAI-shape `input_audio` on `/v1/chat/completions`

Path 2: Native `/v1beta/` with `inline_data`

Supported model families

Limits

See also

Getting started

Routing

Advanced

Native Formats

Compatibility

Operations

Other

Documentation Index

​Path 1: OpenAI-shape input_audio on /v1/chat/completions

​Path 2: Native /v1beta/ with inline_data

​Supported model families

​Limits

​See also

Path 1: OpenAI-shape `input_audio` on `/v1/chat/completions`

Path 2: Native `/v1beta/` with `inline_data`

Supported model families

Limits

See also