# API reference

> Every endpoint, field and error — generated from the committed OpenAPI contract.

Base URL: `https://api.kalpalabs.ai`. Every request and response body is JSON. Authenticated
endpoints take `Authorization: Bearer $KALPA_API_KEY` (or `X-API-Key`). Every
error, on every endpoint, is the one envelope:

```json
{ "error": { "type": "rate_limit_exceeded", "message": "…", "request_id": "…" } }
```

## POST /v1/tts

**Synthesize speech from text.**

Render the given text as speech (24 kHz mono WAV) in the requested speaker's voice.

### Request — `TtsRequest`

| Field | Type | Default | Constraints | Description |
|---|---|---|---|---|
| `text` *(required)* | `string` |  | 1 – 8000 chars | Text to speak. |
| `model` | `string \| null` |  |  | Public model id (see GET /v1/models). Omit/null for the default model. |
| `params` | `GenParamsModel` |  |  |  |
| `params.depth_temperature` | `number \| null` |  | 0 – 1.5 | Acoustic temperature; null = follow temperature. |
| `params.max_new_tokens` | `integer` | `512` | 16 – 2048 |  |
| `params.penalty_window` | `integer` | `20` | 1 – 80 |  |
| `params.quantizers` | `integer \| null` |  | ≥ 1 | Decode only the first N RVQ levels; null = full depth. |
| `params.repetition_penalty` | `number` | `3` | 0 – 6 |  |
| `params.temperature` | `number` | `0.7` | 0 – 1.5 |  |
| `params.top_k` | `integer \| null` |  | ≥ 1 | Backbone top-k; null = full vocabulary. |
| `speaker` | `string` | `"0"` |  | Speaker role to render the text as (one of the model's `speakers`; see GET /v1/models). |

### Response 200 — `TtsResponse`

| Field | Type | Default | Constraints | Description |
|---|---|---|---|---|
| `audio` *(required)* | `AudioPayload` |  |  |  |
| `audio.data_b64` *(required)* | `string` |  |  | Base64-encoded 16-bit PCM WAV (mono). |
| `audio.num_quantizers` *(required)* | `integer` |  |  | Number of RVQ levels decoded into this audio. |
| `audio.sample_rate` *(required)* | `integer` |  |  | Sample rate of the audio in Hz. |
| `audio.format` | `string` | `"wav"` |  | Container/encoding of `data_b64` (16-bit PCM WAV). |
| `model` *(required)* | `string` |  |  |  |
| `request_id` *(required)* | `string` |  |  |  |
| `text` *(required)* | `string` |  |  | The text that was spoken (echoes the request). |
| `usage` *(required)* | `Usage` |  |  |  |
| `usage.input_audio_seconds` | `number` | `0` |  | Seconds of input audio supplied (converse). |
| `usage.input_chars` | `integer` | `0` |  | Characters of input text billed for this request. |
| `usage.output_audio_seconds` | `number` | `0` |  | Seconds of audio generated. |
| `meta` | `object` |  |  | Backend-specific diagnostics (latency, frames, …). |

Errors: `401`, `429`, `502` (+ `422` on schema violations); see [Rate limits & errors](/rate-limits-and-errors).

```bash
curl -s https://api.kalpalabs.ai/v1/tts \
  -H "Authorization: Bearer $KALPA_API_KEY" -H 'Content-Type: application/json' \
  -d '{"text": "Hey there! How are you doing today?", "speaker": "0"}'
```

## POST /v1/converse

**Complete the open (final) turn of a conversation.**

Given a conversation, complete its last ('open') turn. A speaker-only open turn is authored (text + audio); an open turn with text is rendered as that speaker, conditioned on the prior turns (contextual TTS).

### Request — `ConverseRequest`

| Field | Type | Default | Constraints | Description |
|---|---|---|---|---|
| `conversation` *(required)* | `ConversationTurnModel[]` |  | 1 – 64 items | The conversation, oldest turn first; the last turn is the open turn to complete. |
| `conversation[].audio_wav_b64` | `string \| null` |  |  | Base64 16-bit PCM WAV of this turn's audio, if any. |
| `conversation[].speaker` | `string` | `"0"` |  | Role label for this turn (one of the model's `speakers`). |
| `conversation[].text` | `string \| null` |  | ≤ 8000 chars | Text spoken in this turn, if known. |
| `model` | `string \| null` |  |  | Public model id (see GET /v1/models). Omit/null for the default model. |
| `params` | `GenParamsModel` |  |  |  |
| `params.depth_temperature` | `number \| null` |  | 0 – 1.5 | Acoustic temperature; null = follow temperature. |
| `params.max_new_tokens` | `integer` | `512` | 16 – 2048 |  |
| `params.penalty_window` | `integer` | `20` | 1 – 80 |  |
| `params.quantizers` | `integer \| null` |  | ≥ 1 | Decode only the first N RVQ levels; null = full depth. |
| `params.repetition_penalty` | `number` | `3` | 0 – 6 |  |
| `params.temperature` | `number` | `0.7` | 0 – 1.5 |  |
| `params.top_k` | `integer \| null` |  | ≥ 1 | Backbone top-k; null = full vocabulary. |

### Response 200 — `ConverseResponse`

| Field | Type | Default | Constraints | Description |
|---|---|---|---|---|
| `model` *(required)* | `string` |  |  |  |
| `reply` *(required)* | `ConverseReply` |  |  |  |
| `reply.speaker` *(required)* | `string` |  |  |  |
| `reply.text` *(required)* | `string` |  |  |  |
| `reply.audio` | `AudioPayload \| null` |  |  |  |
| `reply.audio.data_b64` *(required)* | `string` |  |  | Base64-encoded 16-bit PCM WAV (mono). |
| `reply.audio.num_quantizers` *(required)* | `integer` |  |  | Number of RVQ levels decoded into this audio. |
| `reply.audio.sample_rate` *(required)* | `integer` |  |  | Sample rate of the audio in Hz. |
| `reply.audio.format` | `string` | `"wav"` |  | Container/encoding of `data_b64` (16-bit PCM WAV). |
| `request_id` *(required)* | `string` |  |  |  |
| `usage` *(required)* | `Usage` |  |  |  |
| `usage.input_audio_seconds` | `number` | `0` |  | Seconds of input audio supplied (converse). |
| `usage.input_chars` | `integer` | `0` |  | Characters of input text billed for this request. |
| `usage.output_audio_seconds` | `number` | `0` |  | Seconds of audio generated. |
| `meta` | `object` |  |  |  |

Errors: `401`, `429`, `502` (+ `422` on schema violations); see [Rate limits & errors](/rate-limits-and-errors).

```bash
curl -s https://api.kalpalabs.ai/v1/converse \
  -H "Authorization: Bearer $KALPA_API_KEY" -H 'Content-Type: application/json' \
  -d '{"conversation": [{"speaker": "0", "text": "Hi, who are you?"}, {"speaker": "1"}]}'
```

## GET /v1/models

**List available public models.**

### Response 200 — `ModelsResponse`

| Field | Type | Default | Constraints | Description |
|---|---|---|---|---|
| `data` *(required)* | `ModelCard[]` |  |  | The available public models. |
| `data[].display_name` *(required)* | `string` |  |  | Human-readable model name. |
| `data[].id` *(required)* | `string` |  |  | Stable public model id used in the `model` request field. |
| `data[].modes` *(required)* | `string[]` |  |  | Supported modes: subset of ["converse", "tts"]. |
| `data[].speakers` *(required)* | `string[]` |  |  | Valid role labels for a turn's `speaker`, in turn order (e.g. ["0", "1"]). |
| `data[].default` | `boolean` | `false` |  | True for the model used when `model` is omitted. |
| `data[].description` | `string` | `""` |  | What this model is for. |

```bash
curl -s https://api.kalpalabs.ai/v1/models -H "Authorization: Bearer $KALPA_API_KEY"
```

## GET /v1/info

**Backend info, default params, and limits.**

### Response 200 — `InfoResponse`

| Field | Type | Default | Constraints | Description |
|---|---|---|---|---|
| `backend` *(required)* | `object` |  |  | Active backend description (name, kind, sample_rate, …). |
| `defaults` *(required)* | `object` |  |  | Default generation params. |
| `limits` *(required)* | `object` |  |  | Request-validation caps the gateway enforces. |
| `param_schema` *(required)* | `object[]` |  |  | UI metadata for the generation knobs. |

```bash
curl -s https://api.kalpalabs.ai/v1/info -H "Authorization: Bearer $KALPA_API_KEY"
```

## GET /v1/usage

**Your metered usage.**

Running totals (requests, input characters, audio seconds) for the calling API key.

### Response 200 — `UsageSummaryResponse`

| Field | Type | Default | Constraints | Description |
|---|---|---|---|---|
| `input_audio_seconds` *(required)* | `number` |  |  |  |
| `input_chars` *(required)* | `integer` |  |  |  |
| `key_id` *(required)* | `string` |  |  |  |
| `output_audio_seconds` *(required)* | `number` |  |  |  |
| `requests` *(required)* | `integer` |  |  |  |
| `last_request_ts` | `number \| null` |  |  |  |

Errors: `401` (+ `422` on schema violations); see [Rate limits & errors](/rate-limits-and-errors).

```bash
curl -s https://api.kalpalabs.ai/v1/usage -H "Authorization: Bearer $KALPA_API_KEY"
```

## GET /health

**Liveness probe.** No authentication.

### Response 200 — `HealthResponse`

| Field | Type | Default | Constraints | Description |
|---|---|---|---|---|
| `backend` *(required)* | `string` |  |  |  |
| `ready` *(required)* | `boolean` |  |  |  |
| `status` | `string` | `"ok"` |  |  |

```bash
curl -s https://api.kalpalabs.ai/health
```
