Start

Kalpa Speech API

Speech models that speak in context — one API for text-to-speech and multi-speaker conversation.

The Kalpa Speech API serves multi-speaker conversational speech models over plain HTTP. It does two things:

Text-to-speech — POST /v1/tts turns text into spoken audio.
Conversation — POST /v1/converse takes a conversation of turns and speaks the open (final) turn: either authoring it outright (the model writes text and voices it) or rendering text you supply, in the voice and rhythm of the conversation so far.

There is one conversation shape and no special cases: TTS is just a one-turn conversation. Everything is JSON; audio crosses the wire as base64-encoded 16-bit PCM WAV (mono, 24 kHz).

bash

export KALPA_API_KEY=...   # provisioned per team — see Authentication

curl -s https://api.kalpalabs.ai/v1/tts \
  -H "Authorization: Bearer $KALPA_API_KEY" -H 'Content-Type: application/json' \
  -d '{"text": "Hey there! How are you doing today?", "speaker": "0"}'

At a glance

Endpoint	What it does
`POST /v1/tts`	Text in, speech out
`POST /v1/converse`	Complete the open turn of a conversation
`GET /v1/models`	The public model registry
`GET /v1/info`	Backend info, default params, request limits
`GET /v1/usage`	Your key's metered usage
`GET /health`	Liveness (no auth)

Base URL: https://api.kalpalabs.ai. All /v1/* endpoints require a key (Authentication); every error comes back in one envelope (Rate limits & errors).

Built for agents

These docs assume your first reader may be a model. Every page is plain markdown at a stable URL — append .md to any path (this page is /index.md). The whole site is indexed in /llms.txt, concatenated in /llms-full.txt, and the full contract is machine-readable at /openapi.json — the same committed artifact the API reference and our clients are generated from. Point your agent at any of them.

Getting access

The API is in early access. Write to [email protected] for a key, or try the models interactively first at studio.kalpalabs.ai.