Kalpa Speech API
Speech models that speak in context — one API for text-to-speech and multi-speaker conversation.
The Kalpa Speech API serves multi-speaker conversational speech models over plain HTTP. It does two things:
- Text-to-speech —
POST /v1/ttsturns text into spoken audio. - Conversation —
POST /v1/conversetakes a conversation of turns and speaks the open (final) turn: either authoring it outright (the model writes text and voices it) or rendering text you supply, in the voice and rhythm of the conversation so far.
There is one conversation shape and no special cases: TTS is just a one-turn conversation. Everything is JSON; audio crosses the wire as base64-encoded 16-bit PCM WAV (mono, 24 kHz).
export KALPA_API_KEY=... # provisioned per team — see Authentication
curl -s https://api.kalpalabs.ai/v1/tts \
-H "Authorization: Bearer $KALPA_API_KEY" -H 'Content-Type: application/json' \
-d '{"text": "Hey there! How are you doing today?", "speaker": "0"}'At a glance
| Endpoint | What it does |
|---|---|
POST /v1/tts | Text in, speech out |
POST /v1/converse | Complete the open turn of a conversation |
GET /v1/models | The public model registry |
GET /v1/info | Backend info, default params, request limits |
GET /v1/usage | Your key's metered usage |
GET /health | Liveness (no auth) |
Base URL: https://api.kalpalabs.ai. All /v1/* endpoints require a key (Authentication); every error comes back in one envelope (Rate limits & errors).
Built for agents
These docs assume your first reader may be a model. Every page is plain markdown at a stable URL — append .md to any path (this page is /index.md). The whole site is indexed in /llms.txt, concatenated in /llms-full.txt, and the full contract is machine-readable at /openapi.json — the same committed artifact the API reference and our clients are generated from. Point your agent at any of them.
Getting access
The API is in early access. Write to [email protected] for a key, or try the models interactively first at studio.kalpalabs.ai.