KancyAI Docs
Build with every frontier model through one stable API.
KancyAI provides an OpenAI-compatible gateway for text, image, video, and voice models. Use one API key, one base URL, and one operational layer for model switching, usage control, and billing visibility.
https://api.kancy.ai/v1
Quickstart
If your application already uses the OpenAI SDK, point the client at the KancyAI base URL
and replace the API key. You can then choose any provider-specific model such as
gpt-4o or claude-3-opus.
curl https://api.kancy.ai/v1/chat/completions \
-H "Authorization: Bearer $KANCY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [
{"role": "user", "content": "Draft a concise API launch note"}
]
}'
Authentication
Every request uses bearer-token authentication. Create separate API keys for production, staging, and local development so usage and permissions remain easy to audit.
Authorization: Bearer $KANCY_API_KEY
Project keys
Scope keys to a product, environment, or team so quotas and spend can be managed independently.
Rotation
Rotate keys without changing provider credentials inside your application code.
Chat Completions
Use chat completions for reasoning, summarization, extraction, tool planning, and agent
workflows. The endpoint supports provider-specific models — switch between them by
changing the model field in each request.
POST /v1/chat/completions
{
"model": "claude-3-opus",
"messages": [
{"role": "system", "content": "You are a precise technical assistant."},
{"role": "user", "content": "Compare these two vendor contracts."}
],
"temperature": 0.2
}
Image Generation
Generate product visuals, article covers, marketing assets, and design references through a unified media endpoint. Choose any image model such as Google, OpenAI Image, MiniMax Image, or ByteDance Image by specifying its model ID.
POST /v1/images/generations
{
"model": "google-image",
"prompt": "A clean enterprise API dashboard, realistic product UI",
"size": "1536x1024"
}
Video Generation
Video generation is asynchronous. Create a job, store the returned job ID, then poll the job status or configure a webhook once your workspace enables callbacks.
POST /v1/videos/generations
{
"model": "kling-video",
"prompt": "A calm walkthrough of a global model gateway network",
"duration": 6,
"aspect_ratio": "16:9"
}
Model Switching
KancyAI does not perform automatic dynamic routing. Instead, you stay in full control:
every request runs against the exact model you specify. Switch between providers and
model families at any time by changing the model field — no code changes,
no provider SDK swaps, no fallback configuration required.
Explicit selection
Pass a provider-specific model ID such as gpt-4o or claude-3-opus to target the exact model you want.
Switch on demand
Change the model ID from one request to the next to compare results, control cost, or match each task to the best model.
One consistent API
Request and response shapes stay the same across every provider, so switching models never breaks your application code.
No vendor lock-in
Keep a single base URL and API key. Add or remove models from your stack without re-wiring authentication or billing.
Model IDs
Use direct model IDs to choose the exact provider and model for each request. Switch
between models at any time by changing the model field — KancyAI keeps
the request and response format identical across every provider.
| Family | Example IDs | Common use |
|---|---|---|
| Text and reasoning | gpt-4o, claude-3-opus, gemini-1.5-pro, qwen-max, kimi-k2, doubao-pro |
Chat, analysis, extraction, agent workflows |
| Image | google-image, openai-image, minimax-image, bytedance-image |
Creative generation, product visuals, covers |
| Video | minimax-video, kling-video, bytedance-video, veo |
Short-form video, demos, multimodal campaigns |
| Voice | whisper-asr, openai-tts |
Speech recognition, dubbing, voice agents |
Errors
KancyAI returns consistent HTTP status codes across providers. Provider-specific details are normalized into stable error categories whenever possible.
| Status | Meaning | Recommended action |
|---|---|---|
400 |
Invalid request | Check required fields, model IDs, and media parameters. |
401 |
Invalid API key | Verify the bearer token and workspace access. |
429 |
Rate limited | Retry with backoff or request a quota adjustment. |
500 |
Gateway or provider failure | Switch to another model ID and inspect request logs. |
Rate Limits and Usage
Limits can be configured per workspace, project, API key, and modality. Enterprise workspaces can set spend caps, per-team quotas, and alert thresholds for production usage.