Back to Guides & Cookbooks
Guide

Use Cadreen as a Model Provider

Cadreen exposes a standard OpenAI-compatible /api/v1/cadreen/chat/completions endpoint. Point any OpenAI client at it and you get governance, memory, and intelligence.

1

Quick start

POST/api/v1/cadreen/chat/completions
Python
from openai import OpenAI

client = OpenAI(
base_url="https://accomplishanything.today/api/v1/cadreen",
api_key="sk_cadreen_..."
)

response = client.chat.completions.create(
model="cadreen",
messages=[{"role": "user", "content": "What connectors do I have?"}]
)
print(response.choices[0].message.content)
TypeScript
import OpenAI from "openai";

const client = new OpenAI({
baseURL: "https://accomplishanything.today/api/v1/cadreen",
apiKey: "sk_cadreen_...",
});

const response = await client.chat.completions.create({
model: "cadreen",
messages: [{ role: "user", content: "What connectors do I have?" }],
});
console.log(response.choices[0].message.content);
curl
curl -X POST https://accomplishanything.today/api/v1/cadreen/chat/completions \
-H "Authorization: Bearer sk_cadreen_..." \
-H "Content-Type: application/json" \
-d '{
"model": "cadreen",
"messages": [{"role": "user", "content": "What connectors do I have?"}],
"stream": true
}'
2

Request fields

FieldTypeRequiredNotes
modelstringNoAccepted for compatibility, ignored internally
messagesarrayYesOpenAI message format
streamboolNoEnable streaming SSE
toolsarrayNoTool/function definitions
conversation_idstringNoTrack conversations across requests
contextobjectNoCadreen-specific context
Note
Ignored fields: temperature, top_p, max_tokens, frequency_penalty, presence_penalty, stop, n, logprobs, response_format — Cadreen controls generation internally.
3

Streaming

Set stream: true to get SSE chunks. Cadreen sends keepalive comments every 15 seconds during long operations.

Streaming example
stream = client.chat.completions.create(
model="cadreen",
messages=[{"role": "user", "content": "Hello!"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
Note
Chain-of-thought (<think>...</think>) is stripped from every response — streaming and non-streaming. You never see the model's internal reasoning.
4

Tool calling

Cadreen supports hybrid tool execution. Your tools pass through. Cadreen's tools execute server-side with governance.

Python
response = client.chat.completions.create(
model="cadreen",
messages=[{"role": "user", "content": "Read main.go and summarize it"}],
tools=[{
"type": "function",
"function": {
"name": "read",
"description": "Read a file",
"parameters": {
"type": "object",
"properties": {
"path": {"type": "string", "description": "File path"}
},
"required": ["path"]
}
}
}]
)
Note
See the Tool Calling guide for the full hybrid execution flow.
5

Response metadata

Every response includes Cadreen-specific metadata alongside the OpenAI-compatible fields.

Response
{
"choices": [{
"message": {
"role": "assistant",
"content": "You have 3 active connectors.",
"metadata": {
"confidence": 0.92,
"cadreen_type": "direct",
"classification": { "intent": "question", "complexity": "simple" },
"assessment": { "can_do": 0.85, "assessment_quality": "complete" },
"governance": { "decision": "auto", "approved": true }
}
}
}],
"intelligence": {
"reasoning": { "intent": "question", "approach": "direct_answer" },
"capability": { "can_do": 0.85 },
"governance": { "active": true, "decision": "auto" }
}
}
FieldDescription
metadata.confidenceClassification confidence (0-1)
metadata.cadreen_typeRouting: direct, clarify, mission, blocked
metadata.classificationIntent, complexity, confidence
metadata.assessmentcan_do, assessment_quality, gap_count
metadata.governanceDecision, approved
intelligenceFull reasoning trace (capability, governance, memory, humility)
6

What's different from OpenAI

Tool governanceEvery server-side tool call evaluated against policies
Memory4 types of persistent memory across conversations
Intelligence tracesFull reasoning trace in every response
7

Limits

Tool calls are buffered during streaming (no incremental deltas)
n is always 1 (one completion per request)
response_format is not supported
Per-request token limits are managed internally
Note
Next: Governance Policies 101 — how to control what Cadreen is allowed to do.