Guide

Use Cadreen as a Model Provider

Cadreen exposes a standard OpenAI-compatible /api/v1/cadreen/chat/completions endpoint. Point any OpenAI client at it and you get governance, memory, and intelligence.

Quick start

POST/api/v1/cadreen/chat/completions

Python

from openai import OpenAI

client = OpenAI(
    base_url="https://accomplishanything.today/api/v1/cadreen",
    api_key="sk_cadreen_..."
)

response = client.chat.completions.create(
    model="cadreen",
    messages=[{"role": "user", "content": "What connectors do I have?"}]
)
print(response.choices[0].message.content)

TypeScript

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://accomplishanything.today/api/v1/cadreen",
  apiKey: "sk_cadreen_...",
});

const response = await client.chat.completions.create({
  model: "cadreen",
  messages: [{ role: "user", content: "What connectors do I have?" }],
});
console.log(response.choices[0].message.content);

curl

curl -X POST https://accomplishanything.today/api/v1/cadreen/chat/completions \
  -H "Authorization: Bearer sk_cadreen_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "cadreen",
    "messages": [{"role": "user", "content": "What connectors do I have?"}],
    "stream": true
  }'

Request fields

FieldTypeRequiredNotes

modelstringNoAccepted for compatibility, ignored internally

messagesarrayYesOpenAI message format

streamboolNoEnable streaming SSE

toolsarrayNoTool/function definitions

conversation_idstringNoTrack conversations across requests

contextobjectNoCadreen-specific context

Note

Ignored fields: temperature, top_p, max_tokens, frequency_penalty, presence_penalty, stop, n, logprobs, response_format — Cadreen controls generation internally.

Streaming

Set stream: true to get SSE chunks. Cadreen sends keepalive comments every 15 seconds during long operations.

Streaming example

stream = client.chat.completions.create(
    model="cadreen",
    messages=[{"role": "user", "content": "Hello!"}],
    stream=True
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Note

Chain-of-thought (<think>...</think>) is stripped from every response — streaming and non-streaming. You never see the model's internal reasoning.

Tool calling

Cadreen supports hybrid tool execution. Your tools pass through. Cadreen's tools execute server-side with governance.

Python

response = client.chat.completions.create(
    model="cadreen",
    messages=[{"role": "user", "content": "Read main.go and summarize it"}],
    tools=[{
        "type": "function",
        "function": {
            "name": "read",
            "description": "Read a file",
            "parameters": {
                "type": "object",
                "properties": {
                    "path": {"type": "string", "description": "File path"}
                },
                "required": ["path"]
            }
        }
    }]
)

Note

See the Tool Calling guide for the full hybrid execution flow.

Response metadata

Every response includes Cadreen-specific metadata alongside the OpenAI-compatible fields.

Response

{
  "choices": [{
    "message": {
      "role": "assistant",
      "content": "You have 3 active connectors.",
      "metadata": {
        "confidence": 0.92,
        "cadreen_type": "direct",
        "classification": { "intent": "question", "complexity": "simple" },
        "assessment": { "can_do": 0.85, "assessment_quality": "complete" },
        "governance": { "decision": "auto", "approved": true }
      }
    }
  }],
  "intelligence": {
    "reasoning": { "intent": "question", "approach": "direct_answer" },
    "capability": { "can_do": 0.85 },
    "governance": { "active": true, "decision": "auto" }
  }
}

FieldDescription

metadata.confidenceClassification confidence (0-1)

metadata.cadreen_typeRouting: direct, clarify, mission, blocked

metadata.classificationIntent, complexity, confidence

metadata.assessmentcan_do, assessment_quality, gap_count

metadata.governanceDecision, approved

intelligenceFull reasoning trace (capability, governance, memory, humility)

What's different from OpenAI

Tool governance — Every server-side tool call evaluated against policies

Memory — 4 types of persistent memory across conversations

Intelligence traces — Full reasoning trace in every response

Limits

Tool calls are buffered during streaming (no incremental deltas)

n is always 1 (one completion per request)

response_format is not supported

Per-request token limits are managed internally

Note

Next: Governance Policies 101 — how to control what Cadreen is allowed to do.