Point your ownify agent at your Ollama Cloud, OpenAI, Groq, or any OpenAI-compatible endpoint. ownify never sees prompts or completions for BYO calls. Spend caps and retention are governed by your provider.
ownify still bills its standard plans (Solo, Pro, etc.) — those cover the agent runtime, memory subsystem, skills, Matrix bot, audit, portal, and storage. BYO replaces only the LLM layer.
The portal dropdown lists four options. All endpoints must be OpenAI-compatible (i.e. accept POST /v1/chat/completions with the standard OpenAI request schema).
| Provider | Endpoint | Get a key |
|---|---|---|
| Ollama Cloud | https://ollama.com/v1 | ollama.com/settings/keys ↗ |
| OpenAI | https://api.openai.com/v1 | platform.openai.com/api-keys ↗ |
| OpenRouter | https://openrouter.ai/api/v1 | openrouter.ai/settings/keys ↗ |
| Groq | https://api.groq.com/openai/v1 | console.groq.com/keys ↗ |
| Custom | any OpenAI-compatible URL | Together, self-hosted Ollama, TGI… |
Anthropic’s native API isn’t directly supported in v1 (microclaw expects OpenAI-shaped requests). Use Anthropic via OpenRouter as a Custom endpoint for now.
In the portal: Dashboard → your agent → LLM. Pick a provider from the dropdown, paste your API key, set the default model id, save. The agent pod restarts in ~15 seconds and begins routing every chat call to your provider.
For Ollama, the default model id is whatever your subscription has access to — e.g. kimi-k2.6:cloud, qwen2.5:32b. For OpenAI it’s the model id you’d pass in any chat-completions request, e.g. gpt-4o-mini.
To revert to ownify-managed (Fireworks via ownify-router), pick that option from the dropdown and save. The four BYO secret keys are wiped and the pod restarts.
OpenRouter is a meta-provider — one API key, hundreds of models behind OpenAI-compatible request shapes. The auto-router classifies each request into one of six categories; here are reasonable defaults to assign in the model editor.
| Category | Recommended model id | Notes |
|---|---|---|
| Fast | openai/gpt-oss-20b | MoE, very low TTFT, used for the classifier itself |
| Balanced | qwen/qwen3-30b-a3b-instruct | 3B active params, broad capability |
| Reasoning | deepseek/deepseek-r1-distill-llama-70b | or anthropic/claude-haiku-4-5 if you want hosted Claude |
| Long context | google/gemini-2.5-flash | 1M token context, fast |
| Vision | qwen/qwen3-vl-235b-a22b-instruct | strong VL model, MoE |
| Code | qwen/qwen3-coder-30b | or anthropic/claude-sonnet-4-6 for top-tier review |
These are starting points — OpenRouter’s catalog evolves quickly. Consult openrouter.ai/models ↗ for current pricing, latency, and ZDR availability per route.
OpenRouter is a meta-provider: prompts may pass through different upstream model providers depending on the model id you select. Some routes support ZDR (zero-data-retention) — opt in at openrouter.ai/settings/privacy ↗.
With BYO active, every LLM call — including the auto-router classifier call that picks a category for each request — flows through your dedicated, per-tenant ownify-router pod directly to your provider. The shared platform LiteLLM is not in the path: no shared ownify component sees prompts or completions.
| ownify still records | ownify does NOT see |
|---|---|
| Audit metadata (which channel triggered a call, when, the agent slug) Skill execution + memory ACL events Pod health + restarts | Prompt content Completion content Token counts (your provider’s dashboard has these) Spend (ownify can’t enforce caps for BYO providers) |
Langfuse traces are skipped for BYO calls so prompts and completions never land in ownify’s observability stack.
With BYO, your prompts and completions are governed by your provider’s privacy policy — review ollama.com/privacy, openai.com/policies/privacy-policy, or your custom endpoint’s terms before sending sensitive data. ownify stores audit metadata only.