Bring your own LLM

Point your ownify agent at your Ollama Cloud, OpenAI, Groq, or any OpenAI-compatible endpoint. ownify never sees prompts or completions for BYO calls. Spend caps and retention are governed by your provider.

Why BYO?

Your tokens, your spend cap. Set limits in your provider’s dashboard, not on ownify’s bill.
Your retention story. Whatever your provider stores is between you and them — ownify isn’t in the data path.
Model choice. Use any model your subscription supports — Llama 3.3 70B on Ollama, GPT-4o on OpenAI, your fine-tunes on a self-hosted endpoint, etc.

ownify still bills its standard plans (Solo, Pro, etc.) — those cover the agent runtime, memory subsystem, skills, Matrix bot, audit, portal, and storage. BYO replaces only the LLM layer.

Supported providers

The portal dropdown lists four options. All endpoints must be OpenAI-compatible (i.e. accept POST /v1/chat/completions with the standard OpenAI request schema).

Provider	Endpoint	Get a key
Ollama Cloud	https://ollama.com/v1	ollama.com/settings/keys ↗
OpenAI	https://api.openai.com/v1	platform.openai.com/api-keys ↗
OpenRouter	https://openrouter.ai/api/v1	openrouter.ai/settings/keys ↗
Groq	https://api.groq.com/openai/v1	console.groq.com/keys ↗
Custom	any OpenAI-compatible URL	Together, self-hosted Ollama, TGI…

Anthropic’s native API isn’t directly supported in v1 (microclaw expects OpenAI-shaped requests). Use Anthropic via OpenRouter as a Custom endpoint for now.

Configure

In the portal: Dashboard → your agent → LLM. Pick a provider from the dropdown, paste your API key, set the default model id, save. The agent pod restarts in ~15 seconds and begins routing every chat call to your provider.

For Ollama, the default model id is whatever your subscription has access to — e.g. kimi-k2.6:cloud, qwen2.5:32b. For OpenAI it’s the model id you’d pass in any chat-completions request, e.g. gpt-4o-mini.

To revert to ownify-managed (Ollama Cloud via ownify-router), pick that option from the dropdown and save. The four BYO secret keys are wiped and the pod restarts.

OpenRouter — recommended models per category

OpenRouter is a meta-provider — one API key, hundreds of models behind OpenAI-compatible request shapes. The auto-router classifies each request into one of six categories; here are reasonable defaults to assign in the model editor.

Category	Recommended model id	Notes
Fast	openai/gpt-oss-20b	MoE, very low TTFT, used for the classifier itself
Balanced	qwen/qwen3-30b-a3b-instruct	3B active params, broad capability
Reasoning	deepseek/deepseek-r1-distill-llama-70b	or anthropic/claude-haiku-4-5 if you want hosted Claude
Long context	google/gemini-2.5-flash	1M token context, fast
Vision	qwen/qwen3-vl-235b-a22b-instruct	strong VL model, MoE
Code	qwen/qwen3-coder-30b	or anthropic/claude-sonnet-4-6 for top-tier review

These are starting points — OpenRouter’s catalog evolves quickly. Consult openrouter.ai/models ↗ for current pricing, latency, and ZDR availability per route.

OpenRouter is a meta-provider: prompts may pass through different upstream model providers depending on the model id you select. Some routes support ZDR (zero-data-retention) — opt in at openrouter.ai/settings/privacy ↗.

What ownify sees vs. doesn't

With BYO active, every LLM call — including the auto-router classifier call that picks a category for each request — flows through your dedicated, per-tenant ownify-router pod directly to your provider. The shared platform LiteLLM is not in the path: no shared ownify component sees prompts or completions.

ownify still records	ownify does NOT see
Audit metadata (which channel triggered a call, when, the agent slug) Skill execution + memory ACL events Pod health + restarts	Prompt content Completion content Spend (ownify can’t enforce caps for BYO providers)

Langfuse traces are skipped for BYO calls so prompts and completions never land in ownify’s observability stack.

Token counts (prompt_tokens, completion_tokens) are captured from your provider’s API responses for usage display. Prompts and completions are never seen or stored by ownify.

Failure modes

Invalid / expired key: the agent surfaces the provider’s error verbatim. There is no fallback to ownify-managed — by design, so spend never sneaks back onto your ownify bill.
Rate limited: same as above — the provider’s 429 propagates. Your provider dashboard is the source of truth for limits.
Provider down (5xx / unreachable): if your BYO provider returns a 5xx error or is unreachable, requests may transparently fall back to ownify-managed models. These fallback requests are tracked separately in your usage dashboard.
Model not available: if the default model id doesn’t exist on your subscription, the provider returns a model-not-found error. Edit the default model on the LLM page and save.

Privacy note

With BYO, your prompts and completions are governed by your provider’s privacy policy — review ollama.com/privacy, openai.com/policies/privacy-policy, or your custom endpoint’s terms before sending sensitive data. ownify stores audit metadata only.