ownify.docs

← Docs index

Bring your own LLM

Point your ownify agent at your Ollama Cloud, OpenAI, Groq, or any OpenAI-compatible endpoint. ownify never sees prompts or completions for BYO calls. Spend caps and retention are governed by your provider.

Why BYO?

  • Your tokens, your spend cap. Set limits in your provider’s dashboard, not on ownify’s bill.
  • Your retention story. Whatever your provider stores is between you and them — ownify isn’t in the data path.
  • Model choice. Use any model your subscription supports — Llama 3.3 70B on Ollama, GPT-4o on OpenAI, your fine-tunes on a self-hosted endpoint, etc.

ownify still bills its standard plans (Solo, Pro, etc.) — those cover the agent runtime, memory subsystem, skills, Matrix bot, audit, portal, and storage. BYO replaces only the LLM layer.

Supported providers

The portal dropdown lists four options. All endpoints must be OpenAI-compatible (i.e. accept POST /v1/chat/completions with the standard OpenAI request schema).

ProviderEndpointGet a key
Ollama Cloudhttps://ollama.com/v1ollama.com/settings/keys ↗
OpenAIhttps://api.openai.com/v1platform.openai.com/api-keys ↗
OpenRouterhttps://openrouter.ai/api/v1openrouter.ai/settings/keys ↗
Groqhttps://api.groq.com/openai/v1console.groq.com/keys ↗
Customany OpenAI-compatible URLTogether, self-hosted Ollama, TGI…

Anthropic’s native API isn’t directly supported in v1 (microclaw expects OpenAI-shaped requests). Use Anthropic via OpenRouter as a Custom endpoint for now.

Configure

In the portal: Dashboard → your agent → LLM. Pick a provider from the dropdown, paste your API key, set the default model id, save. The agent pod restarts in ~15 seconds and begins routing every chat call to your provider.

For Ollama, the default model id is whatever your subscription has access to — e.g. kimi-k2.6:cloud, qwen2.5:32b. For OpenAI it’s the model id you’d pass in any chat-completions request, e.g. gpt-4o-mini.

To revert to ownify-managed (Fireworks via ownify-router), pick that option from the dropdown and save. The four BYO secret keys are wiped and the pod restarts.

OpenRouter — recommended models per category

OpenRouter is a meta-provider — one API key, hundreds of models behind OpenAI-compatible request shapes. The auto-router classifies each request into one of six categories; here are reasonable defaults to assign in the model editor.

CategoryRecommended model idNotes
Fastopenai/gpt-oss-20bMoE, very low TTFT, used for the classifier itself
Balancedqwen/qwen3-30b-a3b-instruct3B active params, broad capability
Reasoningdeepseek/deepseek-r1-distill-llama-70bor anthropic/claude-haiku-4-5 if you want hosted Claude
Long contextgoogle/gemini-2.5-flash1M token context, fast
Visionqwen/qwen3-vl-235b-a22b-instructstrong VL model, MoE
Codeqwen/qwen3-coder-30bor anthropic/claude-sonnet-4-6 for top-tier review

These are starting points — OpenRouter’s catalog evolves quickly. Consult openrouter.ai/models ↗ for current pricing, latency, and ZDR availability per route.

OpenRouter is a meta-provider: prompts may pass through different upstream model providers depending on the model id you select. Some routes support ZDR (zero-data-retention) — opt in at openrouter.ai/settings/privacy ↗.

What ownify sees vs. doesn't

With BYO active, every LLM call — including the auto-router classifier call that picks a category for each request — flows through your dedicated, per-tenant ownify-router pod directly to your provider. The shared platform LiteLLM is not in the path: no shared ownify component sees prompts or completions.

ownify still recordsownify does NOT see
Audit metadata (which channel triggered a call, when, the agent slug)
Skill execution + memory ACL events
Pod health + restarts
Prompt content
Completion content
Token counts (your provider’s dashboard has these)
Spend (ownify can’t enforce caps for BYO providers)

Langfuse traces are skipped for BYO calls so prompts and completions never land in ownify’s observability stack.

Failure modes

  • Invalid / expired key: the agent surfaces the provider’s error verbatim. There is no fallback to ownify-managed — by design, so spend never sneaks back onto your ownify bill.
  • Rate limited: same as above — the provider’s 429 propagates. Your provider dashboard is the source of truth for limits.
  • Provider down: the agent stops responding to chat until the provider recovers or you switch back to ownify-managed.
  • Model not available: if the default model id doesn’t exist on your subscription, the provider returns a model-not-found error. Edit the default model on the LLM page and save.
Privacy note

With BYO, your prompts and completions are governed by your provider’s privacy policy — review ollama.com/privacy, openai.com/policies/privacy-policy, or your custom endpoint’s terms before sending sensitive data. ownify stores audit metadata only.