# doks

> Self-hosted documentation. With chat built in.

doks is a free, open-source documentation framework you run on your own server. You write your docs in Markdown, doks turns them into a site, and visitors can ask questions in plain English and get answers from your content. doks uses retrieval-augmented generation (RAG) and works with the AI model of your choice (Anthropic Claude, DeepSeek, Google Gemini, Mistral, OpenAI, or z.ai), connected with your own API key. There are no accounts, no subscription, and no telemetry.

doks is a product of datadistill.co, licensed under MIT, and developed in the open at https://github.com/getdoks/doks.

## Key facts

- **Type**: Open-source documentation framework with built-in retrieval-augmented chat
- **Stack**: Next.js + MDX, with `sqlite-vec` for embeddings storage
- **License**: MIT
- **Maintainer**: datadistill.co
- **Repository**: https://github.com/getdoks/doks
- **Self-hosted**: runs on your infrastructure; no SaaS layer, no hosted control plane
- **Bring your own keys**: Voyage AI for embeddings (reference); Anthropic, DeepSeek, Gemini, Mistral, OpenAI, or z.ai for chat
- **Privacy posture**: no telemetry, no analytics, no tracking on the marketing site
- **Cost**: framework is free; users pay their chosen embedding and chat providers directly (typically a few cents per million tokens for embeddings, ~$0.002 per answered question on a frontier model)

## How it works (four stages)

1. **Chunk** — Each `.mdx` file under `content/docs/` becomes a page. Authors wrap retrievable units in `<Chunk id>`. Frontmatter carries metadata. Versioned with git.
2. **Index** — `npm run ingest` walks the tree, embeds chunks with the configured embedding provider, and writes them to `data/docs.db` (a single SQLite file).
3. **Retrieve** — `/api/docs/search` embeds the visitor's query, runs a vec0 ANN scan with a lexical filter, and returns top-k chunks. ⌘K Spotlight uses the same route.
4. **Answer** — The chat panel packs the retrieved chunks into a system prompt, appends conversation history, and streams the model's answer through the configured chat provider. Citations link back to source chunks.

## Pages

- [Home](https://doks.dev/) — overview
- [How it works](https://doks.dev/how-it-works) — the four-stage pipeline, with a comparison against hosted RAG stacks
- [Inside the codebase](https://doks.dev/source) — the four files that carry the pattern end-to-end
- [Providers](https://doks.dev/models) — every supported embedding and chat provider
- [Get started](https://doks.dev/get-started) — install in four steps; full feature checklist
- [About](https://doks.dev/about) — project background, roadmap, cost breakdown, license
- [FAQ](https://doks.dev/faq) — 22 common questions with direct answers
- [Privacy](https://doks.dev/privacy) — full privacy notice (no analytics, no cookies, no tracking)
- [Terms](https://doks.dev/terms) — terms of use
- [Accessibility](https://doks.dev/accessibility) — WCAG 2.2 AA conformance statement
- [Security](https://doks.dev/security) — vulnerability disclosure policy with safe-harbour clause

## Common questions

**Which LLMs work with doks?** Anthropic Claude, DeepSeek, Google Gemini, Mistral, OpenAI, and z.ai. The chat adapter is a single async function so adding any OpenAI-compatible provider takes a few lines of code.

**Are visitors' questions sent to the LLM provider?** Yes. To answer, doks must send the question (plus the retrieved chunks) to the chat provider. How that provider logs or retains the question is governed by their privacy policy.

**Are my docs sent to the LLM provider?** Only the chunks selected as relevant to a given question are sent at query time. The full doc tree is embedded once at build time by the embedding provider.

**How accurate are the answers?** Because the LLM is given the actual source text, hallucinations are rare. The typical failure mode is the model admitting it does not have enough information rather than making something up. Every answer includes citations linking to the source chunks.

**What does it cost?** The framework is free. Embeddings cost roughly $0.02 per million tokens with Voyage AI. Chat answers cost roughly $0.002 per question on a frontier model and less on smaller ones. There is no fixed monthly cost.

**Can I use doks for private or internal docs?** Yes. doks is self-hosted, so you can put it behind any auth (SSO, basic auth, IP allow-listing). Your embedding and chat providers will see the chunks and questions; no third party including the doks maintainers ever does.

**Can I switch LLM providers later?** Yes. Switching means changing two environment variables (URL and API key) and one model identifier. Your docs, embeddings, and the rest of the site stay exactly as they were.

**Does doks send any data to its maintainers?** No. There is no telemetry endpoint, no update check that pings home, and no licence callback.

## Reference

- Repository: https://github.com/getdoks/doks
- License: MIT (https://opensource.org/licenses/MIT)
- Issues: https://github.com/getdoks/doks/issues
- Discussions: https://github.com/getdoks/doks/discussions
- Security disclosure: https://doks.dev/security
- Maintainer: https://datadistill.co