Inside the codebase

Four pieces matter.

Reading the source is the documentation. Here are the four files that carry the pattern end-to-end.

A · Corpus

Your MDX, with chunk markers.

Every section becomes a chunk. The <Chunk id> is the rowid in the index; the heading and body become the embedded text.

content/docs/auth.mdx

---
title: "Authentication"
category: "core-concepts"
tags: ["auth", "api-keys"]
vector_metadata:
  importance: 0.8
---

<Chunk id="auth-api-keys">
## API keys

Issue a key from the dashboard and pass it as
`Authorization: Bearer <key>` on every request.
</Chunk>

B · Index

One script, run at build time.

Walks content/docs/, extracts chunks, embeds them with Voyage AI (or a deterministic hash fallback for offline runs), and writes them to data/docs.db.

scripts/ingest.ts

const chunks = extractAllChunks();
const db = getDb();
resetTables(db);

for (let i = 0; i < chunks.length; i += BATCH_SIZE) {
  const batch = chunks.slice(i, i + BATCH_SIZE);
  const texts = batch.map(chunkText);
  const embeddings = await embed(texts, 'document');
  insertChunkBatch(db, batch, embeddings);
}

C · Retrieve

One route handler.

Embeds the query, runs vec0 ANN against chunks_vec, joins back to chunk metadata, ranks, returns. Used by both Spotlight (⌘K) and the chat panel.

app/api/docs/search/route.ts

export async function POST(req: Request) {
  const { q, topK = 5 } = await req.json();
  const queryVec = await embedOne(q, 'query');
  const hits = searchSimilar(queryVec, topK);
  return Response.json({ hits });
}

D · Answer

Multi-turn. One streamed call.

Retrieved chunks become context. History is appended for follow-up turns. The model streams the answer; the panel renders tokens as they arrive. The only network round-trip is the model call.

components/chat-panel.tsx · send()

async function send(question: string) {
  const { hits } = await fetch('/api/docs/search', {
    method: 'POST',
    body: JSON.stringify({ q: question, topK: 4 }),
  }).then((r) => r.json());

  const system = buildPrompt(hits);
  const stream = await callModel({ system, question, hist });
  for await (const tok of stream) render(tok);
}

Legal

Four pieces matter.

Your MDX, with chunk markers.

One script, run at build time.

One route handler.

Multi-turn. One streamed call.

Read the source. Fork it. Ship it.