Purpose

The “Ask Claude” panel lets the user ask follow-up questions on top of the article they are reading. Implemented as an SSE streaming round-trip from useChat (client) → /api/chat (Bun server) → Anthropic SDK with optional search_web tool.

Client side — useChat

KB/src/hooks/useChat.ts:8–147. Returns { messages, isLoading, sendMessage, clearMessages }.

sendMessage(content, currentArticleContext?):

  1. Appends a user message to local state (KB/src/hooks/useChat.ts:21–28).
  2. Inserts a placeholder streaming assistant message.
  3. Builds full message history in Anthropic format (role, content).
  4. POSTs to /api/chat with headers X-Anthropic-Key, X-Serper-Key (optional, from localStorage), X-Model (defaults "claude-opus-4-6").
  5. Body: { messages, systemContext? }. systemContext is the wrapper "The user is currently reading the following article:\n\n…" constructed by <ChatPanel> from the selected article’s title + summary (KB/src/components/ChatPanel.tsx:49–51).
  6. Reads the response body as ReadableStream; parses data: … SSE lines; concatenates into accumulated; sets the assistant message content on each chunk.
  7. Strips [DONE] sentinel; clears isStreaming flag at end.

Server side — /api/chat

KB/server.ts:100–237. Reads headers, parses body, instantiates new Anthropic({ apiKey }).

Returns a Response whose body is a ReadableStream running an agentic loop:

while (true) {
  response = client.messages.create({
    model,
    max_tokens: 4096,
    system,
    tools: [SEARCH_WEB_TOOL],
    messages,
    stream: false,    // collect full to inspect stop_reason
  })
 
  if (response.stop_reason === "tool_use") {
    // append assistant turn, run search_web for each tool_use block,
    // append tool_result turn, continue loop
  } else {
    // chunk text into 500-char SSE slices, send "[DONE]", break
  }
}

stream: false means Anthropic’s native streaming is not used. The “streaming feel” on the client is synthetic — slicing the final text and SSE-ing the slices (KB/server.ts:201–214).

System prompt

KB/server.ts:13–27. Hardcoded persona: GDPR + Swedish data law + B2B enrichment compliance specialist. Instructs Claude to cite specific articles/recitals, be uncertain when uncertain, recommend consulting a Swedish attorney, and use search_web if asked.

If the request includes systemContext (always set when an article is selected), it is appended after the base prompt.

search_web tool

KB/server.ts:29–40. Single input query: string. When Claude returns a tool_use block named search_web, the server calls serperSearch() (KB/server.ts:53–94) which POSTs to https://google.serper.dev/search with gl=se, hl=sv, num=8. Each organic result is enriched with credibility tier from scoreUrl() — see KB Credibility Scoring. Results returned as a JSON-stringified tool_result.

If no Serper key is configured, the tool result is { error: "No Serper API key configured — cannot search web" } — the model can still respond, just without web grounding.

Default model mismatch

  • Server default: claude-opus-4-5 (KB/server.ts:109)
  • Client default: claude-opus-4-6 (KB/src/hooks/useChat.ts:59, KB/src/hooks/useSettings.ts:10)
  • Settings modal options: claude-opus-4-6, claude-sonnet-4-6, claude-haiku-4-5-20251001 (KB/src/components/SettingsModal.tsx:11–15)

The client always sends X-Model, so the server default is unreachable in practice.

Gotchas

  • messages in useChat.ts:46 reads the React state at render time, not the latest after the user message was added. The new user message is then pushed onto a copy via history.push(...). Works because the closure captures the pre-add messages.
  • The synthetic 500-char chunking can introduce visible pauses on long responses. Native Anthropic streaming would be smoother but would require switching stream: true and forwarding MessageStreamEvents — the agentic loop currently relies on stop_reason inspection which requires the buffered response.

See also

KB Architecture, KB Search, KB Credibility Scoring, KB Settings.

See also