Agent Tool Integration
If your product has an existing AI agent (built with OpenAI function-calling, Anthropictool_use, LangChain, LlamaIndex, or a custom framework) that needs to edit documents, the right integration shape is to register SuperDocs as a tool the agent can call — not to build a separate UI around SuperDocs.
This guide covers the pattern + working snippets for the five most common agent frameworks, plus a generic JSON-schema definition you can adapt to any framework that supports tool-calling.
Why this shape is different from a UI integration
In a typical SuperDocs UI integration, a human types in a chat panel, the chat sends a message to SuperDocs, and SuperDocs returns proposed edits the human reviews. The customer-facing app is the consumer of SuperDocs. In an agent-tool integration, your AI agent is the consumer. The agent decides — based on its own reasoning, the user’s query, and the conversation context — that it needs to edit a document. It invokes SuperDocs as a tool, receives proposed changes as structured data, and either approves them itself (auto-decide), surfaces them through your existing UI for the user to review, or sends them out-of-band (Slack, email). The integration is server-side, the SuperDocssk_ key lives next to your other model API keys, and there is no SuperDocs-specific UI to build — your agent’s existing UI is where the result appears.
The pattern, in 4 steps
- Define the SuperDocs tool in your agent’s tool registry. The tool takes three parameters:
message(the natural-language instruction for the AI),session_id(a string that ties multiple turns together), anddocument_html(the HTML the agent wants edited). - Implement the tool function. Call
POST /v1/chatwith the three parameters plus your chosenapproval_mode. Return the response back to the agent — either as the parsed proposed changes (forask_every_time) or as the final updated HTML (forapprove_all). - Decide approval shape. Choose between
approve_all(your agent’s reasoning is the approval — fastest),ask_every_time(the agent reviews each proposed change individually before deciding), or surface to a human through your UI. - Persist the result. The agent receives the updated HTML and writes it back to wherever your documents live.
Generic JSON-schema tool definition
Every framework that supports tool-calling accepts a JSON-schema tool definition. Use this as the canonical version and adapt for your framework:OpenAI function-calling (Python)
Anthropic tool_use (Python)
LangChain Tool (Python)
LlamaIndex FunctionTool (Python)
TypeScript / Vercel AI SDK
Approval modes — choose based on who decides
approve_all — your agent’s reasoning is the approval
The simplest pattern. Pass approval_mode: "approve_all" and SuperDocs returns the final updated HTML in one call. Use this when your agent is autonomous and you trust it to make the right call. Most appropriate for:
- Backend AI agents processing a queue without human oversight.
- Agents whose user already approved the high-level intent (e.g., “rewrite all sections in plain English”) and doesn’t need per-change review.
- Server-to-server workflows.
ask_every_time — your agent reviews each proposed change
Pass approval_mode: "ask_every_time" and SuperDocs returns proposed changes one at a time via SSE. Your agent reads each proposed_change event (parse event.data then JSON.parse the content field again — see the SSE Streaming guide for the double-parse warning), decides whether to approve based on its own reasoning, and POSTs to /v1/chat/{session_id}/approve to either accept or reject.
Use this when:
- Your agent has higher reasoning capability than SuperDocs’ edit suggestions and wants to filter them.
- The user is reviewing the agent’s overall plan but trusts the agent to filter individual edits.
- You want an audit log of every accepted vs rejected change for compliance.
Surface to a human through your UI
If your agent is in a chat with a human and SuperDocs proposes an edit, your agent can stop, surface the proposed change to the human via your existing chat thread (as a card, a button group, an inline question — whatever your UI does), wait for the human’s decision, and then POST the decision back to SuperDocs. This pattern keeps your AI agent in control of what gets surfaced and when, but defers the binary approve/deny decision to the human. The agent’s UI is the only UI involved — there’s no SuperDocs-specific UI.Document HTML round-trip — same rule as everywhere else
Thedata-chunk-id attributes that SuperDocs adds to block-level elements must round-trip cleanly when you send the HTML back on the next call. If your agent stores the HTML in a database, file, or in-memory variable between calls, those attributes must survive the storage cycle. Most string-based storage preserves them automatically; if you parse the HTML through a library that strips unknown attributes, you’ll break the round-trip silently.
If you’re piping the HTML through a rich-text editor at any point in your agent’s UI, see the Editor Integration guide for editor-specific preservation patterns.
Choosing between sync /v1/chat and async /v1/chat/async
Most agent-tool integrations should use sync /v1/chat. It’s a simpler integration (one call, one response) and matches the standard tool-call shape (function in, JSON out, agent reasons over the result).
Use async /v1/chat/async + the SSE stream when:
- Your agent has a UI that wants to show streaming progress events to the user as the AI works.
- The document is very large and you want to fail fast on auth errors before committing to a long edit.
- You want to use
ask_every_timeand react to each proposed change in real time.
Stuck?
If your agent framework isn’t covered here or the tool-calling pattern doesn’t map cleanly to your setup, email hello@superdocs.app or book a 15-minute integration call at cal.com/superdocs. We’ll talk through the pattern and add a snippet to this guide for the next person.Related guides
- Server Integration — if your agent is one of several services in a backend, the same patterns apply more broadly.
- Human-in-the-Loop — if your agent surfaces proposed changes to a human via your UI, the rendering patterns there work for chat-thread cards too.
- SSE Streaming — if you choose
ask_every_timeand need to consume proposed changes one at a time. - Async Jobs — if you want polling instead of SSE.
- Integration Starter Prompt — paste this into your coding agent to wire all the above up automatically.

