Features

SuperDocs is a universal AI document platform — one API for editing, drafting, searching, summarizing, and exporting styled documents (.docx, PDF, HTML, Markdown, RTF). Every capability below is live in production at https://api.superdocs.app and exposed identically over the REST API and the MCP server at https://api.superdocs.app/mcp (21 tools + 4 user-invocable workflow prompts on a single endpoint). This page is exhaustive — it’s a reference, not a marketing summary. Skip to the group that matches what you’re trying to build.

Core editing intelligence

Section-precision editing

Documents load as structured HTML where every paragraph, heading, table, row, and cell carries a unique identifier internally. The AI can target a specific section without touching anything else — “remove row 3 of the pricing table”, “bold the second paragraph in section 4”, “replace the governing-law clause” all work as natural-language instructions. This survives across multi-turn editing, so a 100-page document edited 30 times still has the same coherent structure at the end as it did at the start. Plain-text rewrites lose this completely. Best for: Long documents, contracts, SOPs, formal reports — anywhere targeted edits matter more than wholesale rewrites.

Style preservation on edit and export

Tables (with borders, alternating row shading, merged cells), fonts, font sizes, colors, inline styling, lists, indentation, and headers/footers all survive both AI edits AND the round-trip to .docx / PDF / HTML. Most general-purpose AI tools strip or mangle these — SuperDocs is built around preserving them. Best for: Branded templates, legal documents, formal correspondence — anywhere format fidelity is part of the deliverable.

Document intelligence

Search within documents

Search isn’t just full-text matching — the AI understands semantic meaning. Ask “find all indemnity clauses” across a 100-page contract and get back the exact sections, even when they use different wording. Works against the active document or across all attachments at once. Results come back with surrounding context so you can verify the match without opening the file. Best for: Contract review, compliance audits, extracting specific terms from long documents, finding all references to a concept across multiple files.

Summarize sections on demand

Ask “summarize the force majeure section” or “give me a 2-line summary of the payment terms” and the AI extracts just that section, then summarizes it at the level of detail you want. Pair with semantic search — “find and summarize all limitation-of-liability clauses” works as a single instruction. No need to read line-by-line. Best for: Contract reviews, due diligence, quick understanding of specific parts of long documents, briefing materials.

Summarize entire documents at any length

Get a concise summary of an entire document at a length you specify. “Executive summary in three sentences”, “one-page overview”, “detailed bullet-point summary by section” — all work. The AI reads the full document and distills it efficiently. Summaries cover key sections, critical terms, and overall structure. Works on documents up to ~100 pages without context-window pressure. Best for: Quick understanding of newly-received documents, briefing executives, preparing for negotiations, extracting key terms from regulatory filings.

Cross-document reference and synthesis

Upload multiple documents into a session and the AI references any of them while editing the active document. “Port the indemnity clause from the attached template into this contract”, “compare these two NDAs and align the payment terms”, “find clauses in the attached precedent that handle this case better than what’s currently in the draft” — all work as natural-language instructions. The AI searches across all attachments, surfaces matches, and adapts content to the current context. Best for: Standardizing language across contracts, porting terms from templates, comparing versions, drafting new docs based on prior precedents, building from clause libraries.

Rich editing & formatting

Beyond basic editing — the AI applies bold, italic, underline, strikethrough, text color (10-color palette), text highlight, custom font selection, six preset font sizes (12px–30px), configurable line spacing (single, 1.15, 1.5, double), and text alignment (left, center, right, justify). All preserved on round-trip exports. The format painter lets you copy a style from one selection and apply it to another with a single click — useful for matching styling across a long document. Best for: Visually distinctive documents, applying brand style guides, adapting documents for different audiences, preserving template styling on edit.

Heading and list hierarchies

Apply heading styles (H1, H2, H3) with hierarchy preserved on export. Create bulleted and numbered lists with automatic numbering. Nest lists up to 8 levels deep. Convert between bullet and numbered formats by natural-language instruction. Add blockquotes for citations or emphasis. Horizontal rules to divide sections. The AI respects these structures — “convert the first three paragraphs to a bulleted list” works without reformatting surrounding content. Best for: SOPs, structured documentation, proposals, outlines, academic papers, anywhere hierarchy matters.

Table editing with cell-level control

Insert and delete tables and rows. Merge or split cells. Apply alternating row shading. Set border styles. All cell-level operations work via natural language: “merge the first three columns of row 2”, “add a row after row 4 with these values”, “split the merged cell in A1”. Tables survive round-trip to .docx and PDF with formatting intact. Useful for pricing tables, comparison matrices, and any data-heavy document. Best for: Documents with pricing or data — proposals, contracts with pricing schedules, technical specifications, comparison charts.

Headers, footers, and hyperlinks

Add headers and footers that appear on every page (or specific pages) — page numbers, document titles, company names, dates. Edit existing headers/footers directly. All survives PDF and .docx export. Hyperlinks work the same way: “make this text link to our website” or “remove the broken link in section 4”. Both internal document references and external URLs supported. Best for: Formal business documents, branded templates, compliance documents requiring page numbering, documents with cross-references or call-to-action links.

Knowledge & attachments

Multimodal vision on attachments

Attach images (PNG, JPG, GIF, WebP), screenshots, scanned forms, diagrams, and charts — the AI interprets them visually while editing the active document. Transcribe a screenshot into structured text, extract numbers from a chart, reference a diagram while drafting documentation, identify entities in a scanned form. Best for: Workflows that mix text documents with image references — design docs, compliance docs that reference scanned forms, technical writing that references architecture diagrams.

Build a knowledge base from attachments

Upload your organization’s template library, style guides, past contracts, or SOP documents as attachments. The AI references them automatically when editing the active document — “make this sound like our standard tone”, “follow the format of our usual NDAs”, “use the indemnity language from our gold-standard contract”. Builds an institutional memory that shapes AI behavior. Attachments are scoped per session, so a knowledge-base session can be reused as a starting point and copied for each new document. Best for: B2B teams with house styles or templates, legal teams with clause libraries, marketing teams with brand voice, compliance teams with corporate policies.

Semantic search across attachments

Every attachment is semantically indexed when uploaded — the AI knows what’s in it, not just the words it contains. Ask “find the data processing clause from the attached regulations” and the search returns the matching section even when the attached doc uses entirely different wording. Useful for large attachments (50+ pages) where keyword search would miss relevant sections, and for reference documents you query repeatedly across sessions. Best for: Reference docs (regulations, industry standards, policy manuals), large attachments queried multiple times, building searchable knowledge over time.

Conversation & robustness

Persistent conversation context

Every message in a session carries the full conversation history. Ask “make that more formal”, then “add three paragraphs of detail to what you just wrote”, then “change the tone back to friendly” — the AI remembers all prior context, edits, and stated preferences. History persists across server restarts and redeploys. Reload a session weeks later and the AI still has the full context (the document, the attachments, every turn of the conversation, every change made). Best for: Iterative editing workflows, multi-turn refinements, long-running document projects, anything where the user comes back to continue work later.

Automatic error recovery with fallbacks

When an edit fails on the first approach (e.g., the section the AI tried to find didn’t exist with those exact keywords), it automatically tries broader strategies — semantic search instead of keyword match, alternative phrasings of the user intent, looking in adjacent sections. Retries happen without user intervention. If all approaches genuinely fail, the AI explains what it tried and asks for clarification with specifics rather than a generic error. Dramatically reduces “I didn’t understand your request” loops. Best for: Complex documents with varied terminology, ambiguous instructions, long documents where sections may have been edited since last viewed.

Human control & approval

Human-in-the-loop approval

For sensitive edits (legal contracts, financial filings, anything user-facing), set approval_mode='ask_every_time' on chat_async and the agent surfaces each proposed change as a structured diff (chunk-level before/after HTML + an explanation) for the user to approve, deny, or send back with feedback. Approved changes apply atomically; denied changes leave the document untouched. State persists across server restarts so multi-step approvals survive autoscaling. Best for: Multi-stakeholder workflows, regulated industries, anywhere the cost of a bad edit landing without review exceeds the cost of a confirmation click.

Compact response mode for long editing sessions

On documents larger than ~20 pages, the default chat response includes the entire updated HTML on every turn — that’s ~130K tokens for a 100-page styled doc. Set response_mode='compact' and the response includes only chunk_diffs (per-section before/after for sections that actually changed) — typically 1-3 chunks, ~500-2,000 tokens. For a 5-turn editing session on a 100-page doc, compact mode reduces total response context from ~650K tokens to ~3K tokens. To read sections in compact mode, just ask in natural language — “show me the force majeure clause” returns the content in the chat reply text. Best for: AI agents editing documents larger than ~20 pages where context window pressure matters.

Real-time progress via SSE

Subscribe to /v1/chat/{session_id}/stream to receive intermediate progress events while the agent works — intermediate (status updates), proposed_change (HITL diffs), document_sync (chunk-ID sync after upload), final (completed response), usage (operation count + tokens), error. Auto-reconnect on drop. Best for: Frontends that show live progress, status bars, or partial results before the full answer is ready.

Multi-format I/O

Multi-format input and output

Input formats	Output formats
`.docx`, PDF, HTML, Markdown, RTF, plain text	`.docx`, PDF, HTML

Same parsing pipeline regardless of input — once a document is loaded into a session, every editing tool works identically. Export from any session in any of the three output formats. Best for: Format conversion workflows (HTML → styled .docx, Markdown → PDF), or accepting user uploads in whatever format they have.

Pre-signed URL upload and download

For files larger than ~100KB, the agent gets a 5-minute pre-signed PUT URL plus a ready-to-run curl example. The agent shells the file directly to cloud storage — bytes never pass through the agent’s context window. Same pattern in reverse for downloads (15-minute GET URL). For a 100-page styled .docx, this saves the agent ~70K tokens per upload (vs base64 inline). Five turns of editing on the same document drops from ~700K tokens of context overhead to ~3K tokens with the matching response_mode='compact' on chat. Best for: Production AI agents working with real-world document sizes (multi-page contracts, manuals, regulatory filings). Max file size 100 MB.

Developer & integration

MCP server — 21 tools, every major client

The same capabilities are exposed as a Model Context Protocol server at https://api.superdocs.app/mcp (Streamable HTTP transport). Compatible clients render the 21 tools natively:

Client	Status
Claude Code	Tools + Prompts ✓
Claude Desktop	Tools + Prompts ✓
Cursor	Tools + Prompts ✓
VS Code (GitHub Copilot)	Tools + Prompts ✓
Zed, Continue, Amazon Q CLI, and others	Tools + Prompts ✓
Windsurf, Cline	Tools only

The same MCP server also exposes 4 user-invocable workflow templates (surfaced as /superdocs:edit_styled_docx, /superdocs:convert_format, etc. in clients that render MCP prompts as slash commands). One MCP server entry in your client config; both tools and prompts come together. Best for: AI-coding-tool users who want SuperDocs available in their editor without writing API integration code.

Three authentication paths

Method	Best for
Web app login	The web app at `use.superdocs.app` (auto-issued via Google Sign-In or email/password)
User API key (`sk_…`)	Individual developers, MCP integrations, scripts
Organization API key (`lce_…`)	B2B integrations with shared usage limits

All three reach the same 21 MCP tools and 22 REST endpoints with identical scoping. Users own and manage their own keys; orgs manage theirs separately.

REST API works with any programming language

One REST API (22 endpoints) works with any language that can make HTTP requests. No NPM packages to keep current, no language-specific SDKs to maintain, no framework dependencies to upgrade. Integrate with a few lines of code in whatever language your backend already speaks. Full OpenAPI specification published — generate your own typed client with openapi-generator, Stainless, or any other codegen tool of your choice if you want type hints in your editor. Best for: Polyglot engineering teams, backend services across multiple languages, avoiding SDK lock-in, simple integrations that don’t warrant a full SDK.

Real-time usage tracking and transparency

Every API response includes usage data — operation count consumed, tokens used. The SSE stream emits a usage event after each operation. Your dashboard shows remaining operations in your current tier and resets on your billing cycle. Promo allowances deplete before paid-tier operations so you always burn the cheapest credits first. Full transparency at every step — no hidden costs, no monthly surprises, no need to call sales for usage data. Best for: Cost-conscious integrations, monitoring spend, forecasting overage, teams on a budget, self-service deployments at scale.

Scale & operations

Sessions and persistence

Every conversation is a session_id — a string the caller chooses. The full document state, conversation history, attachments, pending HITL changes, and AI working memory persist across calls and across server restarts. Reload an old session days later and the AI still has the full context. Best for: Long-running document workflows, async editing where the user comes back hours later, multi-turn editing where state carries between turns.

Async jobs with HITL state

Long-running edits and HITL workflows return a job_id; the client polls or subscribes to SSE updates. State persists in a database, so any backend instance can pick up a job mid-flight (autoscaling, restarts, redeploys all safe). Approved changes resume automatically. Best for: Any workflow that takes more than 30 seconds or needs human approval mid-flight.

Per-organization feature flags

B2B deployments can toggle specific features on or off per organization. Offer one platform integration to all customers but let Enterprise org A use a custom branding skin while Startup org B sticks with the default. Different orgs can have different rate limits, feature sets, or experimental rollouts — all from the same codebase, all controlled by API. Useful for staged feature rollouts, customer-specific customization, and enterprise tier differentiation. Best for: Multi-tenant B2B platforms serving different customer tiers, gradual feature rollouts to specific orgs, enterprise customization without per-customer deployments. Issue promo codes that grant temporary operation allowances (“LAUNCH50” = 50 ops valid for 30 days, max 200 redemptions). Users redeem in Settings. Promo operations deplete before paid-tier operations so users get the most out of their allowance. Every redemption is tracked and auditable. Useful for go-to-market campaigns, partner enablement, customer pilots, and time-limited free trial extensions. Best for: Product launches, partner programs, customer pilots, trials, conferences, hackathons.

Multi-language editing

Natural-language instructions and document content both work across many languages — production users have edited documents in English, Spanish, French, Hebrew, Korean, Mandarin, and others (16+ languages confirmed in real usage so far). Write your prompt in one language, edit a document in another, get the AI’s reply in whichever language you wrote the request. Multilingual documents (e.g., bilingual contracts) handled correctly. Tone and formality conventions adapted per language. Best for: International teams, multilingual document workflows, organizations serving non-English markets, contract translation and adaptation, cross-border legal work.

Build vertical AI on SuperDocs

Combine attachments (your domain knowledge), sessions (long-running workflows), and chat instructions to build domain-specific applications on top of SuperDocs. Contract AI: attach your standard clause library + draft instructions, get an AI that writes contracts in your house style. Compliance AI: attach your regulations + policy templates, get an AI that audits documents for compliance gaps. Marketing AI: attach your brand voice guides + past collateral, get an AI that produces on-brand content. Same platform, different domain — all configurable per session or per organization. Best for: Vertical SaaS platforms, agencies serving specific industries, organizations with strong domain languages, anyone building specialized document workflows.

On the roadmap

Time Travel & Version History (Coming Soon)

Restore documents to any prior state and browse a timeline of every edit. Useful for “oops, undo that” moments, compliance audits, and exploring alternate editing paths without losing the current version. Audit trail of who edited what, when. The state-snapshot infrastructure already runs internally on every chat turn; the user-facing API and UI to browse and restore versions ships next. Best for: Multi-stakeholder workflows, regulatory compliance audits, exploring alternate edits without losing the current version, recovering from mistakes.

What ships when

A live timeline of major capabilities and when they shipped. Older capabilities don’t get less reliable over time — once shipped, they stay covered by the regression suite and the production monitoring stack.

Date	Capability	Why it matters
2026-04-25	Expanded features documentation (this page) — 30+ capabilities grouped by use case	Developers and AI agents form a complete mental model of what SuperDocs can do
2026-04-25	MCP server unified — 21 tools + 4 user-invocable workflow prompts on a single `/mcp` endpoint	Single MCP config entry covers both; discoverable slash commands for Cursor/Claude Code/Claude Desktop users
2026-04-25	Pre-signed URL upload/download flow (`request_upload_url`, `process_uploaded_document`, `request_download_url`)	Bytes no longer pass through agent context window; viable for real-world file sizes
2026-04-25	Compact response mode (`response_mode='compact'` + `chunk_diffs`)	~140× token reduction for editing sessions on large documents
2026-04-25	Capability-forward MCP tool descriptions across all 21 tools	Agents form correct mental models of when to use SuperDocs vs build from scratch
2026-04-23	MCP HTTP transport reliability fix	Eliminates 5-minute hang clients (Claude Code, Cursor, Bun) saw on first connect
2026-04-22	OAuth Protected Resource Metadata (RFC 9728) for MCP	Cursor 3.x / Claude Code 2.x / mcp-remote can now connect without 60s metadata-probe timeout
2026-04-19	Editing latency improvements for large documents	4m55s → ~10s for typical edit operations on large documents
2026-04-18	Editing precision improvements for nuanced instructions	Eliminates over-broad edits when the user’s instruction was narrowly-scoped
Earlier 2026	Async jobs + HITL durable state, SSE streaming, multimodal vision, multi-format export, MCP server, promo codes, billing	Foundation

For schemas, parameters, and code examples, see the API Reference and the MCP Tools Reference. For workflow guides, see the Guides section.

Getting Started

Capabilities

Account & Billing

Core Concepts

Guides

MCP Integration

Errors & Limits

Code Examples

​Features

​Core editing intelligence

​Section-precision editing

​Style preservation on edit and export

​Document intelligence

​Search within documents

​Summarize sections on demand

​Summarize entire documents at any length

​Cross-document reference and synthesis

​Rich editing & formatting

​Full rich-text formatting toolbar

​Heading and list hierarchies

​Table editing with cell-level control

​Headers, footers, and hyperlinks

​Knowledge & attachments

​Multimodal vision on attachments

​Build a knowledge base from attachments

​Semantic search across attachments

​Conversation & robustness

​Persistent conversation context

​Automatic error recovery with fallbacks

​Human control & approval

​Human-in-the-loop approval

​Compact response mode for long editing sessions

​Real-time progress via SSE

​Multi-format I/O

​Multi-format input and output

​Pre-signed URL upload and download

​Developer & integration

​MCP server — 21 tools, every major client

​Three authentication paths

​REST API works with any programming language

​Real-time usage tracking and transparency

​Scale & operations

​Sessions and persistence

​Async jobs with HITL state

​Per-organization feature flags

​Promo codes and credit allowances

​Multi-language editing

​Build vertical AI on SuperDocs

​On the roadmap

​Time Travel & Version History (Coming Soon)

​What ships when