Skip to main content

Model Selection

SuperDocs offers four model tiers and three thinking depths. All tiers are available on every plan.

Model tiers

Set model_tier in your request to choose a model:
TierBest forSpeedThinking depth control
coreEveryday editing, quick tasksFastYes
turboSpeed-critical workflowsFastestNo — always optimized
proComplex analysis, multi-step editsModerateNo — always optimized
maxChallenging documents, nuanced tasksSlowerYes
Default: core

Thinking depth

Set thinking_depth to control how much reasoning the AI applies:
DepthBehavior
fastQuick responses, minimal reasoning
balancedAI decides when to reason deeply (default)
deepExtended reasoning for complex problems
Default: balanced
Thinking depth control is available for the core and max tiers. The turbo and pro tiers use their own optimized reasoning — the thinking_depth parameter is ignored for these models. You don’t need to configure it; they always use the best reasoning depth for the task.

Usage

# Core with custom thinking depth
curl -X POST https://api.superdocs.app/v1/chat \
  -H "Authorization: Bearer sk_YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Analyze this contract for potential risks",
    "session_id": "my-session",
    "document_html": "...",
    "model_tier": "core",
    "thinking_depth": "deep"
  }'

# Pro — no thinking_depth needed, reasoning is always optimized
curl -X POST https://api.superdocs.app/v1/chat \
  -H "Authorization: Bearer sk_YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Analyze this contract for potential risks",
    "session_id": "my-session",
    "document_html": "...",
    "model_tier": "pro"
  }'

Recommendations

TaskSuggested tierSuggested depth
Fix a typocorefast
Rewrite a paragraphcorebalanced
Draft a multi-section documentcore or maxdeep
Analyze a complex contractpro or maxdeep (max only)
Batch processing many documentsturbo

Token cost expectations by operation type

Use this as a planning guide for how many output tokens — and roughly how much money — a given operation will spend.

Typical per-operation output token usage

OperationSections involvedExpected output tokensApprox. cost (default tier)
Single-section edit (typo, reword, format)15,000 – 20,0000.0010.001 – 0.004
Multi-section edit (3–5 sections)3–520,000 – 80,0000.0040.004 – 0.016
Full-document operation (rewrite, restructure)10–2080,000 – 250,0000.0160.016 – 0.050
Anything materially above these ranges (e.g. a single-section edit using 100,000+ output tokens) suggests a runaway plan or an unusually large section. Inspect the operation in your usage dashboard if you see this — and feel free to flag it to us, since it’s also useful as a signal for our own quality monitoring.

Reasoning tokens are additive

Each thinking depth adds reasoning tokens on top of the operation’s output:
Thinking depthApprox. reasoning tokens addedLatency added
fastup to 2,000< 1 s
balanced (default)dynamic — typically 4,000 – 8,0001 – 3 s
deepup to 16,0003 – 8 s
Reasoning tokens are billed at the model’s standard output rate, so on the default tier a deep reasoning addition contributes < $0.005 per request even at the upper end.

Picking a tier for batch jobs

If you’re processing 100+ documents in a batch:
  • Cost-firstmodel_tier: "turbo". Fastest, lowest cost, slight precision tradeoff. Good for analytics-style passes (extract, classify, summarize).
  • Balancedmodel_tier: "core" with thinking_depth: "fast". Solid precision, moderate speed and cost. Good default for most batch flows.
  • High-stakesmodel_tier: "pro" or "max". Use when the cost of a wrong edit (lawyer review hours, regulatory exposure, reputational damage) far exceeds the per-document token cost — i.e. always for legal, regulatory, medical, or financial documents.

Choosing for precision

If the AI’s output isn’t what you wanted, the right tier change is usually obvious once you name the symptom:
SymptomWhat to try
AI edited sections you didn’t ask aboutSwitch to model_tier: "pro" or "max". Also verify your prompt is single-section explicit (e.g. “edit Section 3” rather than “fix the document”). Vague prompts invite wide edits.
Edits miss nuance in legal / regulatory / medical languagemodel_tier: "max" with thinking_depth: "deep". Max is the most capable model and Deep gives it room to reason carefully about wording.
Too slowmodel_tier: "turbo" (loses some precision but is the fastest). Or stay on core and pass thinking_depth: "fast" (smaller cost, smaller precision drop).
Want a balance — don’t know which to pickStay on the default: core + balanced. The model picks reasoning effort dynamically per request. Most edits won’t need deep reasoning; complex ones will.
Want maximum precision, cost is no objectmodel_tier: "max" + thinking_depth: "deep". Most expensive, most accurate.

What balanced actually does

thinking_depth: "balanced" is dynamic — the model decides how much reasoning to apply based on the prompt’s complexity. Most everyday edits trigger minimal reasoning (cheap and fast); complex multi-step edits trigger more (slower, more expensive). This is the recommended default for general editing, and it’s what core + balanced gives you out of the box. fast (2,048-token reasoning ceiling) is faster but can over-narrow on the AI’s part — sometimes it picks a wider scope than the prompt warranted because it didn’t have budget to reason carefully about scope. Use fast when you’ve verified your prompts are unambiguous and you want the speed. deep (16,384-token reasoning ceiling) gives the model room to reason carefully. Use it for high-stakes edits where one bad output is more expensive than the extra tokens.

When to use Deep — and what it costs

Deep reasoning roughly 6× the output tokens versus Balanced for a typical 1,000-token edit. On core that translates to roughly 0.04vs0.04 vs 0.007 per edit; on max it’s higher — roughly 0.05vs0.05 vs 0.009. The wall-clock latency increase is typically 1–3 seconds. That cost is well worth it when:
  • The document is high-stakes (a contract clause, a regulatory disclosure, a medical instruction)
  • Output errors are expensive to fix downstream (executive review, lawyer hours)
  • You’re running few-but-important edits, not high-volume batch work
It is NOT worth it when:
  • You’re doing fast iterative editing (a writer’s draft, a Slack-formatting cleanup)
  • You’re processing high volumes (use turbo instead)
  • The user is sitting in front of the screen waiting (latency dominates UX)
For legal, regulatory, medical, financial, or compliance documents — default to Pro or Max. The cost of one wrong edit on a contract clause or a HIPAA-relevant medical instruction massively exceeds the cost of running a more capable tier. Don’t optimise for token cost on documents where the human cost of a mistake is in lawyer hours, regulatory exposure, or patient harm.

Default model preference

You can set a default model tier in your account preferences (via the web app). Per-request model_tier overrides your default.