Skip to main content

Router

Router is PilotDeck's intelligent model routing engine. It chooses the most appropriate and cost-effective model based on task complexity, while providing multi-provider fallback to keep the service available.

Core Capabilities

TokenSaver Tiering

Analyze request complexity and automatically choose the right model tier

Fallback Chain

Switch to backup providers when the primary provider is unavailable

Scenario Routing

Apply different routing policies for main agents and subagents

Custom Router

Plug in your own routing logic through custom router extensions

Decision Flow

User request


┌─────────────────────────┐
│ 1. Custom Router │ ← if configured
└───────────┬─────────────┘


┌─────────────────────────┐
│ 2. Scenario Decision │ ← explicit/default/subagent
└───────────┬─────────────┘


┌─────────────────────────┐
│ 3. TokenSaver Tiering │ ← choose model tier by complexity
└───────────┬─────────────┘


┌─────────────────────────┐
│ 4. Session Sticky │ ← keep model stable in a session
└───────────┬─────────────┘


RouterDecision
{ provider, model, tier }

TokenSaver

TokenSaver is Router's primary cost-saving capability. It classifies user requests into model tiers.

How It Works

  1. Judge: use a lightweight judge model to classify request complexity
  2. Tier match: choose the model tier based on the classification
  3. Sticky binding: keep later requests in the same session on the same tier unless explicitly changed

Example Tiers

TierUse caseExample model
highArchitecture design, multi-file refactorsClaude Sonnet 4.5
mediumNormal coding and bug fixesClaude Sonnet
lowSimple Q&A and file readingClaude Haiku

Configure TokenSaver

router:
tokenSaver:
enabled: true
subagent:
policy: judge # skip | judge | always-low
tiers:
high:
provider: anthropic-main
model: claude-sonnet-4-5
medium:
provider: anthropic-main
model: claude-sonnet-4-20250514
low:
provider: anthropic-main
model: claude-haiku

Fallback Chain

When the primary provider returns an error such as timeout, rate limit, server error, or network error, Router switches to the next provider in the fallback chain.

router:
fallback:
default:
- provider: openai-backup
model: gpt-4o
- provider: deepseek-backup
model: deepseek-chat

Retry Strategies

Retry transient errors on the same provider with exponential backoff:

router:
transientRetry:
enabled: true
maxAttempts: 5
baseDelayMs: 1000
maxDelayMs: 30000

Scenario Routing

ScenarioDescription
defaultDefault flow using agent.model
explicitA user or system explicitly selects the model
subagentSubagent calls that may use a dedicated routing policy

Subagent behavior is controlled by tokenSaver.subagent.policy:

  • skip: bypass TokenSaver and use the default model
  • judge: classify subagent requests with the judge model
  • always-low: always use the lowest tier for subagents

Auto-Orchestrate

When TokenSaver is enabled, Router can optimize the system prompt and tool list for lower-tier models:

router:
autoOrchestrate:
enabled: true
skillExtensionId: "my-skill-extension"
subagentMaxTokens: 50000

Stats and Events

Router records model choices and token usage:

const stats = router.stats;
// Includes: sessionId, provider, model, tier, role, usage, timestamps

It also emits events such as pilotdeck_router_decision, pilotdeck_router_fallback, pilotdeck_router_zero_usage_retry, and pilotdeck_router_execute_failed for monitoring and debugging.