Runtime - CopilotKit

What is CopilotRuntime?

CopilotRuntime is the server-side orchestrator that receives HTTP requests from the frontend and delegates them to agents for execution. It’s the bridge between your frontend application and your AI agents.

CopilotRuntime can be deployed as a standalone microservice or embedded in your existing Node.js server.

Core Concepts

CopilotRuntime

Basic Setup

import { CopilotRuntime } from "@copilotkitnext/runtime";
import { BuiltInAgent } from "@copilotkitnext/agent";
import { openai } from "@ai-sdk/openai";

const runtime = new CopilotRuntime({
  agents: {
    "assistant": new BuiltInAgent({
      agentId: "assistant",
      model: openai("gpt-4")
    })
  }
});

Reference: packages/v2/runtime/src/runtime.ts:57

Configuration Options

interface CopilotRuntimeOptions {
  // Map of available agents
  agents: MaybePromise<Record<string, AbstractAgent>>;
  
  // Agent runner (defaults to InMemoryAgentRunner)
  runner?: AgentRunner;
  
  // Transcription service for audio
  transcriptionService?: TranscriptionService;
  
  // Before request middleware
  beforeRequestMiddleware?: BeforeRequestMiddleware;
  
  // After request middleware
  afterRequestMiddleware?: AfterRequestMiddleware;
  
  // A2UI middleware config
  a2ui?: A2UIMiddlewareConfig;
  
  // MCP Apps config
  mcpApps?: McpAppsConfig;
}

Reference: packages/v2/runtime/src/runtime.ts:41

Lazy Agent Loading

Agents can be loaded asynchronously for better startup performance:

const runtime = new CopilotRuntime({
  agents: async () => {
    // Load agents on-demand
    const model = await loadModel();
    return {
      "assistant": new BuiltInAgent({
        agentId: "assistant",
        model
      })
    };
  }
});

Lazy loading is useful when agents have expensive initialization (loading models, connecting to databases, etc.).

Server Integration

Express Adapter

import express from "express";
import { copilotRuntimeExpressAdapter } from "@copilotkitnext/runtime";

const app = express();

app.use("/copilotkit", copilotRuntimeExpressAdapter({
  runtime
}));

app.listen(4000);

This creates the following endpoints:

GET /copilotkit/info - Runtime and agent information
POST /copilotkit/agent/:agentId/run - Execute an agent
POST /copilotkit/agent/:agentId/connect - Reconnect to thread
POST /copilotkit/agent/:agentId/stop/:threadId - Stop agent

Hono Adapter

import { Hono } from "hono";
import { copilotRuntimeHonoAdapter } from "@copilotkitnext/runtime";

const app = new Hono();

app.route("/copilotkit", copilotRuntimeHonoAdapter({
  runtime
}));

export default app;

Next.js API Route

// app/api/copilotkit/[...copilotkit]/route.ts
import { CopilotRuntime } from "@copilotkitnext/runtime";
import { copilotRuntimeNextJSAppRouterAdapter } from "@copilotkitnext/runtime";

const runtime = new CopilotRuntime({
  agents: { /* ... */ }
});

export const POST = copilotRuntimeNextJSAppRouterAdapter(runtime);
export const GET = copilotRuntimeNextJSAppRouterAdapter(runtime);

AgentRunner

AgentRunner is an abstract class responsible for managing thread state (conversation history, agent state) and executing agents. It’s the persistence layer for agent conversations.

abstract class AgentRunner {
  // Execute an agent run
  abstract run(request: AgentRunnerRunRequest): Observable<BaseEvent>;
  
  // Reconnect to an existing thread
  abstract connect(request: AgentRunnerConnectRequest): Observable<BaseEvent>;
  
  // Check if a thread is currently running
  abstract isRunning(request: AgentRunnerIsRunningRequest): Promise<boolean>;
  
  // Stop a running thread
  abstract stop(request: AgentRunnerStopRequest): Promise<boolean | undefined>;
}

Reference: packages/v2/runtime/src/runner/agent-runner.ts:23

InMemoryAgentRunner

The default runner that stores thread state in memory. Perfect for development and stateless deployments:

import { InMemoryAgentRunner } from "@copilotkitnext/runtime";

const runner = new InMemoryAgentRunner();

const runtime = new CopilotRuntime({
  agents: { /* ... */ },
  runner
});

Characteristics:

Ephemeral - State is lost on server restart
Fast - No I/O overhead
Hot-reload friendly - Survives hot reloads in development (via global state)
Concurrent - Handles multiple threads simultaneously

Reference: packages/v2/runtime/src/runner/in-memory.ts:100

State Management

// Global store per thread
const store = {
  threadId: "thread_123",
  subject: ReplaySubject<BaseEvent>,
  isRunning: false,
  currentRunId: "run_456",
  historicRuns: [
    {
      threadId: "thread_123",
      runId: "run_456",
      parentRunId: null,
      events: [...],
      createdAt: 1234567890
    }
  ],
  agent: AbstractAgent,
  stopRequested: false
};

Reference: packages/v2/runtime/src/runner/in-memory.ts:19

Event Replay

When reconnecting, InMemoryAgentRunner replays all historic events:

connect(request: AgentRunnerConnectRequest): Observable<BaseEvent> {
  const store = GLOBAL_STORE.get(request.threadId);
  
  // Collect all historic events
  const allHistoricEvents: BaseEvent[] = [];
  for (const run of store.historicRuns) {
    allHistoricEvents.push(...run.events);
  }
  
  // Compact and emit
  const compactedEvents = compactEvents(allHistoricEvents);
  for (const event of compactedEvents) {
    connectionSubject.next(event);
  }
  
  // Bridge to active run if exists
  if (store.subject && store.isRunning) {
    store.subject.subscribe(connectionSubject);
  }
  
  return connectionSubject.asObservable();
}

Reference: packages/v2/runtime/src/runner/in-memory.ts:294

SQLiteAgentRunner

Persistent runner that stores thread state in SQLite. Use in production for conversation persistence:

import { SQLiteAgentRunner } from "@copilotkitnext/sqlite-runner";

const runner = new SQLiteAgentRunner({
  dbPath: "./copilot.db"
});

const runtime = new CopilotRuntime({
  agents: { /* ... */ },
  runner
});

Characteristics:

Persistent - Survives server restarts
Scalable - Can handle large conversation histories
Queryable - SQL access to conversation data
Transactional - ACID guarantees for state updates

Reference: packages/v2/sqlite-runner/src/sqlite-runner.ts

Schema

CREATE TABLE runs (
  id TEXT PRIMARY KEY,
  thread_id TEXT NOT NULL,
  parent_run_id TEXT,
  created_at INTEGER NOT NULL
);

CREATE TABLE events (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  run_id TEXT NOT NULL,
  event_type TEXT NOT NULL,
  event_data TEXT NOT NULL,
  created_at INTEGER NOT NULL,
  FOREIGN KEY (run_id) REFERENCES runs(id)
);

CREATE INDEX idx_events_run_id ON events(run_id);
CREATE INDEX idx_runs_thread_id ON runs(thread_id);

Event Persistence

// Store events as they arrive
for await (const event of agentStream) {
  db.insert("events", {
    run_id: runId,
    event_type: event.type,
    event_data: JSON.stringify(event),
    created_at: Date.now()
  });
  
  yield event; // Stream to frontend
}

Reference: packages/v2/sqlite-runner/src/sqlite-runner.ts:248

Custom AgentRunner

Build a custom runner for your storage backend:

import { AgentRunner } from "@copilotkitnext/runtime";
import { Observable } from "rxjs";

class RedisAgentRunner extends AgentRunner {
  constructor(private redis: RedisClient) {
    super();
  }

  run(request: AgentRunnerRunRequest): Observable<BaseEvent> {
    return new Observable((observer) => {
      const { threadId, agent, input } = request;
      
      // Load historic state from Redis
      const history = await this.redis.get(`thread:${threadId}`);
      
      // Execute agent
      agent.runAgent(input, {
        onEvent: ({ event }) => {
          // Persist event
          this.redis.rpush(`thread:${threadId}:events`, 
            JSON.stringify(event)
          );
          
          // Stream to frontend
          observer.next(event);
        }
      });
      
      return () => {
        // Cleanup
      };
    });
  }

  async connect(request: AgentRunnerConnectRequest): Observable<BaseEvent> {
    // Load and replay events from Redis
    const events = await this.redis.lrange(
      `thread:${request.threadId}:events`, 
      0, 
      -1
    );
    
    return new Observable((observer) => {
      for (const eventStr of events) {
        observer.next(JSON.parse(eventStr));
      }
      observer.complete();
    });
  }

  async isRunning(request: AgentRunnerIsRunningRequest): Promise<boolean> {
    return await this.redis.exists(`thread:${request.threadId}:running`);
  }

  async stop(request: AgentRunnerStopRequest): Promise<boolean> {
    const running = await this.isRunning(request);
    if (!running) return false;
    
    await this.redis.del(`thread:${request.threadId}:running`);
    return true;
  }
}

Custom runners must handle concurrent access safely. Use locks or transactions to prevent race conditions.

Middleware

Middleware provides hooks for cross-cutting concerns like authentication, logging, and request transformation.

Before Request Middleware

Runs before the request handler:

const runtime = new CopilotRuntime({
  agents: { /* ... */ },
  beforeRequestMiddleware: async ({ runtime, request, path }) => {
    // Authentication
    const token = request.headers.get("Authorization");
    if (!isValid(token)) {
      throw new Error("Unauthorized");
    }
    
    // Logging
    console.log(`Request to ${path} from ${token}`);
    
    // Transform request
    const newHeaders = new Headers(request.headers);
    newHeaders.set("X-User-ID", getUserId(token));
    
    return new Request(request.url, {
      ...request,
      headers: newHeaders
    });
  }
});

Reference: packages/v2/runtime/src/middleware.ts:72

After Request Middleware

Runs after the response is generated:

const runtime = new CopilotRuntime({
  agents: { /* ... */ },
  afterRequestMiddleware: async ({ 
    runtime, 
    response, 
    path, 
    messages,
    threadId,
    runId 
  }) => {
    // Log completion
    console.log(`Completed ${path} for thread ${threadId}`);
    
    // Analytics
    await analytics.track({
      event: "agent_run_completed",
      threadId,
      runId,
      messageCount: messages?.length
    });
    
    // Audit trail
    await audit.log({
      action: "agent_run",
      threadId,
      timestamp: Date.now()
    });
  }
});

Reference: packages/v2/runtime/src/middleware.ts:89

Middleware Use Cases

Common middleware patterns:

Authentication - Verify JWT tokens, API keys, or session cookies
Authorization - Check user permissions for specific agents
Rate limiting - Throttle requests per user or IP
Logging - Record all agent interactions
Metrics - Track performance and usage
Request transformation - Modify headers or payloads
Response filtering - Remove sensitive data from responses

Thread Management

Thread IDs

Threads represent a conversation context. The frontend generates and manages thread IDs:

import { v4 as uuidv4 } from "uuid";

// Create new thread
const threadId = uuidv4();

// Run agent in this thread
await agent.runAgent({
  threadId,
  runId: uuidv4(),
  messages: [...]
});

Run IDs

Each agent execution within a thread gets a unique run ID:

// First run
await agent.runAgent({
  threadId: "thread_123",
  runId: "run_1",
  messages: [{ role: "user", content: "Hello" }]
});

// Follow-up run in same thread
await agent.runAgent({
  threadId: "thread_123",
  runId: "run_2",
  messages: [{ role: "user", content: "Tell me more" }]
});

Parent-Child Runs

Runners track parent-child relationships for run chains:

{
  threadId: "thread_123",
  runs: [
    {
      runId: "run_1",
      parentRunId: null,  // First run
      events: [...]
    },
    {
      runId: "run_2",
      parentRunId: "run_1",  // Child of run_1
      events: [...]
    }
  ]
}

Reference: packages/v2/runtime/src/runner/in-memory.ts:19

State Persistence

Event Compaction

To optimize storage, runners can compact event streams:

// Original events
[
  { type: "TEXT_MESSAGE_START", messageId: "msg_1" },
  { type: "TEXT_MESSAGE_CONTENT", messageId: "msg_1", content: "Hello" },
  { type: "TEXT_MESSAGE_CONTENT", messageId: "msg_1", content: " world" },
  { type: "TEXT_MESSAGE_END", messageId: "msg_1" }
]

// Compacted
[
  { 
    type: "TEXT_MESSAGE_START", 
    messageId: "msg_1",
    content: "Hello world"  // Merged content
  }
]

Reference: packages/v2/runtime/src/runner/in-memory.ts:213

Deduplication

Duplicate events are removed during compaction:

// Before
[
  { type: "RUN_STARTED", runId: "run_1" },
  { type: "RUN_STARTED", runId: "run_1" },  // Duplicate
  { type: "TEXT_MESSAGE_CONTENT", content: "Hello" }
]

// After
[
  { type: "RUN_STARTED", runId: "run_1" },
  { type: "TEXT_MESSAGE_CONTENT", content: "Hello" }
]

Concurrent Execution

Thread Isolation

Each thread runs independently:

// Thread 1
runner.run({
  threadId: "thread_1",
  agent: agent.clone(),
  input: { ... }
});

// Thread 2 (runs concurrently)
runner.run({
  threadId: "thread_2",
  agent: agent.clone(),
  input: { ... }
});

Agent Cloning

The runtime clones agents for each request to ensure isolation:

// Original agent
const agent = new BuiltInAgent({ ... });

// Cloned for request 1
const clone1 = agent.clone();

// Cloned for request 2
const clone2 = agent.clone();

// No shared state between clones

This prevents race conditions and state leakage between concurrent requests.

Preventing Concurrent Runs in Same Thread

run(request: AgentRunnerRunRequest): Observable<BaseEvent> {
  const store = GLOBAL_STORE.get(request.threadId);
  
  if (store.isRunning) {
    throw new Error("Thread already running");
  }
  
  store.isRunning = true;
  // ... execute agent
}

Reference: packages/v2/runtime/src/runner/in-memory.ts:109

Error Handling

Run Finalization

Runners ensure runs are properly finalized even on errors:

try {
  await agent.runAgent(input, { onEvent });
  
  // Success - emit RUN_FINISHED
  const appendedEvents = finalizeRunEvents(currentRunEvents, {
    stopRequested: false
  });
} catch (error) {
  // Error - emit RUN_ERROR
  const appendedEvents = finalizeRunEvents(currentRunEvents, {
    stopRequested: false,
    interruptionMessage: error.message
  });
} finally {
  // Always clean up
  store.isRunning = false;
  store.currentRunId = null;
}

Reference: packages/v2/runtime/src/runner/in-memory.ts:202

Stop Requests

Gracefully stop a running agent:

await runner.stop({
  threadId: "thread_123"
});

// Runner marks stop requested
// Agent receives abort signal
// RUN_FINISHED emitted with stopped flag

Reference: packages/v2/runtime/src/runner/in-memory.ts:352

Production Deployment

Scaling Considerations

Horizontal Scaling with InMemoryAgentRunner:

State is per-process
Use sticky sessions to route threads to same instance
Or use SQLiteAgentRunner for shared state

Horizontal Scaling with SQLiteAgentRunner:

SQLite doesn’t support concurrent writes from multiple processes
Use PostgreSQL/MySQL runner instead (custom implementation)
Or use Redis runner for distributed state

Health Checks

app.get("/health", async (req, res) => {
  try {
    // Check runner health
    const isHealthy = await runner.isRunning({ 
      threadId: "health_check" 
    });
    
    res.json({ status: "ok" });
  } catch (error) {
    res.status(500).json({ 
      status: "error", 
      message: error.message 
    });
  }
});

Monitoring

const runtime = new CopilotRuntime({
  agents: { /* ... */ },
  afterRequestMiddleware: async ({ threadId, runId, messages }) => {
    // Track metrics
    metrics.histogram("agent_run_duration", Date.now() - startTime);
    metrics.increment("agent_runs_total");
    metrics.gauge("active_threads", activeThreadCount);
    
    // Error tracking
    if (hasError) {
      errorTracker.captureException(error, {
        threadId,
        runId
      });
    }
  }
});

Best Practices

Runner Selection

Development - Use InMemoryAgentRunner for fast iteration
Production (stateless) - Use InMemoryAgentRunner with sticky sessions
Production (stateful) - Use SQLiteAgentRunner or custom persistent runner

State Management

Keep state minimal - Only persist what’s necessary
Compact regularly - Reduce storage overhead
Archive old threads - Move inactive threads to cold storage
Clean up on errors - Always finalize runs properly

Performance

Clone agents efficiently - Avoid expensive operations in clone()
Stream events promptly - Don’t buffer unnecessarily
Use connection pooling - For database-backed runners
Monitor memory - Track runner memory usage

Next Steps

Architecture

Understand where runtime fits in the architecture

Agents

Learn how agents are executed by the runtime

AG-UI Protocol

Understand the event streaming protocol

Frontend Integration

Connect your frontend to the runtime

​What is CopilotRuntime?

​Core Concepts

​CopilotRuntime

​Basic Setup

​Configuration Options

​Lazy Agent Loading

​Server Integration

​Express Adapter

​Hono Adapter

​Next.js API Route

​AgentRunner

​InMemoryAgentRunner

​State Management

​Event Replay

​SQLiteAgentRunner

​Schema

​Event Persistence

​Custom AgentRunner

​Middleware

​Before Request Middleware

​After Request Middleware

​Middleware Use Cases

​Thread Management

​Thread IDs

​Run IDs

​Parent-Child Runs

​State Persistence

​Event Compaction

​Deduplication

​Concurrent Execution

​Thread Isolation

​Agent Cloning

​Preventing Concurrent Runs in Same Thread

​Error Handling

​Run Finalization

​Stop Requests

​Production Deployment

​Scaling Considerations

​Health Checks

​Monitoring

​Best Practices

​Runner Selection

​State Management

​Performance

​Next Steps

Architecture

Agents

AG-UI Protocol

Frontend Integration

What is CopilotRuntime?

Core Concepts

CopilotRuntime

Basic Setup

Configuration Options

Lazy Agent Loading

Server Integration

Express Adapter

Hono Adapter

Next.js API Route

AgentRunner

InMemoryAgentRunner

State Management

Event Replay

SQLiteAgentRunner

Schema

Event Persistence

Custom AgentRunner

Middleware

Before Request Middleware

After Request Middleware

Middleware Use Cases

Thread Management

Thread IDs

Run IDs

Parent-Child Runs

State Persistence

Event Compaction

Deduplication

Concurrent Execution

Thread Isolation

Agent Cloning

Preventing Concurrent Runs in Same Thread

Error Handling

Run Finalization

Stop Requests

Production Deployment

Scaling Considerations

Health Checks

Monitoring

Best Practices

Runner Selection

State Management

Performance

Next Steps