Skip to main content

Building a Production Claude Agent API with Job Queues and Streaming

The @anthropic-ai/claude-agent-sdk brings Claude Code's autonomous capabilities to your own applications. Unlike the standard Claude API, the Agent SDK gives Claude access to file systems, bash commands, and MCP tools—enabling complex multi-step workflows that can read, analyze, and modify codebases autonomously.

In this guide, we'll explore a production-ready implementation that handles long-running tasks, provides real-time streaming feedback, and integrates seamlessly with MCP servers.

Why Choose the Agent SDK?

The standard Claude API is perfect for conversational AI and text generation. But for complex coding tasks, the Agent SDK provides:

  • File system access - Read and modify files across your codebase
  • Bash execution - Run commands, tests, and build scripts
  • MCP integration - Connect to custom tools and data sources
  • Multi-step reasoning - Autonomous planning and execution
  • Project awareness - Full understanding of codebase structure

This makes it ideal for automated code review, refactoring, documentation generation, and complex data processing.

Architecture: Three Complementary Patterns

Our implementation provides three ways to interact with the agent:

  1. Async Job Queue - Submit long-running tasks and poll for results
  2. Streaming SSE - Get real-time feedback during execution
  3. MCP Tool - Expose the agent to other AI systems
graph TD
    A[Client] -->|POST /claude| B[Job Queue]
    A -->|POST /claude/stream| C[Streaming API]
    D[Other AI] -->|MCP Tool| E[MCP Server]
    
    B --> F[Database]
    B --> G[Background Worker]
    G --> H[Claude Agent SDK]
    C --> H
    E --> H
    
    H --> I[File System]
    H --> J[Bash]
    H --> K[MCP Servers]

Core Implementation

The heart of the system is a clean wrapper around the Claude Agent SDK:

import { query } from "@anthropic-ai/claude-agent-sdk";
import path from "path";
import fs from "fs";

// Load project configuration
const projectRoot = path.resolve(process.cwd(), "../..");
const mcpConfigPath = path.join(projectRoot, ".mcp.json");
let mcpServers = {};

try {
  const mcpConfig = JSON.parse(fs.readFileSync(mcpConfigPath, "utf-8"));
  mcpServers = mcpConfig.mcpServers || {};
} catch (error) {
  console.warn("Could not load .mcp.json:", error);
}

// Load project-specific instructions
let claudeMdContent = "";
try {
  const claudeMdPath = path.join(projectRoot, "CLAUDE.md");
  claudeMdContent = fs.readFileSync(claudeMdPath, "utf-8");
} catch (error) {
  console.warn("Could not load CLAUDE.md:", error);
}

export async function* executeClaudeAgent({ prompt, continue: continueOption = true }) {
  // Clean environment to avoid conflicts
  const cleanEnv = { ...process.env };
  delete cleanEnv.NODE_OPTIONS;
  delete cleanEnv.VSCODE_INSPECTOR_OPTIONS;

  const queryResult = query({
    prompt,
    options: {
      cwd: projectRoot,
      settingSources: ["user", "project"],
      mcpServers,
      systemPrompt: claudeMdContent
        ? {
            type: "preset",
            preset: "claude_code",
            append: claudeMdContent,
          }
        : undefined,
      env: cleanEnv,
      permissionMode: "bypassPermissions",
      continue: continueOption,
    },
  });

  for await (const message of queryResult) {
    yield message;
  }
}

Key Design Decisions

Project Context: The agent loads configuration from your monorepo root, including .mcp.json for MCP servers and CLAUDE.md for project-specific instructions.

Environment Cleanup: Debugger environment variables are removed to prevent conflicts in development.

Permission Mode: bypassPermissions enables autonomous operation—use carefully and consider sandboxing.

Continue Option: Enables multi-turn conversations where the agent maintains context across requests.

Pattern 1: Async Job Queue

For tasks that might exceed HTTP timeout limits, use the async pattern:

// Database schema
interface ClaudeWorkerTask {
  job_id: string;
  context: { prompt: string; continue?: boolean };
  status: 'pending' | 'in_progress' | 'completed' | 'failed';
  results?: any;
  created_at: Date;
  started_at?: Date;
  completed_at?: Date;
}

// Background processor
async function processClaudeJob(jobId: string, prompt: string, logger: any) {
  try {
    // Update to in_progress
    await updateJobStatus(jobId, "in_progress");
    
    logger.info({ jobId }, "Starting Claude agent job");

    // Execute agent (can take minutes)
    const messages = await executeClaudeAgentSync({ prompt });

    // Extract results
    const resultMessage = messages.find((msg) => msg.type === "result");
    const result = {
      output: resultMessage?.result,
      usage: resultMessage?.usage,
      cost_usd: resultMessage?.total_cost_usd,
    };

    // Save results
    await updateJobResults(jobId, "completed", result);
    logger.info({ jobId }, "Job completed");
  } catch (error: any) {
    logger.error({ jobId, error: error.message }, "Job failed");
    await updateJobResults(jobId, "failed", { error: error.message });
  }
}

Submit and Poll

# Submit job
curl -X POST https://api.example.com/claude \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Analyze TypeScript files and generate dependency graph"}'

# Returns: {"job_id": "uuid-here", "status": "pending"}

# Poll for results
curl -X GET https://api.example.com/claude/jobs/uuid-here \
  -H "Authorization: Bearer YOUR_KEY"

# Returns: {"job_id": "...", "status": "completed", "result": {...}}

Pattern 2: Streaming with SSE

For real-time feedback, use Server-Sent Events:

app.post("/claude/stream", async (request, reply) => {
  const { prompt } = request.body;

  try {
    const queryResult = executeClaudeAgent({ prompt });

    // Set SSE headers
    reply.raw.writeHead(200, {
      "Content-Type": "text/event-stream",
      "Cache-Control": "no-cache",
      "Connection": "keep-alive",
    });

    // Stream messages as they arrive
    for await (const message of queryResult) {
      reply.raw.write(`data: ${JSON.stringify(message)}\n\n`);
    }

    reply.raw.write("data: [DONE]\n\n");
    reply.raw.end();
  } catch (error: any) {
    reply.raw.write(`data: ${JSON.stringify({ error: error.message })}\n\n`);
    reply.raw.end();
  }
});

Client-Side Consumption

const eventSource = new EventSource('/claude/stream');

eventSource.onmessage = (event) => {
  if (event.data === '[DONE]') {
    eventSource.close();
    return;
  }
  
  const message = JSON.parse(event.data);
  
  if (message.type === 'assistant') {
    console.log('Response:', message.message.content);
  } else if (message.type === 'result') {
    console.log('Final:', message.result);
    console.log('Cost:', message.total_cost_usd);
  }
};

Pattern 3: MCP Server Tool

The most powerful pattern is exposing the agent as an MCP tool:

import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { z } from "zod";

export function createAgentMcpServer() {
  const mcpServer = new McpServer({
    name: "claude-agent",
    version: "1.0.0",
  });

  mcpServer.tool(
    "run_claude_agent",
    "Execute autonomous coding task with full codebase access",
    {
      prompt: z.string().describe("Task for Claude to execute"),
    },
    async ({ prompt }) => {
      const messages = await executeClaudeAgentSync({ prompt });
      const resultMessage = messages.find((msg) => msg.type === "result");

      if (resultMessage?.result) {
        return {
          content: [{
            type: "text",
            text: JSON.stringify({
              result: resultMessage.result,
              usage: resultMessage.usage,
              cost_usd: resultMessage.total_cost_usd,
            }, null, 2),
          }],
        };
      }

      return {
        content: [{ type: "text", text: "No result returned" }],
        isError: true,
      };
    }
  );

  return mcpServer.server;
}

This enables a meta-agent pattern:

  1. Primary AI system receives user request
  2. Identifies complex coding subtasks
  3. Delegates to Claude Agent via MCP
  4. Claude executes with full codebase access
  5. Results flow back to primary AI

MCP Configuration

Define MCP servers in .mcp.json at your project root:

{
  "mcpServers": {
    "database-tools": {
      "type": "http",
      "url": "https://your-mcp-server.com/mcp/database"
    },
    "api-tools": {
      "type": "http",
      "url": "https://your-mcp-server.com/mcp/api"
    }
  }
}

Project instructions go in CLAUDE.md for consistent agent behavior across all tasks.

Production Considerations

Security

Authentication: Require API keys for all endpoints:

const authHeader = request.headers.authorization;
const expectedKey = process.env.API_KEY;

if (!authHeader?.startsWith("Bearer ") || 
    authHeader.substring(7) !== expectedKey) {
  return reply.status(401).send({ error: "Unauthorized" });
}

Sandboxing: Since bypassPermissions allows any operation, consider:

  • Running in isolated containers
  • Implementing additional authorization checks
  • Auditing all agent actions
  • Rate limiting requests

Resource Management

Timeouts: Prevent runaway jobs:

const TIMEOUT = 30 * 60 * 1000; // 30 minutes

const timeoutPromise = new Promise((_, reject) => {
  setTimeout(() => reject(new Error('Timeout')), TIMEOUT);
});

const result = await Promise.race([
  executeClaudeAgentSync({ prompt }),
  timeoutPromise
]);

Cleanup: Remove old completed jobs periodically:

DELETE FROM claude_worker_tasks 
WHERE completed_at < DATE_SUB(NOW(), INTERVAL 30 DAY);

Monitoring

Cost Tracking: The agent reports costs automatically:

const resultMessage = messages.find(msg => msg.type === "result");
console.log(`Cost: $${resultMessage?.total_cost_usd}`);
console.log(`Tokens: ${resultMessage?.usage?.input_tokens + 
                        resultMessage?.usage?.output_tokens}`);

Key Metrics:

  • Job completion rate
  • Average execution time
  • Token usage and costs
  • Error rates
  • Queue depth

Use Cases

Automated Code Analysis

const prompt = `
Analyze the authentication system:
1. List all auth-related files
2. Document the flow
3. Identify security issues
4. Generate a Mermaid diagram
`;

Codebase Refactoring

const prompt = `
Convert class components to hooks:
1. Find all class components
2. Convert to functional components
3. Replace lifecycle methods
4. Update tests
`;

Documentation Generation

const prompt = `
Generate API documentation:
1. Find all route definitions
2. Extract schemas
3. Document auth requirements
4. Create OpenAPI spec
`;

Agent SDK vs Standard API

Feature Standard API Agent SDK
File Access None Full read/write
Bash Commands None Full execution
MCP Tools Limited Full integration
Multi-step Manual Autonomous
Best For Chat, generation Code, automation

Best Practices

1. Structure Clear Prompts

const prompt = `
## Task: Add User Authentication

### Steps:
1. Implement JWT-based auth
2. Create login/logout endpoints
3. Add middleware
4. Write tests

### Criteria:
- All tests pass
- No TypeScript errors
- Follow security best practices
`;

2. Use Continue for Context

// Initial analysis
await executeClaudeAgentSync({
  prompt: "Analyze the auth system",
  continue: true
});

// Follow-up with context preserved
await executeClaudeAgentSync({
  prompt: "Based on your analysis, implement OAuth2",
  continue: true
});

3. Handle Errors Gracefully

try {
  const messages = await executeClaudeAgentSync({ prompt });
  const result = messages.find(m => m.type === "result");
  
  if (result?.result?.error) {
    throw new Error(result.result.error);
  }
  
  return result.result;
} catch (error) {
  logger.error({ error, prompt }, "Agent failed");
  // Implement retry logic if appropriate
  throw error;
}

4. Set Cost Alerts

const COST_THRESHOLD = 5.0;
const result = messages.find(m => m.type === "result");

if (result?.total_cost_usd > COST_THRESHOLD) {
  await sendAlert(`High cost execution: $${result.total_cost_usd}`);
}

Conclusion

The Claude Agent SDK unlocks autonomous coding capabilities that go far beyond traditional AI APIs. By combining it with robust job queues, streaming responses, and MCP integration, you create a production-ready platform for complex automation tasks.

Key takeaways:

  • Choose async for long tasks, streaming for real-time feedback, MCP for delegation
  • Implement proper authentication and sandboxing
  • Monitor costs and performance closely
  • Provide clear prompts and project context
  • Handle failures with retries and proper error messages

The combination of autonomous agent capabilities with well-architected API patterns opens new possibilities for intelligent automation—from code analysis to refactoring to complex workflows.


Interested in AI-powered automation? Check out more patterns and implementations at blle.co/blog.

Related Articles

Need help with your project or have questions?

We specialize in AI automation, custom integrations, and intelligent workflows tailored to your business needs.

Whether you need help deploying, building, implementing, or creating a solution - or just want expert guidance on your project - we're here to help.

Contact us today to discuss your project.