Building a Production Claude Agent API with Job Queues and Streaming
The @anthropic-ai/claude-agent-sdk brings Claude Code's autonomous capabilities to your own applications. Unlike the standard Claude API, the Agent SDK gives Claude access to file systems, bash commands, and MCP tools—enabling complex multi-step workflows that can read, analyze, and modify codebases autonomously.
In this guide, we'll explore a production-ready implementation that handles long-running tasks, provides real-time streaming feedback, and integrates seamlessly with MCP servers.
Why Choose the Agent SDK?
The standard Claude API is perfect for conversational AI and text generation. But for complex coding tasks, the Agent SDK provides:
- File system access - Read and modify files across your codebase
- Bash execution - Run commands, tests, and build scripts
- MCP integration - Connect to custom tools and data sources
- Multi-step reasoning - Autonomous planning and execution
- Project awareness - Full understanding of codebase structure
This makes it ideal for automated code review, refactoring, documentation generation, and complex data processing.
Architecture: Three Complementary Patterns
Our implementation provides three ways to interact with the agent:
- Async Job Queue - Submit long-running tasks and poll for results
- Streaming SSE - Get real-time feedback during execution
- MCP Tool - Expose the agent to other AI systems
graph TD
A[Client] -->|POST /claude| B[Job Queue]
A -->|POST /claude/stream| C[Streaming API]
D[Other AI] -->|MCP Tool| E[MCP Server]
B --> F[Database]
B --> G[Background Worker]
G --> H[Claude Agent SDK]
C --> H
E --> H
H --> I[File System]
H --> J[Bash]
H --> K[MCP Servers]
Core Implementation
The heart of the system is a clean wrapper around the Claude Agent SDK:
import { query } from "@anthropic-ai/claude-agent-sdk";
import path from "path";
import fs from "fs";
// Load project configuration
const projectRoot = path.resolve(process.cwd(), "../..");
const mcpConfigPath = path.join(projectRoot, ".mcp.json");
let mcpServers = {};
try {
const mcpConfig = JSON.parse(fs.readFileSync(mcpConfigPath, "utf-8"));
mcpServers = mcpConfig.mcpServers || {};
} catch (error) {
console.warn("Could not load .mcp.json:", error);
}
// Load project-specific instructions
let claudeMdContent = "";
try {
const claudeMdPath = path.join(projectRoot, "CLAUDE.md");
claudeMdContent = fs.readFileSync(claudeMdPath, "utf-8");
} catch (error) {
console.warn("Could not load CLAUDE.md:", error);
}
export async function* executeClaudeAgent({ prompt, continue: continueOption = true }) {
// Clean environment to avoid conflicts
const cleanEnv = { ...process.env };
delete cleanEnv.NODE_OPTIONS;
delete cleanEnv.VSCODE_INSPECTOR_OPTIONS;
const queryResult = query({
prompt,
options: {
cwd: projectRoot,
settingSources: ["user", "project"],
mcpServers,
systemPrompt: claudeMdContent
? {
type: "preset",
preset: "claude_code",
append: claudeMdContent,
}
: undefined,
env: cleanEnv,
permissionMode: "bypassPermissions",
continue: continueOption,
},
});
for await (const message of queryResult) {
yield message;
}
}
Key Design Decisions
Project Context: The agent loads configuration from your monorepo root, including .mcp.json for MCP servers and CLAUDE.md for project-specific instructions.
Environment Cleanup: Debugger environment variables are removed to prevent conflicts in development.
Permission Mode: bypassPermissions enables autonomous operation—use carefully and consider sandboxing.
Continue Option: Enables multi-turn conversations where the agent maintains context across requests.
Pattern 1: Async Job Queue
For tasks that might exceed HTTP timeout limits, use the async pattern:
// Database schema
interface ClaudeWorkerTask {
job_id: string;
context: { prompt: string; continue?: boolean };
status: 'pending' | 'in_progress' | 'completed' | 'failed';
results?: any;
created_at: Date;
started_at?: Date;
completed_at?: Date;
}
// Background processor
async function processClaudeJob(jobId: string, prompt: string, logger: any) {
try {
// Update to in_progress
await updateJobStatus(jobId, "in_progress");
logger.info({ jobId }, "Starting Claude agent job");
// Execute agent (can take minutes)
const messages = await executeClaudeAgentSync({ prompt });
// Extract results
const resultMessage = messages.find((msg) => msg.type === "result");
const result = {
output: resultMessage?.result,
usage: resultMessage?.usage,
cost_usd: resultMessage?.total_cost_usd,
};
// Save results
await updateJobResults(jobId, "completed", result);
logger.info({ jobId }, "Job completed");
} catch (error: any) {
logger.error({ jobId, error: error.message }, "Job failed");
await updateJobResults(jobId, "failed", { error: error.message });
}
}
Submit and Poll
# Submit job
curl -X POST https://api.example.com/claude \
-H "Authorization: Bearer YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"prompt": "Analyze TypeScript files and generate dependency graph"}'
# Returns: {"job_id": "uuid-here", "status": "pending"}
# Poll for results
curl -X GET https://api.example.com/claude/jobs/uuid-here \
-H "Authorization: Bearer YOUR_KEY"
# Returns: {"job_id": "...", "status": "completed", "result": {...}}
Pattern 2: Streaming with SSE
For real-time feedback, use Server-Sent Events:
app.post("/claude/stream", async (request, reply) => {
const { prompt } = request.body;
try {
const queryResult = executeClaudeAgent({ prompt });
// Set SSE headers
reply.raw.writeHead(200, {
"Content-Type": "text/event-stream",
"Cache-Control": "no-cache",
"Connection": "keep-alive",
});
// Stream messages as they arrive
for await (const message of queryResult) {
reply.raw.write(`data: ${JSON.stringify(message)}\n\n`);
}
reply.raw.write("data: [DONE]\n\n");
reply.raw.end();
} catch (error: any) {
reply.raw.write(`data: ${JSON.stringify({ error: error.message })}\n\n`);
reply.raw.end();
}
});
Client-Side Consumption
const eventSource = new EventSource('/claude/stream');
eventSource.onmessage = (event) => {
if (event.data === '[DONE]') {
eventSource.close();
return;
}
const message = JSON.parse(event.data);
if (message.type === 'assistant') {
console.log('Response:', message.message.content);
} else if (message.type === 'result') {
console.log('Final:', message.result);
console.log('Cost:', message.total_cost_usd);
}
};
Pattern 3: MCP Server Tool
The most powerful pattern is exposing the agent as an MCP tool:
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { z } from "zod";
export function createAgentMcpServer() {
const mcpServer = new McpServer({
name: "claude-agent",
version: "1.0.0",
});
mcpServer.tool(
"run_claude_agent",
"Execute autonomous coding task with full codebase access",
{
prompt: z.string().describe("Task for Claude to execute"),
},
async ({ prompt }) => {
const messages = await executeClaudeAgentSync({ prompt });
const resultMessage = messages.find((msg) => msg.type === "result");
if (resultMessage?.result) {
return {
content: [{
type: "text",
text: JSON.stringify({
result: resultMessage.result,
usage: resultMessage.usage,
cost_usd: resultMessage.total_cost_usd,
}, null, 2),
}],
};
}
return {
content: [{ type: "text", text: "No result returned" }],
isError: true,
};
}
);
return mcpServer.server;
}
This enables a meta-agent pattern:
- Primary AI system receives user request
- Identifies complex coding subtasks
- Delegates to Claude Agent via MCP
- Claude executes with full codebase access
- Results flow back to primary AI
MCP Configuration
Define MCP servers in .mcp.json at your project root:
{
"mcpServers": {
"database-tools": {
"type": "http",
"url": "https://your-mcp-server.com/mcp/database"
},
"api-tools": {
"type": "http",
"url": "https://your-mcp-server.com/mcp/api"
}
}
}
Project instructions go in CLAUDE.md for consistent agent behavior across all tasks.
Production Considerations
Security
Authentication: Require API keys for all endpoints:
const authHeader = request.headers.authorization;
const expectedKey = process.env.API_KEY;
if (!authHeader?.startsWith("Bearer ") ||
authHeader.substring(7) !== expectedKey) {
return reply.status(401).send({ error: "Unauthorized" });
}
Sandboxing: Since bypassPermissions allows any operation, consider:
- Running in isolated containers
- Implementing additional authorization checks
- Auditing all agent actions
- Rate limiting requests
Resource Management
Timeouts: Prevent runaway jobs:
const TIMEOUT = 30 * 60 * 1000; // 30 minutes
const timeoutPromise = new Promise((_, reject) => {
setTimeout(() => reject(new Error('Timeout')), TIMEOUT);
});
const result = await Promise.race([
executeClaudeAgentSync({ prompt }),
timeoutPromise
]);
Cleanup: Remove old completed jobs periodically:
DELETE FROM claude_worker_tasks
WHERE completed_at < DATE_SUB(NOW(), INTERVAL 30 DAY);
Monitoring
Cost Tracking: The agent reports costs automatically:
const resultMessage = messages.find(msg => msg.type === "result");
console.log(`Cost: $${resultMessage?.total_cost_usd}`);
console.log(`Tokens: ${resultMessage?.usage?.input_tokens +
resultMessage?.usage?.output_tokens}`);
Key Metrics:
- Job completion rate
- Average execution time
- Token usage and costs
- Error rates
- Queue depth
Use Cases
Automated Code Analysis
const prompt = `
Analyze the authentication system:
1. List all auth-related files
2. Document the flow
3. Identify security issues
4. Generate a Mermaid diagram
`;
Codebase Refactoring
const prompt = `
Convert class components to hooks:
1. Find all class components
2. Convert to functional components
3. Replace lifecycle methods
4. Update tests
`;
Documentation Generation
const prompt = `
Generate API documentation:
1. Find all route definitions
2. Extract schemas
3. Document auth requirements
4. Create OpenAPI spec
`;
Agent SDK vs Standard API
| Feature | Standard API | Agent SDK |
|---|---|---|
| File Access | None | Full read/write |
| Bash Commands | None | Full execution |
| MCP Tools | Limited | Full integration |
| Multi-step | Manual | Autonomous |
| Best For | Chat, generation | Code, automation |
Best Practices
1. Structure Clear Prompts
const prompt = `
## Task: Add User Authentication
### Steps:
1. Implement JWT-based auth
2. Create login/logout endpoints
3. Add middleware
4. Write tests
### Criteria:
- All tests pass
- No TypeScript errors
- Follow security best practices
`;
2. Use Continue for Context
// Initial analysis
await executeClaudeAgentSync({
prompt: "Analyze the auth system",
continue: true
});
// Follow-up with context preserved
await executeClaudeAgentSync({
prompt: "Based on your analysis, implement OAuth2",
continue: true
});
3. Handle Errors Gracefully
try {
const messages = await executeClaudeAgentSync({ prompt });
const result = messages.find(m => m.type === "result");
if (result?.result?.error) {
throw new Error(result.result.error);
}
return result.result;
} catch (error) {
logger.error({ error, prompt }, "Agent failed");
// Implement retry logic if appropriate
throw error;
}
4. Set Cost Alerts
const COST_THRESHOLD = 5.0;
const result = messages.find(m => m.type === "result");
if (result?.total_cost_usd > COST_THRESHOLD) {
await sendAlert(`High cost execution: $${result.total_cost_usd}`);
}
Conclusion
The Claude Agent SDK unlocks autonomous coding capabilities that go far beyond traditional AI APIs. By combining it with robust job queues, streaming responses, and MCP integration, you create a production-ready platform for complex automation tasks.
Key takeaways:
- Choose async for long tasks, streaming for real-time feedback, MCP for delegation
- Implement proper authentication and sandboxing
- Monitor costs and performance closely
- Provide clear prompts and project context
- Handle failures with retries and proper error messages
The combination of autonomous agent capabilities with well-architected API patterns opens new possibilities for intelligent automation—from code analysis to refactoring to complex workflows.
Interested in AI-powered automation? Check out more patterns and implementations at blle.co/blog.