All Articles
AI/ML
20 min read

Building AI Applications with Claude 4.5 Opus

Comprehensive guide to building production AI applications using Claude 4.5 Opus, covering tool use, streaming, caching, and advanced prompting techniques.

Building AI Applications with Claude 4.5 Opus
DP

Dibyank Padhy

Engineering Manager & Full Stack Developer

Building AI Applications with Claude 4.5 Opus

Claude 4.5 Opus represents a significant leap in AI capabilities, offering enhanced reasoning, better instruction following, and improved tool use. In this guide, we'll build production-ready AI applications leveraging these capabilities.

Setting Up the Claude SDK

First, let's set up a robust Claude client with proper error handling and retry logic:

typescript
import Anthropic from '@anthropic-ai/sdk';

const anthropic = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

interface Message {
  role: 'user' | 'assistant';
  content: string | ContentBlock[];
}

interface ContentBlock {
  type: 'text' | 'tool_use' | 'tool_result';
  text?: string;
  id?: string;
  name?: string;
  input?: any;
  tool_use_id?: string;
  content?: string;
}

// Retry wrapper with exponential backoff
async function withRetry<T>(
  fn: () => Promise<T>,
  maxRetries = 3,
  baseDelay = 1000
): Promise<T> {
  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    try {
      return await fn();
    } catch (error: any) {
      const isRetryable =
        error.status === 429 || // Rate limit
        error.status === 529 || // Overloaded
        error.status >= 500;    // Server error

      if (!isRetryable || attempt === maxRetries) {
        throw error;
      }

      const delay = baseDelay * Math.pow(2, attempt - 1);
      console.log(`Attempt ${attempt} failed, retrying in ${delay}ms...`);
      await new Promise(resolve => setTimeout(resolve, delay));
    }
  }
  throw new Error('Retry failed');
}

Tool Use (Function Calling)

Claude 4.5 Opus excels at tool use. Here's how to implement a multi-tool system:

typescript
// Define tools
const tools: Anthropic.Tool[] = [
  {
    name: 'get_weather',
    description: 'Get current weather for a location. Use this when the user asks about weather conditions.',
    input_schema: {
      type: 'object',
      properties: {
        location: {
          type: 'string',
          description: 'City and country, e.g., "London, UK"',
        },
        units: {
          type: 'string',
          enum: ['celsius', 'fahrenheit'],
          description: 'Temperature units',
        },
      },
      required: ['location'],
    },
  },
  {
    name: 'search_database',
    description: 'Search the product database for items matching criteria.',
    input_schema: {
      type: 'object',
      properties: {
        query: { type: 'string', description: 'Search query' },
        category: { type: 'string', description: 'Product category' },
        max_price: { type: 'number', description: 'Maximum price filter' },
        limit: { type: 'number', description: 'Max results to return' },
      },
      required: ['query'],
    },
  },
  {
    name: 'create_order',
    description: 'Create a new order for a customer. Only use after confirming with user.',
    input_schema: {
      type: 'object',
      properties: {
        product_ids: {
          type: 'array',
          items: { type: 'string' },
          description: 'Product IDs to order',
        },
        customer_id: { type: 'string', description: 'Customer ID' },
        shipping_address: {
          type: 'object',
          properties: {
            street: { type: 'string' },
            city: { type: 'string' },
            postal_code: { type: 'string' },
            country: { type: 'string' },
          },
          required: ['street', 'city', 'postal_code', 'country'],
        },
      },
      required: ['product_ids', 'customer_id', 'shipping_address'],
    },
  },
];

// Tool implementations
const toolImplementations: Record<string, (input: any) => Promise<any>> = {
  get_weather: async ({ location, units = 'celsius' }) => {
    const response = await fetch(
      `https://api.weather.service/v1/current?location=${encodeURIComponent(location)}&units=${units}`
    );
    return response.json();
  },

  search_database: async ({ query, category, max_price, limit = 10 }) => {
    // Your database search logic
    return db.products.search({ query, category, max_price, limit });
  },

  create_order: async ({ product_ids, customer_id, shipping_address }) => {
    return orderService.create({ product_ids, customer_id, shipping_address });
  },
};

// Agentic loop with tool execution
async function runAgent(userMessage: string, conversationHistory: Message[] = []) {
  const messages: Message[] = [
    ...conversationHistory,
    { role: 'user', content: userMessage },
  ];

  while (true) {
    const response = await withRetry(() =>
      anthropic.messages.create({
        model: 'claude-opus-4-5-20251101',
        max_tokens: 4096,
        system: `You are a helpful shopping assistant. Use the available tools to help customers find products and place orders. Always confirm order details before creating an order.`,
        tools,
        messages,
      })
    );

    // Check if we need to execute tools
    const toolUseBlocks = response.content.filter(
      (block): block is Anthropic.ToolUseBlock => block.type === 'tool_use'
    );

    if (toolUseBlocks.length === 0) {
      // No tool use - return the text response
      const textBlock = response.content.find(b => b.type === 'text');
      return {
        response: textBlock?.text || '',
        messages,
      };
    }

    // Execute tools
    const toolResults: ContentBlock[] = [];

    for (const toolUse of toolUseBlocks) {
      console.log(`Executing tool: ${toolUse.name}`, toolUse.input);

      try {
        const implementation = toolImplementations[toolUse.name];
        if (!implementation) {
          throw new Error(`Unknown tool: ${toolUse.name}`);
        }

        const result = await implementation(toolUse.input);
        toolResults.push({
          type: 'tool_result',
          tool_use_id: toolUse.id,
          content: JSON.stringify(result),
        });
      } catch (error: any) {
        toolResults.push({
          type: 'tool_result',
          tool_use_id: toolUse.id,
          content: JSON.stringify({ error: error.message }),
          is_error: true,
        });
      }
    }

    // Add assistant response and tool results to messages
    messages.push({ role: 'assistant', content: response.content });
    messages.push({ role: 'user', content: toolResults });
  }
}

Streaming Responses

For better UX, implement streaming to show responses as they generate:

typescript
async function* streamResponse(
  userMessage: string,
  systemPrompt: string
): AsyncGenerator<string> {
  const stream = anthropic.messages.stream({
    model: 'claude-opus-4-5-20251101',
    max_tokens: 4096,
    system: systemPrompt,
    messages: [{ role: 'user', content: userMessage }],
  });

  for await (const event of stream) {
    if (
      event.type === 'content_block_delta' &&
      event.delta.type === 'text_delta'
    ) {
      yield event.delta.text;
    }
  }

  // Get final message for metadata
  const finalMessage = await stream.finalMessage();
  console.log('Token usage:', finalMessage.usage);
}

// Express.js streaming endpoint
app.post('/api/chat/stream', async (req, res) => {
  res.setHeader('Content-Type', 'text/event-stream');
  res.setHeader('Cache-Control', 'no-cache');
  res.setHeader('Connection', 'keep-alive');

  try {
    const generator = streamResponse(req.body.message, req.body.systemPrompt);

    for await (const chunk of generator) {
      res.write(`data: ${JSON.stringify({ text: chunk })}\n\n`);
    }

    res.write(`data: [DONE]\n\n`);
  } catch (error) {
    res.write(`data: ${JSON.stringify({ error: 'Stream error' })}\n\n`);
  } finally {
    res.end();
  }
});

// Client-side consumption
async function consumeStream(message: string) {
  const response = await fetch('/api/chat/stream', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ message }),
  });

  const reader = response.body?.getReader();
  const decoder = new TextDecoder();

  while (reader) {
    const { done, value } = await reader.read();
    if (done) break;

    const chunk = decoder.decode(value);
    const lines = chunk.split('\n');

    for (const line of lines) {
      if (line.startsWith('data: ')) {
        const data = line.slice(6);
        if (data === '[DONE]') return;

        const parsed = JSON.parse(data);
        if (parsed.text) {
          appendToUI(parsed.text);
        }
      }
    }
  }
}

Prompt Caching

Use prompt caching to reduce latency and costs for repeated prompts:

typescript
// Enable caching for system prompts and large context
async function chatWithCaching(
  userMessage: string,
  documentContext: string
) {
  const response = await anthropic.messages.create({
    model: 'claude-opus-4-5-20251101',
    max_tokens: 4096,
    system: [
      {
        type: 'text',
        text: `You are an expert analyst. Use the following document to answer questions accurately.`,
      },
      {
        type: 'text',
        text: documentContext,
        cache_control: { type: 'ephemeral' }, // Cache this block
      },
    ],
    messages: [{ role: 'user', content: userMessage }],
  });

  console.log('Cache metrics:', {
    input_tokens: response.usage.input_tokens,
    cache_creation_input_tokens: response.usage.cache_creation_input_tokens,
    cache_read_input_tokens: response.usage.cache_read_input_tokens,
  });

  return response;
}

// Multi-turn conversation with cached context
class CachedConversation {
  private messages: Message[] = [];
  private cachedContext: string;

  constructor(context: string) {
    this.cachedContext = context;
  }

  async chat(userMessage: string): Promise<string> {
    this.messages.push({ role: 'user', content: userMessage });

    const response = await anthropic.messages.create({
      model: 'claude-opus-4-5-20251101',
      max_tokens: 4096,
      system: [
        {
          type: 'text',
          text: 'You are a helpful assistant with access to the user\'s documents.',
        },
        {
          type: 'text',
          text: this.cachedContext,
          cache_control: { type: 'ephemeral' },
        },
      ],
      messages: this.messages,
    });

    const assistantMessage = response.content[0].type === 'text'
      ? response.content[0].text
      : '';

    this.messages.push({ role: 'assistant', content: assistantMessage });

    return assistantMessage;
  }
}

Structured Output with JSON Mode

Extract structured data reliably using JSON mode:

typescript
interface ExtractedData {
  entities: Array<{
    name: string;
    type: 'person' | 'organization' | 'location' | 'date';
    confidence: number;
  }>;
  sentiment: 'positive' | 'negative' | 'neutral';
  summary: string;
  key_points: string[];
}

async function extractStructuredData(text: string): Promise<ExtractedData> {
  const response = await anthropic.messages.create({
    model: 'claude-opus-4-5-20251101',
    max_tokens: 2048,
    messages: [
      {
        role: 'user',
        content: `Analyze the following text and extract structured information.

Text:
${text}

Respond with a JSON object containing:
- entities: array of {name, type, confidence} where type is person/organization/location/date
- sentiment: positive/negative/neutral
- summary: brief summary
- key_points: array of main points

Return only valid JSON, no markdown formatting.`,
      },
    ],
  });

  const textContent = response.content.find(b => b.type === 'text');
  if (!textContent || textContent.type !== 'text') {
    throw new Error('No text response');
  }

  // Parse and validate
  const data = JSON.parse(textContent.text);

  // Type validation
  if (!Array.isArray(data.entities) || !data.sentiment || !data.summary) {
    throw new Error('Invalid response structure');
  }

  return data as ExtractedData;
}

// With Zod validation for type safety
import { z } from 'zod';

const ExtractedDataSchema = z.object({
  entities: z.array(z.object({
    name: z.string(),
    type: z.enum(['person', 'organization', 'location', 'date']),
    confidence: z.number().min(0).max(1),
  })),
  sentiment: z.enum(['positive', 'negative', 'neutral']),
  summary: z.string(),
  key_points: z.array(z.string()),
});

async function extractWithValidation(text: string) {
  const response = await extractStructuredData(text);
  return ExtractedDataSchema.parse(response);
}

Building a RAG Pipeline

Implement Retrieval-Augmented Generation for document Q&A:

typescript
import { OpenAI } from 'openai';

const openai = new OpenAI(); // For embeddings

interface Document {
  id: string;
  content: string;
  metadata: Record<string, any>;
  embedding?: number[];
}

// Generate embeddings
async function generateEmbedding(text: string): Promise<number[]> {
  const response = await openai.embeddings.create({
    model: 'text-embedding-3-small',
    input: text,
  });
  return response.data[0].embedding;
}

// Cosine similarity
function cosineSimilarity(a: number[], b: number[]): number {
  const dotProduct = a.reduce((sum, val, i) => sum + val * b[i], 0);
  const magnitudeA = Math.sqrt(a.reduce((sum, val) => sum + val * val, 0));
  const magnitudeB = Math.sqrt(b.reduce((sum, val) => sum + val * val, 0));
  return dotProduct / (magnitudeA * magnitudeB);
}

// RAG system
class RAGSystem {
  private documents: Document[] = [];

  async addDocument(content: string, metadata: Record<string, any> = {}) {
    const embedding = await generateEmbedding(content);
    this.documents.push({
      id: crypto.randomUUID(),
      content,
      metadata,
      embedding,
    });
  }

  async query(question: string, topK = 3): Promise<string> {
    // Get question embedding
    const questionEmbedding = await generateEmbedding(question);

    // Find most relevant documents
    const scored = this.documents.map(doc => ({
      doc,
      score: cosineSimilarity(questionEmbedding, doc.embedding!),
    }));

    scored.sort((a, b) => b.score - a.score);
    const relevant = scored.slice(0, topK);

    // Build context
    const context = relevant
      .map((r, i) => `[Document ${i + 1}] (relevance: ${r.score.toFixed(2)})\n${r.doc.content}`)
      .join('\n\n---\n\n');

    // Query Claude with context
    const response = await anthropic.messages.create({
      model: 'claude-opus-4-5-20251101',
      max_tokens: 2048,
      system: [
        {
          type: 'text',
          text: `You are a helpful assistant that answers questions based on the provided documents.
If the answer cannot be found in the documents, say so clearly.
Always cite which document(s) you're drawing information from.`,
        },
        {
          type: 'text',
          text: `Reference Documents:\n\n${context}`,
          cache_control: { type: 'ephemeral' },
        },
      ],
      messages: [{ role: 'user', content: question }],
    });

    return response.content[0].type === 'text' ? response.content[0].text : '';
  }
}

// Usage
const rag = new RAGSystem();

await rag.addDocument('AWS Lambda pricing is based on requests and duration...');
await rag.addDocument('CloudWatch can monitor Lambda functions with custom metrics...');
await rag.addDocument('Lambda layers allow sharing code between functions...');

const answer = await rag.query('How is Lambda pricing calculated?');
console.log(answer);

Conclusion

Claude 4.5 Opus provides powerful capabilities for building sophisticated AI applications. By implementing proper tool use, streaming, caching, and RAG patterns, you can create production-ready systems that are both performant and cost-effective.

Key takeaways:

  • Use tool use for complex agentic workflows
  • Implement streaming for better user experience
  • Leverage prompt caching for repeated contexts
  • Combine with RAG for knowledge-grounded responses

Found this helpful? Share it with others

DP

About the Author

Dibyank Padhy is an Engineering Manager & Full Stack Developer with 7+ years of experience building scalable software solutions. Passionate about cloud architecture, team leadership, and AI integration.