AI/ML
12 min read

How I Built an AI-Powered Defense Intelligence Platform with ChatGPT and Claude

A deep dive into building SalesBridge.ai - an AI platform that processes 500+ defense opportunities daily using dual LLM architecture with ChatGPT and Claude APIs.

How I Built an AI-Powered Defense Intelligence Platform with ChatGPT and Claude
DP

Dibyank Padhy

Engineering Manager & Full Stack Developer

The Problem: Information Overload in Defense Sales

The US defense industry generates thousands of contract opportunities across dozens of government websites every single day. For companies trying to break into or grow within this sector, the challenge is not a lack of opportunities - it is the sheer volume of data that needs to be processed, analyzed, and matched to specific capabilities.

Before building SalesBridge.ai, sales teams at defense companies were spending upwards of 20 hours per week manually browsing government procurement sites like SAM.gov, DIBBS, and various agency-specific portals. They would copy-paste opportunity descriptions into spreadsheets, try to parse complex requirement documents, and somehow match these against their company capabilities. It was tedious, error-prone, and incredibly inefficient.

I set out to build a platform that could automate this entire workflow - scrape opportunities from 11 different defense websites, use AI to analyze and categorize them, and intelligently match them to a company's specific capabilities. The result was SalesBridge.ai, which now processes over 500 opportunities daily and has reduced manual research time from 20 hours to just 30 minutes per week.

Architecture Overview: The Dual-LLM Approach

One of the most important architectural decisions I made early on was to use a dual-LLM approach, combining both OpenAI's ChatGPT API and Anthropic's Claude API. This was not just for redundancy - each model has distinct strengths that complement each other in a production pipeline.

Why Two LLMs?

ChatGPT excels at structured data extraction - parsing NAICS codes, contract values, and deadline dates from unstructured text

Claude is superior at nuanced analysis - understanding the intent behind requirements, identifying implicit qualifications, and generating match scores

Using both provides a consensus mechanism - when both models agree on a classification, confidence is significantly higher

Redundancy ensures 99.9% uptime - if one API has issues, the other can handle the workload

The Web Scraping Engine

The foundation of SalesBridge.ai is its web scraping engine, built entirely in Python. It processes 11 defense procurement websites on a scheduled basis, handling everything from simple HTML pages to JavaScript-rendered SPAs and PDF documents.

python
import asyncio
from playwright.async_api import async_playwright
from bs4 import BeautifulSoup
import aiohttp

class DefenseScraperEngine:
    def __init__(self, config):
        self.config = config
        self.rate_limiter = AsyncRateLimiter(
            max_requests=config.max_requests_per_minute,
            time_window=60
        )

    async def scrape_sam_gov(self):
        """Scrape opportunities from SAM.gov API"""
        async with aiohttp.ClientSession() as session:
            params = {
                'api_key': self.config.sam_api_key,
                'postedFrom': self.get_last_scrape_date(),
                'limit': 100,
                'offset': 0,
            }

            all_opportunities = []
            while True:
                await self.rate_limiter.acquire()
                async with session.get(
                    'https://api.sam.gov/opportunities/v2/search',
                    params=params
                ) as resp:
                    data = await resp.json()
                    opportunities = data.get('opportunitiesData', [])
                    if not opportunities:
                        break
                    all_opportunities.extend(opportunities)
                    params['offset'] += 100

            return all_opportunities

    async def scrape_dynamic_site(self, url, selectors):
        """Handle JS-rendered procurement portals"""
        async with async_playwright() as p:
            browser = await p.chromium.launch(headless=True)
            page = await browser.new_page()
            await page.goto(url, wait_until='networkidle')

            content = await page.content()
            soup = BeautifulSoup(content, 'html.parser')

            opportunities = []
            for item in soup.select(selectors['listing']):
                opp = {
                    'title': item.select_one(selectors['title']).text.strip(),
                    'agency': item.select_one(selectors['agency']).text.strip(),
                    'deadline': item.select_one(selectors['deadline']).text.strip(),
                    'url': item.select_one('a')['href'],
                }
                opportunities.append(opp)

            await browser.close()
            return opportunities

AI-Powered Analysis Pipeline

Once raw opportunities are scraped, they enter the AI analysis pipeline. This is where the dual-LLM architecture really shines. Each opportunity goes through three stages: extraction, classification, and matching.

Stage 1: Structured Data Extraction

The first stage uses ChatGPT to extract structured data from the often messy, inconsistent text of procurement notices. Government websites have wildly different formats - some use tables, others use free-form text, and many include lengthy legal boilerplate that needs to be filtered out.

python
async def extract_structured_data(self, raw_text: str) -> dict:
    """Use ChatGPT to extract structured fields from raw opportunity text"""

    response = await self.openai_client.chat.completions.create(
        model="gpt-4-turbo",
        messages=[
            {
                "role": "system",
                "content": """Extract the following fields from this defense
                procurement opportunity. Return valid JSON only.
                Fields: title, agency, naics_codes (list), set_aside_type,
                estimated_value, response_deadline, place_of_performance,
                key_requirements (list), security_clearance_required (bool),
                small_business_eligible (bool)"""
            },
            {"role": "user", "content": raw_text[:8000]}
        ],
        response_format={"type": "json_object"},
        temperature=0.1,
    )

    return json.loads(response.choices[0].message.content)

Stage 2: Intelligent Classification with Claude

Claude handles the more nuanced classification work. It reads the full opportunity description and categorizes it across multiple dimensions: technical domain, complexity level, competition intensity, and strategic value.

What makes Claude particularly effective here is its ability to understand context and subtext. A procurement notice might not explicitly say "this requires top-secret clearance" but will reference programs or facilities that imply it. Claude catches these nuances consistently.

Stage 3: Capability Matching

The final stage combines outputs from both models to generate a match score against a company's capability profile. This uses a weighted scoring algorithm that considers technical fit, past performance relevance, contract size, and timeline feasibility.

Results and Impact

After three months in production, SalesBridge.ai delivered measurable results that exceeded my initial projections:

75% improvement in lead quality - matched opportunities had a significantly higher win rate compared to manual identification

Reduced research time from 20 hours to 30 minutes per week - freeing sales teams to focus on proposal writing

500+ opportunities processed daily across 11 websites with 99.2% uptime

Cost per analysis dropped to under $0.03 per opportunity using optimized prompt engineering and batching

Lessons Learned

Building SalesBridge.ai taught me several valuable lessons about production AI systems:

Prompt engineering is an iterative science, not a one-time task. I went through 47 iterations of my extraction prompts before reaching production quality.

Always build with LLM-agnostic abstractions. When OpenAI changed their API, I only had to update one adapter class.

Cost management is critical. Without batching and caching, my API costs would have been 10x higher.

Human-in-the-loop validation is essential for the first few weeks. I manually reviewed every 10th classification to catch systematic errors.

If you are considering building an AI-powered data pipeline, the dual-LLM approach is worth serious consideration. The complementary strengths of different models can dramatically improve both accuracy and reliability in production systems.

Stay Updated

Get notified when I publish new articles on engineering, AI, and leadership. No spam, unsubscribe anytime.

Found this helpful? Share it with others

DP

About the Author

Dibyank Padhy is an Engineering Manager & Full Stack Developer with 7+ years of experience building scalable software solutions. Passionate about cloud architecture, team leadership, and AI integration.