How I Built an AI-Powered Defense Intelligence Platform with ChatGPT and Claude
A deep dive into building SalesBridge.ai - an AI platform that processes 500+ defense opportunities daily using dual LLM architecture with ChatGPT and Claude APIs.
Dibyank Padhy
Engineering Manager & Full Stack Developer
Table of Contents
The Problem: Information Overload in Defense Sales
The US defense industry generates thousands of contract opportunities across dozens of government websites every single day. For companies trying to break into or grow within this sector, the challenge is not a lack of opportunities - it is the sheer volume of data that needs to be processed, analyzed, and matched to specific capabilities.
Before building SalesBridge.ai, sales teams at defense companies were spending upwards of 20 hours per week manually browsing government procurement sites like SAM.gov, DIBBS, and various agency-specific portals. They would copy-paste opportunity descriptions into spreadsheets, try to parse complex requirement documents, and somehow match these against their company capabilities. It was tedious, error-prone, and incredibly inefficient.
I set out to build a platform that could automate this entire workflow - scrape opportunities from 11 different defense websites, use AI to analyze and categorize them, and intelligently match them to a company's specific capabilities. The result was SalesBridge.ai, which now processes over 500 opportunities daily and has reduced manual research time from 20 hours to just 30 minutes per week.
Architecture Overview: The Dual-LLM Approach
One of the most important architectural decisions I made early on was to use a dual-LLM approach, combining both OpenAI's ChatGPT API and Anthropic's Claude API. This was not just for redundancy - each model has distinct strengths that complement each other in a production pipeline.
Why Two LLMs?
ChatGPT excels at structured data extraction - parsing NAICS codes, contract values, and deadline dates from unstructured text
Claude is superior at nuanced analysis - understanding the intent behind requirements, identifying implicit qualifications, and generating match scores
Using both provides a consensus mechanism - when both models agree on a classification, confidence is significantly higher
Redundancy ensures 99.9% uptime - if one API has issues, the other can handle the workload
The Web Scraping Engine
The foundation of SalesBridge.ai is its web scraping engine, built entirely in Python. It processes 11 defense procurement websites on a scheduled basis, handling everything from simple HTML pages to JavaScript-rendered SPAs and PDF documents.
import asyncio
from playwright.async_api import async_playwright
from bs4 import BeautifulSoup
import aiohttp
class DefenseScraperEngine:
def __init__(self, config):
self.config = config
self.rate_limiter = AsyncRateLimiter(
max_requests=config.max_requests_per_minute,
time_window=60
)
async def scrape_sam_gov(self):
"""Scrape opportunities from SAM.gov API"""
async with aiohttp.ClientSession() as session:
params = {
'api_key': self.config.sam_api_key,
'postedFrom': self.get_last_scrape_date(),
'limit': 100,
'offset': 0,
}
all_opportunities = []
while True:
await self.rate_limiter.acquire()
async with session.get(
'https://api.sam.gov/opportunities/v2/search',
params=params
) as resp:
data = await resp.json()
opportunities = data.get('opportunitiesData', [])
if not opportunities:
break
all_opportunities.extend(opportunities)
params['offset'] += 100
return all_opportunities
async def scrape_dynamic_site(self, url, selectors):
"""Handle JS-rendered procurement portals"""
async with async_playwright() as p:
browser = await p.chromium.launch(headless=True)
page = await browser.new_page()
await page.goto(url, wait_until='networkidle')
content = await page.content()
soup = BeautifulSoup(content, 'html.parser')
opportunities = []
for item in soup.select(selectors['listing']):
opp = {
'title': item.select_one(selectors['title']).text.strip(),
'agency': item.select_one(selectors['agency']).text.strip(),
'deadline': item.select_one(selectors['deadline']).text.strip(),
'url': item.select_one('a')['href'],
}
opportunities.append(opp)
await browser.close()
return opportunitiesAI-Powered Analysis Pipeline
Once raw opportunities are scraped, they enter the AI analysis pipeline. This is where the dual-LLM architecture really shines. Each opportunity goes through three stages: extraction, classification, and matching.
Stage 1: Structured Data Extraction
The first stage uses ChatGPT to extract structured data from the often messy, inconsistent text of procurement notices. Government websites have wildly different formats - some use tables, others use free-form text, and many include lengthy legal boilerplate that needs to be filtered out.
async def extract_structured_data(self, raw_text: str) -> dict:
"""Use ChatGPT to extract structured fields from raw opportunity text"""
response = await self.openai_client.chat.completions.create(
model="gpt-4-turbo",
messages=[
{
"role": "system",
"content": """Extract the following fields from this defense
procurement opportunity. Return valid JSON only.
Fields: title, agency, naics_codes (list), set_aside_type,
estimated_value, response_deadline, place_of_performance,
key_requirements (list), security_clearance_required (bool),
small_business_eligible (bool)"""
},
{"role": "user", "content": raw_text[:8000]}
],
response_format={"type": "json_object"},
temperature=0.1,
)
return json.loads(response.choices[0].message.content)Stage 2: Intelligent Classification with Claude
Claude handles the more nuanced classification work. It reads the full opportunity description and categorizes it across multiple dimensions: technical domain, complexity level, competition intensity, and strategic value.
What makes Claude particularly effective here is its ability to understand context and subtext. A procurement notice might not explicitly say "this requires top-secret clearance" but will reference programs or facilities that imply it. Claude catches these nuances consistently.
Stage 3: Capability Matching
The final stage combines outputs from both models to generate a match score against a company's capability profile. This uses a weighted scoring algorithm that considers technical fit, past performance relevance, contract size, and timeline feasibility.
Results and Impact
After three months in production, SalesBridge.ai delivered measurable results that exceeded my initial projections:
75% improvement in lead quality - matched opportunities had a significantly higher win rate compared to manual identification
Reduced research time from 20 hours to 30 minutes per week - freeing sales teams to focus on proposal writing
500+ opportunities processed daily across 11 websites with 99.2% uptime
Cost per analysis dropped to under $0.03 per opportunity using optimized prompt engineering and batching
Lessons Learned
Building SalesBridge.ai taught me several valuable lessons about production AI systems:
Prompt engineering is an iterative science, not a one-time task. I went through 47 iterations of my extraction prompts before reaching production quality.
Always build with LLM-agnostic abstractions. When OpenAI changed their API, I only had to update one adapter class.
Cost management is critical. Without batching and caching, my API costs would have been 10x higher.
Human-in-the-loop validation is essential for the first few weeks. I manually reviewed every 10th classification to catch systematic errors.
If you are considering building an AI-powered data pipeline, the dual-LLM approach is worth serious consideration. The complementary strengths of different models can dramatically improve both accuracy and reliability in production systems.
Stay Updated
Get notified when I publish new articles on engineering, AI, and leadership. No spam, unsubscribe anytime.