How We Turned 250+ Resumes Into a Searchable AI System in 2 Weeks
In Part 1, we explored how Copilot-powered search is changing the way employees find and work with information across Microsoft 365. From context-aware answers to natural language discovery, the promise of AI-driven search is clear.
Now, in Part 2, we move from possibility to precision—looking at what it actually takes to make Copilot Search work effectively in real-world environments. We designed a four-layer architecture using Microsoft Azure services
Layer 1: Data Ingestion
- Resumes uploaded from SharePoint to Azure Blob Storage
- Preserves folder structure and metadata
- Currently using a custom C# script; post-POC, automated via Power Automate
Layer 2: AI Enrichment (The Game Changer)
This is where the magic happens. We built a custom Azure Function that:
- Extracts text using Azure AI Search’s built-in OCR (handles scanned PDFs, images, complex layouts)
- Calls Azure OpenAI GPT-4 with a strict extraction schema
- Uses function calling to enforce structured output (no hallucinations!)
- Extracts 45+ fields from each resume:
- Basic info (name, title, location, years of experience)
- Industry details (networks, roles, genres, skills, guilds)
- Career history (shows with roles/networks/years, awards, production companies)
- Contact information and references
Example extraction
{
"name": "James Sanderson",
"title": "Executive Producer/Showrunner",
"networks": ["Discovery+", "Investigation Discovery", "Bravo"],
"roles": ["Executive Producer", "Showrunner", "Director"],
"genres": ["True Crime", "Documentary", "Reality TV"],
"shows": [
{
"show_name": "Murder in the Heartland",
"role": "Executive Producer",
"network": "Investigation Discovery",
"years": "2018-2022",
"seasons": "4"
}
],
"yearsExperience": 15,
"awards": [...]
}
Critical Innovation: We add contextual annotations to entertainment industry terms- “Murder in the Heartland (TV Show Title – Professional Work Credit)” to prevent content filtering issues. This annotation happens in our Azure Function code, completely under our control.
Layer 3: Azure AI Search Index
The structured data goes into Azure AI Search, which provides:
- Semantic search – understands “true crime” = “murder investigation” = “cold case”
- Hybrid ranking – combines keyword and semantic relevance
- Complex filtering – network AND role AND genre AND years of experience, all at once
- Tunable relevance – we control scoring profiles and ranking
Layer 4: Search API & User Interface
We built a REST API (Azure Function) that
- Receives natural language queries from Copilot Studio
- Translates them into optimized Azure AI Search queries
- Applies filters and ranking
- Returns structured JSON results
- Handles all content annotation.
Users interact through:
- Copilot Studio for conversational search
- React web app for advanced search with full filter controls
The Content Filtering Solution
Remember those “Content was filtered” errors? Here’s how we solved them.
The Problem
Microsoft’s Responsible AI system (rightfully) blocks harmful content. But it’s a black box that sometimes misclassifies legitimate business terms. Entertainment industry resumes containing show titles like “Murder Mystery,” “True Crime,” or “Deadly Sins” were triggering filters.
Our Solution
We control content processing before it reaches Microsoft’s filters. In our custom Azure Function, we add contextual annotations:
Before (Gets Filtered):
“Murder in the Heartland”
“True Crime Documentary”
After (Passes Through):
“Murder in the Heartland (TV Show Title – Professional Work Credit)”
“True Crime Documentary (TV Genre – Entertainment Industry)”
These annotations clarify context, preventing misclassification. Since this happens in our code (not Microsoft’s), we have complete control.
Result: Zero content filtering errors in production. Every query works.
The Results: From 30 Minutes to a few Seconds
After a 2-week POC (32-40 hours of implementation), here’s what we delivered:
Performance Metrics
| Metric | Before | After | Improvement |
|---|---|---|---|
| Search Time | 15-30 minutes | Few seconds | 99% faster |
| Content Filtering Errors | Frequent blocks | Zero | 100% eliminated |
| Search Accuracy | Inconsistent | 95%+ | Reliable |
| Complex Queries | Not possible | Fully supported | New capability |
| Result Completeness | Unknown | 100% of matches | Trustworthy |
Business Impact
Productivity Gains
- 20+ searches per day × 20 minutes saved per search = 400 minutes daily
- Nearly 7 hours of productivity gained per day
- ROI achieved in 6-8 weeks
Better Outcomes
Find talent that would have been missed with manual search
- Multi-criteria matching works flawlessly (network + role + genre + experience)
- Every team member gets the same consistent results
- Confidence that results are complete, not just examples
Scalability
- Handles 250+ resumes today, can scale to 1000+ with no performance degradation
- Auto-indexes new/updated resumes (no manual maintenance)
- Supports concurrent users without slowdown

Copilot for Microsoft 365 – Unveiling the Dynamics and Capabilities
Microsoft 365 Copilot is coming soon but is your organization ready? As organizations increasingly embrace Microsoft 365 Copilot for enhanced collaboration and productivity, the strategic planning of its rollout becomes critical. Read our eBook.
Get the eBookReal-World Examples
Here’s how it works in practice:
Query 1: “Find documentary producers who worked with Discovery”
- Returns 15 profiles in 2 seconds
- Automatically includes Discovery, Discovery+, Investigation Discovery
- All are producers (not directors or other roles)
- All have documentary experience
- Sorted by relevance (most Discovery credits first)
Query 2: “True crime producers with 10+ years experience in Los Angeles”
- Returns 8 profiles in 2 seconds
- All match: true crime OR crime investigation OR murder mystery (semantic understanding)
- All have 10+ years experience
- All based in Los Angeles
- Zero content filtering errors
Query 3: Complex multi-criteria
- User selects: Networks (Discovery, Investigation Discovery), Roles (Producer, Executive Producer), Genres (Documentary, True Crime), Min Experience (10 years)
- Returns exact matches only
- Fast, accurate, complete
Key Architectural Decisions
Why Azure AI Search Instead of Vector Database?
Azure AI Search provides hybrid search (vector + keyword + semantic) out of the box, with built-in ranking, filtering, and faceting. Vector databases like Pinecone or Weaviate are excellent for pure similarity search but lack the rich query capabilities needed for enterprise search.
Why GPT-4 Function Calling?
GPT-4’s function-calling feature enforces strict JSON schemas, eliminating hallucinations. If GPT-4 can’t find a field, it returns null—it never invents data. This is critical for mission-critical applications where accuracy matters.
Why a Custom Azure Function for Search API?
This gives us complete control over:
- Content annotation (bypassing RAI filters)
- Query translation (natural language → structured search)
- Scoring and ranking logic
- Response formatting – Security and access control
We could have used Azure AI Search directly, but the API layer provides a cleaner interface for Copilot Studio and allows business logic centralization.
Why Both Copilot Studio AND a Web App?
Different users, different needs:
- Copilot Studio: Great for conversational, quick searches (“Find producers with true crime credits”)
- React Web App: Better for power users who want full control over filters, export to CSV, and advanced sorting
One backend, multiple interfaces.
When to Use Out-of-the-Box Copilot vs. Production Search
This isn’t an either/or. Both have their place.
Use Out-of-the-Box Copilot Studio When
- You need conversational Q&A about documents
- Approximate answers are fine (“Here are some examples…”)
- You have <100 documents
- Queries are simple (no complex filtering)
- You need something fast (days to deploy)
- You don’t need 100% consistency
Perfect for: Policy questions, document summaries, general knowledge base
Use Production Search Architecture When
- You need to find ALL matching results (not examples)
- Multi-criteria filtering is essential
- Results must be consistent and trustworthy
- You have 100s-1000s of documents
- Sub-second response times matter
- You need complete control over ranking and relevance
- Content filtering is causing issues
Perfect for: Talent search, contract search, technical documentation, research repositories
Lessons Learned
1. Pre-Processing Beats On-Demand Processing
Extracting structured data upfront (even if it takes hours for initial indexing) is far better than asking an LLM to read PDFs on every query. The upfront cost pays dividends in speed, accuracy, and consistency.
2. Structured Data Eliminates Hallucinations
Using GPT-4 function calling with strict schemas means you get structured data or null – never hallucinated data. This makes the system trustworthy for business-critical applications.
3. Control Over Content Processing Is Essential
When dealing with industry-specific terminology that might trigger content filters, you need to control the processing pipeline. Our custom Azure Function gives us that control.
4. Semantic Search Changes Everything
Azure AI Search’s semantic understanding (“true crime” = “murder investigation” = “cold case”) finds results that a keyword search would miss. This is the power of modern AI search.
5. The Right Tool for the Right Job
Copilot Studio is brilliant for what it does. But when you need production-grade search, you need a proper search engine. Trying to force Copilot to be something it’s not leads to frustration.
The Technology Stack
For those interested in the technical details:
Search & AI
- Azure AI Search (Basic tier, ~$75/month)
- Azure OpenAI GPT-4 (pay-per-use)
Compute & Storage
- Azure Functions (.NET 8 Isolated Worker)
- Azure Blob Storage (Standard tier)
User Interface
- Microsoft Copilot Studio (conversational search)
- React 18 + TypeScript (web app)
- Ant Design (UI component library)
Total Cloud Cost: max $300-400/month for production workloads.
Beyond Talent Search: Where This Architecture Applies
While we built this for talent search, the same architecture works for any industry with large collections of unstructured documents:
- Recruitment & HR: Resume search, candidate matching
- Legal: Contract search, clause extraction
- Healthcare: Clinical document search (HIPAA-compliant)
- Technical Documentation: Knowledge base, support systems
- Research: Academic paper search, citation analysis
- Real Estate: Property document search
- Financial Services: Policy and compliance document search
If you have PDFs or Word documents and need a reliable, filtered search – this pattern applies.
The Bottom Line
Out-of-the-Box Microsoft Copilot Studio is an excellent tool for conversational document Q&A. But when your business needs a complete, fast, reliable search with complex filtering, you need a different architecture.
Our client went from
❌ 15-30 minute manual searches
❌ Inconsistent, incomplete results
❌ Content filtering blocking legitimate queries
❌ No way to filter by multiple criteria
To
✅ Sub-second search results
✅ 100% consistent, complete results
✅ Zero content filtering issues
✅ Full multi-criteria filtering support
✅ Complete control and visibility
The difference? Pre-indexed structured data with a production-grade search engine.
Sometimes the out-of-the-box solution is perfect. Sometimes you need to build something better.
The key is knowing which is which.
What’s Next?
If you’re facing similar challenges like unreliable search results, content filtering issues, or the need for production-grade AI search, we can help.
We offer
POC Assessment (1 week)
- Document analysis workshop
- Architecture design for your use case
- Cost estimate and timeline
- Risk assessment
Full POC (2-4 weeks)
- Complete implementation with your data
- Stakeholder demonstration
- Production deployment roadmap
This architecture is proven, scalable, and delivers measurable ROI in weeks.
Ready to go from blocked to brilliant? Contact us and we will be happy to help.
📧 Email us at info@netwoven.com
🌐 Learn more here
💼 Connect on LinkedIn
Appendix: Technical Architecture Diagram
┌─────────────────────────────────────────────────────────┐
│ USER INTERFACE LAYER │
│ Microsoft Copilot Studio + React Web App │
└────────────────────┬────────────────────────────────────┘
│
│ Natural Language Query
↓
┌─────────────────────────────────────────────────────────┐
│ SEARCH API LAYER (Our Control) │
│ Azure Function - REST API │
│ • Query translation │
│ • Content annotation (prevents filtering) │
│ • Multi-filter support │
│ • Response formatting │
└────────────────────┬────────────────────────────────────┘
│
│ Structured Search Query
↓
┌─────────────────────────────────────────────────────────┐
│ SEARCH ENGINE LAYER │
│ Azure AI Search Index │
│ • Semantic search (context understanding) │
│ • Vector search (similarity matching) │
│ • Keyword search (exact match) │
│ • Hybrid ranking (best of all three) │
│ • Complex filtering │
└────────────────────┬────────────────────────────────────┘
│
│ Pre-Indexed Structured Data
↑
┌─────────────────────────────────────────────────────────┐
│ AI ENRICHMENT LAYER (Critical!) │
│ │
│ ┌──────────────────────────────────────────────────┐ │
│ │ CUSTOM SKILL - Azure Function + GPT-4 │ │
│ │ │ │
│ │ 1. Extract text (OCR for scanned docs) │ │
│ │ 2. Call Azure OpenAI GPT-4 │ │
│ │ 3. Use function calling (strict schema) │ │
│ │ 4. Extract 45+ structured fields │ │
│ │ 5. Add content annotations │ │
│ │ 6. Return JSON (no hallucinations) │ │
│ │ │ │
│ └──────────────────────────────────────────────────┘ │
└────────────────────┬────────────────────────────────────┘
│
│ Raw Document Files
↓
┌─────────────────────────────────────────────────────────┐
│ DATA SOURCE LAYER │
│ Azure Blob Storage │
│ PDF & Word Documents (250+) │
│ Auto-sync from SharePoint (Power Automate) │
└─────────────────────────────────────────────────────────┘
If your organization is struggling with unreliable AI search, content filtering issues, or unstructured content chaos, Netwoven can help.
Our proven architecture delivers measurable ROI in weeks – not months. Schedule a consultation with our experts to get started.






















