Priority: Focus on what matters most for your M.Tech presentation
# Edit .env.local
LLM_TIMEOUT_COLD=120000 # Increase from 90000 to 120000
LLM_TIMEOUT_WARM=90000 # Increase from 45000 to 90000
# Restart services
./start_local.sh restartWhy: Prevents random timeout errors users are seeing
# Install pytest
pip install pytest pytest-asyncio pytest-mock httpx
# Create test directory
mkdir -p backend/tests
# Write 3 critical tests:
# 1. Test health endpoint works
# 2. Test query endpoint returns response
# 3. Test embedding generation works
# Run tests
pytest backend/tests/ -vWhy: Demonstrates code quality, catches bugs before demo
# backend/main.py
ALLOWED_ORIGINS = [
"http://localhost:3000",
"https://yourdomain.com", # Add your production URL
]
app.add_middleware(
CORSMiddleware,
allow_origins=ALLOWED_ORIGINS, # Change from ["*"]
allow_credentials=True,
allow_methods=["GET", "POST", "DELETE"],
allow_headers=["Content-Type", "Authorization"],
)Why: Security best practice, shows production awareness
# Install Prometheus client
pip install prometheus-client
# Add to backend/main.py
from prometheus_client import Counter, Histogram, generate_latest
from fastapi.responses import PlainTextResponse
query_counter = Counter('rag_queries_total', 'Total queries')
query_duration = Histogram('rag_query_duration_seconds', 'Query duration')
@app.get("/metrics", response_class=PlainTextResponse)
async def metrics():
return generate_latest()
# In ask endpoint, add:
query_counter.inc()
with query_duration.time():
# ... existing query logicWhy: Shows observability knowledge, impress evaluators
- Show list of uploaded documents
- Allow deleting documents
- Display document metadata (chunk count, upload date)
Why: Makes system more impressive and usable
// Replace all console.log/error with proper logging
// Or just wrap in if (process.env.NODE_ENV === 'development')
if (process.env.NODE_ENV === 'development') {
console.log('Debug info');
}Why: Cleaner demo, looks more professional
// frontend/src/components/ErrorBoundary.js
// Wrap <App /> in ErrorBoundary
// Shows "Oops!" message instead of crashWhy: Graceful failure handling impresses judges
- Draw how components connect
- Show data flow: User → Frontend → Backend → LLM/Milvus → Response
- Include in presentation slides
Why: Demonstrates system understanding
# Measure and document:
# - Query response time (yours: ~8-10s vs Docker: 60-90s)
# - Memory usage (before/after)
# - Throughput (queries per minute)
# - Accuracy metrics (if possible)Why: Data-driven results impress evaluators
- List problems faced (Docker timeouts, Node.js localStorage, etc.)
- Explain solutions implemented
- Show learning and problem-solving skills
Why: Demonstrates technical depth and resilience
- ❌ Full authentication system → Demo with API keys is fine
- ❌ Complete Confluence API integration → Local mode works
- ❌ Streaming responses → Cool but complex
- ❌ Analytics dashboard → Nice but not critical
- ❌ Multi-language support → Out of scope
- ❌ A/B testing → Over-engineering for demo
- Fix LLM timeouts
- Add metrics endpoint
- Remove console errors
- Write 5 basic tests
- Fix CORS config
- Add error boundary
- Create architecture diagram
- Collect performance metrics
- Practice demo flow
-
Introduction (1 min)
- "Enterprise RAG system for knowledge management"
- "Runs locally on Mac M4 Pro with 8-10x performance improvement"
-
Architecture Overview (2 min)
- Show diagram
- Explain: Frontend → Backend → Ollama (LLM) + Milvus (Vector DB)
-
Live Demo (3-4 min)
- Show frontend UI
- Ask a question: "What is the agile workflow?"
- Highlight fast response time (~8-10s)
- Show source citations with relevance scores
- Open metrics endpoint (show Prometheus metrics)
- Check health endpoint (show all services green)
-
Technical Highlights (2 min)
- Metal GPU acceleration (explain ARM64 optimization)
- Vector similarity search (explain embeddings)
- Document ingestion (show 14 sample docs loaded)
-
Challenges & Solutions (2 min)
- Docker performance issues → Local deployment
- LLM timeout errors → Increased timeouts, retry logic
- Node.js localStorage → Environment variable fix
- Show test coverage (if implemented)
-
Q&A (2-3 min)
- Be prepared to explain:
- Why Milvus over other vector DBs
- Why Mistral model
- How RAG differs from plain LLM
- Scalability considerations
- Be prepared to explain:
- ✅ System starts without errors
- ✅ Query returns in <15 seconds
- ✅ Source citations displayed
- ✅ Health checks pass
- ✅ No visible console errors
- ✅ Can answer at least 3 different questions
- ✅ Metrics endpoint shows data
- ✅ Architecture diagram is clear
- Have backup demo recording in case live demo fails
- Test your demo flow 3+ times before presentation
- Prepare for "What would you improve?" question (reference IMPROVEMENT_AREAS.md)
- Know your metrics cold (response time, accuracy, document count)
- Practice explaining RAG to non-technical audience
Backend won't start:
./start_local.sh stop
./start_local.sh clean
./start_local.sh start
# Wait 2 minutesQuery times out:
- "This query is complex, let me try a simpler one"
- Have 2-3 pre-tested queries ready
Frontend crashes:
- Use curl to demo backend API instead
- Show Swagger UI at http://localhost:8000/docs
Nothing works:
- Show architecture diagram
- Walk through code structure
- Explain design decisions
Bottom Line: Focus on fixes 1-4 (LLM timeouts, tests, CORS, metrics) + demo preparation. Skip everything else for now. You can always mention "future improvements" in Q&A.
Time Budget: ~8-10 hours total for a solid demo-ready system.