✓ Complete Multi-Agent AI RAG Local LLM Python

Hybrid AI Platform

A full-stack, multi-agent AI platform running 100% locally. No API keys, no cloud calls, no data leaving your machine. Two independent pipelines, one shared intelligent core.

View on GitHub ← All Projects

Overview

This project is a from-scratch implementation of two real-world AI pipelines sharing a single infrastructure layer. The goal was to learn the most in-demand AI engineering patterns — RAG, multi-agent orchestration, local inference — while building something that genuinely solves problems: career development and missing persons case management.

Built over an intensive session as both a learning exercise and a production-ready tool. Every component was chosen deliberately: Ollama for private local inference, ChromaDB for semantic vector search, CrewAI for agent orchestration, SQLite for persistent memory.

Architecture

Two fully independent pipelines share one core infrastructure layer:

SHARED CORE ├── ChromaDB — vector store (nomic-embed-text embeddings) ├── Ollama — local LLM inference (llama3.2) └── SQLite — persistent memory across sessions CAREER CO-PILOT MISSING PERSONS INTEL Agent 1: Job Scraper Agent 1: Case Extractor Agent 2: Gap Analyzer (RAG) Agent 2: Cross-Referencer (RAG) Agent 3: Resume Rewriter Agent 3: Report Generator Agent 4: Interview Prepper Agent 4: Family Updater LATEX RESUME EDITOR ZIP upload → parse .tex → AI edits → confidence scores → ZIP export

What Makes This Non-Trivial

Real RAG, not keyword search. Your resume experience is chunked and embedded into vectors using nomic-embed-text, stored in ChromaDB. When you apply to a job, the system retrieves the most semantically relevant bullets — not by keyword match, but by meaning. This is the #1 most-discussed AI architecture in engineering interviews right now.

Multi-agent orchestration. Each pipeline runs 4 specialized agents in sequence, where each agent's output becomes the next agent's context. This mirrors production AI patterns used at companies like Salesforce and DocuSign, built from scratch using CrewAI.

100% local and private. Everything runs via Ollama. No data is sent anywhere. This is critical for the Missing Persons side — sensitive case data never leaves the device. The same local inference that makes it private also makes it free to run.

LaTeX-aware editing. The resume editor understands LaTeX syntax — it edits human-readable content while leaving \href, \textbf, \begin{rSection}, and all formatting commands completely intact. It assigns confidence scores (🟢🟡🔴) to every edit and exports an Overleaf-ready ZIP. Most AI resume tools can't handle raw LaTeX at all.

Build Status

Feature	Status
Shared Core (RAG + Memory)	✓ Complete
Career Co-Pilot (4 agents)	✓ Complete
Missing Persons Intelligence (4 agents)	✓ Complete
LaTeX Resume Editor	✓ Complete
Job URL Scraper (Indeed, Glassdoor, generic)	✓ Complete
Gradio UI (4 tabs)	✓ Complete
PDF Resume Upload → RAG	⟶ Next
Job Application Tracker	⟶ Next
Follow-up Chat Agent	⟶ Next
Public Deployment (HuggingFace / VPS)	⟶ Planned

How to Run It

Prerequisites: Python 3.10+ and Ollama installed locally.

# Pull models (one-time) ollama pull llama3.2 ollama pull nomic-embed-text # Clone and set up git clone https://github.com/leemacaravan/hybrid-ai-platform cd hybrid-ai-platform python3 -m venv venv && source venv/bin/activate pip install -r requirements.txt playwright install chromium # Run python -m ui.app # Open http://127.0.0.1:7860

What I Learned

How to architect and implement a full RAG pipeline from chunking to retrieval
Multi-agent orchestration patterns with CrewAI and how context flows between agents
Local LLM deployment trade-offs — model size, speed, quality on consumer hardware
LaTeX AST parsing and safe editing of structured document formats
How to build for privacy-first AI use cases where data sensitivity is high