Afzan Khan | Portfolio

About the Project

RAG-powered AI code review platform that grounds every response in retrieved evidence across OWASP, codebase vectors, and review memory.

Overview

Codex 2.0 is a Retrieval-Augmented Generation (RAG) platform designed to provide evidence-grounded developer intelligence. Unlike traditional code review tools that tell you what is wrong, Codex shows you why — with citations retrieved from trusted knowledge sources before generating a single token.

Key Features

Playground — Instant RAG-grounded code reviews with evidence-backed responses
Codebase Chat — Ask questions about your codebase in natural language with streaming SSE responses and inline citations
Refactor Intelligence — Evidence-backed refactoring suggestions with before/after diffs
Index Manager — Live pipeline visualization showing the status of each knowledge corpus

Architecture

The platform follows a three-service pipeline architecture:

Ingestion Service — Processes and indexes code into three knowledge corpora (OWASP Top 10, codebase vectors, review memory)
Retrieval Service — Hybrid retrieval combining BM25 keyword search with ChromaDB semantic cosine similarity, fused via Reciprocal Rank Fusion (RRF)
Generation Service — Feeds retrieved context to Llama 3.3 70B via Groq API for grounded generation

Tech Stack

React 18 + TypeScript frontend, Node.js/Express backend, MySQL 8.0, ChromaDB vector database, all-MiniLM-L6-v2 embeddings (local ONNX inference — no API key required), Groq API (Llama 3.3 70B), JWT + GitHub OAuth authentication.