CodeContext
Local Retrieval Stack for Code and Docs
Overview
CodeContext is a local-first retrieval system designed to give AI agents and developers unified access to their codebase and documentation. It runs entirely on your machine, indexing files into a hybrid search layer that balances semantic understanding with exact lexical matching.
The stack includes a FastAPI service, an MCP server for agent integration, a VS Code extension for native editor support, and token compression utilities to keep context windows lean and relevant. Everything is built around the idea that your code context should stay local, fast, and interoperable.
Features
Hybrid Search
Combines semantic embeddings and lexical matching for precise code and document retrieval
MCP Server
Model Context Protocol server exposing local context to any compatible AI agent or IDE
VS Code Extension
Native editor integration for querying context without leaving your workflow
Token Compression
Smart window management that preserves relevance while staying within model limits
Architecture
- Indexer: Background file watcher that parses code and docs into chunked, embedding-ready segments
- Hybrid Engine: Dual-retrieval pipeline fusing dense vector search with sparse BM25 ranking for high-recall results
- Reranker: Cross-encoder scoring layer that reorders candidates by relevance before returning results
- MCP Server: Exposes indexed context through the Model Context Protocol to any compatible client or agent
- REST API & VS Code Extension: FastAPI surface for external tools and a native extension for in-editor queries