CodeContext

Local Retrieval Stack for Code and Docs

Overview

CodeContext is a local-first retrieval system designed to give AI agents and developers unified access to their codebase and documentation. It runs entirely on your machine, indexing files into a hybrid search layer that balances semantic understanding with exact lexical matching.

The stack includes a FastAPI service, an MCP server for agent integration, a VS Code extension for native editor support, and token compression utilities to keep context windows lean and relevant. Everything is built around the idea that your code context should stay local, fast, and interoperable.

Features

Hybrid Search

Combines semantic embeddings and lexical matching for precise code and document retrieval

MCP Server

Model Context Protocol server exposing local context to any compatible AI agent or IDE

VS Code Extension

Native editor integration for querying context without leaving your workflow

Token Compression

Smart window management that preserves relevance while staying within model limits

Architecture

Indexer: Background file watcher that parses code and docs into chunked, embedding-ready segments
Hybrid Engine: Dual-retrieval pipeline fusing dense vector search with sparse BM25 ranking for high-recall results
Reranker: Cross-encoder scoring layer that reorders candidates by relevance before returning results
MCP Server: Exposes indexed context through the Model Context Protocol to any compatible client or agent
REST API & VS Code Extension: FastAPI surface for external tools and a native extension for in-editor queries

Tech Stack

PythonFastAPIMCPSQLiteSentenceTransformersRank-BM25VS Code APITypeScriptDockerREST API

Status

Active Development|Core Stack Functional