This project provides a framework and a suite of tools (context_store.py, context_store_json.py) designed to facilitate sophisticated, AI-assisted software development. It enables multiple specialized AI agents to collaborate on a Python codebase by leveraging advanced context retrieval mechanisms, thereby mitigating the inherent context window limitations of Large Language Models (LLMs).
The core of the context management system allows for:
- Lightweight, Precise Context Retrieval: Using JSON-based indices (
project_signatures.json,project_fullsource.json) for direct lookup of Python functions and classes by name, primarily queried viacontext_store_json.py. - Dense, Semantic Context Retrieval: Using AST-based chunking and SentenceTransformer embeddings (
project_ast_index.npz) for conceptual, natural language searches across the codebase, primarily queried viacontext_store.py. - Prose Document Indexing: (If
build_prose_indexfunctionality is used) Semantic indexing of Markdown and Jupyter Notebook content for documentation and broader project understanding.
This README focuses on the tools and the conceptual multi-agent setup. Detailed agent priming documents (COMMON_PROTOCOL.md, MASTER_AGENT_PRIME.md, etc.) govern agent behavior and interaction.
Provides functionalities for creating and querying a lightweight, exact-match index of Python code elements.
- Features:
- AST-based extraction of function/class signatures, docstrings, and full source code.
- Outputs to
project_signatures.jsonandproject_fullsource.json. - Relies only on the Python standard library.
- CLI Usage:
- Build JSON Index:
(This creates
python context_store_json.py build-json --repo <path_to_python_codebase> --output-base-name project
project_signatures.jsonandproject_fullsource.json) - Query JSON Index:
python context_store_json.py query-json --signatures-file project_signatures.json --source-file project_fullsource.json --query "function_name in file_name.py" --k 3
- Build JSON Index:
- Programmatic API:
export_ast_chunks_to_json(),query_json_context().
Provides tools to build and query dense, AST-based semantic indices of Python code and (optionally) prose documents.
- Features:
- Code Indexing:
- Intelligently extracts Python functions and classes.
- Uses SentenceTransformer models (e.g.,
intfloat/e5-base-v2) for creating vector embeddings. - Enables natural language querying for semantically similar code snippets.
- Prose Indexing (Optional):
- Indexes Markdown and Jupyter Notebook content, chunked by headings.
- Allows semantic search over documentation and other prose.
- Caching: In-memory caching for loaded index files and models.
- Code Indexing:
- CLI Usage (Code Index):
- Build Dense Code Index:
python context_store.py build --repo <path_to_python_codebase> --index project_ast_index.npz --model intfloat/e5-base-v2
- Query Dense Code Index:
python context_store.py query --index project_ast_index.npz --query "natural language description of code needed" --k 3
- Build Dense Code Index:
- CLI Usage (Prose Index - if implemented):
- Build Dense Prose Index:
python context_store.py build-prose --repo <path_to_docs_or_project> --output project_prose_index.npz --model intfloat/e5-base-v2
- Query Dense Prose Index:
python context_store.py query-prose --index project_prose_index.npz --query "concept from documentation" --k 3
- Build Dense Prose Index:
- Programmatic API:
build_index(),get_code_context(),build_prose_index(),get_prose_context().
This system is designed to be the backbone of a multi-agent development team, typically structured as follows:
- Human Overseer: Initializes the project, provides high-level goals, makes index files and agent primes available, and executes actions agents cannot (e.g., running full test suites, committing code, running dense index queries).
- Master Agent (Master):
- Receives project goals and orchestrates the specialized agents.
- Defines tasks, reviews deliverables, and ensures project coherence.
- Uses the "Task Delegation Checklist" (defined in its prime) to determine context strategy.
- Instructs specialized agents to use
context_store_json.pyfor retrieving existing code elements. - Formulates semantic queries for the dense index (
project_ast_index.npz) and requests the Human Overseer to execute them viacontext_store.py, then provides the results to specialized agents.
- Specialized Agents (
module_dev,unit_tester,notebook_writer):- Operate based on signed tasks from Master and their role-specific primes.
- Utilize a "Task Kick-off Checklist" to guide their actions, including context acquisition.
- Crucially, if their environment allows, they directly execute
context_store_json.py query-json ...to retrieve context for existing Python code elements they are tasked to work on, using the providedproject_signatures.jsonandproject_fullsource.json. This minimizes Master's context burden. - Refer to
COMMON_PROTOCOL.mdand their specific prime for detailed operational rules and interaction protocols.
- Human Overseer: Provides
COMMON_PROTOCOL.md, role-specific primes, index files, and tool scripts to the agent environment. Sets a project goal. - Master Agent: Decomposes goal. Identifies that
module_devneeds to modifyfunction_Ainmodule_X.py. - Master Agent to
module_dev(Signed Message): "Task: Modifyfunction_Ainmodule_X.pyto include new error handling for X. First, retrieve current source forfunction_Ausingcontext_store_json.py query-json ...(refer toCOMMON_PROTOCOL.mdfor command structure if needed)." module_dev(Internal): Executes its "Task Kick-off Checklist." Runs thequery-jsoncommand, getsfunction_A's source. Implements changes.module_devto Master (Signed Message): Delivers modifiedfunction_A.- Master Agent: Reviews. If a broader impact analysis is needed, Master formulates a semantic query (e.g., "find all functions affected by changes to
function_A's return type") and asks Human to run it againstproject_ast_index.npz. Master uses these results to plan next steps or provide further context to agents.
This collaborative approach, underpinned by robust context retrieval tools, allows the AI team to tackle complex software development tasks more effectively.
- Python 3.8+
numpytorch(CPU version is sufficient)sentence-transformersnbformat&nbconvert(for prose indexing of Jupyter Notebooks)
Install dependencies for dense indexing:
pip install numpy torch sentence-transformers nbformat nbconvert(context_store_json.py has no external dependencies beyond the Python standard library).