日本語 | 中文 | Español | Français | हिन्दी | Italiano | Português (BR)
Semantic file search for AI workstations using HNSW vector indexing
Find files by describing what you're looking for, not just by name
| Problem | Solution |
|---|---|
| "Where's that database connection file?" | file-compass search "database connection handling" |
| Keyword search misses semantic matches | Vector embeddings understand meaning |
| Slow search across large codebases | HNSW index: <100ms for 10K+ files |
| Need to integrate with AI assistants | MCP server for Claude Code |
# Install
git clone https://github.com/mcp-tool-shop-org/file-compass.git
cd file-compass && pip install -e .
# Pull embedding model
ollama pull nomic-embed-text
# Index your code
file-compass index -d "C:/Projects"
# Search semantically
file-compass search "authentication middleware"- Semantic Search - Find files by describing what you're looking for
- Quick Search - Instant filename/symbol search (no embedding required)
- Multi-Language AST - Tree-sitter support for Python, JS, TS, Rust, Go
- Result Explanations - Understand why each result matched
- Local Embeddings - Uses Ollama (no API keys needed)
- Fast Search - HNSW indexing for sub-second queries
- Git-Aware - Optionally filter to git-tracked files only
- MCP Server - Integrates with Claude Code and other MCP clients
- Security Hardened - Input validation, path traversal protection
# Clone the repository
git clone https://github.com/mcp-tool-shop-org/file-compass.git
cd file-compass
# Create virtual environment
python -m venv venv
venv\Scripts\activate # Windows
# or: source venv/bin/activate # Linux/Mac
# Install dependencies
pip install -e .
# Pull the embedding model
ollama pull nomic-embed-text- Python 3.10+
- Ollama with
nomic-embed-textmodel
# Index a directory
file-compass index -d "C:/Projects"
# Index multiple directories
file-compass index -d "C:/Projects" "D:/Code"# Semantic search
file-compass search "database connection handling"
# Filter by file type
file-compass search "training loop" --types python
# Git-tracked files only
file-compass search "API endpoints" --git-only# Search by filename or symbol name
file-compass scan -d "C:/Projects" # Build quick indexfile-compass statusFile Compass includes an MCP server for integration with Claude Code and other AI assistants.
| Tool | Description |
|---|---|
file_search |
Semantic search with explanations |
file_preview |
Code preview with syntax highlighting |
file_quick_search |
Fast filename/symbol search |
file_quick_index_build |
Build the quick search index |
file_actions |
Context, usages, related, history, symbols |
file_index_status |
Check index statistics |
file_index_scan |
Build or rebuild the full index |
Add to your claude_desktop_config.json:
{
"mcpServers": {
"file-compass": {
"command": "python",
"args": ["-m", "file_compass.gateway"],
"cwd": "C:/path/to/file-compass"
}
}
}| Variable | Default | Description |
|---|---|---|
FILE_COMPASS_DIRECTORIES |
F:/AI |
Comma-separated directories |
FILE_COMPASS_OLLAMA_URL |
http://localhost:11434 |
Ollama server URL |
FILE_COMPASS_EMBEDDING_MODEL |
nomic-embed-text |
Embedding model |
- Scanning - Discovers files matching configured extensions, respects
.gitignore - Chunking - Splits files into semantic pieces:
- Python/JS/TS/Rust/Go: AST-aware via tree-sitter (functions, classes)
- Markdown: Heading-based sections
- JSON/YAML: Top-level keys
- Other: Sliding window with overlap
- Embedding - Generates 768-dim vectors via Ollama
- Indexing - Stores vectors in HNSW index, metadata in SQLite
- Search - Embeds query, finds nearest neighbors, returns ranked results
| Metric | Value |
|---|---|
| Index Size | ~1KB per chunk |
| Search Latency | <100ms for 10K+ chunks |
| Quick Search | <10ms for filename/symbol |
| Embedding Speed | ~3-4s per chunk (local) |
file-compass/
├── file_compass/
│ ├── __init__.py # Package init
│ ├── config.py # Configuration
│ ├── embedder.py # Ollama client with retry
│ ├── scanner.py # File discovery
│ ├── chunker.py # Multi-language AST chunking
│ ├── indexer.py # HNSW + SQLite index
│ ├── quick_index.py # Fast filename/symbol search
│ ├── explainer.py # Result explanations
│ ├── merkle.py # Incremental updates
│ ├── gateway.py # MCP server
│ └── cli.py # CLI
├── tests/ # 298 tests, 91% coverage
├── pyproject.toml
└── LICENSE
- Input Validation - All MCP inputs are validated
- Path Traversal Protection - Files outside allowed directories blocked
- SQL Injection Prevention - Parameterized queries only
- Error Sanitization - Internal errors not exposed
# Run tests
pytest tests/ -v
# Run with coverage
pytest tests/ --cov=file_compass --cov-report=term-missing
# Type checking
mypy file_compass/Part of MCP Tool Shop — the Compass Suite for AI-powered development:
- Tool Compass - Semantic MCP tool discovery
- Integradio - Vector-embedded Gradio components
- Backpropagate - Headless LLM fine-tuning
- Comfy Headless - ComfyUI without the complexity
- Questions / help: Discussions
- Bug reports: Issues
MIT License - see LICENSE for details.
- Ollama for local LLM inference
- hnswlib for fast vector search
- nomic-embed-text for embeddings
- tree-sitter for multi-language AST parsing
