All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog and this project adheres to Semantic Versioning.
- importlinter as arch deps/imports watcher. Safety check added to CI + precommit.
- Removed legacy SQL shim module
src/local_rag_backend/infrastructure/persistence/sql/alchemy_engine.pyand its remaining compatibility surface.
- Golden integration coverage for invariants F1-F5 in
tests/integration/test_invariants_f1_f5_golden.py:- F1/F2 durable mutate + ingest + list contracts.
- F3 ask + ask_eval + history contracts.
- F4/F5 rebuild + health/readiness contracts.
- Composition wiring is now centralized through
AppContainer.runtime_wiring_defaults()withAppContainer.from_settings()as the single runtime construction path. - Factory/container wiring overrides were unified to reduce duplication across API/CLI/runtime paths.
- Bootstrap ingestion (
rag-bootstrap) now goes through canonical durable mutation flow (MutationCoordinator) instead of the legacy ETL-style path. - Bootstrap/integration naming and architecture docs were updated to reflect the canonical mutation topology and single composition root.
composition.factorynow resolves runtime wiring overrides with safe defaults and keeps monkeypatchable public symbols for test/runtime overrides.- Lock/offload assertions were hardened in mutation tests to check behavior instead of fragile internal callable names.
- Screaming Architecture refactor:
app/package eliminated entirely.app/application/→core/use_cases/(transport-agnostic use cases).app/contracts/ports.py→core/ports/contracts.py.app/contracts/results.py→core/use_cases/results.py.app/errors.py→core/use_cases/errors.py.app/{composition,container,factory,app_context}.py→composition/(transport-neutral DI root).app/wiring/→composition/wiring/.app/{diagnostics,metrics_backend,observability,telemetry}.py→infrastructure/observability/.app/blocking.py→infrastructure/concurrency/blocking.py.app/application/storage_profiles.py→core/domain/profiles.py.- All remaining HTTP-specific files →
http/(routers, schemas, middleware, security, main).
- FastAPI is now an optional dependency: moved
fastapi,uvicorn,python-multipartto[project.optional-dependencies.server]. Install withpip install rag-prototype[server]oruv sync --extra server. - ASGI entry point changed:
local_rag_backend.app.main:app→local_rag_backend.http.main:app. allextra now uses PEP 621 self-referencing syntax to includeservertransitively.error_mapping.pymoved fromhttp/intocore/use_cases/errors.py::map_runtime_error.AppContainerpublic method signatures useAskEvalConfigLikeProtocol (fromcore/use_cases/rag_query) instead of the concreteAskEvalConfigPydantic schema.
- Architecture guard tests enforce layer dependency rules:
core/{domain,ports,services}must not importinfrastructure/,http/, orcomposition/.core/use_cases/must not importhttp/orfastapi/starlette.- No
local_rag_backend.app.*references remain in source.
AskEvalConfigLikeProtocol with 9 properties for transport-neutral RAG evaluation config.Makefilesync targets include--extra serverfor dev/test/lint environments.
- Dockerfile: ASGI entry point updated to
local_rag_backend.http.main:app;--extra serveradded to dependency sync stages.
- Bumped starlette 0.37.2 → 0.50.0 (via fastapi ≥ 0.124): resolves three DoS CVEs (CVE-2024-47874, CVE-2025-54121, CVE-2025-62727).
- Bumped anyio 4.3 → 4.12: closes PVE-2024-71199.
VectorRepoPort.ntotalread-only property for non-destructive preflight health checks (used in maintenance preflight before rebuild/delete operations).WriteLockPortProtocol incore/ports/contracts.py;DocsMutationPorts.write_lockis now explicitly typed against it.- gitleaks v8.24.2 pre-commit hook for secret and credential detection.
check-jsonanddebug-statementspre-commit hooks.
- fastapi pin raised to
>=0.124,<0.125(minimum required to unlock starlette ≥ 0.50). - python-multipart declared as an explicit dependency (was transitive in fastapi ≤ 0.111).
DocsMutationPorts: four dead fields removed (precompute_vectors_fn,sync_dense_fn,delete_docs_fn,delete_external_ids_fn).DocsRepositoryPort: extended withget(),delete_documents(),delete_by_external_ids().UpsertDocBuilderPort: return type narrowed fromAnytoobject.MultiStoreDeleteResultandMultiStoreExternalIdDeleteResultare now frozen dataclasses;maintenance.pyfunctions no longer return bare tuples.core/services/types.pyreplacescore/services/schemas.pyfor data-only DTOs.VectorStorage.__init__accepts an explicitsettings_objparameter instead of relying on the global settings singleton.
except Exceptionnarrowed to specific types (UnicodeDecodeError,json.JSONDecodeError) infactory.pyandid_map_json.py; broad catches inalchemy_engine.pyandvector/index.pyare documented as intentionally wide (rollback/recovery guards).docs_ingest.pyno longer importsSqlDocumentStoragedirectly; uses theports.build_upsert_docfactory instead.sample_data_ingestion.py:print()replaced bylogger.info(); magic constant128replaced bysettings_obj.ingest_batch_size.- All production
assert x is not Nonereplaced by explicitRuntimeError. - Orphaned result DTOs removed:
UpsertDocsSummary,DeleteDocsByExternalIdSummary,DeleteDocsSummary. runtime.__all__no longer exposes private_reset_rag_service_best_effort.mutation_journal._record_from_dict: invalid-state entries now log a warning before resetting toPREPAREDinstead of silently resetting.except TypeErrorremoved fromwrite_lock.pyanddocs_mutation.py.
_resolve_embedder_or_raise()helper extracted inmaintenance.py, removing four duplicated preflight patterns across two functions.FaissIndex._locked_write()context manager replaces five repeated lock patterns.Settings.ingest_batch_sizefield added; magic number64removed from callers.IngestPlanandBatchSyncResultconverted from tuple aliases to frozen dataclasses._to_domain_document()and_detect_document_changes()extracted insql_.py(eliminatesgetattrusage; adds explicit change detection).DocsMutationPorts/IndexMutationPorts:Anyreplaced by concrete types throughout mutation port contracts.factory.py:get_app_contextdead local variable for mypy narrowing removed.
- README: FastAPI badge corrected (0.111+ → 0.124+);
cli_commands/added to project structure tree;scripts/directory description corrected. - USAGE.md:
sessionmaker(bind=engine)→sessionmaker(engine)(SQLAlchemy 2.x API).
- New ingestion pipeline for files/directories with loader discovery for
.txt,.md, and.csv, plus optionalpython-magicdetection. - New CLI capabilities:
rag-ingest,rag-delete-external-ids, andrag-eval(dataset-based offline regression gate). - New API mutation capabilities for document lifecycle and maintenance (
/api/docs/upsert,/api/docs/delete_by_external_id,/api/index/rebuild). - Manifest-backed dense index diagnostics with drift checks surfaced in readiness/status flows.
- Optional overlap reranker and expanded observability (structured logs + ingestion/query metrics).
- DB-backed
system_stateversioning to invalidate cached RAG services across processes. - Task-type blocking pools with explicit pending limits for mutation/network/eval workloads.
- Release line normalized for
release/02-2026atv1.0.0. - API transport split into bounded routers (health/rag/docs/index/openrouter) with composition-focused root wiring.
- Mutation orchestration moved to app-layer services and shared mutation ports reused by API and CLI.
- CLI reorganized by bounded contexts while preserving command surface.
- Runtime composition policy centralized for retriever/embedder/provider resolution across API, CLI, and scripts.
- Default Ollama model consolidated to
lfm2.5-thinkingand ingestion batching formalized viaINGEST_BATCH_SIZE.
- Multi-store consistency hardening: dense embeddings are precomputed before SQL upserts to avoid SQL/vector drift on provider failures.
- Mutating operations now serialize under a shared cross-process write lock and fail closed on lock acquisition errors.
- FAISS/index consistency hardening: manifest preflight checks, stricter lock behavior, safer persistence and recovery paths.
- Ingestion correctness fixes: external_id prefix collision handling, symlink no-follow behavior, duplicate external_id validation, and robust unreadable-file handling.
- API hardening fixes: malformed OpenRouter responses mapped cleanly (502), stricter sampling parameter validation, and deterministic cache invalidation after mutation attempts.
- SQLite compatibility hardening: identity migration race tolerance and safer compatibility migrations for CLI/scripts.
- Runtime API-key enforcement for non-local exposure, including forwarded/proxied requests (
X-Forwarded-*/Forwarded) with fail-closed behavior on ambiguity. - Additional validation guards for generation/evaluation request parameters and non-local request handling.
- CI/security posture hardened (workflow gate tightening, pinned actions, and security-check workflow improvements).
- Ingestion batching optimizations to reduce write-lock and upsert churn.
- Sparse retrieval hot-path optimization via in-memory document caching and reduced duplicate SQL loads.
- Reduced ingestion overhead by reusing file-format detection results and precompiling whitespace cleanup regexes.
- Removal of root shims and stricter app/core module boundaries.
- Typed cross-layer provider errors and centralized HTTP error mapping.
- Consolidated dense upsert/delete consistency flow and shared locking/client helpers.
- RAG service invalidation refactored from file token strategy to DB-backed versioning (
system_state).
- Updated architecture and usage guides for bounded routers/services, new CLI/API mutation flows, eval workflow, and observability.
- Documented manifest drift behavior, ingestion dedup/chunking strategy, delete-by-external-id semantics, and contributor verification gates.
- Added/updated operational notes for rebuild/delete maintenance and release stabilization roadmap.
- Upgrading from
0.1.x: run app/CLI once per SQLite database so compatibility migrations can add identity and consistency fields. - Dense/hybrid mode now depends on manifest integrity (
index_manifest.json); if readiness reports drift/corruption, run index rebuild. - Delete semantics changed: deleting by
external_idcreates tombstones and blocks future re-ingest/upsert of those identities. - Ingestion dedup now uses
chunk_dedup_sha256+INGEST_CHUNKER_VERSION; changing chunker version intentionally creates new chunk identities. - Review production env before rollout: set
API_KEYfor non-local binds and tune ingestion concurrency withINGEST_BATCH_SIZE. - Release tag for this cut:
v1.0.0onrelease/02-2026.
- Optional API key auth via
API_KEY(clients must sendX-API-Key) to protect/api/*and/metrics. - Production-safe CORS allowlist via
CORS_ALLOW_ORIGINSwhenDEBUG=false. - Metrics: low-cardinality Prometheus path labels to prevent time-series explosion on dynamic/404 paths.
- Cross-worker cache invalidation for the cached RAG service via a reload token in the data directory.
- FAISS persistence: ID map is now JSON (
id_map.json) with atomic writes and best-effort locks; unsafe pickle maps are refused. - API: request size limits for key endpoints to reduce DoS/cost-amplification risk.
- LangChain loaders integration via
LangChainLoaderadapter implementingLoaderPort. - Optional extras group
loaderswithlangchain-communityandtrafilaturainpyproject.toml. - Optional docs site scaffold (
mkdocs.yml+docs/index.md).
- API: reset cached RAG service after ingestion; make readiness fail when dense index/id-map are missing.
- OpenAI: avoid embeddings calls for empty input; generator requires API key.
- FAISS: validate
id_map.jsonformat; guard ids/embeddings length mismatch. - Settings: avoid side effects at import-time; create data dir at startup/scripts.
- Align package name and defaults (Ollama model, coverage instructions, config source of truth).
- Update architecture doc to match current ports/factory and list extra API endpoints.
- Prefer Ruff (
ruff check+ruff format) as the primary formatter/linter. - CI: add
ruff format --check, setUV_CACHE_DIR, align Docker tag with compose. - Dockerfile: production stage now reuses the installed project from the deps stage (avoids rebuilding in the final image).