Berlin • AI systems researcher focused on production-grade ML/GenAI: evaluation-first design, reliable serving, and real-world robustness.
[email protected] • linkedin.com/in/khaled777b • github.com/wired777b
I research and build AI systems that remain reliable in real environments: shifting data, messy inputs, latency constraints, and hard evaluation questions.
My north star is “measurable intelligence”: evaluation harnesses, regression gates, and observability that make model behavior auditable over time.
- 🧪 Evaluation-first ML/GenAI: offline metrics + regression suites, task-specific checks, and monitoring loops.
- 🧠 RAG systems: hybrid retrieval, indexing strategies, reranking, grounding, and quality/safety guardrails.
- 🧬 Fine-tuning when justified: 7B+ class models, dataset curation, and careful error analysis (not vibes).
- 🏗️ MLOps & serving: reproducible pipelines, versioning, rollout patterns, multi-model serving.
Modeling
RAG / Retrieval
Serving / MLOps
APIs / Streaming / Data
Observability
- 🤖 AI systems: evaluation harnesses, model/retrieval regression gates, and monitoring that catches drift early.
- 📚 RAG platforms: hybrid retrieval, chunking/index strategies, reranking, grounding checks, and cost/latency controls.
- ⚙️ Serving & pipelines: cloud-native deployments, safe rollouts, multi-tenant patterns when needed.
- 🛰️ Data reality: streaming ingestion, replayability, idempotency, schema evolution.
- 🧪 Evaluation over opinions; metrics + datasets are the contract.
- 🔭 Observability is non-negotiable; debugging should be fast and boring.
- 🧱 Maintainability is velocity; clean boundaries beat cleverness.
- ⚖️ Reliability is designed; failure modes should be predictable.
🧫 Current project: gut health (microbiome + nutrition)
I’m applying the same AI-systems discipline (evaluation, guardrails, and reliability) to gut-health use cases where data is noisy and “sounds plausible” is not enough.
🌍 Quick facts
- Berlin-based.
- Languages: English • French • German • Spanish
- Interests: AI systems, GenAI/RAG, MLOps, streaming platforms, computer vision, open source.
If you're building AI systems that must hold up in production — let’s connect.

