Skip to content
View JetXu-LLM's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report JetXu-LLM

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
JetXu-LLM/README.md

Hi, I'm Jet Xu.

Turning private work files and code into reliable, evidence-backed context for AI.

Systems Architect | 15+ years building mission-critical infrastructure | AI Harnessing & Context Engineering

DocMason | Blog | LinkedIn | Email

Reasoning is improving fast. Reliable context is still the bottleneck.

  • Now: building DocMason, a local-first, evidence-first knowledge base for AI-assisted deep research over private work files.
  • Before: built code-intelligence systems across llama-github, LlamaPReview, and repo-graph-rag.
  • Direction: the Mason ecosystem—moving from deep document analysis to generating native, consulting-grade presentations.

Why This Path

Over 15 years of architecting mission-critical systems, the recurring failure mode is always the same: in high-stakes environments, being "almost right" is useless. The bottleneck to reliable output—whether from humans or AI—is rarely raw reasoning capacity. It is context precision. That constraint drives everything I build.

Current Focus

DocMason is my current open-source focus: a local-first, provenance-first knowledge base for AI-assisted deep research over private work files. It is not a document chatbot. It compiles unstructured artifacts into knowledge infrastructure that agents can actually use. Its native operating pattern is simple: the repo is the app, and Codex is the runtime.

Core architectural priorities:

  • Deterministic ingestion: parsing PDFs, decks, spreadsheets, emails, and repository-native text without silent failures.
  • Reliable outputs: provenance-first retrieval instead of vague, hallucination-prone document chat.
  • Actionable output: extending the Mason ecosystem beyond extraction. The next step is a deterministic pipeline that turns deep document analysis directly into native, consulting-grade presentations (PPTX) for serious white-collar work.

The Foundation

I came to document intelligence through code intelligence.

  • llama-github: the retrieval substrate, built to give LLMs GitHub-native context instead of raw repository dumps.
  • LlamaPReview: field validation for that thesis. It achieved a measured 61% signal-to-noise ratio in AI code review across 4,000+ active repositories (35K+ combined stars).
  • repo-graph-rag: the Code Mesh research artifact, exploring deterministic repository graphs and traversal-first retrieval.
  • llamapreview-context-research: formalizing the exact failure mode of Context Instability.

This path started with helping AI understand code diffs, but led to a broader conclusion: the real computing frontier is shifting toward understanding full knowledge environments and generating high-stakes output from them. Code Mesh was the logical end of one inquiry, but not the final product surface.

The Pivot

By late 2025, it was clear that code review would not remain the terminal surface of AI engineering. As vibe coding accelerated, the scarce problem was no longer commenting on diffs, but helping agents understand entire working environments to produce artifacts people could actually use—like generating top-tier consulting decks directly from raw knowledge bases. This is why my focus shifted from code intelligence to document intelligence, and ultimately toward visual output systems.

Systemizing "Vibe Coding"

Whether building traditional software or complex multi-agent systems, open-ended "vibe coding" hits a scaling wall. The bottleneck isn't generating code; it is preventing architectural collapse as AI-driven mutations accumulate.

To solve this, I formalized a universal paradigm for AI harnessing engineering—the Dao / Fa / Qi / Shu of agentic coding. It shifts AI from an open-ended conversational copilot to a strictly harnessed actor within a deterministic system:

  • Dao (Direction & Value): Defining the invariant product boundaries. Without Dao, AI optimizes for local illusions of progress, building features that demo well but corrupt the long-term architecture.
  • Fa (Runtime Law & State): Governing identity, state transitions, and truth surfaces. Without Fa, AI silently hallucinates state, conflating generated projections with canonical authored truth.
  • Qi (Machinery & Control): The actual subsystems (controllers, commit barriers, projection layers) that enforce the laws. Without Qi, the rules only exist on paper, and the system relies on human vigilance to prevent AI drift.
  • Shu (Execution & Phasing): The deterministic sequence of implementation and validation. Without Shu, AI coding devolves into an endless loop of patching symptoms instead of shipping structural phases.

(I am currently documenting this methodology. A deep dive into building sustainable agent operating surfaces is coming soon to my Blog.)

Selected Writing

Connect

I build systems that move agents from merely reading documents to executing serious knowledge work end-to-end.

Pinned Loading

  1. DocMason DocMason Public

    DocMason is a repo-native agent app for deep research over private work files. It builds a local, evidence-first knowledge base with provenance. The repo is the app. Codex is the runtime.

    Python 1

  2. llama-github llama-github Public

    Llama-github is an open-source Python library that empowers LLM Chatbots, AI Agents, and Auto-dev Solutions to conduct Agentic RAG from actively selected GitHub public projects. It Augments through…

    Python 320 22

  3. llamapreview-context-research llamapreview-context-research Public

    Research artifacts exploring Search-based vs. Agentic RAG strategies for AI Code Review. A deep dive into solving the "Context Instability" problem in LLM-based software engineering, analyzing trad…

    Python 3 1

  4. repo-graph-rag repo-graph-rag Public

    Code Mesh builds deterministic, evidence-backed repository knowledge graphs for Graph RAG, code understanding, and reproducible code intelligence.

    Python 1