LLMSystems

Learning LLM Systems by Building

This organization is a collection of projects created during my journey of learning Large Language Models (LLMs), retrieval systems, and AI infrastructure.

Instead of only studying theory, I try to learn by building real systems — from inference servers to RAG pipelines and Agent Harness frameworks.

What I'm Exploring

How agent harness frameworks work (tools / memory / planning / orchestration)
How to serve models efficiently (GPU / TensorRT / batching)
How to route and manage multiple LLMs
How retrieval works (dense / sparse / hybrid / multi-vector)
How to evaluate LLM outputs and reduce hallucination
How to design practical LLM applications

Project Overview

These projects are not perfect or production-ready — they reflect my learning process and experiments.

Agent

picobot
A small, clear, and extensible multi-user Web Agent — chat, call tools, operate a workspace, browse the web, search for information, with each conversation running in an isolated sandbox.

Inference & Serving

TensorRT Inference Server
Exploring high-performance model serving and GPU optimization.

Routing

LLM Router Server
Learning how to route requests across multiple models with load balancing.
LLM Router Server Dashboard
One-Stop LLM Model Management and Monitoring Platform.

Retrieval & RAG

Tiny-RAGFlow
A lightweight RAG framework to understand hybrid retrieval and reranking.

Tools

LLM Tools
A unified interface for interacting with LLMs, embeddings, and rerankers.

Data Processing

SEC-10-K-Structured-Extraction
About Parses SEC EDGAR Form 10-K annual reports into standardized JSON, automatically identifying the content and status of every Item.
SEC-10-K-Structured-Extraction-Web
About Parses SEC EDGAR Form 10-K annual reports into standardized JSON, automatically identifying the content and status of every Item.
file2md
Converting different file formats into Markdown for downstream LLM usage.

Evaluation

llm-evals
Experimenting with LLM evaluation and LLM-as-a-judge approaches.

Research Exploration

Hallucination Mitigation (Behavior RL)
Trying to solve LLMs hallucination.

ML + Database

ML2SQL
Exploring how ML models can run directly inside databases using SQL.

Why This Exists

I believe the best way to understand LLM systems is to:

Build them piece by piece.

Each repository focuses on a different part of the stack, and together they form a rough picture of how modern LLM systems work.

Still Learning

This is an ongoing journey.
Many things are incomplete, naive, or experimental — and that’s intentional.

If you’re also learning, feel free to explore, use, or build on top of these projects.

Contact

If you have any questions, ideas, or just want to chat about LLMs, feel free to:

Open an issue
Or reach out via email

[email protected]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLMSystems

Learning LLM Systems by Building

What I'm Exploring

Project Overview

Agent

Inference & Serving

Routing

Retrieval & RAG

Tools

Data Processing

Evaluation

Research Exploration

ML + Database

Why This Exists

Still Learning

Contact

Pinned Loading

Repositories

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

People

Top languages

Uh oh!

Most used topics

Uh oh!