Skip to content

Learning LLM Systems by Building

introduction

This organization is a collection of projects created during my journey of learning Large Language Models (LLMs), retrieval systems, and AI infrastructure.

Instead of only studying theory, I try to learn by building real systems — from inference servers to RAG pipelines and Agent Harness frameworks.


What I'm Exploring

  • How agent harness frameworks work (tools / memory / planning / orchestration)
  • How to serve models efficiently (GPU / TensorRT / batching)
  • How to route and manage multiple LLMs
  • How retrieval works (dense / sparse / hybrid / multi-vector)
  • How to evaluate LLM outputs and reduce hallucination
  • How to design practical LLM applications

Project Overview

These projects are not perfect or production-ready — they reflect my learning process and experiments.

Agent

  • picobot
    A small, clear, and extensible multi-user Web Agent — chat, call tools, operate a workspace, browse the web, search for information, with each conversation running in an isolated sandbox.

Inference & Serving

Routing

Retrieval & RAG

  • Tiny-RAGFlow
    A lightweight RAG framework to understand hybrid retrieval and reranking.

Tools

  • LLM Tools
    A unified interface for interacting with LLMs, embeddings, and rerankers.

Data Processing

  • SEC-10-K-Structured-Extraction
    About Parses SEC EDGAR Form 10-K annual reports into standardized JSON, automatically identifying the content and status of every Item.
  • SEC-10-K-Structured-Extraction-Web
    About Parses SEC EDGAR Form 10-K annual reports into standardized JSON, automatically identifying the content and status of every Item.
  • file2md
    Converting different file formats into Markdown for downstream LLM usage.

Evaluation

  • llm-evals
    Experimenting with LLM evaluation and LLM-as-a-judge approaches.

Research Exploration

ML + Database

  • ML2SQL
    Exploring how ML models can run directly inside databases using SQL.

Why This Exists

I believe the best way to understand LLM systems is to:

Build them piece by piece.

Each repository focuses on a different part of the stack, and together they form a rough picture of how modern LLM systems work.


Still Learning

This is an ongoing journey.
Many things are incomplete, naive, or experimental — and that’s intentional.

If you’re also learning, feel free to explore, use, or build on top of these projects.

Contact

If you have any questions, ideas, or just want to chat about LLMs, feel free to:

  • Open an issue
  • Or reach out via email

[email protected]

Pinned Loading

  1. file2md file2md Public

    file2md is a versatile tool for converting multiple file formats to Markdown.

    Python 5

  2. TensorrtServer TensorrtServer Public

    A high-performance deep learning model inference server based on TensorRT, supporting fast inference for Embedding, Reranker, and NLI models.

    Python 4

  3. ML2SQL ML2SQL Public

    Compile tree-based machine learning models into SQL inference queries, enabling model predictions to run directly inside the database.

    Python 4

  4. LLM-Router-Server LLM-Router-Server Public

    LLM Router Server is a high-performance routing service designed for multi-model deployment scenarios, used to uniformly manage and orchestrate multiple local Large Language Model (LLM) services, E…

    Python 5 1

  5. picobot picobot Public

    A small, clear, and extensible multi-user Web Agent — chat, call tools, operate a workspace, browse the web, search for information, with each conversation running in an isolated sandbox.

    Python 2

Repositories

Showing 10 of 17 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…