rmsnorm
Here are 10 public repositories matching this topic...
Efficient kernel for RMS normalization with fused operations, includes both forward and backward passes, compatibility with PyTorch.
-
Updated
Jun 5, 2024 - Python
Simple and easy to understand PyTorch implementation of Large Language Model (LLM) GPT and LLAMA from scratch with detailed steps. Implemented: Byte-Pair Tokenizer, Rotational Positional Embedding (RoPe), SwishGLU, RMSNorm, Mixture of Experts (MOE). Tested on Taylor Swift song lyrics dataset.
-
Updated
Nov 18, 2024 - Python
Simple character level Transformer
-
Updated
May 27, 2024 - Jupyter Notebook
Optimized Fused RMSNorm implementation with CUDA. Features vectorized memory access (float4), warp-level reductions, and efficient backward pass for LLM training
-
Updated
Dec 24, 2025 - Python
Generative models nano version for fun. No STOA here, nano first.
-
Updated
Jul 27, 2025 - Jupyter Notebook
Build an LLM in PyTorch: BPE tokenizer, GPT-1/2 + LLaMA, end-to-end train/infer
-
Updated
Feb 8, 2026 - Python
A from-scratch PyTorch LLM implementing Sparse Mixture-of-Experts (MoE) with Top-2 gating. Integrates modern Llama-3 components (RMSNorm, SwiGLU, RoPE, GQA) and a custom-coded Byte-Level BPE tokenizer. Pre-trained on a curated corpus of existential & dark philosophical literature.
-
Updated
Jan 7, 2026 - Python
🚀 Build your own LLM easily with OpenLabLM, a lightweight, hackable codebase tailored for hobbyists using a single consumer GPU.
-
Updated
Feb 15, 2026 - Python
Improve this page
Add a description, image, and links to the rmsnorm topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the rmsnorm topic, visit your repo's landing page and select "manage topics."