-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Pull requests: openai/parameter-golf
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Non-record: Emergent weight symmetry in QO projections + learnable SymMix
#1214
opened Apr 1, 2026 by
gersh
Loading…
5 tasks done
Record: Window Attention + Mixed Seq_Len Training, bpb 1.1108, eval at 6144 (5-seed mean)
#1212
opened Apr 1, 2026 by
Gusanidas
Loading…
Non-record: Custom tokenizer with web-content symbols + pre-tokenized dataset
#1210
opened Apr 1, 2026 by
mikeapedia
Loading…
4 tasks
Record: Full GPTQ + Score-First TTT + SLOT — val_bpb 1.1064 (3-seed mean)
#1209
opened Apr 1, 2026 by
andrewbaggio1
Loading…
5 tasks done
HYDRA-Ω: SLOT-Optimized Parameter-Efficient Language Model (WIP)
#1207
opened Apr 1, 2026 by
RAVINDRA8008
Loading…
Non-record: Universal Transformer (4h unlimited compute track)
#1206
opened Apr 1, 2026 by
oneKn8
Loading…
3 of 6 tasks
Non-record: Turbo-Muon + EngramLite(10240) + VE(8,9,10) — val_bpb 1.1431
#1205
opened Apr 1, 2026 by
SergheiBrinza
Loading…
Record: ParallelResiduals + MiniDepthRecurrence, 1.1063 BPB / 1.8679 nats, -0.0072 vs PR #1179, -0.0143 vs merged SOTA
#1204
opened Apr 1, 2026 by
msisovic
Loading…
Record: Unified Attention + FA3 + Legal TTT (val_bpb=1.1412, 3-seed)
#1202
opened Mar 31, 2026 by
VirajDeshwal
Loading…
Add non-record 16MB submission: Hybrid Sparse Diffusion 2H on 8xH100
#1198
opened Mar 31, 2026 by
ymrohit
Loading…
Non-record: Mamba-Inspired SSM Hybrid 3:1 (val_bpb 3.3168)
#1197
opened Mar 31, 2026 by
dentity007
Loading…
Non-record: LLM-JEPA — Joint Embedding Prediction (val_bpb 2.2020)
#1196
opened Mar 31, 2026 by
dentity007
Loading…
Non-record: Learning Adapters on Random Linear Maps (val_bpb 2.2017)
#1195
opened Mar 31, 2026 by
dentity007
Loading…
Non-record: Text Diffusion (MDLM) — Masked Discrete Diffusion (val_bpb 3.3801)
#1194
opened Mar 31, 2026 by
dentity007
Loading…
Non-record: Universal Transformer + Adaptive Density (val_bpb 1.4390)
#1193
opened Mar 31, 2026 by
dentity007
Loading…
Non-record: Fused Triton Megakernels — RMSNorm + LeakyReLU² (val_bpb 1.3560)
#1192
opened Mar 31, 2026 by
dentity007
Loading…
Non-record: H-Net Dynamic Chunking — Learned Tokenization Layer (val_bpb 1.3587)
#1191
opened Mar 31, 2026 by
dentity007
Loading…
3 tasks done
Non-record: 10L MLP3x + Muon | val_bpb=1.3365 | Single Colab GPU
#1190
opened Mar 31, 2026 by
Durlabhkumarjha
Loading…
Orchestrated 10L Int5 record stack, LeakyReLU² toggle, RunPod helpers
#1188
opened Mar 31, 2026 by
rithvik-duddupudi
Loading…
Non-record: Negative Results — Architecture, TTT Variants, Quantization, and N-gram Cache Illegality
#1186
opened Mar 31, 2026 by
andrewbaggio1
Loading…
4 tasks done
[10min_16mb] 0.9641 BPB: LeakyReLU² + Score-First TTT + N-gram Backoff Cache
#1185
opened Mar 31, 2026 by
skoustav35
Loading…
Record: Scylla + Full GPTQ + XSA-all + FA3 — val_bpb 0.9485 (3-seed mean)
#1184
opened Mar 31, 2026 by
icryo
Loading…
5 tasks done
Previous Next
ProTip!
Follow long discussions with comments:>50.