FlashSinkhorn: IO-Aware Entropic Optimal Transport in PyTorch + Triton. Streaming Sinkhorn with O(nd) memory.
machine-learning gpu cuda pytorch triton optimal-transport sinkhorn flash-attention entropic-optimal-transport flashsinkhorn
-
Updated
Mar 3, 2026 - Python