
About us
This meetup is focused on AI Performance Engineering.
Upcoming events
11

OpenClaw/MoltBot/ClawdBot/MCP + NVFP4 Low Precision for AI System Optimizations
·OnlineOnlineZoom link: https://us02web.zoom.us/j/82308186562
Talk #0: Introductions and Meetup Updates [5 mins]
by Chris Fregly and Antje BarthTalk #1: OpenClaw/MoltBot/ClawdBot/MCP for GPU Kernel and AI System Optimizations [10 mins] by Chris Fregly
In this demo, Chris will demonstrate how to use both MCP and OpenClaw (formerly all the other names used in the talk title!) to optimize GPU kernels and complete end-to-end AI systems.
Related Link: Github Repo for these tools: https://github.com/cfregly/ai-performance-engineering/
Talk #2: Unlocking NVFP4: Low Precision Numerics on NVIDIA Blackwell [30-45 mins] by Riccardo Mereu @ Verda.com
In this talk, Riccardo dives deep into low-precision numerics, model quantization algorithms, on NVIDIA Blackwell including NVFP4 (vs. MXFP4), per-block scaling, per-tensor scaling, mixed precision, and much more!
Related link: GPU Mode NVFP4 GEMM kernel competition blog post by Daniel Obolensky: https://obolensky.xyz/blog/nvfp4_gemm_kernel_explanation/
Zoom link: https://us02web.zoom.us/j/82308186562
Related Links
Github Repo: http://github.com/cfregly/ai-performance-engineering/
O'Reilly Book: https://www.amazon.com/Systems-Performance-Engineering-Optimizing-Algorithms/dp/B0F47689K8/
YouTube: https://www.youtube.com/@AIPerformanceEngineering
Generative AI Free Course on DeepLearning.ai: https://bit.ly/gllm149 attendees
NVIDIA GTC 2026 Conf Recap + Evolution of Flash Attention v1-v4 Optimizations
·OnlineOnlineZoom link: https://us02web.zoom.us/j/82308186562
Talk #0: Introductions and Meetup Updates
by Chris Fregly and Antje BarthTalk #1: NVIDIA GTC 2026 AI Conference Recap by Chris Fregly
In this talk, Chris will present the AI and systems highlights from the NVIDIA GTC 2026 conference (happening the prior week.)
Conference registration link:
https://www.nvidia.com/gtc/ (Use code GTC26-20 for 20% off!)Talk #2: Evolution and Deep Dive into Flash Attention (v1-v4) for Transformers on NVIDIA GPUs by Seth Weidman @ Sentilink and Author of "Deep Learning from Scratch" @ O'Reilly
In this talk, Seth will break down the evolution of Flash Attention, an optimized and mechanically-sympathetic implementation of the attention mechanism which is fundamental to a the Transformer architecture in modern LLMs.
Related links:
Blog: https://modal.com/blog/reverse-engineer-flash-attention-4
Github: https://github.com/Dao-AILab/flash-attention
Arxiv paper: https://arxiv.org/abs/2205.14135Zoom link: https://us02web.zoom.us/j/82308186562
Related Links
Github Repo: http://github.com/cfregly/ai-performance-engineering/
O'Reilly Book: https://www.amazon.com/Systems-Performance-Engineering-Optimizing-Algorithms/dp/B0F47689K8/
YouTube: https://www.youtube.com/@AIPerformanceEngineering
Generative AI Free Course on DeepLearning.ai: https://bit.ly/gllm58 attendees
GPU, CUDA, and PyTorch Performance Optimizations
·OnlineOnlineZoom link: https://us02web.zoom.us/j/82308186562
Talk #0: Introductions and Meetup Updates
by Chris Fregly and Antje BarthTalk #1: GPU, PyTorch, and CUDA Performance Optimizations
Talk #2: GPU, PyTorch, and CUDA Performance Optimizations
Zoom link: https://us02web.zoom.us/j/82308186562
Related Links
Github Repo: http://github.com/cfregly/ai-performance-engineering/
O'Reilly Book: https://www.amazon.com/Systems-Performance-Engineering-Optimizing-Algorithms/dp/B0F47689K8/
YouTube: https://www.youtube.com/@AIPerformanceEngineering
Generative AI Free Course on DeepLearning.ai: https://bit.ly/gllm25 attendees
GPU, CUDA, and PyTorch Performance Optimizations
·OnlineOnlineZoom link: https://us02web.zoom.us/j/82308186562
Talk #0: Introductions and Meetup Updates
by Chris Fregly and Antje BarthTalk #1: GPU, PyTorch, and CUDA Performance Optimizations
Talk #2: GPU, PyTorch, and CUDA Performance Optimizations
Zoom link: https://us02web.zoom.us/j/82308186562
Related Links
Github Repo: http://github.com/cfregly/ai-performance-engineering/
O'Reilly Book: https://www.amazon.com/Systems-Performance-Engineering-Optimizing-Algorithms/dp/B0F47689K8/
YouTube: https://www.youtube.com/@AIPerformanceEngineering
Generative AI Free Course on DeepLearning.ai: https://bit.ly/gllm17 attendees
Past events
369