GPU CLI

Run any code on cloud GPUs with a single command. Just prefix your normal commands with gpu run.

python train.py           # local
gpu run python train.py   # remote GPU

Features

Simple - Prefix commands with gpu run, that's it
Fast - Connection pooling, delta sync, real-time output streaming
Cost-efficient - Auto-stops pods when idle (save 60-98% on GPU costs)
Multi-cloud - RunPod, Vast.ai, local Docker
Secure - Zero-trust encryption on supported providers

Quick Start

# 1. Install GPU CLI
curl -fsSL https://gpu-cli.sh/install.sh | sh

# 2. Run your code on a remote GPU
gpu run python train.py

Claude Code Plugin

This repo includes a Claude Code plugin that supercharges GPU CLI with AI assistance. Describe what you want in plain English, and Claude generates complete, runnable GPU workflows.

What's Included

Skills (Automatic AI Capabilities)

Skill	Description
gpu-workflow-creator	Transform natural language into complete GPU projects
gpu-ml-trainer	LLM fine-tuning, LoRA training, classifier training
gpu-inference-server	Set up vLLM, TGI, or custom inference APIs
gpu-media-processor	Whisper transcription, voice cloning, video generation
gpu-cost-optimizer	GPU selection advice and cost optimization
gpu-debugger	Debug failed runs, OOM errors, connectivity issues

Slash Commands

Command	Description
`/gpu-cli:gpu-create`	Create a complete project from a description
`/gpu-cli:gpu-optimize`	Analyze and optimize your gpu.jsonc
`/gpu-cli:gpu-debug`	Debug a failed GPU run
`/gpu-cli:gpu-quick`	Quick-start common workflows

Example Conversations

Create a LoRA training project:

You: I want to train a LoRA on photos of my dog so I can generate images of it

Claude: [Generates complete project with gpu.jsonc, train.py, requirements.txt, README.md]

Set up a private LLM API:

You: Set up Llama 3.1 70B as a private ChatGPT-like API

Claude: [Generates vLLM server config with OpenAI-compatible endpoints]

Debug an error:

You: /gpu-cli:gpu-debug CUDA out of memory when running FLUX

Claude: [Analyzes error, suggests reducing batch size or upgrading to A100]

Optimize costs:

You: /gpu-cli:gpu-optimize

Claude: [Reviews gpu.jsonc, suggests RTX 4090 instead of A100 for your workload, saving 75%]

Templates

Ready-to-use templates for common AI/ML workflows:

Template	GPU	Description
Ollama Models	RTX 4090	Run LLMs with Ollama - includes Web UI and OpenAI-compatible API
vLLM Models	RTX 4090	High-performance LLM inference with vLLM
Background Removal	RTX 4090	Remove backgrounds from images using AI
CrewAI Stock Analysis	RTX 4090	Multi-agent stock analysis with CrewAI + Ollama
HuggingFace Gradio	RTX 4090	Run HuggingFace models with Gradio UI
Qwen Image Edit	RTX 4090	Edit images using Qwen vision model

Common Commands

# Run a command on remote GPU
gpu run python script.py

# Run a server with port forwarding
gpu run -p 8188:8188 python server.py --listen 0.0.0.0

# Open a shell on the remote pod
gpu shell

# View running pods
gpu pods list

# Stop a pod
gpu stop

# Interactive dashboard
gpu dashboard

Configuration

Create a gpu.jsonc in your project:

{
  "$schema": "https://gpu-cli.sh/schema/v1/gpu.json",
  "project_id": "my-project",
  "provider": "runpod",

  // Sync outputs back to local machine
  "outputs": ["output/", "models/"],

  // GPU selection
  "gpu_type": "RTX 4090",
  "min_vram": 24,

  // Optional: Pre-download models
  "download": [
    { "strategy": "hf", "source": "black-forest-labs/FLUX.1-dev", "allow": "*.safetensors" }
  ],

  "environment": {
    "base_image": "runpod/pytorch:2.4.0-py3.11-cuda12.4.1-devel-ubuntu22.04"
  }
}

Network Volumes (Recommended)

For faster startup and persistent model storage, use RunPod Network Volumes. See the Network Volumes Guide for setup instructions.

GPU Options

GPU	VRAM	Best For	Cost/hr
RTX 4090	24GB	Image generation, LoRA training	~$0.44
RTX 4080	16GB	SDXL, most workflows	~$0.35
A100 40GB	40GB	70B models, video generation	~$1.29
A100 80GB	80GB	70B+ models, large batch	~$1.79
H100 80GB	80GB	Maximum performance	~$3.99

Documentation

Network Volumes Guide - Persistent storage for models
GPU CLI Docs - Full documentation

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 73 Commits
.claude-plugin		.claude-plugin
skills/gpu		skills/gpu
templates		templates
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GPU CLI

Features

Quick Start

Claude Code Plugin

What's Included

Skills (Automatic AI Capabilities)

Slash Commands

Example Conversations

Templates

Common Commands

Configuration

Network Volumes (Recommended)

GPU Options

Documentation

License

About

Uh oh!

Releases

Packages

Contributors 4

Uh oh!

Languages

gpu-cli/gpu

Folders and files

Latest commit

History

Repository files navigation

GPU CLI

Features

Quick Start

Claude Code Plugin

What's Included

Skills (Automatic AI Capabilities)

Slash Commands

Example Conversations

Templates

Common Commands

Configuration

Network Volumes (Recommended)

GPU Options

Documentation

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Uh oh!

Languages

Packages