⚡ VibeHQ

🌐 Language: English | 繁體中文 | 日本語

⚡ VibeHQ

Running 5 AI agents in parallel is easy.
Making them not break each other's code is the hard part.

VibeHQ adds contracts, task tracking, and idle-aware messaging to Claude Code, Codex & Gemini CLI — so they work like an actual engineering team, not 5 interns editing the same file.

The Problem Nobody Talks About

Every "multi-agent" tool lets you run multiple CLI agents in parallel. But parallel ≠ collaboration. Here's what actually happens when 5 agents build the same app:

What Goes Wrong	Real Example from Our Logs
Schema conflicts — each agent invents its own JSON format	Frontend expects `{ data: [] }`, backend writes `{ results: [] }`, third agent creates its own copy
Orchestrator role drift — the PM starts writing code	PM spent 6 manual JS patches fixing integration bugs instead of coordinating
Ghost files — agents publish 43-byte stubs instead of real content	Agent writes full file via `share_file`, then puts `"See local file..."` in `publish_artifact`. Loop repeats for 68 minutes
Premature execution — agents start before dependencies are ready	Agent sees `QUEUED` task description, ignores the status, starts coding with hardcoded data
Silent failures — crashed agents produce no signal	Orchestrator waits 18 minutes for a response from a dead process

These aren't edge cases. They're LLM-native behavioral patterns that reliably appear across model families. We documented 7 of them with full session logs.

📖 Read the full analysis: 7 LLM-Native Problems →

What VibeHQ Actually Does

VibeHQ is a teamwork protocol layer that sits on top of real CLI agents. Each agent stays a full Claude Code / Codex / Gemini process with all native features — VibeHQ adds the coordination they're missing:

Problem	VibeHQ's Fix
Schema conflicts	Contract system — agents must sign API specs before coding begins
Role drift	Structured task lifecycle — `create → accept → in_progress → done` with required artifacts
Ghost files	Hub-side validation — rejects `publish_artifact` calls with stub content (<200 bytes)
Premature execution	Idle-aware queue — withholds task details until dependencies are ready
Silent failures	Heartbeat monitoring — auto-detects offline agents, notifies orchestrator
No quality check	Independent QA — separate agent validates data against source docs
No post-mortem	13 automated detection rules — analyzes session logs for failure patterns

Self-Improving Coordination: The Framework That Debugs Itself

VibeHQ doesn't just coordinate agents — it analyzes its own failures and writes code to fix them. Fully automated, zero human intervention.

We built a closed-loop system: run a benchmark → analyze the logs → /optimize-protocol reads the analysis and implements real code changes → rebuild → run again and measure:

┌─────────────┐     ┌──────────────────┐     ┌───────────────────┐
│  Benchmark   │────▶│  vibehq-analyze   │────▶│ /optimize-protocol│
│  (run team)  │     │  --with-llm       │     │   (Claude skill)  │
└─────────────┘     └──────────────────┘     └───────────────────┘
       ▲                                              │
       │              writes real code changes        │
       └──────────────────────────────────────────────┘

Benchmark Results: Todo App (V1 → V5, 4 agents)

Metric	V1	V2	V3	V4	V5
Total Tokens	7.2M	3.9M	14.6M	15.0M	5.7M
PM Tokens	0.3M	0.2M	10.1M	9.8M	1.8M
PM % of Total	4%	5%	69%	65%	32%
Turns	233	164	326	308	216
Duration	47min	13min	10min	9min	14min
Flags (issues)	4	3	5	3	0
Context Bloat (PM)	7.07x	10.56x	6.62x	7.04x	2.84x

Benchmark Results: Classroom Quiz (fully automated loop)

Metric	V1 (Before)	V2 (After Loop)	Change
Total Tokens	23.1M	13.8M	-40%
PM Tokens	~15.2M	~1.3M	-91%
Turns	460	353	-23%
Flags	14	3	-79%
STUB_FILE	8	0	eliminated
Context Bloat (PM)	7.87x	2.84x	-64%

What the system learned and built

Iteration	Problem Found	What Was Built
V1→V2	Hub falsely kills agents during boot; PM writes code	Startup grace period (180s); role presets with tool bans
V2→V3	Codex PM ignores prompt constraints (shell_command 4→42x)	`--disallowedTools` CLI enforcement; switched PM to Claude
V3→V4	PM uses Glob to monitor workers; artifacts overwritten to 0 bytes	Expanded disallowed tools; 0-byte content rejection at MCP layer
V4→V5	PM polling explodes (28x check_status); stubs pass validation	`McpRateLimiter` (5 calls/60s); `CODE_MIN` enforcement; post-completion quiesce
CQ V1→V2	8 stub files; PM 66% of tokens on polling	Same fixes applied automatically — stubs eliminated, tokens -40%

       23.1M ┤                         * CQ-V1
             │
       15.0M ┤               * V3  * V4
       13.8M ┤                            * CQ-V2
             │
        7.2M ┤  * V1
        5.7M ┤                                  * V5
        3.9M ┤      * V2
             │
           0 ┼──────────────────────────────────────
             V1   V2   V3   V4  CQ1  CQ2   V5

Key insight: Prompt constraints are suggestions. CLI-level enforcement is law. Agents adapt and route around soft limits — the fix must be architectural.

📖 Full blog post: Self-Improving Multi-Agent Coordination →

📱 Web Dashboard — Desktop & Mobile

Start agents on your PC, monitor from your phone.

Mobile

vibehq-app.mp4

Desktop

vibehq.mp4

🚀 Quick Start

git clone https://github.com/0x0funky/vibehq-hub.git
cd vibehq-hub && npm install
npm run build

Terminal (TUI)

vibehq

Interactive menu — select a team, configure agents, start. Everything runs in your terminal.

Web Dashboard

npm run build:web
vibehq-web

Open http://localhost:3100 — create a team, add agents, hit Start. Manage everything from a browser.

# With auth (recommended for LAN/mobile access)
VIBEHQ_AUTH=admin:secret vibehq-web

The server prints your LAN IP — open it on your phone and you're in.

🔧 20 MCP Tools

Every agent gets 20 collaboration tools auto-injected via Model Context Protocol:

Communication (6): ask_teammate, reply_to_team, post_update, get_team_updates, list_teammates, check_status

Tasks (5): create_task, accept_task, update_task, complete_task, list_tasks

Artifacts (5): publish_artifact, list_artifacts, share_file, read_shared_file, list_shared_files

Contracts (3): publish_contract, sign_contract, check_contract

System (1): get_hub_info

🎬 Watch 7 agents collaborate in real-time →

MCP tools in action (videos)

List Teammates

list_teammate.mp4

Teammate Talk

Assign_task.mp4

Assign Task

Discuss_teammate.mp4

📊 Post-Run Analytics & Auto-Optimization

Analyze

vibehq-analyze ./data                        # Analyze session logs
vibehq-analyze --team my-team --with-llm     # Auto-resolve team logs + LLM insights
vibehq-analyze --team my-team --with-llm --save --run-id v1  # Save for optimization
vibehq-analyze compare v1 v2                 # Compare two runs side-by-side
vibehq-analyze history --last 10             # View past runs

13 automated detection rules: artifact regression, orchestrator role drift, stub files, task timeout, incomplete tasks, coordination overhead, unresponsive agents, zero artifacts, context bloat, duplicate artifacts, premature task accept, excessive MCP polling, task reassignment.

Skills: `/run-teamwork`, `/benchmark-loop` & `/optimize-protocol`

VibeHQ ships three skills. Skills work on both Claude Code and Codex CLI — same format, different directory.

Cross-Platform Skill Locations

Platform	Project-level	User-level
Claude Code	`.claude/skills/<name>/SKILL.md`	`~/.claude/skills/`
Codex CLI	`.agents/skills/<name>/SKILL.md`	`~/.codex/skills/`

The SKILL.md format is an emerging cross-platform standard — same frontmatter (name, description), same markdown body. A skill created for one platform works on the other.

Setup

Claude Code — skills are already included in .claude/skills/. Just use them:

# In Claude Code, type:
/run-teamwork "Build an AI investment analysis platform"
/benchmark-loop "Build a todo app" --grade A
/optimize-protocol v1

Codex CLI — copy the skills to Codex's directory:

# Project-level (committed to repo)
mkdir -p .agents/skills
cp -r .claude/skills/run-teamwork .agents/skills/
cp -r .claude/skills/optimize-protocol .agents/skills/
cp -r .claude/skills/benchmark-loop .agents/skills/

# Or user-level (available in all projects)
cp -r .claude/skills/run-teamwork ~/.codex/skills/
cp -r .claude/skills/optimize-protocol ~/.codex/skills/
cp -r .claude/skills/benchmark-loop ~/.codex/skills/

Then in Codex CLI, invoke with /skills or type $ to mention a skill.

`/run-teamwork` — One-Shot Team Builder

Give it a project description — it designs the team, spawns agents, and builds it. No analysis, no loop.

/run-teamwork "Build an e-commerce site with payments and admin panel"

Analyzes the prompt to determine required domains and team size
Generates PM system prompt with research-first workflow (research before implementation)
Spawns agents in tmux (macOS/Linux) or Windows Terminal
Waits for all tasks to complete
Reports the output directory and file count

`/optimize-protocol` — Framework Engineer

Reads analysis data and writes real code fixes (not parameter tuning):

/optimize-protocol v1    # Read analysis for run v1, implement fixes

Loads current run + all previous optimization reports
Builds cross-run trend table (what's improving, what regressed, what's a side-effect)
Classifies each problem as NEW, RECURRING, or SIDE-EFFECT of a previous fix
Implements real TypeScript changes to the framework
Verifies build passes
Saves a detailed changelog to ~/.vibehq/analytics/optimizations/

`/benchmark-loop` — Autonomous Runner

Runs the full self-improving cycle automatically:

/benchmark-loop "Build a Todo app with REST API, React frontend, and WebSocket real-time updates"

Spawns a fresh team with a standardized project
Waits for the team to finish (heartbeat monitoring)
Analyzes session logs (13 rules + LLM grading)
Triggers /optimize-protocol to write code fixes
Rebuilds the framework (npx tsup)
Repeats with a new team — zero human intervention

Manual Step-by-Step (works with any CLI)

The underlying tools are regular CLI commands — no skills required:

# 1. Run a benchmark
vibehq start --team your-team

# 2. Analyze
vibehq-analyze --team your-team --with-llm --save --run-id v1

# 3. Auto-optimize (Claude Code / Codex skill)
/optimize-protocol v1

# 4. Run again, compare
vibehq start --team your-team
vibehq-analyze --team your-team --with-llm --save --run-id v2
vibehq-analyze compare v1 v2

All optimization reports are saved to ~/.vibehq/analytics/optimizations/ for tracking and auditing.

Supports both Claude Code and Codex CLI native JSONL log formats.

📱 Remote Access

The web platform is accessible on your LAN by default. For external access:

⚠️ Always set VIBEHQ_AUTH before exposing remotely — the web UI gives full terminal access.

Method	Best For
Tailscale	Personal use — private VPN, no config, free
Cloudflare Tunnel	Sharing — public URL behind Cloudflare, free
ngrok	Quick testing — `ngrok http 3100`, temporary URL
SSH Tunnel	VPS — `ssh -R 8080:localhost:3100 your-server`

Tailscale (recommended): Install on PC + phone → sign in both → VIBEHQ_AUTH=admin:secret vibehq-web → open http://<tailscale-ip>:3100 on phone.

📝 Configuration

`vibehq.config.json`

{
  "teams": [{
    "name": "my-project",
    "hub": { "port": 3001 },
    "agents": [
      { "name": "Alex", "role": "Project Manager", "cli": "codex", "cwd": "D:\\project" },
      { "name": "Jordan", "role": "Frontend Engineer", "cli": "claude", "cwd": "D:\\project\\frontend",
        "dangerouslySkipPermissions": true, "additionalDirs": ["D:\\project\\shared"] }
    ]
  }]
}

Field	Description
`name`	Agent display name (unique per team)
`role`	Role — auto-loads preset if no `systemPrompt` set
`cli`	`claude`, `codex`, or `gemini`
`cwd`	Working directory (isolated per agent)
`systemPrompt`	Custom prompt (overrides preset)
`dangerouslySkipPermissions`	Auto-approve Claude permissions
`additionalDirs`	Extra directories agent can access

Built-in presets: Project Manager, Product Designer, Frontend Engineer, Backend Engineer, AI Engineer, QA Engineer

🛠 CLI Reference

vibehq              # Interactive TUI
vibehq-web          # Web platform (browser + mobile)
vibehq-hub          # Standalone hub server
vibehq-spawn        # Spawn single agent
vibehq-analyze      # Post-run analytics

Manual Spawn

vibehq-spawn --name "Jordan" --role "Frontend Engineer" \
  --team "my-team" --hub "ws://localhost:3001" \
  --skip-permissions --add-dir "/shared" -- claude

🏗 Architecture

┌──────────────────────────────────────────────────┐
│                   VibeHQ Hub                      │
│               (WebSocket Server)                  │
│  ┌────────┐ ┌──────────┐ ┌────────┐ ┌─────────┐ │
│  │ Tasks  │ │Artifacts │ │Contract│ │ Message │ │
│  │ Store  │ │ Registry │ │ Store  │ │  Queue  │ │
│  └────────┘ └──────────┘ └────────┘ └─────────┘ │
│  ┌──────────────────────────────────────────────┐│
│  │  Agent Registry — idle/working detection     ││
│  └──────────────────────────────────────────────┘│
└────────┬──────────┬──────────┬──────────┬────────┘
    ┌────▼───┐ ┌────▼───┐ ┌───▼────┐ ┌───▼────┐
    │ Claude │ │ Claude │ │ Codex  │ │ Claude │
    │  (FE)  │ │  (BE)  │ │  (PM)  │ │  (QA)  │
    │ 20 MCP │ │ 20 MCP │ │ 20 MCP │ │ 20 MCP │
    └────────┘ └────────┘ └────────┘ └────────┘
         ▲          ▲          ▲          ▲
         └──────────┴────┬─────┴──────────┘
                    ┌────▼─────────────┐
                    │  Web Dashboard   │
                    │ Desktop & Mobile │
                    └──────────────────┘

Key design:

Process isolation — each agent is a separate OS process. Crashes don't cascade.
Contract-driven — specs must be signed before coding begins.
Idle-aware queue — messages queue when busy, flush when idle (JSONL watcher + PTY timeout).
State persistence — all data survives hub restarts (~/.vibehq/teams/<team>/hub-state.json).
MCP-native — 20 purpose-built tools, type-safe, auto-configured per agent.
Orchestrator enforcement — Claude PMs get --disallowedTools (CLI-level hard block on Bash/Write/Edit/Read/Glob); Codex PMs get --sandbox read-only.
Content validation — MCP rejects 0-byte artifacts, stub patterns, and >80% size regressions at the tool level.
Self-improving — analyze→optimize loop with cross-run trend tracking and automated changelogs.

⚠️ Platform Support

Feature	Windows	Mac	Linux
Web Platform	✅ Tested	✅ Should work	✅ Should work
TUI	✅ Tested	✅ Tested	⚠️ Untested
Hub + Spawn	✅ Tested	✅ Tested	✅ Should work
JSONL Watcher	✅ Tested	✅ Tested	⚠️ Path encoding
node-pty	✅ Tested	✅ Tested	⚠️ Untested

Mac: requires xcode-select --install. If posix_spawnp failed: chmod +x node_modules/node-pty/prebuilds/*/spawn-helper

Linux: requires build-essential and python3.

📁 Project Structure

agent-hub/
├── bin/                  # CLI entry points (start, spawn, hub, web, analyze)
├── src/
│   ├── hub/              # WebSocket hub, agent registry, message relay
│   ├── spawner/          # PTY manager, JSONL watcher, idle detection
│   ├── web/              # Express server, REST API, WebSocket handlers
│   ├── mcp/              # 20 MCP tools + hub-client bridge
│   ├── analyzer/         # Post-run analytics pipeline (13 rules)
│   ├── shared/           # TypeScript types
│   └── tui/              # Terminal UI screens + role presets
├── web/                  # React frontend (Vite + xterm.js)
├── blog/                 # Technical articles on LLM behavioral patterns
└── benchmarks/           # V1 vs V2 comparison reports

🤝 Contributing

PRs welcome. Modular architecture:

New MCP tool? → src/mcp/tools/ + register in hub-client.ts
New CLI? → detection in spawner.ts + MCP config in autoConfigureMcp()
New widget? → web/src/components/ or src/tui/screens/

📄 License

MIT

𝕏 @0x0funky

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
.claude		.claude
benchmarks		benchmarks
bin		bin
blog		blog
chart		chart
examples		examples
images		images
launch		launch
src		src
tests		tests
web-dist		web-dist
web		web
.gitignore		.gitignore
LICENSE		LICENSE
README.ja.md		README.ja.md
README.md		README.md
README.zh-TW.md		README.zh-TW.md
agent_hub_spec.md.resolved		agent_hub_spec.md.resolved
blog-draft-self-improving-agents.md		blog-draft-self-improving-agents.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
tsup.config.ts		tsup.config.ts
vibehq.config.json		vibehq.config.json
vibehq_architecture.md.resolved		vibehq_architecture.md.resolved

Folders and files

Latest commit

History

Repository files navigation

⚡ VibeHQ

The Problem Nobody Talks About

What VibeHQ Actually Does

Self-Improving Coordination: The Framework That Debugs Itself

Benchmark Results: Todo App (V1 → V5, 4 agents)

Benchmark Results: Classroom Quiz (fully automated loop)

What the system learned and built

📱 Web Dashboard — Desktop & Mobile

Mobile

Desktop

🚀 Quick Start

Terminal (TUI)

Web Dashboard

🔧 20 MCP Tools

List Teammates

Teammate Talk

Assign Task

📊 Post-Run Analytics & Auto-Optimization

Analyze

Skills: /run-teamwork, /benchmark-loop & /optimize-protocol

Cross-Platform Skill Locations

Setup

/run-teamwork — One-Shot Team Builder

/optimize-protocol — Framework Engineer

/benchmark-loop — Autonomous Runner

Manual Step-by-Step (works with any CLI)

vibehq.config.json

Manual Spawn

🤝 Contributing

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Skills: `/run-teamwork`, `/benchmark-loop` & `/optimize-protocol`

`/run-teamwork` — One-Shot Team Builder

`/optimize-protocol` — Framework Engineer

`/benchmark-loop` — Autonomous Runner

`vibehq.config.json`

Packages