devverify is a Windows-focused, agent-native verification and investigation harness.
The goal is not just to run a test and print a report. The goal is to keep a live investigation session open so Codex, Claude Code, or a human developer can inspect the browser, services, logs, files, and future data stores step by step.
- Session directory under
.devverify/sessions/<id> - Timeline event log for every operation
- Service launcher with log capture and HTTP health checks
- Chrome CDP probes through a remote debugging port
- File grep probe
- Java/Spring/Vite project detection
- VS Code
tasks.jsonandlaunch.jsonreaders - Thin scenario runner that starts services and leaves the session available for follow-up investigation
- HTTP probe steps, JSON assertions, run context interpolation, and
report.json - Data policy planning, preparation, evidence recording, and explicit cleanup
- Explicit cleanup commands for tmp files, browser storage/profile, and artifacts
Devverify is a CLI execution and evidence tool. The agent remains the developer and validation planner.
Expected agent flow:
- Read the project code, routes, API contracts, seed data, auth flow, and recent changes.
- Write down the validation path it intends to exercise.
- Encode that path as devverify commands or a temporary scenario config.
- Use devverify to start services, drive the browser, call HTTP endpoints, assert JSON, collect logs/network/console, and preserve evidence.
- Investigate failures by adding targeted probes, not by asking devverify to infer the business workflow.
Devverify should not contain project-specific business path planning. It provides composable primitives that make the agent's plan repeatable and inspectable.
Install the CLI first. Codex plugins and Claude Code skills only teach agents how to use the CLI; they do not replace the CLI executable.
From GitHub:
npm install -g git+https://github.com/fallindream/devverify.git
devverify doctorFor local development from a clone:
git clone https://github.com/fallindream/devverify.git
cd devverify
npm install
npm link
devverify doctorOr run directly:
node ./bin/devverify.mjs doctorThe repository includes a Codex plugin at:
plugins/devverify/
.codex-plugin/plugin.json
skills/devverify/SKILL.md
.agents/plugins/marketplace.json
Install the marketplace:
codex plugin marketplace add https://github.com/fallindream/devverifyThen enable the devverify plugin in Codex if it is not already enabled. Start a new Codex thread after installing or updating the plugin, because skills are loaded when a thread starts.
For local plugin development from a clone:
codex plugin marketplace add D:\project\devverifyIf you update the local plugin while developing, refresh the cachebuster and reinstall the marketplace:
python C:\Users\<you>\.codex\skills\.system\plugin-creator\scripts\update_plugin_cachebuster.py D:\project\devverify\plugins\devverify
codex plugin marketplace remove devverify-local
codex plugin marketplace add D:\project\devverifyThe Codex plugin is intentionally CLI-only. It does not expose MCP tools; it tells Codex to inspect the target project, plan the validation path, and execute devverify commands.
The repository includes a Claude Code skill at:
agent/claude-skill/devverify/SKILL.md
Install it into a target project:
cd D:\project\YourApp
mkdir .claude\skills\devverify
Invoke-WebRequest `
-Uri https://raw.githubusercontent.com/fallindream/devverify/main/agent/claude-skill/devverify/SKILL.md `
-OutFile .claude\skills\devverify\SKILL.mdOr copy it from a local clone:
cd D:\project\YourApp
mkdir .claude\skills\devverify
Copy-Item D:\project\devverify\agent\claude-skill\devverify\SKILL.md .claude\skills\devverify\SKILL.mdOpen a new Claude Code session in the target project. You can then ask Claude Code to use devverify, or invoke the skill by name if your Claude Code version supports slash skill invocation.
Update the Claude Code skill:
cd D:\project\YourApp
Invoke-WebRequest `
-Uri https://raw.githubusercontent.com/fallindream/devverify/main/agent/claude-skill/devverify/SKILL.md `
-OutFile .claude\skills\devverify\SKILL.mdThe Claude Code skill is also CLI-only. The target machine still needs devverify installed and available on PATH.
When devverify runs inside a business project, its config and evidence live in that business project, not in the devverify tool repository:
<project>/devverify.config.mjs
<project>/.devverify/configs/<task>.config.mjs
<project>/.devverify/sessions/<session-id>/
Do not commit .devverify/; it may contain logs, screenshots, API responses, and database copies.
Early development is focused on Windows + VS Code + Java/Spring backend + Vite frontend projects.
devverify detect
devverify suggest config
devverify vscode list
devverify vscode tasks
devverify vscode launchesdetect identifies the project shape. suggest config emits a draft devverify.config.mjs-style object with likely services, such as mvn spring-boot:run for Spring Boot or npm run dev / pnpm dev for Vite. VS Code commands read .vscode/tasks.json and .vscode/launch.json; a later extension can use VS Code APIs for real debug launches.
Services may depend on other services and may be backed by a VS Code shell task:
services: {
api: {
command: "mvn spring-boot:run",
health: { url: "http://127.0.0.1:8080/actuator/health" }
},
web: {
vscodeTask: "vite-dev",
dependsOn: ["api"],
health: { url: "http://127.0.0.1:5173" }
}
}By default, devverify starts Chrome for you with a per-session isolated profile and --remote-debugging-port=9222:
devverify session start --name browser-debug --activate
devverify browser start
devverify browser tabsThe browser probes also start Chrome on demand if CDP is not reachable:
devverify browser tabs
devverify browser goto 0 "http://127.0.0.1:5173"
devverify browser fill 0 "[name=email]" "[email protected]"
devverify browser click 0 "text=Login"
devverify browser wait 0 "text=Dashboard"
devverify browser text 0 "#app"
devverify browser info 0
devverify browser console 0 --last 100
devverify browser network 0 --status ">=400"
devverify browser storage 0
devverify browser cookies 0
devverify browser eval 0 "document.title"
devverify browser html 0 "body"
devverify browser shot 0
devverify browser cdp 0 Runtime.evaluate "{\"expression\":\"location.href\",\"returnByValue\":true}"Use an externally started Chrome only when you explicitly want to attach to an existing CDP port. The safer default is the isolated profile created under .devverify/sessions/<id>/browser/.
devverify session start --name login-debug --activate
devverify browser ensure
devverify service start web
devverify browser tabs
devverify timeline show --last 20
devverify service logs web --last 80
devverify service logs web --grep "ERROR|Exception"
devverify files grep . "Unhandled"Every command writes to the active session where relevant, so an agent can keep investigating the same failure scene.
For one-off verification, prefer a task-specific config under .devverify/configs/ and pass it explicitly:
devverify run login --config .devverify/configs/login-debug.config.mjs
devverify report currentThis keeps the repository's default devverify.config.mjs stable while still making the agent-authored validation path reproducible.
Devverify does not assume databases can be copied, rebuilt, or cleaned safely. The agent must inspect the project and declare a data policy before write-capable validation.
Commands:
devverify data inspect
devverify data plan --config .devverify/configs/task.config.mjs
devverify data prepare --config .devverify/configs/task.config.mjs
devverify data cleanup --session current
devverify cleanup --session current --dataSupported policies:
none
use-source-readonly
use-source-with-consent
copy-file-db
shadow-database
shadow-schema
tagged-shared-db
external-prepared-env
custom-hooks
copy-file-db is only one policy for file databases such as H2, SQLite, or DuckDB:
data: {
policy: "copy-file-db",
reason: "验证会写入中间结果,使用会话副本避免污染演示库",
source: "schedule-server/data/schedule.mv.db",
target: "${DEVVERIFY_TMP_DIR}/schedule.mv.db",
cleanup: "manual"
}For MySQL/PostgreSQL, prefer an explicit prepared environment, shared test DB with run markers, or project-owned hooks:
data: {
policy: "tagged-shared-db",
marker: "${DEVVERIFY_RUN_ID}",
cleanup: "manual",
reason: "共享测试库,但本次写入必须能按 runId 追踪"
}data: {
policy: "custom-hooks",
prepare: [
"npm run db:test:prepare -- --run ${DEVVERIFY_RUN_ID}"
],
cleanupHooks: [
"npm run db:test:cleanup -- --run ${DEVVERIFY_RUN_ID}"
],
cleanup: "manual"
}shadow-database and shadow-schema are recorded policies in the current MVP; use prepare and cleanupHooks for the actual project-specific commands. Devverify records the policy and hook output in report.json and timeline.jsonl. Cleanup is never automatic on success.
After a failed validation, keep the session alive and inspect layers in this order:
devverify timeline show --last 100
devverify service logs api --grep "ERROR|Exception"
devverify browser info 0
devverify browser text 0 "#app"
devverify browser console 0 --last 100
devverify browser network 0 --status ">=400"
devverify browser storage 0
devverify browser cookies 0
devverify browser shot 0
devverify files grep src "TODO|throw new|console.error"browser console and browser network currently use a lightweight in-page probe installed by devverify. It captures console calls, page errors, unhandled rejections, fetch, and XMLHttpRequest events after the probe is installed. A later CDP event recorder can capture browser-level traffic continuously.
devverify run <scenario> creates a run context and injects it into every service started by devverify:
DEVVERIFY_RUN_ID
DEVVERIFY_SESSION_ID
DEVVERIFY_SCENARIO
DEVVERIFY_RUN_DIR
DEVVERIFY_TMP_DIR
DEVVERIFY_ARTIFACTS_DIR
Each run is preserved under the active session:
.devverify/sessions/<session-id>/
report.json # latest run report for compatibility
runs/
<run-id>/
report.json
tmp/
artifacts/
List recorded runs:
devverify run listScenario values support ${VAR} interpolation:
scenarios: {
login: {
env: {
TEST_EMAIL: "devverify+${DEVVERIFY_RUN_ID}@example.test"
},
steps: [
{ http: { name: "health", url: "http://127.0.0.1:5680/health", saveAs: "health", expectStatus: 200 } },
{ assertJson: { from: "health", path: "json.ok", equals: true } },
{ fill: ["[name=email]", "${TEST_EMAIL}"] }
]
}
}Cleanup is manual by default. Devverify does not delete evidence or business data automatically on success or failure. Safe cleanup targets can be requested explicitly:
devverify cleanup --session current --tmp
devverify cleanup --session current --browser-storage
devverify cleanup --session current --browser-profile
devverify cleanup --session current --artifacts
devverify cleanup --session current --dataWhen a run context exists, --tmp, --artifacts, and --data target the latest run's scoped directories/data policy. Older run directories stay available for comparison unless explicitly removed later.
Current CLI commands use a global active session, so avoid running multiple devverify run workflows concurrently until per-command --session scoping is added throughout the CLI.
Edit devverify.config.mjs, or create a task-specific file under .devverify/configs/ and pass --config:
export default {
chrome: { port: 9222 },
services: {
web: {
command: "npm run dev",
cwd: ".",
health: { url: "http://127.0.0.1:3000", timeoutMs: 30000 }
}
},
scenarios: {
home: {
services: ["web"],
steps: [
{ http: { name: "health", url: "http://127.0.0.1:3000/health", saveAs: "health", expectStatus: 200 } },
{ assertJson: { from: "health", path: "status", equals: "UP" } },
{ goto: "http://127.0.0.1:3000" },
{ waitFor: "text=Welcome" },
{ browser: "shot", target: "0" }
]
}
}
};- Modularize remaining CLI internals into
src/ - VS Code extension bridge for
debug.startDebugging - Java/Spring Boot service adapter
- Vite service adapter
- Playwright-backed high-level browser actions
- Broader scenario DSL and assertion engine
- Report HTML
- DB inspectors: SQLite, PostgreSQL, MySQL
- Redis inspector
- Local daemon with JSON-RPC/HTTP API
- Deferred: other IDEs, Android, desktop automation, MCP adapter