Skip to content

Releases: fastxyz/skill-optimizer

v1.1.0

18 Apr 00:12

Choose a tag to compare

What's new in v1.1.0

Added

  • Prompt surface — benchmark and optimize prompt templates, Claude Code skills, and agent instructions. Discovers phases and capabilities from markdown, evaluates output quality with content-based criteria (required sections, format patterns, forbidden keywords, code blocks).
  • Codex auth — direct OpenAI model runs can use browser-login tokens or a static OPENAI_API_KEY stored by Codex (~/.codex/auth.json) instead of requiring an env var. Set benchmark.authMode: "codex" and "format": "openai" with openai/<model> IDs.
  • SKILL folder — bundled AI-agent guidance (SKILL/SKILL.md) so agents can use skill-optimizer reliably without extra setup.
  • Stable task IDs — IDs are now a SHA-1 hash of action names (SDK/CLI/MCP) or prompt text (prompt surface), so --task <id> filters work across regenerations (fixes #17).
  • Optimizer loop diagram — README includes a visual workflow diagram.

Fixed

  • Anthropic tool names — dotted tool names (e.g. auth.status) are now sanitized to auth_status before sending to the Anthropic API and mapped back in responses. Fixes hard failures on tool-calling benchmarks against anthropic/ models.
  • Prompt eval on model error — prompt evaluator no longer runs when the model call itself failed; toolPrecision is now correctly set to 1.0 for prompt tasks (no tool calls = vacuously perfect precision).
  • Config path — running without --config now looks for .skill-optimizer/skill-optimizer.json, matching what init scaffolds.
  • Format/prefix validationvalidate now errors when benchmark.format: "openai" is paired with non-openai/ model IDs, and vice versa for anthropic/.
  • Codex static key routing — a plain OPENAI_API_KEY in ~/.codex/auth.json now correctly routes to the direct OpenAI transport instead of the JWT-only Codex transport. A malformed access_token (non-JWT) no longer shadows a valid static key fallback.
  • Model IDs — OpenRouter slugs preserve dots (openrouter/anthropic/claude-sonnet-4.6); dot→hyphen rewrite applies only to anthropic/ direct-API IDs; openai/ slugs (e.g. gpt-5.4) are exempt.
  • Provider prefix is stripped before sending model IDs to anthropic/ and openai/ direct APIs.
  • Prompt-surface benchmarks no longer hard-fail on coverage violations; coverage is informational.
  • Prompt tasks are scored against their specific capabilityId, not always the first discovered capability.

Breaking changes

  • CodeModeConfigSdkSurfaceConfig
  • McpModeConfigMcpSurfaceConfig
  • ExpectedToolExpectedAction
  • ToolMatchActionMatch
  • LEGACY_PROJECT_CONFIG_NAME → hard-code ".skill-optimizer/skill-optimizer.json"
  • toLegacyOptimizeManifest → removed
  • SurfaceSnapshotArg → removed

TaskResult fields: toolMatchesactionMatches, hallucinatedCallshallucinatedActions, unnecessaryCallsunnecessaryActions. Re-run benchmark to regenerate report files.

tasks.json files using expected_tools or method on action entries will error on load — rename to expected_actions and name.

The config file skill-benchmark.json is no longer auto-detected — rename to skill-optimizer.json and move it into .skill-optimizer/.

Full Changelog: 1.0.0...1.1.0

v1.0.0

14 Apr 21:30
02c1e28

Choose a tag to compare

feat(optimizer): replace minimal skill-writing-guide with full skill-…