feat(skills): add cost-aware-llm-pipeline skill by shimo4228 · Pull Request #219 · affaan-m/everything-claude-code

shimo4228 · 2026-02-14T03:16:23Z

Description

Adds a new skill for cost-optimized LLM API usage. This skill covers four composable patterns:

Model Routing: Automatically select cheaper models (Haiku) for simple tasks, reserving expensive models (Sonnet/Opus) for complex ones
Immutable Cost Tracking: Track cumulative API spend with frozen dataclasses
Narrow Retry Logic: Retry only on transient errors (network, rate limit), fail fast on permanent errors
Prompt Caching: Cache long system prompts to reduce token costs

Type of Change

feat: New feature

Motivation

There is currently no skill in ECC that addresses LLM API cost optimization. As LLM-powered applications scale, cost control becomes critical. This skill provides battle-tested patterns extracted from production use with the Anthropic API.

Checklist

Tests pass locally (node tests/run-all.js)
Validation scripts pass
Follows conventional commits format
Updated relevant documentation
Focused on one domain/technology
Includes practical code examples
Under 500 lines
Tested with Claude Code

Summary by CodeRabbit

Documentation
- Added a new skill guide for cost-aware LLM pipeline patterns and best practices, covering model routing by task complexity, cost tracking mechanisms, retry strategies with exponential backoff, and prompt caching to minimize API costs. Includes pricing references and anti-patterns to avoid.

Cost optimization patterns for LLM API usage combining model routing, budget tracking, retry logic, and prompt caching.

coderabbitai · 2026-02-14T03:16:41Z

📝 Walkthrough

Walkthrough

Introduces a cost-aware LLM pipeline skill document that provides guidance patterns for controlling API costs through model routing, immutable cost tracking, controlled retries with exponential backoff, and prompt caching strategies. Includes thresholds, function definitions, workflow composition, and best practices.

Changes

Cohort / File(s)	Summary
Cost-Aware LLM Pipeline Guidance `skills/cost-aware-llm-pipeline/SKILL.md`	New skill documentation defining patterns for cost optimization in LLM pipelines, including model routing by task complexity, immutable cost tracking mechanisms, retry logic with exponential backoff, and prompt caching. Covers activation criteria, control flow, error handling, pricing reference, best practices, and anti-patterns.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~15 minutes

Suggested reviewers

affaan-m

Poem

🐰 A pipeline so wise, with budgets in mind,
Routes tasks with care, cost patterns refined,
Cache prompts once more, don't repeat the call,
Track every expense, immutably—all!
Our fluffy friend cheers for economies small! ✨

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat(skills): add cost-aware-llm-pipeline skill' directly and clearly summarizes the main change: adding a new cost-aware LLM pipeline skill to the codebase.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Merge Conflict Detection	✅ Passed	✅ No merge conflicts detected when merging into `main`

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@skills/cost-aware-llm-pipeline/SKILL.md`:
- Around line 153-160: Update the "Pricing Reference (2025-2026)" table rows for
Haiku 4.5, Sonnet 4.5, and Opus 4.5 in SKILL.md: change Haiku 4.5 rates to Input
$1.00 / Output $5.00; change Sonnet 4.5 to Input $3.00 / Output $15.00 for ≤200K
context (and note the alternate $6.00 / $22.50 for >200K context) and update its
Relative Cost to ≈3x vs Haiku; change Opus 4.5 rates to Input $5.00 / Output
$25.00 and set its Relative Cost to ≈5x vs Haiku; keep the table header and
formatting intact and ensure the Sonnet large-context footnote or parenthetical
is clearly indicated next to the Sonnet 4.5 row.

🧹 Nitpick comments (3)

skills/cost-aware-llm-pipeline/SKILL.md (3)
74-75: Consider clarifying budget boundary behavior.

The over_budget property uses > which allows spending exactly the budget limit. If this is intentional, consider adding a docstring or comment to clarify that meeting the budget exactly is acceptable. If you want to prevent reaching the limit, use >= instead.

147-147: Consider showing the cost calculation.

While the ellipsis placeholders are acceptable for documentation, it would be helpful to show or reference how to calculate cost_usd from input_tokens and output_tokens using the pricing table. This would make the example more complete and actionable.
Example cost calculation
# Example cost calculation based on pricing table
def calculate_cost(model: str, input_tokens: int, output_tokens: int) -> float:
    """Calculate cost in USD based on model and token counts."""
    pricing = {
        MODEL_HAIKU: (0.80 / 1_000_000, 4.00 / 1_000_000),    # input, output per token
        MODEL_SONNET: (3.00 / 1_000_000, 15.00 / 1_000_000),
    }
    input_rate, output_rate = pricing.get(model, (0, 0))
    return (input_tokens * input_rate) + (output_tokens * output_rate)

# Then in line 147:
record = CostRecord(
    model=model, 
    input_tokens=response.usage.input_tokens,
    output_tokens=response.usage.output_tokens,
    cost_usd=calculate_cost(model, response.usage.input_tokens, response.usage.output_tokens)
)
92-102: Move the comment about non-retryable errors into the docstring.

The code after line 100 is unreachable because the function always returns, raises, or propagates an exception within the loop. The comment on line 101 documents important behavior but should be placed in the docstring instead to avoid unreachable code.

The Anthropic SDK error classes (APIConnectionError, RateLimitError, InternalServerError) are correct and properly imported.
♻️ Suggested improvement
 def call_with_retry(func, *, max_retries: int = _MAX_RETRIES):
-    """Retry only on transient errors, fail fast on others."""
+    """Retry only on transient errors, fail fast on others.
+    
+    Non-retryable errors (AuthenticationError, BadRequestError, etc.) 
+    will raise immediately.
+    """
     for attempt in range(max_retries):
         try:
             return func()
         except _RETRYABLE_ERRORS:
             if attempt == max_retries - 1:
                 raise
             time.sleep(2 ** attempt)  # Exponential backoff

coderabbitai · 2026-02-14T03:21:07Z

skills/cost-aware-llm-pipeline/SKILL.md

+## Pricing Reference (2025-2026)
+
+| Model | Input ($/1M tokens) | Output ($/1M tokens) | Relative Cost |
+|-------|---------------------|----------------------|---------------|
+| Haiku 4.5 | $0.80 | $4.00 | 1x |
+| Sonnet 4.5 | $3.00 | $15.00 | ~4x |
+| Opus 4.5 | $15.00 | $75.00 | ~19x |
+


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

Anthropic Claude Haiku 4.5 Sonnet 4.5 Opus 4.5 pricing 2025 2026 per million tokens

💡 Result:

Anthropic Claude 4.5 API pricing (USD) — per 1M tokens (2025–2026)

Model Input ($/1M) Output ($/1M)

Claude Haiku 4.5 $1 $5 [1][2]

Claude Sonnet 4.5 $3 (≤200K prompt) / $6 (>200K prompt) $15 (≤200K prompt) / $22.50 (>200K prompt) [1][3]

Claude Opus 4.5 $5 $25 [1]

Sources: Anthropic/Claude pricing page and model pages. [1][2][3]

Update pricing table with 2025-2026 rates.

The pricing information is outdated. Current 2025-2026 rates for Anthropic Claude models are:

Haiku 4.5: $1.00 / $5.00 (not $0.80 / $4.00)

Sonnet 4.5: $3.00 / $15.00 for ≤200K context (or $6.00 / $22.50 for >200K context)

Opus 4.5: $5.00 / $25.00 (not $15.00 / $75.00)

Update the table and recalculate relative costs (Sonnet ≈3x, Opus ≈5x vs. Haiku, not 4x and 19x).

🤖 Prompt for AI Agents

In `@skills/cost-aware-llm-pipeline/SKILL.md` around lines 153 - 160, Update the "Pricing Reference (2025-2026)" table rows for Haiku 4.5, Sonnet 4.5, and Opus 4.5 in SKILL.md: change Haiku 4.5 rates to Input $1.00 / Output $5.00; change Sonnet 4.5 to Input $3.00 / Output $15.00 for ≤200K context (and note the alternate $6.00 / $22.50 for >200K context) and update its Relative Cost to ≈3x vs Haiku; change Opus 4.5 rates to Input $5.00 / Output $25.00 and set its Relative Cost to ≈5x vs Haiku; keep the table header and formatting intact and ensure the Sonnet large-context footnote or parenthetical is clearly indicated next to the Sonnet 4.5 row.

affaan-m · 2026-02-14T04:00:22Z

[openclaw-bot:pr-review]

Automated Review - CI checks are passing! ✅

Hi @shimo4228, thanks for this contribution! All CI checks have passed successfully.

A maintainer will review this PR shortly. In the meantime, please ensure:

The PR description explains the changes
Tests cover the new/modified functionality
No breaking changes (or they're documented)

This is an automated review from OpenClaw.

affaan-m

Automated review: doc-only changes look good. Approving.

feat(skills): add cost-aware-llm-pipeline skill

219bd1f

Cost optimization patterns for LLM API usage combining model routing, budget tracking, retry logic, and prompt caching.

coderabbitai bot reviewed Feb 14, 2026

View reviewed changes

affaan-m approved these changes Feb 14, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(skills): add cost-aware-llm-pipeline skill#219

feat(skills): add cost-aware-llm-pipeline skill#219
shimo4228 wants to merge 1 commit intoaffaan-m:mainfrom
shimo4228:feat/skills/cost-aware-llm-pipeline

shimo4228 commented Feb 14, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 14, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 14, 2026

Uh oh!

affaan-m commented Feb 14, 2026

Uh oh!

affaan-m left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Model	Input ($/1M)	Output ($/1M)
Claude Haiku 4.5	$1	$5 [1][2]
Claude Sonnet 4.5	$3 (≤200K prompt) / $6 (>200K prompt)	$15 (≤200K prompt) / $22.50 (>200K prompt) [1][3]
Claude Opus 4.5	$5	$25 [1]

Uh oh!

Conversation

shimo4228 commented Feb 14, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Motivation

Checklist

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

affaan-m commented Feb 14, 2026

Uh oh!

affaan-m left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

shimo4228 commented Feb 14, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 14, 2026 •

edited

Loading