Skip to content

feat: improve benchmarking#81

Merged
twangodev merged 8 commits intomainfrom
feat/benchmarks
Dec 16, 2025
Merged

feat: improve benchmarking#81
twangodev merged 8 commits intomainfrom
feat/benchmarks

Conversation

@twangodev
Copy link
Copy Markdown
Owner

@twangodev twangodev commented Dec 16, 2025

Summary by CodeRabbit

  • New Features

    • Render concurrency limiting with timeout-based overload responses
    • Added load-test scripts and a CI benchmark job to run and publish results
    • Added a lightweight comparison service for Vercel OG
  • Chores

    • Improved Docker image (metadata, non-root runtime, assets, healthcheck tuning)
    • New k6 helpers for scenarios, metrics, runners, and summarized reports
  • Removed

    • Deleted the previous benchmark suite, example app, and related docs/configs

✏️ Tip: You can customize this high-level summary in your review settings.

Copilot AI review requested due to automatic review settings December 16, 2025 00:51
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Dec 16, 2025

Important

Review skipped

Review was skipped as selected files did not have any reviewable changes.

💤 Files selected but had no reviewable changes (1)
  • k6/results/.gitkeep

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Replaces the Python/Next.js benchmark suite with a k6-based benchmark, adds Docker and comparison artifacts, introduces render concurrency gating via a tokio::sync::Semaphore in the server, and reorganizes the Docker multi-stage build and runtime user/asset layout; adds a GitHub "Benchmark" CI job.

Changes

Cohort / File(s) Summary
Dockerfile / image metadata
Dockerfile
Replaces previous builder stage with cargo-chef stages (chef, planner, builder), adds LABEL metadata, installs system deps, creates non-root user ogis, adjusts file layout and permissions, copies built binaries/assets, and modifies HEALTHCHECK timings.
Application concurrency / render gating
src/main.rs, src/routes/index.rs
Adds render_semaphore: Arc<Semaphore> to AppState (initialized from CPU parallelism) and updates request handler to acquire a semaphore slot with timeout, spawn_blocking SVG→PNG rendering, and return 503/500 on overload or failures.
Benchmarks removed
benchmarks/*, benchmarks/python/*, benchmarks/vercel-og/*
Removes the legacy benchmark suite: Python harness, pyproject, scenarios.json, docker-compose, READMEs, and the Vercel Next.js demo (routes, layout, pages, styles, Dockerfile, package/tsconfig).
k6 load-test additions
k6/*, k6/lib/*, k6/.gitignore
Adds a k6 test harness and helpers: ogis.js, vercel-og.js, lib/config.js, lib/metrics.js, lib/templates.js, lib/utils.js, lib/runner.js, lib/summary.js, and a k6 results .gitignore.
Benchmark comparisons / docker-compose
comparisons/vercel-og/*, docker-compose.benchmark.yml
Adds a minimal Vercel-OG comparison server (server.js, Dockerfile, package.json) and a docker-compose benchmark file that runs ogis and vercel-og side-by-side.
CI: Benchmark job
.github/workflows/rust.yml
Adds a new "Benchmark" job that launches benchmark services via docker-compose, runs k6 scenarios, saves summaries and artifacts, and is gated to main.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Client
    participant AppHandler as App (async)
    participant Semaphore
    participant Renderer as BlockingRenderTask
    participant PNGEncoder as PNG Encoder

    Client->>AppHandler: GET /?template=...
    AppHandler->>Semaphore: try_acquire (with timeout)
    alt acquired
        AppHandler->>Renderer: spawn_blocking(render SVG -> PNG)
        Renderer->>PNGEncoder: rasterize & encode
        PNGEncoder-->>Renderer: PNG bytes
        Renderer-->>AppHandler: PNG bytes
        AppHandler-->>Client: 200 image/png
        AppHandler->>Semaphore: release
    else timeout / not acquired
        Semaphore-->>AppHandler: timeout
        AppHandler-->>Client: 503 Service Unavailable
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Review semaphore acquire/release paths and timeout handling in src/routes/index.rs (ensure no leaks on panic/join errors).
  • Validate CPU parallelism detection and fallback in src/main.rs.
  • Inspect spawn_blocking task error handling, logging, and mapping to HTTP status codes.
  • Verify Dockerfile cargo-chef stages, file ownership/chown, fonts/templates copying, and non-root ogis user permissions.
  • Audit k6 modules for correctness (imports, scenario building, threshold semantics) and CI integration in .github/workflows/rust.yml.

Possibly related PRs

Poem

🐰
I hopped through code at break of dawn,
Semaphores guard the render lawn,
Benchmarks swapped for k6 delight,
Docker snugged assets through the night,
A tiny rabbit cheers the build just right.

Pre-merge checks and finishing touches

❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'feat: improve benchmarking' is partially related to the changeset—it correctly identifies benchmarking improvements, but is too broad and vague to clearly convey the specific changes made, which include comprehensive Dockerfile updates, removal of Python-based benchmarks, addition of k6 load-testing infrastructure, and restructured comparison services. Consider a more specific title such as 'feat: migrate benchmarks from Python to k6 and restructure comparison services' to better reflect the major architectural changes and the scope of modifications across the benchmark infrastructure.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 81.82% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves benchmarking infrastructure and adds concurrency control to the image generation service. The main focus is on introducing a semaphore-based rate limiting mechanism and removing the Vercel OG benchmark comparison setup.

  • Adds semaphore-based concurrency control for render operations
  • Offloads blocking render operations to dedicated threads with timeouts
  • Removes entire Vercel OG benchmarking infrastructure

Reviewed changes

Copilot reviewed 21 out of 29 changed files in this pull request and generated 3 comments.

File Description
src/routes/index.rs Implements render semaphore with timeout handling and moves SVG generation to blocking tasks
src/main.rs Adds render_semaphore to AppState, sized by CPU core count
Dockerfile Improves build caching with cargo-chef, adds container metadata, and creates non-root user
benchmarks/* Removes all Vercel OG benchmark comparison files and infrastructure
Files not reviewed (1)
  • benchmarks/vercel-og/package-lock.json: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

let text = TextContent {
title: &title,
description: &description,
subtitle: &subtitle,
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new spawn_blocking error path (lines 246-255) lacks test coverage. Consider adding tests to verify behavior when the spawned task fails or panics.

Copilot uses AI. Check for mistakes.
apt-get install -y ca-certificates curl && \
rm -rf /var/lib/apt/lists/*
rm -rf /var/lib/apt/lists/* && \
useradd -r -s /bin/false ogis
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The non-root user 'ogis' is created without explicitly disabling login, though /bin/false is set as the shell. Consider using --no-create-home flag to prevent home directory creation for enhanced security.

Suggested change
useradd -r -s /bin/false ogis
useradd -r --no-create-home -s /bin/false ogis

Copilot uses AI. Check for mistakes.
@codecov
Copy link
Copy Markdown

codecov bot commented Dec 16, 2025

Codecov Report

❌ Patch coverage is 0% with 45 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/routes/index.rs 0.00% 41 Missing ⚠️
src/main.rs 0.00% 4 Missing ⚠️
Flag Coverage Δ
rust 8.70% <0.00%> (-0.27%) ⬇️
typescript 100.00% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
src/main.rs 0.00% <0.00%> (ø)
src/routes/index.rs 0.00% <0.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
Dockerfile (1)

24-25: Consider adding --no-create-home flag for the non-root user.

For enhanced security, prevent home directory creation for the service user.

-    useradd -r -s /bin/false ogis
+    useradd -r --no-create-home -s /bin/false ogis
🧹 Nitpick comments (2)
src/routes/index.rs (2)

170-170: Consider making RENDER_TIMEOUT configurable.

The 5-second timeout is reasonable for most scenarios, but making it configurable via the application config would allow tuning for different deployment environments (e.g., larger timeouts for complex templates).


242-251: Verify JoinError handling covers task panics.

The JoinError case handles both task cancellation and panics. Consider whether you want to log at a higher severity (e.g., tracing::error!) for panics, as they indicate unexpected failures that may warrant investigation.

         Err(join_err) => {
             log.status = StatusCode::INTERNAL_SERVER_ERROR;
-            log.error = Some(format!("Task join: {}", join_err));
+            log.error = Some(format!("Task join: {}", join_err));
+            if join_err.is_panic() {
+                tracing::error!("Render task panicked: {}", join_err);
+            }
             log.log();
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 81baf0b and 5d6fbfd.

⛔ Files ignored due to path filters (8)
  • benchmarks/python/uv.lock is excluded by !**/*.lock
  • benchmarks/vercel-og/app/favicon.ico is excluded by !**/*.ico
  • benchmarks/vercel-og/package-lock.json is excluded by !**/package-lock.json
  • benchmarks/vercel-og/public/file.svg is excluded by !**/*.svg
  • benchmarks/vercel-og/public/globe.svg is excluded by !**/*.svg
  • benchmarks/vercel-og/public/next.svg is excluded by !**/*.svg
  • benchmarks/vercel-og/public/vercel.svg is excluded by !**/*.svg
  • benchmarks/vercel-og/public/window.svg is excluded by !**/*.svg
📒 Files selected for processing (21)
  • Dockerfile (2 hunks)
  • benchmarks/.gitignore (0 hunks)
  • benchmarks/README.md (0 hunks)
  • benchmarks/docker-compose.yml (0 hunks)
  • benchmarks/python/.python-version (0 hunks)
  • benchmarks/python/benchmark.py (0 hunks)
  • benchmarks/python/pyproject.toml (0 hunks)
  • benchmarks/python/scenarios.json (0 hunks)
  • benchmarks/vercel-og/.gitignore (0 hunks)
  • benchmarks/vercel-og/Dockerfile (0 hunks)
  • benchmarks/vercel-og/README.md (0 hunks)
  • benchmarks/vercel-og/app/api/og/route.tsx (0 hunks)
  • benchmarks/vercel-og/app/globals.css (0 hunks)
  • benchmarks/vercel-og/app/layout.tsx (0 hunks)
  • benchmarks/vercel-og/app/page.module.css (0 hunks)
  • benchmarks/vercel-og/app/page.tsx (0 hunks)
  • benchmarks/vercel-og/next.config.ts (0 hunks)
  • benchmarks/vercel-og/package.json (0 hunks)
  • benchmarks/vercel-og/tsconfig.json (0 hunks)
  • src/main.rs (4 hunks)
  • src/routes/index.rs (3 hunks)
💤 Files with no reviewable changes (18)
  • benchmarks/vercel-og/README.md
  • benchmarks/python/.python-version
  • benchmarks/README.md
  • benchmarks/vercel-og/.gitignore
  • benchmarks/vercel-og/Dockerfile
  • benchmarks/vercel-og/tsconfig.json
  • benchmarks/python/pyproject.toml
  • benchmarks/vercel-og/app/api/og/route.tsx
  • benchmarks/vercel-og/app/globals.css
  • benchmarks/.gitignore
  • benchmarks/python/scenarios.json
  • benchmarks/vercel-og/next.config.ts
  • benchmarks/vercel-og/app/page.tsx
  • benchmarks/vercel-og/app/page.module.css
  • benchmarks/vercel-og/package.json
  • benchmarks/python/benchmark.py
  • benchmarks/vercel-og/app/layout.tsx
  • benchmarks/docker-compose.yml
🧰 Additional context used
🧬 Code graph analysis (1)
src/routes/index.rs (3)
src/templates.rs (1)
  • templates (54-57)
src/generator/svg.rs (1)
  • generate_svg (122-190)
src/generator/png.rs (1)
  • render_to_png (3-26)
🔇 Additional comments (6)
src/main.rs (1)

82-99: LGTM! Well-designed concurrency control.

The semaphore-based render gating is a solid approach for CPU-bound rendering tasks. Using available_parallelism() with a fallback to 4 is appropriate, especially in containerized environments where parallelism detection may be unreliable. The Arc<Semaphore> pattern correctly enables shared state across async handlers.

Dockerfile (3)

17-20: Good addition of OCI image metadata labels.

The labels follow the OCI annotation specification and provide useful metadata for container registries and tooling.


37-39: Good security practice with non-root user.

Running as a non-root user with proper ownership is a best practice for container security.


1-1: The Rust version rust:1.90 is valid and exists. As of December 2025, the latest stable Rust release is 1.92.0, but older versions like 1.90 remain available on Docker Hub and will not cause build failures. If updating to the latest stable version is desired for security or feature benefits, use rust:1.92 instead.

Likely an incorrect or invalid review comment.

src/routes/index.rs (2)

169-195: Well-implemented render slot gating with timeout.

The semaphore acquisition with timeout correctly handles:

  1. Successful permit acquisition
  2. Semaphore closure (defensive, shouldn't happen in practice)
  3. Timeout when overloaded (returns 503)

The _render_permit is correctly held in scope until the render completes, ensuring the permit is released after the blocking work finishes.


202-220: Good use of spawn_blocking for CPU-bound rendering.

Moving SVG generation and PNG rendering to a blocking task prevents stalling the async executor. The closure correctly owns all required data through Arc clones and owned strings, avoiding lifetime issues.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🧹 Nitpick comments (1)
k6/ogis.js (1)

32-35: Add checks for response body or size.

The checks verify status code and content-type but don't validate that the response body contains actual image data. A 200 response with an empty body or error message would pass these checks.

Consider adding a check for response body size:

   check(res, {
     'status is 200': (r) => r.status === 200,
     'content-type is png': (r) => r.headers['Content-Type']?.includes('image/png'),
+    'body has content': (r) => r.body && r.body.length > 100,
   });
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5d6fbfd and e6bfb5a.

📒 Files selected for processing (7)
  • .github/workflows/rust.yml (1 hunks)
  • k6/.gitignore (1 hunks)
  • k6/lib/config.js (1 hunks)
  • k6/lib/metrics.js (1 hunks)
  • k6/lib/templates.js (1 hunks)
  • k6/lib/utils.js (1 hunks)
  • k6/ogis.js (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • k6/.gitignore
🧰 Additional context used
🧬 Code graph analysis (3)
k6/lib/config.js (1)
k6/ogis.js (1)
  • templates (10-10)
k6/lib/metrics.js (2)
k6/ogis.js (1)
  • templates (10-10)
k6/lib/templates.js (1)
  • sanitize (14-16)
k6/ogis.js (4)
k6/lib/templates.js (1)
  • loadTemplates (6-9)
k6/lib/metrics.js (1)
  • createTemplateMetrics (7-13)
k6/lib/config.js (3)
  • buildScenarios (18-37)
  • config (7-13)
  • config (7-13)
k6/lib/utils.js (3)
  • randomTitle (22-35)
  • randomDescription (40-50)
  • buildHandleSummary (55-102)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Code Coverage
  • GitHub Check: Code Coverage
  • GitHub Check: build
🔇 Additional comments (4)
k6/ogis.js (1)

22-25: Sequential template selection is correct.

The formula Math.floor(__ITER / config.requestsPerTemplate) % templates.length correctly distributes iterations across templates in sequential mode, cycling through each template for requestsPerTemplate iterations before moving to the next.

k6/lib/utils.js (1)

22-35: Random data generation looks good.

The randomTitle() and randomDescription() functions generate reasonable synthetic data for testing, with appropriate variability in length and composition.

Also applies to: 40-50

k6/lib/metrics.js (1)

7-13: No issues found. The k6 Trend constructor correctly accepts a second boolean parameter isTime that treats added values as time values when set to true. The code properly applies this for tracking template response durations.

.github/workflows/rust.yml (1)

166-168: No action needed—k6 automatically creates the k6/results/ directory.

The handleSummary function in k6/lib/utils.js returns an object with 'k6/results/summary.md' (and other files) as keys. k6 automatically creates these directories and files when exporting the handleSummary callback, so the files will exist before the cat command runs.

Comment on lines +151 to +155
services:
ogis:
image: ${{ vars.DOCKERHUB_USERNAME }}/ogis:latest
ports:
- 3000:3000
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n .github/workflows/rust.yml | sed -n '1,200p'

Repository: twangodev/ogis

Length of output: 6517


🌐 Web query:

docker/build-push-action@v6 outputs image digest

💡 Result:

Yes — docker/build-push-action@v6 provides a step output named digest (the pushed image digest). To get it, give the build step an id and read steps..outputs.digest. For multi-arch / push-by-digest workflows use the image exporter parameters (examples below).

Minimal example:

  • name: Build and push
    id: build
    uses: docker/build-push-action@v6
    with:
    context: .
    push: true
    outputs: type=image,name=ghcr.io/myorg/myimage,push-by-digest=true,name-canonical=true

Then use the digest:

  • run: echo "IMAGE_DIGEST=${{ steps.build.outputs.digest }}"

Docs / examples: README (outputs: digest) and Docker Buildx image/exporter docs (push-by-digest / type=image). [1][2][3]

Sources:
[1] docker/build-push-action README (outputs: imageid, digest, metadata).
[2] Docker docs — GitHub Actions multi-platform example using outputs + push-by-digest.
[3] Docker docs — image exporter parameters (push-by-digest, name, push).


🌐 Web query:

GitHub Actions services container digest image pull

💡 Result:

Yes — you can pin a GitHub Actions service container to a digest by setting the service image to the fully-qualified name with @sha256:. For private registries use the service credentials block to authenticate.

Example:
services:
mydb:
image: ghcr.io/owner/mydb@sha256:0123456789abcdef...
credentials:
username: ${{ github.actor }}
password: ${{ secrets.GHCR_PAT }}
ports:
- 5432:5432

Notes:

  • The image@sha256:... form pulls the exact image digest (not a mutable tag). [1][2]
  • If the registry requires auth, provide credentials (username/password) under jobs.<job_id>.services.<service_id>.credentials. [1][2]

Sources:

  • Workflow syntax — jobs.<job_id>.services.<service_id>.image and credentials. [1]
  • Pulling container images by digest (docs for container registries / docker). [2]

🌐 Web query:

GitHub Actions needs dependency service container image registry propagation delay

💡 Result:

Short answer
GitHub Actions can encounter a registry propagation/replication delay when you push an image (especially to GHCR) and immediately try to pull that same tag as a service container in the same or a downstream workflow. Workarounds: publish and reference the image by digest or a unique tag (e.g., commit SHA), or add a deterministic retry/wait before pulling. (docs.github.com)

Why it happens (one line)
Registry backends and CDNs may not make a newly pushed tag/manifest visible instantaneously across all endpoints; using a mutable tag (like latest) can therefore race with image availability. (docs.github.com)

Practical options (choose one)

  • Push and consume by digest: after push capture the image digest (sha256) and use ghcr.io/your/image@sha256:... when starting the service — this avoids tag-replication races. (docs.github.com)
  • Use a unique, immutable tag per run (e.g., ${{ github.sha }}): push ghcr.io/...:${{ github.sha }} and reference that tag; avoids collisions and cache/replication ambiguity. (learn.microsoft.com)
  • Retry/wait: if you must use the same tag, add a short sleep + retry loop (docker pull with backoff) before the service container step or implement retries in the job that pulls the image. This is a pragmatic fallback. (stackoverflow.com)
  • Ensure proper credentials/visibility: for GHCR internal/private images make sure visibility and credentials (GITHUB_TOKEN or PAT with read:packages) are correct — auth issues can look like propagation failures. (vito.io)

Example minimal flow (digest approach)

  1. Build & push, capture digest (docker inspect or build-push action output).
  2. Start service with image: ghcr.io/ORG/IMAGE@sha256:... (or set image input to that digest). (docs.github.com)

If you want, tell me which registry (ghcr.io or Docker Hub), whether the image is private/internal, and the workflow fragment — I’ll give a concrete snippet (digest or retry) you can drop into your workflow.

Citations:


🌐 Web query:

GitHub Actions workflow job needs dependency timing guarantee DockerHub image availability

💡 Result:

Short answer: don’t rely on “image appearing globally on Docker Hub” as the only signal — either (A) pass a concrete artifact/digest from the build job and have the consumer job verify/pull with retries, or (B) avoid Docker Hub replication by sharing the image as an artifact or using GHCR (or another registry) which is faster/tighter-integrated with Actions. Examples below.

Why: jobs are isolated runners so a push-to-registry can race with a downstream pull; using job outputs/artifacts or explicit pull verification eliminates the timing race. (docs.github.com)

Options (concise):

  1. Pass image digest + explicit pull-with-retries
  • Build & push in job A, set job output to the image digest/tag.
  • Job B needs: jobA, then loop until docker pull IMAGE:TAG succeeds (exponential backoff + timeout).
  • Benefit: you know exact digest to pull and you explicitly wait for availability.

Minimal snippet (conceptual)

  • jobA steps:
    • build & docker push
    • echo "image=repo/name:tag" >> $GITHUB_OUTPUT
  • jobB (needs: jobA) steps:
    • IMAGE=${{ needs.jobA.outputs.image }}
    • attempt: for i in 1..10; do docker pull "$IMAGE" && break || sleep $((i*i)); done
      (Use a reasonable max attempts/timeout.)
  1. Share the image file between jobs (no registry dependency)
  • Build and export image to a tar (docker save / buildx outputs type=docker,dest=...) and actions/upload-artifact in jobA; jobB downloads and docker load.
  • Guarantee: downstream gets the exact image without any registry delay. Docker docs show this pattern. (docs.docker.com)
  1. Use GitHub Container Registry (GHCR) or another registry with stronger integration
  • GHCR has tighter GitHub integration and often avoids cross-registry replication delays; it’s simpler to authenticate from Actions and you can still pass the image digest as a job output. Consider GHCR if you control the registry choice. (jfrog.com)

Extra: if you must use Docker Hub and want an API check, query the registry manifest endpoint for the pushed tag (requires token for private images) and wait until the manifest HEAD/GET succeeds — then pull. (This is an implementation detail of option 1.)

Which to pick:

  • If you need absolute timing guarantee and want to avoid network/regional replication issues → use artifacts (option 2).
  • If you want real registry-based CI/CD and minimal artifacts → push, export exact digest as job output, then poll/pull with retries (option 1).
  • If you can change registry → prefer GHCR for tighter GitHub Actions integration (option 3). (docs.github.com)

If you want, I can:

  • provide a ready-to-copy workflow YAML implementing option 1 (digest + pull-with-retries), or
  • provide a workflow implementing option 2 (upload/download docker image artifact). Which do you prefer?

Citations:


Use the image digest instead of the :latest tag to avoid a registry propagation race.

The benchmark service pulls ogis:latest after the docker job completes, but DockerHub's registry may not have fully replicated the newly pushed image across all endpoints. This creates a race condition where the service could pull a stale image or fail to start.

Capture the image digest from docker/build-push-action@v6 and reference it in the service:

  docker:
    # ... existing steps ...
    steps:
      # ... existing steps ...
      - name: Build and push Docker image
        id: build
        uses: docker/build-push-action@v6
        with:
          context: .
          platforms: linux/amd64,linux/arm64
          push: true
          provenance: mode=max
          sbom: true
          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}
          cache-from: type=gha
          cache-to: type=gha,mode=max

  benchmark:
    name: Benchmark
    runs-on: ubuntu-latest
    needs: [docker]
    if: github.ref == 'refs/heads/main' && github.event_name == 'push'

    services:
      ogis:
        image: ${{ vars.DOCKERHUB_USERNAME }}/ogis@${{ needs.docker.outputs.digest }}
        ports:
          - 3000:3000

The docker job must export the digest output for the benchmark job to reference it.

Comment on lines +7 to +13
export const config = {
mode: __ENV.MODE || 'sequential', // 'sequential' or 'concurrent'
requestsPerTemplate: parseInt(__ENV.REQUESTS_PER_TEMPLATE || '50'),
vus: parseInt(__ENV.VUS || '100'),
duration: __ENV.DURATION || '60s',
baseUrl: __ENV.BASE_URL || 'http://localhost:3000',
};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate parsed integer environment variables.

parseInt() returns NaN if the environment variable contains non-numeric values. This could cause silent failures or unexpected behavior in the benchmark scenarios.

Apply this diff to add validation:

 export const config = {
   mode: __ENV.MODE || 'sequential', // 'sequential' or 'concurrent'
-  requestsPerTemplate: parseInt(__ENV.REQUESTS_PER_TEMPLATE || '50'),
-  vus: parseInt(__ENV.VUS || '100'),
+  requestsPerTemplate: parseInt(__ENV.REQUESTS_PER_TEMPLATE || '50', 10),
+  vus: parseInt(__ENV.VUS || '100', 10),
   duration: __ENV.DURATION || '60s',
   baseUrl: __ENV.BASE_URL || 'http://localhost:3000',
 };
+
+// Validate numeric config
+if (isNaN(config.requestsPerTemplate) || config.requestsPerTemplate < 1) {
+  throw new Error(`Invalid REQUESTS_PER_TEMPLATE: ${__ENV.REQUESTS_PER_TEMPLATE}`);
+}
+if (isNaN(config.vus) || config.vus < 1) {
+  throw new Error(`Invalid VUS: ${__ENV.VUS}`);
+}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export const config = {
mode: __ENV.MODE || 'sequential', // 'sequential' or 'concurrent'
requestsPerTemplate: parseInt(__ENV.REQUESTS_PER_TEMPLATE || '50'),
vus: parseInt(__ENV.VUS || '100'),
duration: __ENV.DURATION || '60s',
baseUrl: __ENV.BASE_URL || 'http://localhost:3000',
};
export const config = {
mode: __ENV.MODE || 'sequential', // 'sequential' or 'concurrent'
requestsPerTemplate: parseInt(__ENV.REQUESTS_PER_TEMPLATE || '50', 10),
vus: parseInt(__ENV.VUS || '100', 10),
duration: __ENV.DURATION || '60s',
baseUrl: __ENV.BASE_URL || 'http://localhost:3000',
};
// Validate numeric config
if (isNaN(config.requestsPerTemplate) || config.requestsPerTemplate < 1) {
throw new Error(`Invalid REQUESTS_PER_TEMPLATE: ${__ENV.REQUESTS_PER_TEMPLATE}`);
}
if (isNaN(config.vus) || config.vus < 1) {
throw new Error(`Invalid VUS: ${__ENV.VUS}`);
}
🤖 Prompt for AI Agents
In k6/lib/config.js around lines 7 to 13, the integer env vars are parsed with
parseInt() which can return NaN for non-numeric input; update parsing to use
parseInt(value, 10) and then validate the result with Number.isFinite or isNaN,
falling back to the existing defaults when parsing fails (e.g., if parsed value
is NaN) so vus and requestsPerTemplate always become valid numbers; optionally
log or warn when an env value is invalid before using the default.

Comment on lines +18 to +37
export function buildScenarios(templates) {
if (config.mode === 'sequential') {
return {
sequential: {
executor: 'per-vu-iterations',
vus: 1,
iterations: templates.length * config.requestsPerTemplate,
maxDuration: '30m',
},
};
}

return {
concurrent: {
executor: 'constant-vus',
vus: config.vus,
duration: config.duration,
},
};
} No newline at end of file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Add guard against empty templates array.

If templates is empty or undefined, the sequential mode will create a scenario with 0 iterations, and template selection in the test function will fail with division by zero or undefined behavior.

Apply this diff to add validation:

 export function buildScenarios(templates) {
+  if (!templates || templates.length === 0) {
+    throw new Error('No templates provided to buildScenarios');
+  }
+
   if (config.mode === 'sequential') {
     return {
       sequential: {
         executor: 'per-vu-iterations',
         vus: 1,
         iterations: templates.length * config.requestsPerTemplate,
         maxDuration: '30m',
       },
     };
   }

   return {
     concurrent: {
       executor: 'constant-vus',
       vus: config.vus,
       duration: config.duration,
     },
   };
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export function buildScenarios(templates) {
if (config.mode === 'sequential') {
return {
sequential: {
executor: 'per-vu-iterations',
vus: 1,
iterations: templates.length * config.requestsPerTemplate,
maxDuration: '30m',
},
};
}
return {
concurrent: {
executor: 'constant-vus',
vus: config.vus,
duration: config.duration,
},
};
}
export function buildScenarios(templates) {
if (!templates || templates.length === 0) {
throw new Error('No templates provided to buildScenarios');
}
if (config.mode === 'sequential') {
return {
sequential: {
executor: 'per-vu-iterations',
vus: 1,
iterations: templates.length * config.requestsPerTemplate,
maxDuration: '30m',
},
};
}
return {
concurrent: {
executor: 'constant-vus',
vus: config.vus,
duration: config.duration,
},
};
}
🤖 Prompt for AI Agents
In k6/lib/config.js around lines 18 to 37, add a guard that validates the
templates parameter before building scenarios: if templates is undefined or has
length 0, throw a clear Error (e.g., "templates must be a non-empty array") so
sequential mode cannot compute zero iterations and cause divide-by-zero or
undefined behavior; ensure this validation occurs before any config.mode checks
so both sequential and concurrent flows are protected.

@@ -0,0 +1,16 @@
import YAML from 'https://cdnjs.cloudflare.com/ajax/libs/js-yaml/4.1.0/js-yaml.min.js';
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

CDN dependency could fail in CI environments.

Loading the YAML library from cdnjs.cloudflare.com creates a runtime dependency on external infrastructure. If the CDN is unreachable or slow, the benchmark will fail.

Consider vendoring the js-yaml library locally or using k6's extension system for YAML parsing. If CDN usage is intentional, add error handling around the import.

#!/bin/bash
# Check if k6 supports local imports or has built-in YAML support
rg -n "import.*from.*http" k6/
🤖 Prompt for AI Agents
In k6/lib/templates.js around line 1, the file imports js-yaml from a CDN which
can fail in CI; replace the external CDN import with a vendored/local copy or
use k6's extension mechanism: add the js-yaml source into the repo (e.g.,
k6/lib/vendor/js-yaml.min.js) and change the import to that local path, or
remove the CDN import and call into a k6-supported YAML parser extension; if you
must keep the CDN, wrap the import/use in a runtime check/fallback that loads
the local vendored file when the CDN import fails and surface a clear error
message so CI doesn't silently fail.

Comment on lines +6 to +9
export function loadTemplates(path = '../templates.yaml') {
const yaml = YAML.load(open(path));
return yaml.templates.map((t) => t.name);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Add error handling for missing or malformed YAML.

The function assumes the YAML file exists and has a templates array with name properties. If the file is missing or malformed, this will cause a runtime error.

Apply this diff to add basic error handling:

 export function loadTemplates(path = '../templates.yaml') {
   const yaml = YAML.load(open(path));
+  if (!yaml || !Array.isArray(yaml.templates)) {
+    throw new Error(`Invalid templates file: ${path}`);
+  }
-  return yaml.templates.map((t) => t.name);
+  return yaml.templates.map((t) => {
+    if (!t.name) {
+      throw new Error(`Template missing name property: ${JSON.stringify(t)}`);
+    }
+    return t.name;
+  });
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export function loadTemplates(path = '../templates.yaml') {
const yaml = YAML.load(open(path));
return yaml.templates.map((t) => t.name);
}
export function loadTemplates(path = '../templates.yaml') {
const yaml = YAML.load(open(path));
if (!yaml || !Array.isArray(yaml.templates)) {
throw new Error(`Invalid templates file: ${path}`);
}
return yaml.templates.map((t) => {
if (!t.name) {
throw new Error(`Template missing name property: ${JSON.stringify(t)}`);
}
return t.name;
});
}
🤖 Prompt for AI Agents
In k6/lib/templates.js around lines 6 to 9, the function assumes the YAML file
exists and contains yaml.templates with name fields, which will throw if the
file is missing or malformed; update the function to wrap YAML.load/open in a
try/catch, check that the loaded value is an object with a templates array, and
if not throw or return an empty array (choose consistent behavior) while logging
or rethrowing a clear error; specifically, catch file/parse errors from
YAML.load/open, validate Array.isArray(yaml.templates) before mapping, and
handle missing name properties defensively (e.g., filter out invalid entries) so
callers won't crash on bad input.

Comment on lines +10 to +11
const templates = loadTemplates();
const templateMetrics = createTemplateMetrics(templates);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Validate templates array is not empty.

If loadTemplates() returns an empty array (e.g., due to an empty YAML file), the subsequent setup and test execution will fail with unclear error messages.

Apply this diff to add validation:

 // Setup
 const templates = loadTemplates();
+if (templates.length === 0) {
+  throw new Error('No templates loaded from templates.yaml');
+}
 const templateMetrics = createTemplateMetrics(templates);
🤖 Prompt for AI Agents
In k6/ogis.js around lines 10 to 11, ensure the templates array returned by
loadTemplates() is validated immediately after loading: check that it is an
Array and has length > 0, and if not raise an explicit error (throw new Error
with a clear message mentioning the YAML/source file and that templates are
empty) or log an error and abort execution; this will prevent later confusing
failures by failing fast with a descriptive message.

Copilot AI review requested due to automatic review settings December 16, 2025 03:55
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (4)
docker-compose.benchmark.yml (1)

2-5: Use image digest instead of the :latest tag to avoid registry propagation race.

The ogis:latest image may not be immediately available after the docker job pushes it, causing the benchmark job to pull a stale image or fail. This issue was previously identified in the workflow file.

The workflow should capture the image digest from the build step and reference it here, or use a unique tag like the commit SHA.

k6/lib/config.js (2)

9-15: Validate parsed integer environment variables.

parseInt() without a radix can misinterpret values (e.g., leading zeros as octal) and returns NaN for non-numeric input, potentially causing silent failures in benchmark scenarios.

Add radix parameter and validate the results as suggested in the previous review.


20-51: Add guard against empty templates array.

If templates is empty or undefined, the sequential mode creates a scenario with 0 iterations, and template selection in test functions will fail. The past review correctly identified this issue.

Add validation at the start of buildScenarios to ensure templates is a non-empty array.

k6/ogis.js (1)

7-9: Validate templates array is not empty.

If loadTemplates() returns an empty array, subsequent operations will fail with unclear errors:

  • selectTemplate() will divide by zero (line 19)
  • Metric access will fail (lines 31, 43)
  • The script will not execute properly

This issue was previously flagged and remains unresolved.

Apply this diff to add validation:

 const templates = loadTemplates();
+if (!templates || templates.length === 0) {
+  throw new Error('No templates loaded from templates.yaml');
+}
 const templateMetrics = createTemplateMetrics(templates);
🧹 Nitpick comments (3)
comparisons/vercel-og/package.json (1)

14-14: Consider pinning the exact version for benchmark reproducibility.

Using ^0.8.6 allows minor and patch updates that could introduce performance variations or behavioral changes, making benchmark results less reproducible over time. For a benchmark suite, exact version pinning ensures consistent results.

Apply this diff to pin the exact version:

-    "@vercel/og": "^0.8.6"
+    "@vercel/og": "0.8.6"
comparisons/vercel-og/server.js (1)

6-97: Add request timeout to prevent resource exhaustion.

The server has no request timeout, so slow or stalled image generation could cause requests to hang indefinitely, exhausting available resources during benchmarking.

Apply this diff to add a timeout:

 const server = createServer(async (req, res) => {
+  req.setTimeout(30000, () => {
+    res.writeHead(408, { 'Content-Type': 'text/plain' });
+    res.end('Request Timeout');
+  });
+
   const url = new URL(req.url, `http://localhost:${PORT}`);
k6/lib/runner.js (1)

21-35: Consider adding HTTP timeout configuration.

The http.get(url) call has no explicit timeout, relying on k6's default. For benchmarking consistency, explicitly setting a timeout would ensure deterministic behavior and prevent hung requests from skewing results.

You can set a timeout in the k6 options or per-request:

const res = http.get(url, { timeout: '30s' });
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e6bfb5a and c8dd7ca.

⛔ Files ignored due to path filters (1)
  • comparisons/vercel-og/package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (12)
  • .github/workflows/rust.yml (1 hunks)
  • comparisons/vercel-og/Dockerfile (1 hunks)
  • comparisons/vercel-og/package.json (1 hunks)
  • comparisons/vercel-og/server.js (1 hunks)
  • docker-compose.benchmark.yml (1 hunks)
  • k6/lib/config.js (1 hunks)
  • k6/lib/metrics.js (1 hunks)
  • k6/lib/runner.js (1 hunks)
  • k6/lib/summary.js (1 hunks)
  • k6/lib/utils.js (1 hunks)
  • k6/ogis.js (1 hunks)
  • k6/vercel-og.js (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • comparisons/vercel-og/Dockerfile
🚧 Files skipped from review as they are similar to previous changes (1)
  • k6/lib/metrics.js
🧰 Additional context used
🧬 Code graph analysis (6)
k6/vercel-og.js (5)
k6/lib/metrics.js (1)
  • createDefaultMetrics (18-25)
k6/lib/config.js (1)
  • createOptions (67-76)
k6/lib/utils.js (2)
  • randomTitle (22-35)
  • randomDescription (40-50)
k6/lib/runner.js (2)
  • createTestRunner (21-36)
  • res (24-24)
k6/lib/summary.js (1)
  • createSummaryHandler (74-90)
k6/lib/config.js (1)
k6/ogis.js (1)
  • templates (7-7)
k6/ogis.js (5)
k6/lib/templates.js (1)
  • loadTemplates (6-9)
k6/lib/metrics.js (2)
  • createTemplateMetrics (7-13)
  • createDefaultMetrics (18-25)
k6/lib/config.js (3)
  • createOptions (67-76)
  • config (9-15)
  • config (9-15)
k6/lib/utils.js (3)
  • randomTitle (22-35)
  • randomDescription (40-50)
  • buildHandleSummary (55-90)
k6/lib/runner.js (2)
  • createTestRunner (21-36)
  • res (24-24)
comparisons/vercel-og/server.js (1)
k6/lib/runner.js (2)
  • res (24-24)
  • url (23-23)
k6/lib/utils.js (1)
k6/lib/templates.js (1)
  • sanitize (14-16)
k6/lib/summary.js (3)
k6/lib/utils.js (4)
  • m (57-57)
  • metrics (58-58)
  • markdown (73-86)
  • defaultTable (71-71)
k6/ogis.js (2)
  • handleSummary (50-50)
  • handleSummary (50-50)
k6/vercel-og.js (2)
  • handleSummary (25-25)
  • handleSummary (25-25)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Agent
  • GitHub Check: Code Coverage
  • GitHub Check: build
  • GitHub Check: Code Coverage
🔇 Additional comments (13)
comparisons/vercel-og/server.js (1)

9-14: LGTM!

The health check endpoint is properly implemented and will work well with container orchestration and load balancer health checks.

k6/lib/runner.js (1)

5-11: LGTM!

The default checks are well-suited for validating OG image responses, and the optional chaining safely handles missing headers.

k6/vercel-og.js (1)

1-25: LGTM!

The test script is well-structured, properly encodes query parameters, and correctly exports a single runner for the Vercel OG endpoint that has no template variants. The integration with the metrics and summary modules is clean.

k6/lib/utils.js (1)

55-89: LGTM!

The summary builder safely handles missing metrics with optional chaining and filtering, sorts results appropriately, and generates well-structured Markdown output. The integration with template sanitization and metrics extraction is clean.

k6/lib/summary.js (5)

1-8: LGTM!

The formatMs utility function is clean and handles undefined values appropriately.


10-20: LGTM!

Good use of optional chaining and defaults to handle missing metrics gracefully.


22-36: LGTM!

The overall metrics table generation is well-structured with appropriate fallbacks for missing data.


38-67: LGTM!

Both generateDefaultTable and generateOutputFiles are well-implemented. The conditional return of empty string in generateDefaultTable ensures clean output when default metrics are unavailable.


69-90: LGTM!

The createSummaryHandler factory function provides a clean, reusable pattern for simpler benchmarks. The intentional absence of per-template metrics differentiates it appropriately from buildHandleSummary in utils.js, which serves OGIS's more complex multi-template scenarios.

k6/ogis.js (4)

17-21: Template selection logic is sound.

The sequential and concurrent selection modes are implemented correctly. The function properly depends on templates validation added earlier to prevent division by zero.


23-25: LGTM!

Proper use of encodeURIComponent for URL parameters ensures safe handling of special characters in generated titles and descriptions.


27-34: Baseline runner implementation is correct.

The baseline scenario appropriately tests only the default template for fair comparison with Vercel OG. Recording metrics to both templateMetrics and defaultMetrics enables comprehensive analysis.


36-50: LGTM!

The default runner correctly implements dynamic template selection and appropriate metrics recording. The conditional tracking of defaultMetrics only for the default template (lines 44-46) ensures accurate comparison data.

Comment on lines +164 to +179
- name: Start services
uses: hoverkraft-tech/compose-action@v2
with:
compose-file: docker-compose.benchmark.yml

- name: Install k6
uses: grafana/setup-k6-action@v1

- name: Create results directory
run: mkdir -p k6/results

- name: Run benchmark
run: k6 run k6/${{ matrix.test }}.js
env:
MODE: ${{ matrix.mode }}
BASE_URL: http://localhost:${{ matrix.port }}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Add health check polling before running benchmarks.

The benchmark starts immediately after docker-compose up, but the services may not be ready to accept requests. This could cause initial benchmark requests to fail or produce inaccurate results.

Add a health check wait step between service startup and benchmark execution:

      - name: Start services
        uses: hoverkraft-tech/compose-action@v2
        with:
          compose-file: docker-compose.benchmark.yml

      - name: Wait for services
        run: |
          timeout 60 bash -c 'until curl -f http://localhost:${{ matrix.port }}/health; do sleep 2; done'

      - name: Install k6
        uses: grafana/setup-k6-action@v1
🤖 Prompt for AI Agents
.github/workflows/rust.yml lines 164-179: insert a new job step between "Start
services" and "Install k6" that waits for the service health endpoint to become
responsive before proceeding; implement a run step that repeatedly curls
http://localhost:${{ matrix.port }}/health (failing fast) in a loop with a total
timeout (e.g. 60s) and short sleep intervals (e.g. 2s) until it succeeds, so the
benchmark only runs once services are healthy; ensure the step uses bash and
honors the matrix.port variable and adjust the endpoint or timeout as needed.

Comment on lines +11 to +13
const DEFAULT_TEMPLATE = 'minimal';

export const options = createOptions(templates);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Validate DEFAULT_TEMPLATE exists in templates array.

The hardcoded 'minimal' template is not validated to exist in the loaded templates. If 'minimal' is missing from templates.yaml, the baseline runner will fail at line 31 when accessing templateMetrics[DEFAULT_TEMPLATE].

Apply this diff to add validation after the templates are loaded:

 const templates = loadTemplates();
+if (!templates || templates.length === 0) {
+  throw new Error('No templates loaded from templates.yaml');
+}
+
+const DEFAULT_TEMPLATE = 'minimal';
+if (!templates.includes(DEFAULT_TEMPLATE)) {
+  throw new Error(`Default template '${DEFAULT_TEMPLATE}' not found in templates.yaml`);
+}
+
 const templateMetrics = createTemplateMetrics(templates);
 const defaultMetrics = createDefaultMetrics();
-
-const DEFAULT_TEMPLATE = 'minimal';

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In k6/ogis.js around lines 11 to 13 (and guarding use at ~line 31), the
hardcoded DEFAULT_TEMPLATE = 'minimal' is not validated against the loaded
templates, which will cause a runtime failure when accessing
templateMetrics[DEFAULT_TEMPLATE]; after loading or constructing the templates
array/object, add a validation check that DEFAULT_TEMPLATE exists (e.g.,
templates.includes(DEFAULT_TEMPLATE) or DEFAULT_TEMPLATE in templates), and if
it does not exist, throw or log a clear error and either choose a safe fallback
template or abort early so the runner fails with an actionable message instead
of an undefined access.

@twangodev twangodev merged commit f9e97fa into main Dec 16, 2025
19 checks passed
@twangodev twangodev deleted the feat/benchmarks branch December 16, 2025 04:03
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 35 out of 45 changed files in this pull request and generated 17 comments.

Files not reviewed (2)
  • benchmarks/vercel-og/package-lock.json: Language not supported
  • comparisons/vercel-og/package-lock.json: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +27 to +28
// Minimal template - exactly matches ogis minimal.svg layout
// Title at y=250, Description at y=300 (from SVG coordinates)
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment claims to match "ogis minimal.svg layout" with specific y-coordinates, but the actual positioning in the code uses top: 250px and top: 350px, which may not exactly match the SVG coordinates mentioned in the comment (250 and 300). The description is positioned at 350px, not 300px as the comment suggests. Ensure the comment accurately reflects the implementation or update the positioning to match the stated intent.

Copilot uses AI. Check for mistakes.
"license": "ISC",
"type": "module",
"dependencies": {
"@vercel/og": "^0.8.6"
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The @vercel/og package version is pinned to 0.8.6 using a caret (^), which allows minor and patch updates. For reproducible benchmarks, consider using an exact version (without the caret) to ensure consistent benchmark results across different runs and environments. Version drift could affect performance comparisons.

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +8
/**
* Configuration from environment variables
*
* Modes:
* - sequential: one-by-one testing of all templates (for per-template stats)
* - baseline: spam only default template (for head-to-head comparison)
* - concurrent: full send all templates randomly
*/
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment describes three modes including 'baseline', but the mode logic checks for 'sequential', 'baseline', and defaults to 'concurrent'. The comment on line 6 says "spam only default template" which is informal language. Consider using more professional terminology like "test only the default template" or "benchmark default template exclusively".

Copilot uses AI. Check for mistakes.
Comment on lines +152 to +159
matrix:
test: [ogis, vercel-og]
mode: [sequential, baseline, concurrent]
include:
- test: ogis
port: 3000
- test: vercel-og
port: 3001
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The matrix includes both 'ogis' and 'vercel-og' tests, but the vercel-og comparison server needs to be built from the repository. If the vercel-og Docker image isn't already available (cached or pre-built), the benchmark might fail or take much longer than expected. Consider adding a comment or documentation about this dependency.

Copilot uses AI. Check for mistakes.
WORKDIR /app

COPY --chown=node:node package*.json ./
RUN npm ci --only=production
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The npm ci command uses the --only=production flag, which is deprecated in newer npm versions. Consider using --omit=dev instead for forward compatibility with npm v7+.

Suggested change
RUN npm ci --only=production
RUN npm ci --omit=dev

Copilot uses AI. Check for mistakes.
const templateMetrics = createTemplateMetrics(templates);
const defaultMetrics = createDefaultMetrics();

const DEFAULT_TEMPLATE = 'minimal';
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The constant is named DEFAULT_TEMPLATE with value 'minimal', but this hardcodes an assumption about which template is the "default" in the OGIS system. If the default template in the OGIS configuration changes, this hardcoded value would be out of sync. Consider either reading this from configuration or renaming to BASELINE_TEMPLATE to clarify its purpose for comparison testing.

Copilot uses AI. Check for mistakes.
};
// Wait for a render slot with timeout (defers response, only 503 if truly overloaded)
const RENDER_TIMEOUT: Duration = Duration::from_secs(5);
let _render_permit =
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The render permit variable is prefixed with an underscore but is actually used implicitly. The permit controls access to the render semaphore through RAII (it releases the permit when dropped). The underscore prefix suggests it's intentionally unused, which could be misleading to future maintainers. Consider renaming it to _permit or render_permit without the underscore to clarify its purpose.

Copilot uses AI. Check for mistakes.
}
};
// Wait for a render slot with timeout (defers response, only 503 if truly overloaded)
const RENDER_TIMEOUT: Duration = Duration::from_secs(5);
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The render timeout is defined as a const within the function. Consider moving this to a configuration setting or at minimum to the top of the file as a module-level constant. This would make it easier to tune the timeout without modifying function logic, especially important for benchmarking scenarios where different timeout values might be tested.

Copilot uses AI. Check for mistakes.
apt-get install -y ca-certificates curl && \
rm -rf /var/lib/apt/lists/*
rm -rf /var/lib/apt/lists/* && \
useradd -r -s /bin/false ogis
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new user 'ogis' is created with /bin/false as the shell, which is good for security. However, the user is created without specifying a home directory. Consider adding --no-create-home flag explicitly to make the intent clear, or if a home directory is needed for any purpose, specify it explicitly.

Suggested change
useradd -r -s /bin/false ogis
useradd -r --no-create-home -s /bin/false ogis

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,90 @@
import { sanitize } from './templates.js';
import { extractMetrics, formatMs, generateOverallTable, generateDefaultTable, generateOutputFiles } from './summary.js';
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused import formatMs.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants