Skip to content

fix: guard startup against mixed API module updates#168

Open
x86txt wants to merge 3 commits intoPegaProx:mainfrom
x86txt:fix/startup-integrity-check
Open

fix: guard startup against mixed API module updates#168
x86txt wants to merge 3 commits intoPegaProx:mainfrom
x86txt:fix/startup-integrity-check

Conversation

@x86txt
Copy link
Copy Markdown

@x86txt x86txt commented Mar 16, 2026

Problem

The update.sh fallback file downloader (used when no release archive is available) has two issues that can leave an installation in a mixed-version state, causing import tracebacks on next startup:

  1. Silent download failures — individual file downloads that fail are swallowed by || true, so the update continues and completes "successfully" even when critical files are missing or stale.
  2. Stale file leftovers — the rsync used to copy extracted archive contents into the install directory does not use
    --delete, so files removed in a newer release persist from the previous version and can cause mixed-version imports.

Once in this state, PegaProx crashes on startup with an unhelpful Python traceback (e.g. ModuleNotFoundError) with no guidance on how to recover.

There is also no way to validate installation integrity without fully launching the server.

Fix

1. Harden update.sh against partial updates

  • Track per-file download failures instead of silently continuing.
  • If any file fails to download, abort the update and restore from the pre-update backup.
  • Add --delete to the rsync command so removed upstream files do not linger as stale leftovers.

2. Add startup blueprint integrity check

  • Before registering blueprints, validate_blueprint_modules() verifies every required API module is importable via importlib.util.find_spec.
  • If any modules are missing, startup raises a RuntimeError with an actionable message naming the missing modules and suggesting ./update.sh --force.
  • app.py catches this specific error and prints a clean one-liner instead of a full traceback, then exits with code 1 for systemd/journal visibility.

3. Add --check-startup CLI preflight mode

  • New --check-startup flag runs the integrity check and exits without launching services.
  • Useful for CI, post-update validation, and manual troubleshooting.

Impact

  • Partial updates no longer silently corrupt the installupdate.sh now aborts and rolls back when any file download fails in fallback mode.
  • Stale files from previous versions are cleaned uprsync --delete ensures the install directory mirrors upstream exactly (user config, SSL, logs, and backups are still excluded).
  • Startup failures are actionable — users see a clear error message with remediation steps instead of a raw Python traceback.
  • Preflight validation without downtime--check-startup lets operators verify installation integrity without starting the server.
  • No new dependencies or config keys introduced.
  • Unit tests included for both the module validation logic and the CLI preflight mode.

Files Changed

File Change
pegaprox/api/__init__.py Add validate_blueprint_modules() and call it before blueprint registration
pegaprox/app.py Catch startup integrity RuntimeError, print clean error, exit 1
pegaprox_multi_cluster.py Add check_startup_integrity() and --check-startup CLI flag
update.sh Track download failures, abort + restore on partial failure, add rsync --delete
tests/test_startup_check.py Unit tests for validate_blueprint_modules()
tests/test_cli_startup_check.py Unit tests for --check-startup CLI mode

Matt added 3 commits March 16, 2026 13:16
Add a blueprint integrity check that detects missing required API modules and exits with an actionable remediation message instead of a traceback. Harden fallback updater behavior to avoid partial file updates that can cause mixed-version imports.

Made-with: Cursor
Add a --check-startup command that validates required API modules and exits with actionable remediation instructions without launching services, with unit coverage for pass/fail paths.

Made-with: Cursor
Detect missing runtime deps such as certifi/charset_normalizer in --check-startup and during app bootstrap, and provide actionable remediation commands instead of raw tracebacks.

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant