fix(graph): cap /banks/{bank_id}/graph response size to keep Control Plane parseable on dense banks#1605
Open
connorblack wants to merge 2 commits into
Open
Conversation
… Node-parseable The graph endpoint accepts ``limit`` and the Control Plane UI defaults to 1000. Edge count scales quadratically with node count for densely-linked banks (memory_units cross-linked via memory_links). Empirical measurement on a 14k-memory bank: limit=200 → 23 MiB, limit=500 → 147 MiB, limit=1000 → 596 MiB. Next.js deserializes the response as a single JS string via response.json(); V8 caps string length at 0x1fffffe8 (~512 MiB). Banks past that cap error out with "Cannot create a string longer than 0x1fffffe8 characters" and the UI shows "Failed to fetch graph data" instead of any graph. Server-side clamp at 200 keeps the API usable for high-volume banks while preserving response payload semantics for smaller ones (where 200 is well under typical graph sizes anyway). Graph viz tools are unusable past ~200 nodes regardless, so this isn't a UX regression. Real fix is paginated/streamed graph loading in the UI; this is a guard to prevent the existing UI from breaking on large banks until that lands.
There was a problem hiding this comment.
Pull request overview
Caps the server-side limit parameter for GET /v1/default/banks/{bank_id}/graph to prevent extremely large graph responses that can exceed V8/Next.js parsing limits on dense banks, keeping the Control Plane UI usable at scale.
Changes:
- Clamp
limitinapi_graphto the range[1, 200]to bound response size. - Add inline rationale/comments explaining why the cap is needed (dense-link quadratic edge growth + V8 string cap).
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Address Copilot review on vectorize-io#1605: 1. Move limit signature default from 1000 to 200 via Query(...) so OpenAPI docs match runtime behavior. ge=1 added for explicit lower bound. The silent clamp at 200 is preserved (no 422) so existing UI/SDK callers passing limit > 200 keep working with a smaller response. Description on the Query field documents the cap and rationale. 2. Add tests/test_graph_endpoint_clamp.py: - test_graph_limit_above_cap_is_silently_clamped: assert request with limit=1000 returns 200 OK and response.limit == 200, nodes <= 200. - test_graph_limit_below_cap_passes_through: assert request with limit=50 returns 200 OK and response.limit == 50 (no modification).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Cap the
limitparameter onGET /v1/default/banks/{bank_id}/graphserver-side at 200. Edge count scales quadratically with node count for densely-linked banks, and the Control Plane (Next.js) deserializes the response as a single JS string which V8 caps at 0x1fffffe8 (~512 MiB).Repro
On a 14k-memory bank with ~835k memory_links and 17k entities:
limitThe Control Plane's
useState(1000)indata-view.tsxsets the initial fetchLimit, which falls within the API's oldlimit: int = 1000signature. Smaller banks fit; banks past ~10k highly-linked memories silently break the UI.What changes
A single
min(max(...))clamp inside theapi_graphhandler. Smaller queries pass through unchanged; large ones clamp to 200 with no error response. UI continues to function on dense banks instead of erroring out entirely.Why 200
Why server-side
The UI fix would be a one-line change to
data-view.tsx'suseState(1000)→useState(200). Either works; server-side caps protect SDK and direct API consumers too without depending on the UI staying in sync. Open to swapping for a UI-only fix if maintainers prefer.Real fix
This is a guard, not a solution. Long-term the graph endpoint should support pagination or streaming so high-density banks can be explored without truncation. Filing this as a stop-gap so the Control Plane stops 500'ing on production banks at scale.
Test plan
GraphDataResponseJSON.limit ≤ 200requests.pre-commithooks (ruff check --fix,ruff format,ty check) all green.