chore: add contributor account loading to bot detection workflow#15366
chore: add contributor account loading to bot detection workflow#15366
Conversation
mnkiefer
commented
Feb 13, 2026
- Enhances the bot detection workflow by expanding the allowlist of trusted accounts to include repository contributors, in addition to existing organization members.
There was a problem hiding this comment.
Pull request overview
This PR expands the trusted-account allowlist used by the bot-detection workflow to also include repository contributors (in addition to existing collaborators and org members), reducing noise from established contributors during suspicious-activity triage.
Changes:
- Add a
loadContributorAccounts()step that loadsrepos.listContributorsresults into the workflow’s allowlist set. - Invoke contributor loading before org member loading during precompute.
- Regenerate the corresponding locked workflow YAML.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| .github/workflows/bot-detection.md | Adds contributor-based allowlisting logic to the bot detection precompute script. |
| .github/workflows/bot-detection.lock.yml | Updates the compiled/locked workflow to match the new contributor allowlisting logic. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| for (const contributor of contributors) { | ||
| if (contributor?.login) { | ||
| MEMBER_ACCOUNTS.add(String(contributor.login).toLowerCase()); | ||
| } |
There was a problem hiding this comment.
listContributors will include bot users (and any account with a single merged commit). Adding all of them to the allowlist can create blind spots in this bot-detection workflow. Consider filtering out type === 'Bot' / login ending in [bot] (except for the explicitly allowed bots) and/or requiring a minimum contributions threshold before allowlisting.
| if (contributor?.login) { | ||
| MEMBER_ACCOUNTS.add(String(contributor.login).toLowerCase()); | ||
| } |
There was a problem hiding this comment.
MEMBER_ACCOUNTS now contains collaborators, org members, and contributors, so the name is misleading and makes isAllowedAccount harder to reason about. Consider renaming it (e.g., TRUSTED_ACCOUNTS / ALLOWLIST_ACCOUNTS) to reflect its broader meaning.
| try { | ||
| const contributors = await github.paginate(github.rest.repos.listContributors, { | ||
| owner, | ||
| repo, | ||
| per_page: 100, | ||
| }); |
There was a problem hiding this comment.
github.paginate(repos.listContributors) will fetch all contributors across pages, which can be very large for older repos and may increase runtime / hit rate limits. Consider enforcing a hard cap (similar to the other paginated calls in this workflow) and stopping pagination early once the cap is reached.
| try { | |
| const contributors = await github.paginate(github.rest.repos.listContributors, { | |
| owner, | |
| repo, | |
| per_page: 100, | |
| }); | |
| const MAX_CONTRIBUTORS = 1000; | |
| let collected = 0; | |
| try { | |
| const contributors = await github.paginate( | |
| github.rest.repos.listContributors, | |
| { | |
| owner, | |
| repo, | |
| per_page: 100, | |
| }, | |
| (response, done) => { | |
| const pageContributors = response?.data || []; | |
| if (pageContributors.length === 0) { | |
| return []; | |
| } | |
| const remaining = MAX_CONTRIBUTORS - collected; | |
| if (remaining <= 0) { | |
| done(); | |
| return []; | |
| } | |
| if (pageContributors.length > remaining) { | |
| const slice = pageContributors.slice(0, remaining); | |
| collected += slice.length; | |
| done(); | |
| return slice; | |
| } | |
| collected += pageContributors.length; | |
| return pageContributors; | |
| } | |
| ); |
| if (contributor?.login) { | ||
| MEMBER_ACCOUNTS.add(String(contributor.login).toLowerCase()); | ||
| } |
There was a problem hiding this comment.
listContributors will include bot users (and any account with a single merged commit). Adding all of them to the allowlist can create blind spots in this bot-detection workflow. Consider filtering out type === 'Bot' / login ending in [bot] (except for the explicitly allowed bots) and/or requiring a minimum contributions threshold before allowlisting.
| if (contributor?.login) { | |
| MEMBER_ACCOUNTS.add(String(contributor.login).toLowerCase()); | |
| } | |
| const login = contributor?.login; | |
| if (!login) { | |
| continue; | |
| } | |
| const normalized = String(login).toLowerCase(); | |
| const isExplicitlyAllowed = ALLOWED_ACCOUNTS.has(normalized); | |
| const isBotType = String(contributor?.type || "").toLowerCase() === "bot"; | |
| const loginLooksLikeBot = normalized.endsWith("[bot]"); | |
| const contributionsCount = | |
| typeof contributor?.contributions === "number" ? contributor.contributions : 0; | |
| const hasSufficientContributions = contributionsCount >= 2; | |
| if ( | |
| !isExplicitlyAllowed && | |
| (isBotType || loginLooksLikeBot || !hasSufficientContributions) | |
| ) { | |
| continue; | |
| } | |
| MEMBER_ACCOUNTS.add(normalized); |
| if (contributor?.login) { | ||
| MEMBER_ACCOUNTS.add(String(contributor.login).toLowerCase()); | ||
| } |
There was a problem hiding this comment.
MEMBER_ACCOUNTS now contains collaborators, org members, and contributors, so the name is misleading and makes isAllowedAccount harder to reason about. Consider renaming it (e.g., TRUSTED_ACCOUNTS / ALLOWLIST_ACCOUNTS) to reflect its broader meaning.
| async function loadContributorAccounts() { | ||
| try { | ||
| const contributors = await github.paginate(github.rest.repos.listContributors, { | ||
| owner, | ||
| repo, | ||
| per_page: 100, | ||
| }); | ||
| for (const contributor of contributors) { | ||
| if (contributor?.login) { | ||
| MEMBER_ACCOUNTS.add(String(contributor.login).toLowerCase()); | ||
| } | ||
| } |
There was a problem hiding this comment.
github.paginate(repos.listContributors) will fetch all contributors across pages, which can be very large for older repos and may increase runtime / hit rate limits. Consider enforcing a hard cap (similar to the other paginated calls in this workflow) and stopping pagination early once the cap is reached.
| async function loadContributorAccounts() { | |
| try { | |
| const contributors = await github.paginate(github.rest.repos.listContributors, { | |
| owner, | |
| repo, | |
| per_page: 100, | |
| }); | |
| for (const contributor of contributors) { | |
| if (contributor?.login) { | |
| MEMBER_ACCOUNTS.add(String(contributor.login).toLowerCase()); | |
| } | |
| } | |
| const MAX_CONTRIBUTORS = 1000; | |
| async function loadContributorAccounts() { | |
| try { | |
| await github.paginate( | |
| github.rest.repos.listContributors, | |
| { | |
| owner, | |
| repo, | |
| per_page: 100, | |
| }, | |
| (response, done) => { | |
| for (const contributor of response.data || []) { | |
| if (contributor?.login) { | |
| MEMBER_ACCOUNTS.add(String(contributor.login).toLowerCase()); | |
| } | |
| if (MEMBER_ACCOUNTS.size >= MAX_CONTRIBUTORS) { | |
| done(); | |
| break; | |
| } | |
| } | |
| // We are updating MEMBER_ACCOUNTS directly and do not need to accumulate results. | |
| return []; | |
| } | |
| ); |