GitHub Status - Incident History

Disruption with push notifications

2025-10-17T14:12:45Z

Oct 17, 14:12 UTC
Resolved - This incident has been resolved. Thank you for your patience and understanding as we addressed this issue. A detailed root cause analysis will be shared as soon as it is available.

Oct 17, 14:01 UTC
Update - We're investigating an issue with mobile push notifications. All notification types are affected, but notifications remain accessible in the app's inbox. For 2FA authentication, please open the GitHub mobile app directly to complete login.

Oct 17, 13:11 UTC
Investigating - We are currently investigating this issue.

Disruption with some GitHub services

2025-10-14T18:57:11Z

Oct 14, 18:57 UTC
Resolved - On October 14th, 2025, between 18:26 UTC and 18:57 UTC a subset of unauthenticated requests to the commit endpoint for certain repositories received 503 errors. During the event, the average error rate was 3%, peaking at 3.5% of total requests.

This event was triggered by a recent configuration change and some traffic pattern shifts on the service. We were alerted of the issue immediately and made changes to the configuration in order to mitigate the problem. We are working on automatic mitigation solutions and better traffic handling in order to prevent issues like this in the future.

Oct 14, 18:26 UTC
Investigating - We are currently investigating this issue.

Disruption with GPT-5-mini in Copilot

2025-10-14T16:00:29Z

Oct 14, 16:00 UTC
Resolved - On Oct 14th, 2025, between 13:34 UTC and 16:00 UTC the Copilot service was degraded for GPT-5 mini model. On average, 18% of the requests to GPT-5 mini failed due to an issue with our upstream provider.

We notified the upstream provider of the problem as soon as it was detected and mitigated the issue by failing over to other providers. The upstream provider has since resolved the issue.

We are working to improve our failover logic to mitigate similar upstream failures more quickly in the future.

Oct 14, 16:00 UTC
Update - GPT-5-mini is once again available in Copilot Chat and across IDE integrations.

We will continue monitoring to ensure stability, but mitigation is complete.

Oct 14, 15:42 UTC
Update - We are continuing to see degraded availability for the GPT-5-mini model in Copilot Chat, VS Code and other Copilot products. This is due to an issue with an upstream model provider. We continue to work with the model provider to resolve the issue.
Other models continue to be available and working as expected.

Oct 14, 14:50 UTC
Update - We continue to see degraded availability for the GPT-5-mini model in Copilot Chat, VS Code and other Copilot products. This is due to an issue with an upstream model provider. We continue to work with the model provider to resolve the issue.
Other models continue to be available and working as expected.

Oct 14, 14:07 UTC
Update - We are experiencing degraded availability for the GPT-5-mini model in Copilot Chat, VS Code and other Copilot products. This is due to an issue with an upstream model provider. We are working with them to resolve the issue.
Other models are available and working as expected.

Oct 14, 14:05 UTC
Investigating - We are investigating reports of degraded performance for Copilot

Incident with Webhooks

2025-10-09T16:40:52Z

Oct 9, 16:40 UTC
Resolved - On October 9th, 2025, between 14:35 UTC and 15:21 UTC, a network device in maintenance mode that was undergoing repairs was brought back into production before repairs were completed. Network traffic traversing this device experienced significant packet loss.

Authenticated users of the github.com UI experienced increased latency during the first 5 minutes of the incident. API users experienced up to 7.3% error rates, after which it stabilized to about 0.05% until mitigated. Actions service experienced 24% of runs being delayed for an average of 13 minutes. Large File Storage (LFS) requests experienced minimally increased error rate, with 0.038% of requests erroring.

To prevent similar issues, we are enhancing the validation process for device repairs of this category.

Oct 9, 16:39 UTC
Update - All services have fully recovered.

Oct 9, 16:27 UTC
Update - Actions has fully recovered but Notifications is still experiencing delays. We will continue to update as the system is fully restored to normal operation.

Oct 9, 16:24 UTC
Update - Actions is operating normally.

Oct 9, 16:08 UTC
Update - Pages is operating normally.

Oct 9, 16:04 UTC
Update - Git Operations is operating normally.

Oct 9, 16:02 UTC
Update - Actions and Notifications are still experiencing delays as we process the backlog. We will continue to update as the system is fully restored to normal operation.

Oct 9, 15:51 UTC
Update - Pull Requests is operating normally.

Oct 9, 15:48 UTC
Update - Actions is experiencing degraded performance. We are continuing to investigate.

Oct 9, 15:44 UTC
Update - We are seeing full recovery in many of our systems, but delays are still expected for actions. We will continue to update as the system is fully restored to normal operation.

Oct 9, 15:43 UTC
Update - Webhooks is operating normally.

Oct 9, 15:40 UTC
Update - Webhooks is experiencing degraded performance. We are continuing to investigate.

Oct 9, 15:39 UTC
Update - Issues is operating normally.

Oct 9, 15:38 UTC
Update - Pull Requests is experiencing degraded performance. We are continuing to investigate.

Oct 9, 15:26 UTC
Update - API Requests is operating normally.

Oct 9, 15:25 UTC
Update - We identified a faulty network component and have removed it from the infrastructure. Recovery has started and we expect full recovery shortly.

Oct 9, 15:20 UTC
Update - Pull Requests is experiencing degraded availability. We are continuing to investigate.

Oct 9, 15:20 UTC
Update - Git Operations is experiencing degraded performance. We are continuing to investigate.

Oct 9, 15:17 UTC
Update - Actions is experiencing degraded availability. We are continuing to investigate.

Oct 9, 15:11 UTC
Update - We are investigating widespread reports of delays and increased latency in various services. We will continue to keep users updated on progress toward mitigation.

Oct 9, 15:09 UTC
Update - Issues is experiencing degraded availability. We are continuing to investigate.

Oct 9, 15:09 UTC
Update - API Requests is experiencing degraded performance. We are continuing to investigate.

Oct 9, 15:09 UTC
Update - Pages is experiencing degraded performance. We are continuing to investigate.

Oct 9, 14:50 UTC
Update - Actions is experiencing degraded performance. We are continuing to investigate.

Oct 9, 14:45 UTC
Investigating - We are investigating reports of degraded availability for Webhooks

Multiple GitHub API endpoints are experiencing errors

2025-10-09T13:56:06Z

Oct 9, 13:56 UTC
Resolved - Between 13:39 UTC and 13:42 UTC on Oct 9, 2025, around 2.3% of REST API calls and 0.4% Web traffic were impacted due to the partial rollout of a new feature that had more impact on one of our primary databases than anticipated. When the feature was partially rolled out it performed an excessive number of writes per request which caused excessive latency for writes from other API and Web endpoints and resulted in 5xx errors to customers.

The issue was identified by our automatic alerting and reverted by turning down the percentage of traffic to the new feature, which led to recovery of the data cluster and services.

We are working to improve the way we roll out new features like this and move the specific writes from this incident to a storage solution more suited to this type of activity. We have also optimized this particular feature to avoid its rollout from having future impact on other areas of the site. We are also investigating how we can even more quickly identify issues like this.

Oct 9, 13:54 UTC
Update - A feature was partially rolled out that had high impact on one of our primary databases but we were able to roll it back. All services are recovered but we will monitor for recovery before statusing green.

Oct 9, 13:52 UTC
Investigating - We are currently investigating this issue.

Disruption with some GitHub services

2025-10-08T00:05:41Z

Oct 8, 00:05 UTC
Resolved - On October 7, 2025, between 7:48 PM UTC and October 8, 12:05 AM UTC (approximately 4 hours and 17 minutes), the audit log service was degraded, creating a backlog and delaying availability of new audit log events. The issue originated in a third-party dependency.

We mitigated the incident by working with the vendor to identify and resolve the issue. Write operations recovered first, followed by the processing of the accumulated backlog of audit log events.

We are working to improve our monitoring and alerting for audit log ingestion delays and strengthen our incident response procedures to reduce our time to detection and mitigation of issues like this one in the future.

Oct 7, 22:45 UTC
Update - We are seeing recovery of audit log ingestion and continue to monitor recovery.

Oct 7, 21:51 UTC
Update - We are seeing recovery of audit log ingestion and continue to monitor recovery.

Oct 7, 21:17 UTC
Update - We continue to apply mitigations and monitor for recovery.

Oct 7, 20:33 UTC
Update - We have identified an issue causing delayed audit log event ingestion and are working on a mitigation.

Oct 7, 19:48 UTC
Update - Ingestion of new audit log events is delayed

Oct 7, 19:48 UTC
Investigating - We are currently investigating this issue.

Incident with Copilot

2025-10-03T03:47:27Z

Oct 3, 03:47 UTC
Resolved -

On October 3rd, between approximately 10:00 PM and 11:30 Eastern, the Copilot service experienced degradation due to an issue with our upstream provider. Users encountered elevated error rates when using the following Claude models: Claude Sonnet 3.7, Claude Opus 4, Claude Opus 4.1, Claude Sonnet 4, and Claude Sonnet 4.5. No other models were impacted.

The issue was mitigated by temporarily disabling affected endpoints while our provider resolved the upstream issue. GitHub is working with our provider to further improve the resiliency of the service to prevent similar incidents in the future.

Oct 3, 03:47 UTC
Update - This incident has been resolved. Thank you for your patience and understanding as we addressed this issue. A detailed root cause analysis will be shared as soon as it is available.

Oct 3, 03:04 UTC
Update - The upstream provider is implementing a fix. Services are recovering. We are monitoring the situation.

Oct 3, 02:42 UTC
Update - We’re seeing degraded experience across Anthropic models. We’re working with our partners to restore service.

Oct 3, 02:41 UTC
Investigating - We are investigating reports of degraded performance for Copilot

Degraded Gemini 2.5 Pro experience in Copilot

2025-10-02T22:33:20Z

Oct 2, 22:33 UTC
Resolved - Between October 1st, 2025 at 1 AM UTC and October 2nd, 2025 at 10:33 PM UTC, the Copilot service experienced a degradation of the Gemini 2.5 Pro model due to an issue with our upstream provider. Before 15:53 UTC on October 1st, users experienced higher error rates with large context requests while using Gemini 2.5 Pro. After 15:53 UTC and until 10:33 PM UTC on October 2nd, requests were restricted to smaller context windows when using Gemini 2.5. Pro. No other models were impacted.

The issue was resolved by a mitigation put in place by our provider. GitHub is collaborating with our provider to enhance communication and improve the ability to reproduce issues with the aim to reduce resolution time.

Oct 2, 22:26 UTC
Update - We have confirmed that the fix for the lower token input limit for Gemini 2.5 Pro is in place and are currently testing our previous higher limit to verify that customers will experience no further impact.

Oct 2, 17:13 UTC
Update - The underlying issue for the lower token limits for Gemini 2.5 Pro has been identified and a fix is in progress. We will update again once we have tested and confirmed that the fix is correct and globally deployed.

Oct 2, 02:52 UTC
Update - We are continuing to work with our provider to resolve the issue where some Copilot requests using Gemini 2.5 Pro return an error indicating a bad request due to exceeding the input limit size.

Oct 1, 18:16 UTC
Update - We are continuing to investigate and test solutions internally while working with our model provider on a deeper investigation into the cause. We will update again when we have identified a mitigation.

Oct 1, 17:37 UTC
Update - We are testing other internal mitigations so that we can return to the higher maximum input length. We are still working with our upstream model provider to understand the contributing factors for this sudden decrease in input limits.

Oct 1, 16:49 UTC
Update - We are experiencing a service regression for the Gemini 2.5 Pro model in Copilot Chat, VS Code and other Copilot products. The maximum input length of Gemini 2.5 prompts been decreased. Long prompts or large context windows may result in errors. This is due to an issue with an upstream model provider. We are working with them to resolve the issue.

Other models are available and working as expected.

Oct 1, 16:43 UTC
Investigating - We are investigating reports of degraded performance for Copilot

Degraded Performance for GitHub Actions MacOS Runners

2025-10-01T16:55:59Z

Oct 1, 16:55 UTC
Resolved - On October 1, 2025 between 07:00 UTC and 17:20 UTC, Mac hosted runner capacity for Actions was degraded, leading to timed out jobs and long queue times. On average, the error rate was 46% and peaked at 96% of requests to the service. XL and Intel runners recovered by 10:10 UTC, with the other types taking longer to recover.

The degraded capacity was triggered by a scheduled event at 07:00 UTC that led to a permission failure on Mac runner hosts, blocking reimage operations. The permission issue was resolved by 9:41 UTC, but the recovery of available runners took longer than expected due to a combination of backoff logic slowing backend operations and some hosts needing state resets.

We deployed changes immediately following the incident to address the scheduled event and ensure that similar failures will not block critical operations in the future. We are also working to reduce the end-to-end time for self-healing of offline hosts for quicker full recovery of future capacity or host events.

Oct 1, 16:27 UTC
Update - We are seeing some recovery for image queueing and continuing to monitor.

Oct 1, 14:41 UTC
Update - We are continuing work to restore capacity for our MacOS ARM runners.

Oct 1, 13:58 UTC
Update - Our team continues to work hard on restoring capacity for the Mac runners.

Oct 1, 13:12 UTC
Update - Work continues on restoring capacity on the Mac runners.

Oct 1, 12:32 UTC
Update - MacOS ARM runners continue to be at reduced capacity, causing queuing of jobs. Investigation is ongoing.

Oct 1, 11:51 UTC
Update - Work continues to bring the full runner capacity back online. Resources are focused on improving the recovery of certain runner types.

Oct 1, 11:11 UTC
Update - We are continuing to see recovery of some runner capacity and investigating slow recovery of certain runner types.

Oct 1, 10:30 UTC
Update - We are seeing recovery of some runner capacity, while also investigating slow recovery of certain runner types.

Oct 1, 09:44 UTC
Update - MacOS runners are coming back online and starting to process queued work.

Oct 1, 08:59 UTC
Update - We are continuing to deploy the necessary changes to restore MacOS runner capacity.

Oct 1, 08:27 UTC
Update - We have identified the cause and are deploying a change to restore MacOS runner capacity.

Oct 1, 08:17 UTC
Update - Customers using GitHub Actions Mac OS runners are experiencing job start delays and failures. We are aware of this issue and actively investigating.

Oct 1, 08:09 UTC
Update - Actions is experiencing degraded performance. We are continuing to investigate.

Oct 1, 07:59 UTC
Investigating - We are currently investigating this issue.

Disruption with Gemini 2.5 Pro and Gemini 2.0 Flash in Copilot

2025-09-29T19:12:41Z

Sep 29, 19:12 UTC
Resolved - On September 29, 2025, between 17:53 and 18:42 UTC, the Copilot service experienced a degradation of the Gemini 2.5 model due to an issue with our upstream provider. Approximately 24% of requests failed, affecting 56% of users during this period. No other models were impacted.

GitHub notified the upstream provider of the problem as soon as it was detected. The issue was resolved after the upstream provider rolled back a recent change that caused the disruption. GitHub will continue to enhance our monitoring and alerting systems to reduce the time it takes to detect and mitigate similar issues in the future.

Sep 29, 19:12 UTC
Update - The upstream model provided has resolved the issue and we are seeing full availability for Gemini 2.5 Pro and Gemini 2.0 Flash.

Sep 29, 18:40 UTC
Update - We are experiencing degraded availability for the Gemini 2.5 Pro & Gemini 2.0 Flash models in Copilot Chat, VS Code and other Copilot products. This is due to an issue with an upstream model provider. We are working with them to resolve the issue.

Other models are available and working as expected.

Sep 29, 18:39 UTC
Investigating - We are currently investigating this issue.

Disruption with some GitHub services

2025-09-29T17:33:51Z

Sep 29, 17:33 UTC
Resolved - On September 29, 2025 between 16:26 UTC and 17:33 UTC the Copilot API experienced a partial degradation causing intermittent erroneous 404 responses for an average of 0.2% of GitHub MCP server requests, peaking at times around 2% of requests. The issue stemmed from an upgrade of an internal dependency which exposed a misconfiguration in the service.

We resolved the incident by rolling back the upgrade to address the misconfiguration. We fixed the configuration issue and will improve documentation and rollout process to prevent similar issues.

Sep 29, 17:28 UTC
Update - Customers are getting 404 responses when connecting to the GitHub MCP server. We have reverted a change we believe is contributing to the impact, and are seeing resolution in deployed environments.

Sep 29, 16:45 UTC
Investigating - We are currently investigating this issue.

Disruption with some GitHub services

2025-09-25T17:36:18Z

Sep 25, 17:36 UTC
Resolved - On September 26, 2025 between 16:22 UTC and 18:32 UTC raw file access was degraded for a small set of four repositories. On average, raw file access error rate was 0.01% and peaked at 0.16% of requests. This was due to a caching bug exposed by excessive traffic to a handful of repositories.

We mitigated the incident by resetting the state of the cache for raw file access and are working to improve cache usage and testing to prevent issues like this in the future.

Sep 25, 17:06 UTC
Update - We are seeing issues related to our ability to serve raw file access across a small percentage of our requests.

Sep 25, 17:00 UTC
Investigating - We are currently investigating this issue.

Disruption with some GitHub services

2025-09-24T15:36:09Z

Sep 24, 15:36 UTC
Resolved - On September 23, 2025, between 15:29 UTC and 17:38 UTC and also on September 24, 2025 between 15:02 UTC and 15:12, email deliveries were delayed up to 50 minutes which resulted in significant delays for most types of email notifications. This occurred due to an unusually high volume of traffic which caused resource contention on some of our outbound email servers.

We have updated the configuration we use to better allocate capacity when there is a high volume of traffic and are also updating our monitors so we can detect this type of issue before it becomes a customer impacting incident.

Sep 24, 14:55 UTC
Update - We are seeing delays in email delivery, which is impacting notifications and user signup email verification. We are investigating and working on mitigation.

Sep 24, 14:46 UTC
Investigating - We are currently investigating this issue.

Claude Opus 4 is experiencing degraded performance

2025-09-24T09:18:30Z

Sep 24, 09:18 UTC
Resolved - On September 24th, 2025, between 08:02 UTC and 09:11 UTC the Copilot service was degraded for Claude Opus 4 and Claude Opus 4.1 requests. On average, 22% of requests failed for Claude Opus 4 and 80% of requests for Claude Opus 4.1. This was due to an upstream provider returning elevated errors on Claude Opus 4 and Opus 4.1.

We mitigated the issue by directing users to select other models and by monitoring recovery. To resolve the issue, we are expanding failover capabilities by integrating with additional infrastructure providers.

Sep 24, 09:16 UTC
Update - Between around 8:16 UTC and 8:51 UTC we saw elevated errors on Claude Opus 4 and Opus 4.1, up to 49% of requests were failing. This has recovered to around 4% of requests failing, we are monitoring recovery.

Sep 24, 09:08 UTC
Investigating - We are currently investigating this issue.

Incident with Copilot

2025-09-24T00:26:29Z

Sep 24, 00:26 UTC
Resolved - Between 20:06 UTC September 23 and 04:58 UTC September 24, 2025, the Copilot service experienced degraded availability for Claude Sonnet 4 and 3.7 model requests.

During this period, 0.46% of Claude 4 requests and 7.83% of Claude 3.7 requests failed.

The reduced availability resulted from Copilot disabling routing to an upstream provider that was experiencing issues and reallocating capacity to other providers to manage requests for Claude Sonnet 3.7 and 4.
We are continuing to investigate the source of the issues with this provider and will provide an update as more information becomes available.

Sep 24, 00:26 UTC
Update - The issues with our upstream model provider have been resolved, and Claude Sonnet 3.7 and Claude Sonnet 4 are once again available in Copilot Chat, VS Code and other Copilot products.

We will continue monitoring to ensure stability, but mitigation is complete.

Sep 23, 22:22 UTC
Update - We are experiencing degraded availability for the Claude Sonnet 3.7 and Claude Sonnet 4 model in Copilot Chat, VS Code and other Copilot products. This is due to an issue with an upstream model provider. We are working with them to resolve the issue.

Other models are available and working as expected.

Sep 23, 22:22 UTC
Investigating - We are investigating reports of degraded performance for Copilot

Incident with Pages and Actions

2025-09-23T17:41:57Z

Sep 23, 17:41 UTC
Resolved - On September 23, between 17:11 and 17:40 UTC, customers experienced failures and delays when running workflows on GitHub Actions and building or deploying GitHub Pages. The issue was caused by a faulty configuration change that disrupted service to service communication in GitHub Actions. During this period, in-progress jobs were delayed and new jobs would not start due to a failure to acquire runners, and about 30% of all jobs failed. GitHub Pages users were unable to build or deploy their Pages during this period.

The offending change was rolled back within 15 minutes of its deployment, after which Actions workflows and Pages deployments began to succeed. Actions customers continued to experience delays for about 15 minutes after the rollback was completed while services worked through the backlog of queued jobs. We are planning to implement additional rollout checks to help detect and prevent similar issues in the future.

Sep 23, 17:33 UTC
Update - We are investigating delays in Actions Workflows.

Sep 23, 17:28 UTC
Investigating - We are investigating reports of degraded performance for Actions and Pages

Disruption with some GitHub services

2025-09-23T17:40:25Z

Sep 23, 17:40 UTC
Resolved - On September 23, 2025, between 15:29 UTC and 17:38 UTC and also on September 24, 2025 between 14:02 UTC and 15:12 UTC, email deliveries were delayed up to 50 minutes which resulted in significant delays for most types of email notifications. This occurred due to an unusually high volume of traffic which caused resource contention on some of our outbound email servers.

We have updated the configuration we use to better allocate capacity when there is a high volume of traffic and are also updating our monitors so we can detect this type of issue before it becomes a customer impacting incident.

Sep 23, 16:50 UTC
Update - We're seeing delays related to outbound emails and are investigating.

Sep 23, 16:46 UTC
Investigating - We are currently investigating this issue.

Incident with Codespaces

2025-09-17T17:55:39Z

Sep 17, 17:55 UTC
Resolved - On September 17, 2025 between 13:23 and 16:51 UTC some users in West Europe experienced issues with Codespaces that had shut down due to network disconnections and subsequently failed to restart. Codespace creations and resumes were failed over to another region at 15:01 UTC. While many of the impacted instances self-recovered after mitigation efforts, approximately 2,000 codespaces remained stuck in a "shutting down" state while the team evaluated possible methods to recover unpushed data from the latest active session of affected codespaces. Unfortunately, recovery of that data was not possible. We unblocked shutdown of those codespaces, with all instances either shut down or available by 8:26 UTC on September 19.

The disconnects were triggered by an exhaustion of resources in the network relay infrastructure in that region, but the lack of self-recovery was caused by an unhandled error impacting the local agent, which led to an unclean shutdown.

We are improving the resilience of the local agent to disconnect events to ensure shutdown of codespaces is always clean without data loss. We have also addressed the exhausted resources in the network relay and will be investing in improved detection and resilience to reduce the impact of similar events in the future.

Sep 17, 17:55 UTC
Update - We have confirmed the original mitigation to failover has resolved the issue causing Codespaces to become unavailable. We are evaluating if there is a path to recover unpushed data from the approximately 2000 Codespaces that are currently in the shutting down state. We will be resolving this incident and will detail the next steps in our public summary.

Sep 17, 16:51 UTC
Update - For Codespaces that were stuck in the shutting down state and have been resumed, we've identified an issue that is causing the contents Codespace to be irrecoverably lost which has impacted approximately 250 Codespaces. We are actively working on a mitigation to prevent any more Codespaces currently in this state from being forced to shut down to prevent the potential data loss.

Sep 17, 16:07 UTC
Update - We're continuing to see improvement with Codespaces that were stuck in in the shutting down state and we anticipate the remaining should self resolve in about an hour.

Sep 17, 15:31 UTC
Update - Some users with Codespaces in West Europe were unable to connect to Codespaces, we have failed over that region and users should be able to create new Codespaces. If a user has a Codespace in a shutting down state, we are still investigating potential fixes and mitigations.

Sep 17, 15:04 UTC
Investigating - We are investigating reports of degraded performance for Codespaces

Unauthenticated LFS requests for public repos are returning unexpected 401 errors

2025-09-16T18:30:08Z

Sep 16, 18:30 UTC
Resolved - Between 16:26 UTC on September 15th and 18:30 UTC on September 16th, anonymous REST API calls to approximately 20 endpoints were incorrectly rejected because they were not authenticated. While this caused unauthenticated requests to be rejected by these endpoints, all authenticated requests were unaffected, and no protected endpoints were exposed.

This resulted in 100% of requests to these endpoints failing at peak, representing less than 0.1% of GitHub’s overall request volume. On average, the error rate for these endpoints was less than 50% and peaked at 100% for about 26 hours over September 16th. API requests to the impacted endpoints were rejected with a 401 error code. This was due to a mismatch in authentication policies, for specific endpoints, during a system migration.

The failure to detect the errors was the result of the issue occurring for a low percentage of traffic.

We mitigated the incident by reverting the policy in question, and correcting the logic associated with the degraded endpoints. We are working to improve our test suite to further validate mismatches, and refining our monitors for proactive detection.

Sep 16, 18:29 UTC
Update - We have mitigated the issue and are monitoring the results

Sep 16, 18:02 UTC
Update - Git Operations is experiencing degraded performance. We are continuing to investigate.

Sep 16, 17:55 UTC
Update - A recent change to our API routing inadvertently added an authentication requirement to the anonymous route for LFS requests. We're in the process of fixing the change, but in the interim retrying should eventually succeed.

Sep 16, 17:55 UTC
Investigating - We are currently investigating this issue.

Creating GitHub apps using the REST API will fail with a 401 error

2025-09-16T17:45:22Z

Sep 16, 17:45 UTC
Resolved - Between 16:26 UTC on September 15th and 18:30 UTC on September 16th, anonymous REST API calls to approximately 20 endpoints were incorrectly rejected because they were not authenticated. While this caused unauthenticated requests to be rejected by these endpoints, all authenticated requests were unaffected, and no protected endpoints were exposed.

This resulted in 100% of requests to these endpoints failing at peak, representing less than 0.1% of GitHub’s overall request volume. On average, the error rate for these endpoints was less than 50% and peaked at 100% for about 26 hours over September 16th. API requests to the impacted endpoints were rejected with a 401 error code. This was due to a mismatch in authentication policies, for specific endpoints, during a system migration.

The failure to detect the errors was the result of the issue occurring for a low percentage of traffic.

We mitigated the incident by reverting the policy in question, and correcting the logic associated with the degraded endpoints. We are working to improve our test suite to further validate mismatches, and refining our monitors for proactive detection.

Sep 16, 17:27 UTC
Update - We have mitigated the issue and are monitoring the results

Sep 16, 17:15 UTC
Update - A recent change to our API routing inadvertently added an authentication requirement to the anonymous route for creating GitHub apps. We're in the process of fixing the change, but in the interim retrying should eventually succeed.

Sep 16, 17:14 UTC
Investigating - We are currently investigating this issue.

Repository search is degraded

2025-09-15T21:01:03Z

Sep 15, 21:01 UTC
Resolved - At around 18:45 UTC on Friday, September 12, 2025, a change was deployed that unintentionally affected search index management. As a result, approximately 25% of repositories were temporarily missing from search results.

By 12:45 UTC on Saturday, September 14, most missing repositories were restored from an earlier search index snapshot, and repositories updated between the snapshot and the restoration were reindexed. This backfill was completed at 21:25 UTC.

After these repairs, about 98.5% of repositories were once again searchable. We are performing a full reconciliation of the search index and customers can expect to see records being updated and content becoming searchable for all repos again between now and Sept 25.

NOTE: Users who notice missing or outdated repositories in search results can force reindexing by starring or un-starring the repository. Other repository actions such as adding topics, or updating the repository description, will also result in reindexing. In general, changes to searchable artifacts in GitHub will also update their respective search index in near-real time.

User impact has been mitigated with the exception of the 1.5% of repos that are missing from the search index. The change responsible for the search issue has been reverted, and full reconciliation of the search index is underway, expected to complete by September 23. We have added additional checks to our indexing model to ensure this failure does not happen again. We are also investigating faster repair alternatives.

To avoid resource contention and possible further issues we are currently not repairing repositories or organizations individually at this time. No repository data was lost, and other search types were not affected.

Sep 13, 22:39 UTC
Update - Most searchable repositories should again be visible in search results. Up to 1.5% of repositories may still be missing from search results.

Many different actions synchronize the repository state with the search index, so we expect natural recovery for repositories that see more frequent user and API-driven interactions.

A complete index reconciliation is underway to restore stagnant repositories that were deleted from the index. We will update again once we have a clear timeline of when we expect full recovery for those missing search results.

Sep 13, 12:49 UTC
Update - Customers are not seeing repositories they expect to see in search results. We have restored a snapshot of this search index from Fri 12 Sep at 21:00 UTC. Changes made since then will be unavailable while we work to backfill the rest of the search index. Any new changes will be available in near-real time as expected.

Sep 13, 12:44 UTC
Investigating - We are currently investigating this issue.

Disruption with some GitHub services

2025-09-15T18:28:36Z

Sep 15, 18:28 UTC
Resolved - On September 15th between 17:55 and 18:20 UTC, Copilot experienced degraded availability for all features. This was due a partial deployment of a feature flag to a global rate limiter. The flag triggered behavior that unintentionally rate limited all requests, resulting in 100% of them returning 403 errors. The issue was resolved by reverting the feature flag which resulted in immediate recovery.

The root cause of the incident was from an undetected edge case in our rate limiting logic. The flag was meant to scale down rate limiting for a subset of users, but unintentionally put our rate limiting configuration into an invalid state.

To prevent this from happening again, we have addressed the bug with our rate limiting. We are also adding additional monitors to detect anomalies in our traffic patterns, which will allow us to identify similar issues during future deployments. Furthermore, we are exploring ways to test our rate limit scaling in our internal environment to enhance our pre-production validation process.

Sep 15, 18:21 UTC
Investigating - We are currently investigating this issue.

Incident with Actions

2025-09-10T14:02:41Z

Sep 10, 14:02 UTC
Resolved - On September 10, 2025 between 13:00 and 14:15 UTC, Actions users experienced failed jobs and run start delays for Ubuntu 24 and Ubuntu 22 jobs on standard runners in private repositories. Additionally, larger runner customers experienced run start delays for runner groups with private networking configured in the eastus2 region. This was due to an outage in an underlying compute service provider in eastus2. 1.06% of Ubuntu 24 jobs and 0.16% of Ubuntu 22 jobs failed during this period. Jobs for larger runners using private networking in the eastus2 region were unable to start for the duration of the incident.

We have identified and are working on improvements in our resilience to single partner region outages for standard runners so impact is reduced in similar scenarios in the future.

Sep 10, 13:31 UTC
Update - Actions hosted runners are taking longer to come online, leading to high wait times or job failures.

Sep 10, 13:23 UTC
Investigating - We are investigating reports of degraded performance for Actions

Degraded REST API success rates for some customers

2025-09-04T20:25:47Z

Sep 4, 20:25 UTC
Resolved - On September 4, 2025 between 15:30 UTC and 20:00 UTC the REST API endpoints git/refs, git/refs/*, and git/matching-refs/* were degraded and returned elevated errors for repositories with reference counts over 22k. On average, the request error rate to these specific endpoints was 0.5%. Overall REST API availability remained 99.9999%. This was due to the introduction of a code change that added latency to reference evaluations and overly affected repositories with many branches, tags, or other references.

We mitigated the incident by reverting the new code.

We are working to improve performance testing and to reduce our time to detection and mitigation of issues like this one in the future.

Sep 4, 20:05 UTC
Update - The deployment has completed and we expect customers who have been impacted to see recovery. We are continuing to monitor.

Sep 4, 19:28 UTC
Update - We are in the process of deploying the PR to revert the change that was causing timeouts to this endpoint. We will update again once that deployment is complete.

Sep 4, 18:57 UTC
Update - We have identified a deployed change that correlates with the increase in 5XX errors to the GitRefs REST API. This is particularly affecting requests for repos with very large numbers of commits. We are working on rolling back this change which we expect will resolve the issue.

Sep 4, 18:52 UTC
Update - API Requests is experiencing degraded performance. We are continuing to investigate.

Sep 4, 18:18 UTC
Update - Customers are experiencing 504 responses for some API requests for regarding repo refs/tags. We are investigating.

Sep 4, 18:16 UTC
Investigating - We are currently investigating this issue.

Loading avatars might fail for a 0.5% of total users and 100% users around the Arabian Peninsula. We are investigating.

2025-09-02T15:44:41Z

Sep 2, 15:44 UTC
Resolved - Between August 21, 2025 at 15:00 UTC, and September 2, 2025 at 15:22 UTC the avatars.githubusercontent.com image service was degraded and failed to display user avatars for users in the Middle East. During this time, avatar images appeared broken on github.com for affected users. On average, this impacted about 82% of users routed through one of our Middle East-based points-of-presence, which represents about 0.14% of global users.

This was due to a configuration change within GitHub's edge infrastructure in the affected region, causing HTTP requests to fail. As a result, image requests returned HTTP 503 errors. The failure to detect the issues was the result of an alerting threshold set too low.

We mitigated the incident by removing the affected site from service, which restored avatar serving for impacted users.

To prevent this from recurring, we have tuned configuration defaults for graceful degradation. We also added new health checks to automatically shift traffic from impacted sites. We are updating our monitoring to prevent undetected errors like this in the future.

Sep 2, 15:17 UTC
Investigating - We are currently investigating this issue.