[2025-06-04] Incident Thread #161603
-
❗ An incident has been declared:Incident with Actions Subscribe to this Discussion for updates on this incident. Please upvote or emoji react instead of commenting +1 on the Discussion to avoid overwhelming the thread. Any account guidance specific to this incident will be shared in thread and on the Incident Status Page. |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments
-
UpdateWe are currently investigating delays with Actions triggering for some users. |
Beta Was this translation helpful? Give feedback.
-
UpdateWe have applied mitigations and are monitoring for recovery. |
Beta Was this translation helpful? Give feedback.
-
Incident ResolvedThis incident has been resolved. |
Beta Was this translation helpful? Give feedback.
-
Incident SummaryOn June 4, 2025, between 14:35 UTC and 15:50 UTC , the Actions service experienced degradation, leading to run start delays. During the incident, about 15.4% of all workflow runs were delayed by an average of 16 minutes. An unexpected load pattern revealed a scaling issue in our backend infrastructure. We mitigated the incident by blocking the requests that triggered this pattern. We are improving our rate limiting mechanisms to better handle unexpected load patterns while maintaining service availability. We are also strengthening our incident response procedures to reduce the time to mitigate for similar issues in the future. |
Beta Was this translation helpful? Give feedback.
Incident Summary
On June 4, 2025, between 14:35 UTC and 15:50 UTC , the Actions service experienced degradation, leading to run start delays. During the incident, about 15.4% of all workflow runs were delayed by an average of 16 minutes. An unexpected load pattern revealed a scaling issue in our backend infrastructure. We mitigated the incident by blocking the requests that triggered this pattern.
We are improving our rate limiting mechanisms to better handle unexpected load patterns while maintaining service availability. We are also strengthening our incident response procedures to reduce the time to mitigate for similar issues in the future.