Degraded performance
Incident Report for Stonly
Resolved
An update to our codebase introducing new performance logs caused slight performance degradation, which caused server overload and restarts during peak hours, which lead to requests failing and site being unavailable for short periods of time.

Between 12:07 and 14:47 UTC, Aug 27th, guides and Knowledge Bases were randomly failing to load with a 500 response code. Reloading the page should have fixed the issue.

After analyzing the issue, we're implementing a set of improvements to our code, infrastructure, and processes:
We’ll be using an alternative approach to performance monitoring
We’ll improve our performance testing to better mimic production traffic patterns
We’ll be gradually rolling out similar changes which may impact performance to minimize impact to a single server at a time rather than the whole platform
We’ll be re-doing our capacity planning to consider adding more CPU power to better accommodate increases in traffic and load
Posted Aug 27, 2024 - 12:00 UTC