This blog is far, far away from Awkward Cake. It will give you updates on system incidents even if our main site is FUBAR.

False downtime reports due to a malconfigured script

Posted: December 6th, 2010 | Author: | Filed under: Incidents | No Comments »

Pingdom has erroneously reported lots of downtime for our Unix Web Hosting Services today. These reports were caused the script we use to return availability information to Pingdom, which had far too low processor overload alarm thresholds to be useful on the cluster of powerful multi-CPU servers we use.

Some of the earlier short incidents reported in our downtime statistics for the Unix Web Hosting Service may be false positives as well. We’ve adjusted the alarm threshold configuration, but our Pingdom stats for this service won’t really be a good indicator of our reliability for this month.

For the record, over 1 hour of real downtime, within or outside a pre-announced service window, is not something we expect to experience regularly on our Unix and Windows Web Hosting System.



Leave a Reply