3/24/10

Wikipedia went down for 1 hour, accused overheating of servers

Word’s largest data farm, Wikipedia couldn’t stand the overwhelming heat of global warming. As Mark from Wikimedia stated in Technical Blog pages of Wikimedia, some servers in Europe seemed to be melt down by excessive heat caused by unknown sources (Global warming? Who knows).
Their official announcement can be found in this link. If you are lazy to surf that site, here’s the full transcription:

Due to an overheating problem in our European data center many of our servers turned off to protect themselves. As this impacted all Wikipedia and other projects access from European users, we were forced to move all user traffic to our Florida cluster, for which we have a standard quick failover procedure in place, that changes our DNS entries.

However, shortly after we did this failover switch, it turned out that this failover mechanism was now broken, causing the DNS resolution of Wikimedia sites to stop working globally. This problem was quickly resolved, but unfortunately it may take up to an hour before access is restored for everyone, due to caching effects.
We apologize for the inconvenience this has caused.

Update: Unfortunately, for many, this outage seems to have lasted longer than an hour. It appears that many ISPs’ DNS resolvers do not honor the so-called Negative Cache TTL that we send (1 hour), and instead use a longer value. We have circumvented this problem by renaming the affected DNS record to something else.

No comments:

Post a Comment