December 29, 2025

Urgent call from Cloudflare outage

Urgent call from Cloudflare outage

When Cloudflare Went Down: A Friday Night Story

Everything started around 17:00 (UTC+7). I received a DM from our support team. They noticed something strange while using our internal tools. When they clicked a link, the speed was very slow, and they sometimes saw a Cloudflare 500 error.

At first, I tried to help, but we couldn't find the root cause. Since we had just released a new version of the company’s API, I thought the servers were just resetting or reconnecting.

 

The Problem Gets Worse

After 30 minutes, the situation became more serious. Our CTO started seeing the same errors. We began to feel a bit worried—especially because we had just released a new monitoring tool that day! Soon after, our Sentry tool started sending many alert emails.

After 30 minutes of stress, I searched through the system and the release history. Finally, I checked DownDetector.

I found the real reason: Cloudflare was having service issues. We felt relieved because we didn't need to change our code or fix the new release. Since these problems usually get fixed quickly, I sent a report to the CTO and started my trip home around 19:00.

 

Part Two: A Global Issue

The "real duel" began at 19:30. After dinner, I tried to open a manga website like I usually do. It was extremely slow and eventually showed a 500 error. At the same time, my girlfriend and her friends couldn't play League of Legends.

That is when I knew something big was happening. I checked our company website; it was still working in Thailand for a moment, but not for long.

 

Five minutes later, the CTO contacted me again. He said our branch in Denmark (Europe) could not access the website at all. I checked DownDetector again and sent a report to the company group to explain what was happening.

 

What I Learned

By 21:30, the problem disappeared. After I wrote a final incident report for my boss, I thought about what happened. This situation taught me a few important lessons:

  • Good tools help you stay calm: Even when things go wrong, having good monitoring and testing helps you understand the problem faster.

  • The "Single Point of Failure": When a huge service like Cloudflare fails, many other websites fail too.

  • Plan B: Our team is now discussing a backup plan in case Cloudflare has problems again in the future.