Real-Time Status Monitor

Is it just me, or is something actually down?

⚡ Live data cache auto-refreshing in 60s

GitHub

Operational

All Systems Operational.

90-Day Uptime: 99.50% Checked: just now

GitLab

Operational

All Systems Operational

90-Day Uptime: 99.94% Checked: just now

Bitbucket

Operational

All Systems Operational

90-Day Uptime: 99.83% Checked: just now

npm Registry

Operational

All Systems Operational.

90-Day Uptime: 99.94% Checked: just now

Vercel

Operational

All Systems Operational.

90-Day Uptime: 99.83% Checked: just now

Netlify

Operational

All Systems Operational.

90-Day Uptime: 100.00% Checked: just now

Cloudflare

Degraded

We are continuing to investigate this issue.

90-Day Uptime: 99.78% Checked: just now

AWS

Outage

Increased failure rates.

90-Day Uptime: 97.78% Checked: just now

Google Cloud

Outage

Incident Report Summary On Friday, 18 July 2025 07:50 US/Pacific, several Google Cloud Platform (GCP) and Google Workspace (GWS) products experienced elevated latencies and failure rates in the us-east1 region for a duration of up to 1 hour and 57 minutes. GCP Impact Duration: 18 July 2025 07:50 - 09:47 US/Pacific : 1 hour 57 minutes GWS Impact Duration: 18 July 2025 07:50 - 08:40 US/Pacific : 50 minutes We sincerely apologize for this incident, which does not reflect the level of quality and reliability we strive to offer. We are taking immediate steps to improve the platform’s performance and availability. Root Cause The service interruption was triggered by a procedural error during a planned hardware replacement in our datacenter. An incorrect physical disconnection was made to the active network switch serving our control plane, rather than the redundant unit scheduled for removal. The redundant unit had been properly de-configured as part of the procedure, and the combination of these two events led to partitioning of the network control plane. Our network is designed to withstand this type of control plane failure by failing open, continuing operation. However, an operational topology change while the network control plane was in a failed open state caused our network fabric's topology information to become stale. This led to lost data packets and service disruption until services were moved away from the fabric and control plane connectivity was restored. Remediation and Prevention Google engineers were alerted to the outage by our monitoring system on 18 July 2025 07:06 US/Pacific and immediately started an review. The following timeline details the remediation and restoration efforts: 07:39 US/Pacific: The underlying root cause (device disconnect) was identified and onsite technicians were engaged to reconnect the control plane device and restore control plane connectivity. At that moment, network failure open mechanisms worked as expected and no impact was observed. 07:50 US/Pacific: A topology change led to traffic being routed suboptimally, due to the network being in a fail open state. This caused traffic slowdown on the subset of links, lost data packets, and response delays to customer traffic. Engineers made a decision to move traffic away from the affected fabric, which temporarily patched the impact for the majority of the services. 08:40 US/Pacific: Engineers temporarily patched Workspace impact by shifting traffic away from the affected region. 09:47 US/Pacific: Onsite technicians reconnected the device, control plane connectivity was fully restored and all services were back to stable state. Google is committed to preventing a repeat of the issue in the future, and is completing the following actions: Pause non-critical workflows until safety controls are implemented (complete). Strengthen safety controls for hardware upgrade workflows by end of Q3 2025\. * Design and implement a mechanism to prevent control plane partitioning in case of dual failure of upstream routers by end of Q4 2025\. Detailed Description of Impact GCP Impact: Multiple products in us-east1 were affected by the loss of network connectivity, with the most significant impacts seen in us-east1-b. Other regions were not affected. The outage caused a range of issues for customers with zonal resources in the region, including lost data packets across VPC networks, increased failure rates and response delays, service unavailable (503) errors, and slow or stuck operations up to loss of networking connectivity. While regional products were briefly impacted, they recovered quickly by failing over to unaffected zones. A small number (0.1%) of Persistent Disks in us-east1-b were unavailable for the duration of the outage: these disks became available once the outage was temporarily patched, with no customer data loss. GWS Impact: A small subset of Workspace users, primarily around the Southeast US, experienced varying degrees of unavailability and increased delays across multiple products, including Gmail, Google Meet, Google Drive, Google Chat, Google Calendar, Google Groups, Google Doc/Editors, and Google Voice.

90-Day Uptime: 97.72% Checked: just now

Azure

Operational

Everything is looking good

90-Day Uptime: 100.00% Checked: just now

OpenAI

Operational

Investigation in progress.

90-Day Uptime: 99.00% Checked: just now

Anthropic

Operational

All Systems Operational.

90-Day Uptime: 99.61% Checked: just now

Hugging Face

Operational

All Systems Operational

90-Day Uptime: 99.44% Checked: just now

Replicate

Operational

All Systems Operational.

90-Day Uptime: 99.56% Checked: just now

Stripe

Operational

All Systems Operational

90-Day Uptime: 99.94% Checked: just now

PayPal

Operational

All Systems Operational

90-Day Uptime: 100.00% Checked: just now

Shopify

Operational

All Systems Operational.

90-Day Uptime: 99.94% Checked: just now

Discord

Operational

All Systems Operational.

90-Day Uptime: 100.00% Checked: just now

Slack

Operational

All Systems Operational

90-Day Uptime: 99.50% Checked: just now

Twilio

Degraded

Engineers are looking into an issue: Twilio customers may be experiencing SMS delivery delays from a subset of Twilio Short Codes to Telcel network subscribers in Mexico. Our team is actively looking into this issue. We will provide another update in 2 hours or as soon as more information becomes available.

90-Day Uptime: 99.67% Checked: just now

SendGrid

Operational

All Systems Operational.

90-Day Uptime: 99.94% Checked: just now

Mailchimp

Operational

All Systems Operational

90-Day Uptime: 99.50% Checked: just now

Notion

Operational

All Systems Operational

90-Day Uptime: 99.50% Checked: just now

Atlassian (Jira/Confluence)

Operational

All Systems Operational.

90-Day Uptime: 99.44% Checked: just now

Figma

Operational

All Systems Operational.

90-Day Uptime: 99.56% Checked: just now

HubSpot

Operational

All Systems Operational.

90-Day Uptime: 100.00% Checked: just now

Zoom

Operational

GIFs are re-enabled for clients on versions below 7.0.0 and above 7.0.4, but users may need to re-login to send GIFs. Users on affected versions (7.0.0 – 7.0.4) can upgrade to 7.0.5 to send GIFs.

90-Day Uptime: 99.78% Checked: just now

Recent Incidents Feed

Anthropic Service Restoration

Resolved. Services returned to baseline response profiles.

SMS Delivery Delays from a Subset of Twilio Short Codes to Telcel Mexico

Investigation in progress.

OpenAI Service Restoration

Resolved. Services returned to baseline response profiles.

Elevated error rates on Codex, ChatGPT and Responses API

Investigation in progress.

OpenAI Service Restoration

Resolved. Services returned to baseline response profiles.

Network Performance Issues in TLV

Investigation in progress.

SMS Delivery Delays from Twilio to Vodacom Tanzania

Investigation in progress.

Voice Call Failures from Twilio Phone Numbers to Vivo Brazil

Investigation in progress.

Atlassian (Jira/Confluence) Service Restoration

Resolved. Services returned to baseline response profiles.

SMS Delivery Delays and Failures from Twilio to NumberBarn United States

Investigation in progress.

Why Choose Is It Down?

Official status pages often take 10–30 minutes to declare an outage. Our **Lag Detector** monitors user reports to flag disruptions before official APIs update.

We translate confusing JSON statuses like `degraded us-east-1 queue backlog` into plain-English.

View Reliability Leaderboard

Frequently Asked Questions

Q: What is IsItDown?

IsItDown is a real-time service status aggregator and outage tracker. We monitor API portals, cloud hosting infrastructure, payment gateways, and developer tools to give you an immediate view of internet-wide health.

Q: How does IsItDown detect outages faster than official status pages?

Official status pages often take 10 to 30 minutes to declare an incident. We combine active API status polling (our Lag Detector) with real-time user report submissions (our Community Outage Pulse) to flag disruptions before they are officially declared.

Q: What is the Plain-English Translation engine?

Technical status logs can be highly cryptic (e.g., 'us-east-1 connection backlog mitigated'). We translate these technical messages into simple, plain-English statements so you know exactly which services are impacted.

Q: How is the 90-day reliability score calculated?

We record daily uptime records for all monitored platforms. Operational days score 100% uptime, minor degradations score 95%, and major outages score 60% uptime. The scores are weighted and averaged over the past 90 days.

Q: Are status subscriptions free?

Yes, you can subscribe to receive email alerts for any service. Navigate to the service's status page, enter your email address, and we will automatically notify you when status changes occur.

Is it just me, or is something actually down?

GitHub

GitLab

Bitbucket

npm Registry

Vercel

Netlify

Cloudflare

AWS

Google Cloud

Azure

OpenAI

Anthropic

Hugging Face

Replicate

Stripe

PayPal

Shopify

Discord

Slack

Twilio

SendGrid

Mailchimp

Notion

Atlassian (Jira/Confluence)

Figma

HubSpot

Zoom

No Tracked Services Found

Recent Incidents Feed

Why Choose Is It Down?

Frequently Asked Questions