This guide explains UptimeIO’s intelligent incident detection system and how to interpret incident data.
What is an Incident?
An incident represents a period when your monitored service is unavailable or not meeting expectations. UptimeIO creates incidents only after confirming the issue across multiple regions to avoid false positives.How Incidents Are Created
UptimeIO uses a multi-region consensus system to ensure incidents are real, not transient network issues.1
Initial Check Fails
Your monitor performs a check from Region A (e.g., US East).No incident created yet - could be a temporary issue.
2
First Retry (1 second later)
UptimeIO automatically retries from Region B (e.g., Europe).Still no incident - waiting for final confirmation.
3
Second Retry (1 second later)
Final retry from Region C (e.g., Asia).Incident created! - 3 failures from 2+ different regions confirms the issue is real.
4
Notifications Sent
All integrations in your notification profiles receive alerts:
- 📧 Email notifications
- 📱 SMS messages
- 💬 Slack/Discord messages
- 🔗 Webhook calls
Why 3 failures from 2+ regions? This consensus mechanism virtually eliminates false positives caused by:
- Temporary network glitches
- Single region outages
- ISP routing issues
- Transient server hiccups
Incident Timeline
Each incident has a detailed timeline showing exactly what happened:Incident Recovery
Recovery works the same way as incident creation - 3 successful checks from 2+ regions are required to resolve an incident.1
First Success
After incident is created, next scheduled check succeeds.
2
Second Success
One second later, check from another region succeeds.
3
Third Success
Final confirmation from third region.Recovery notifications sent to all integrations.
Incident Types
Monitor Down
The primary incident type when health checks fail.- Server down or restarting
- Network connectivity issues
- DNS resolution failures
- Firewall blocking requests
- Application crashes
Slow Response
Created when response times exceed your configured threshold.- Threshold: 2000ms
- Incident created: Response time 2500ms
- Incident resolved: Response time drops below 1600ms (80% of 2000ms) for 3 consecutive checks
Slow response incidents are separate from downtime incidents. Your monitor can be “UP” but have an active slow response incident.
SSL Certificate Expiry
Alerts before SSL certificates expire.- 30 days before expiry
- 7 days before expiry
- 1 day before expiry
DNS Error
Triggered when DNS resolution fails or returns unexpected values.Reading Incident Details
Incident Status
| Status | Meaning |
|---|---|
| Open | Incident is active, service is down |
| Resolved | Service recovered, incident closed |
Incident Metadata
Each incident includes:Start Time
Start Time
When the incident was created (after 3 failures confirmed).
Duration
Duration
How long the incident lasted.For open incidents, shows elapsed time.
Verification Details
Verification Details
Consensus information:
Error Details
Error Details
Specific error from first failure:
Affected Monitor
Affected Monitor
Which monitor triggered the incident:
Incident Notifications
When an incident is created or resolved, notifications are sent through all integrations in your assigned notification profiles.Incident Created
Incident Resolved
Preventing False Positives
UptimeIO’s consensus system prevents false positives, but you can further reduce them:Use multiple regions
Use multiple regions
Always monitor from at least 2 regions (required). For critical services, use 3-4 regions.
Set appropriate timeouts
Set appropriate timeouts
- Fast APIs: 10-15 seconds
- Standard sites: 30 seconds
- Slow services: 45-60 seconds
Configure expected status codes correctly
Configure expected status codes correctly
Ensure your expected status codes match what your service actually returns.
Whitelist UptimeIO
Whitelist UptimeIO
If using firewall or rate limiting, whitelist UptimeIO’s user agent:
Incident History
View all past incidents for a monitor:1
Go to Monitor Details
Click on any monitor from your monitors list.
2
Scroll to Recent Incidents
The “Recent Incidents” section shows the last 10 incidents.
3
View All Incidents
Click “View All Incidents” to see complete history.
4
Filter and Search
Filter by:
- Date range
- Status (open/resolved)
- Incident type
- Duration
Incident Metrics
UptimeIO calculates key metrics from your incident history:| Metric | Description | Calculation |
|---|---|---|
| Uptime % | Percentage of time service was available | (Total Time - Downtime) / Total Time × 100 |
| Downtime | Total time in incidents | Sum of all incident durations |
| MTBF | Mean Time Between Failures | Average time between incidents |
| MTTR | Mean Time To Recovery | Average incident duration |
| Incident Count | Total number of incidents | Count of all incidents |