Skip to main content
This guide explains how to read and interpret the metrics UptimeIO provides for your monitors.

Dashboard Overview

When you log in, your dashboard shows key metrics at a glance:
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Organization Overview               โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Total Monitors: 15                  โ”‚
โ”‚ Active: 12 | Paused: 2 | Down: 1    โ”‚
โ”‚                                     โ”‚
โ”‚ Overall Uptime (24h): 99.8%        โ”‚
โ”‚ Average Response Time: 145ms        โ”‚
โ”‚ Active Incidents: 1                 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
The dashboard provides a quick health check of all your services.

Uptime Percentage

Uptime percentage is the most important metric - it tells you what percentage of time your service was available.

How Itโ€™s Calculated

Uptime % = (Total Time - Downtime) / Total Time ร— 100
Example:
  • Period: 24 hours (1,440 minutes)
  • Downtime: 5 minutes
  • Uptime: (1,440 - 5) / 1,440 ร— 100 = 99.65%

Time Periods

UptimeIO shows uptime for multiple periods:
PeriodUse Case
Last 24 hoursRecent performance
Last 7 daysWeekly trends
Last 30 daysMonthly SLA tracking
Last 90 daysQuarterly reporting
Uptime is calculated from actual check results within your planโ€™s data retention period.

Reading Uptime Values

Uptime %StatusMeaning
100%๐ŸŸข PerfectNo downtime at all
99.9%+๐ŸŸข Excellent< 43 minutes downtime per month
99.5-99.9%๐ŸŸก Good43 minutes - 3.6 hours per month
99.0-99.5%๐ŸŸก Fair3.6 - 7.2 hours per month
< 99.0%๐Ÿ”ด Poor> 7.2 hours downtime per month

Downtime Equivalents

Understanding what uptime percentages mean in real time:
Uptime %Downtime per DayDowntime per WeekDowntime per MonthDowntime per Year
99.9% (three nines)1.4 minutes10 minutes43 minutes8.7 hours
99.95%43 seconds5 minutes22 minutes4.4 hours
99.99% (four nines)8.6 seconds1 minute4.3 minutes52 minutes
99.999% (five nines)0.9 seconds6 seconds26 seconds5.3 minutes
Most SLAs target 99.9% (three nines) uptime, which allows for about 43 minutes of downtime per month.

Response Time

Response time measures how long it takes for your service to respond to requests.

Whatโ€™s Measured

Total Response Time = DNS + TCP + TLS + Server Response
Breakdown:
  • DNS Resolution: Time to resolve domain to IP (10-50ms typical)
  • TCP Connection: Time to establish connection (20-100ms typical)
  • TLS Handshake: SSL/TLS negotiation (50-200ms typical)
  • Server Response: Time for server to process and respond (varies)

Example Response Time

Monitor: API Server
Total Response Time: 245ms

Breakdown:
โ”œโ”€ DNS Resolution: 12ms
โ”œโ”€ TCP Connection: 45ms
โ”œโ”€ TLS Handshake: 78ms
โ””โ”€ Server Response: 110ms

Response Time Metrics

UptimeIO shows multiple response time statistics:
Mean response time across all checks in the period.
Average: 145ms
Good for: General performance trends Limitation: Can be skewed by outliers
Fastest response time recorded.
Minimum: 85ms
Good for: Best-case performance Use: Baseline for optimization
Slowest response time recorded.
Maximum: 450ms
Good for: Identifying performance spikes Use: Troubleshooting slow requests
95th percentile - 95% of requests were faster than this.
P95: 245ms
Good for: Real-world user experience Use: SLA targets (better than average)

Response Time Ranges

Response TimeStatusUser Experience
< 100ms๐ŸŸข ExcellentInstant, imperceptible
100-300ms๐ŸŸข GoodFast, slight delay
300-1000ms๐ŸŸก AcceptableNoticeable delay
1000-3000ms๐ŸŸ  SlowFrustrating
> 3000ms๐Ÿ”ด Very SlowUnacceptable
Response time expectations vary by service type. APIs should be < 500ms, while complex web pages can be 1-2 seconds.

Response Time Graph

The response time graph shows performance over time:
Response Time (ms)
500 โ”ค                                    โ•ญโ”€โ•ฎ
400 โ”ค                          โ•ญโ”€โ•ฎ      โ”‚ โ”‚
300 โ”ค              โ•ญโ”€โ•ฎ    โ•ญโ”€โ”€โ”€โ•ฏ โ•ฐโ”€โ”€โ•ฎ   โ”‚ โ•ฐโ”€โ•ฎ
200 โ”ค      โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ โ•ฐโ”€โ”€โ”€โ”€โ•ฏ         โ•ฐโ”€โ”€โ”€โ•ฏ   โ•ฐโ”€โ•ฎ
100 โ”คโ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ                                  โ•ฐโ”€โ”€
    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
    12:00  14:00  16:00  18:00  20:00  22:00

Reading the Graph

200ms โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
Meaning: Consistent performance Status: โœ… Healthy
300ms              โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€
200ms      โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
100ms โ”€โ”€โ”€โ”€โ”€โ•ฏ
Meaning: Performance degrading Action: Investigate resource usage
500ms      โ•ญโ•ฎ
300ms      โ”‚โ”‚    โ•ญโ•ฎ
100ms โ”€โ”€โ”€โ”€โ”€โ•ฏโ•ฐโ”€โ”€โ”€โ”€โ•ฏโ•ฐโ”€โ”€โ”€โ”€โ”€
Meaning: Intermittent slowdowns Action: Check for:
  • Traffic spikes
  • Background jobs
  • Database queries
  • External API calls
200ms โ”€โ”€โ”€โ”€    โ”€โ”€โ”€โ”€    โ”€โ”€โ”€โ”€
Meaning: Failed checks (no response) Status: โš ๏ธ Downtime periods

Check Success Rate

Percentage of checks that succeeded vs failed.
Success Rate = Successful Checks / Total Checks ร— 100
Example:
  • Total checks: 1,000
  • Successful: 998
  • Failed: 2
  • Success Rate: 998 / 1,000 ร— 100 = 99.8%
Success rate is similar to uptime but counts individual checks rather than time periods.

Recent Checks

The Recent Checks section shows the last 10-20 checks:
Recent Checks
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
โœ… 10:45:00 | 145ms | 200 OK | US East
โœ… 10:40:00 | 152ms | 200 OK | Europe
โœ… 10:35:00 | 138ms | 200 OK | Asia
โŒ 10:30:00 | N/A   | Timeout | US East
โœ… 10:25:00 | 141ms | 200 OK | Europe

Check Details

Each check shows:
  • Status: โœ… Success or โŒ Failure
  • Time: When the check ran
  • Response Time: How long it took
  • Status Code: HTTP status code (for HTTP monitors)
  • Region: Where the check ran from
Recent checks help you spot patterns like specific regions having issues or time-of-day performance variations.

Incident Metrics

Mean Time Between Failures (MTBF)

Average time between incidents.
MTBF = Total Uptime / Number of Incidents
Example:
  • Period: 30 days
  • Incidents: 3
  • MTBF: 30 days / 3 = 10 days
Higher is better - longer time between incidents means more reliability.

Mean Time To Recovery (MTTR)

Average time to resolve incidents.
MTTR = Total Downtime / Number of Incidents
Example:
  • Total downtime: 15 minutes
  • Incidents: 3
  • MTTR: 15 / 3 = 5 minutes
Lower is better - faster recovery means less impact.

Status Page Metrics

If you have a status page, additional metrics are available:
Shows current uptime percentage:
โ•”โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•—
โ•‘  99.9% Uptime โ•‘
โ•‘  Last 30 Days โ•‘
โ•šโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
Real-time status of each monitored service:
  • ๐ŸŸข Operational
  • ๐ŸŸก Degraded Performance
  • ๐Ÿ”ด Major Outage
  • ๐Ÿ”ต Under Maintenance
Public timeline of past incidents with:
  • Start and end times
  • Duration
  • Impact description
  • Resolution notes

Exporting Metrics

Export metrics for reporting or analysis:
1

Go to Monitor Details

Click on any monitor from your monitors list.
2

Select Time Period

Choose the date range you want to export.
3

Export Data

Click โ€œExportโ€ and choose format:
  • CSV: For spreadsheets
  • JSON: For programmatic access
  • PDF: For reports

CSV Export Example

timestamp,status,response_time_ms,status_code,region
2024-01-15T10:00:00Z,success,145,200,us-east
2024-01-15T10:05:00Z,success,152,200,europe
2024-01-15T10:10:00Z,failure,0,0,asia
2024-01-15T10:15:00Z,success,138,200,us-east

API Access to Metrics

Retrieve metrics programmatically via API:
# Get monitor status
curl https://api.uptimeio.com/v1/monitors/mon_abc123/status \
  -H "Authorization: Bearer YOUR_API_KEY"

# Response
{
  "success": true,
  "data": {
    "current_status": "up",
    "uptime": {
      "last_24h": 99.95,
      "last_7d": 99.87,
      "last_30d": 99.92
    },
    "response_time": {
      "last_24h": 145,
      "last_7d": 152,
      "last_30d": 148
    }
  }
}
See the API Reference for complete documentation.
Uptime: 99.5% โ†’ 99.8% โ†’ 99.9%
Response Time: 200ms โ†’ 180ms โ†’ 150ms
Indicators:
  • โœ… Uptime increasing
  • โœ… Response time decreasing
  • โœ… Fewer incidents
Action: Keep monitoring, document whatโ€™s working
Uptime: 99.9% โ†’ 99.7% โ†’ 99.5%
Response Time: 150ms โ†’ 200ms โ†’ 250ms
Indicators:
  • โš ๏ธ Uptime decreasing
  • โš ๏ธ Response time increasing
  • โš ๏ธ More frequent incidents
Action: Investigate immediately:
  1. Check server resources (CPU, memory, disk)
  2. Review recent deployments
  3. Analyze error logs
  4. Check database performance
  5. Review third-party dependencies

Best Practices

Donโ€™t aim for 100% uptime - itโ€™s unrealistic:
  • Critical services: 99.9% (three nines)
  • Important services: 99.5%
  • Non-critical: 99.0%
Account for planned maintenance and deployments.
When you see unusual metrics:
  1. Check incident timeline
  2. Review recent changes
  3. Compare with other monitors
  4. Check external dependencies
  5. Review server logs
Track response time trends to predict when youโ€™ll need to scale:
  • Gradual increases = growing load
  • Spikes at specific times = traffic patterns
  • Steady increases = resource exhaustion

Next Steps