Observability
Health Monitoring
Comprehensive health checks with readiness probes, liveness monitoring, and dependency health tracking for Kubernetes-native deployments.
99.97%
Uptime
Last 90 days
2.4M
Health Checks
Per day
12ms
Avg Response
Health endpoint
8/8
Dependencies
All healthy
Health Check Endpoints
Standardized health endpoints for monitoring.
# Health Check Endpoints
# /health - Basic health check (liveness)
GET /health
{
"status": "healthy",
"timestamp": "2024-12-14T10:30:00Z",
"version": "1.2.3",
"uptime_seconds": 86400
}
# /health/ready - Readiness check
GET /health/ready
{
"status": "ready",
"checks": {
"database": { "status": "up", "latency_ms": 2 },
"redis": { "status": "up", "latency_ms": 1 },
"ml_service": { "status": "up", "latency_ms": 15 }
}
}
# /health/detailed - Full diagnostics
GET /health/detailed
Authorization: Bearer <internal-token>
{
"status": "healthy",
"checks": { ... },
"metrics": {
"memory_mb": 512,
"cpu_percent": 25,
"open_connections": 45,
"queue_depth": 12
}
}Implementation
# FastAPI Health Check Implementation
from fastapi import APIRouter
from datetime import datetime
router = APIRouter(prefix="/health", tags=["Health"])
start_time = datetime.utcnow()
@router.get("")
async def liveness():
"""Kubernetes liveness probe"""
return {
"status": "healthy",
"timestamp": datetime.utcnow().isoformat(),
"version": settings.VERSION,
"uptime_seconds": (
datetime.utcnow() - start_time
).total_seconds()
}
@router.get("/ready")
async def readiness(
db: AsyncSession = Depends(get_db),
redis: Redis = Depends(get_redis),
):
"""Kubernetes readiness probe"""
checks = {}
# Check database
try:
await db.execute(text("SELECT 1"))
checks["database"] = {"status": "up"}
except Exception as e:
checks["database"] = {"status": "down", "error": str(e)}
# Check Redis
try:
await redis.ping()
checks["redis"] = {"status": "up"}
except Exception:
checks["redis"] = {"status": "down"}
all_up = all(c["status"] == "up" for c in checks.values())
return JSONResponse(
status_code=200 if all_up else 503,
content={"status": "ready" if all_up else "not_ready", "checks": checks}
)Always-On Reliability
Comprehensive health monitoring ensures maximum uptime.
99.97% Uptime8/8 Dependencies Healthy12ms Health Check