Resilience
Error Handling
Robust error handling with graceful degradation, intelligent retries, and circuit breakers for resilient systems.
0.12%
Error Rate
Last 24h
94%
Retry Success
After retry
2
Circuit Opens
This week
4.2m
MTTR
Avg recovery
Error Handling Patterns
Structured approach to handling errors at every layer.
# Error Handling Architecture
## Exception Hierarchy
├── JustKalmError (base)
│ ├── ValidationError
│ │ ├── InvalidInputError
│ │ └── SchemaValidationError
│ ├── AuthenticationError
│ │ ├── InvalidTokenError
│ │ └── ExpiredTokenError
│ ├── AuthorizationError
│ │ ├── InsufficientScopeError
│ │ └── QuotaExceededError
│ ├── ResourceError
│ │ ├── NotFoundError
│ │ └── ConflictError
│ └── ExternalServiceError
│ ├── UpstreamTimeoutError
│ └── ServiceUnavailableError
## Error Context
class JustKalmError(Exception):
def __init__(
self,
message: str,
code: str,
details: dict = None,
retry_after: int = None,
doc_url: str = None
):
self.message = message
self.code = code
self.details = details or {}
self.retry_after = retry_after
self.doc_url = doc_urlGraceful Degradation
# Fallback Patterns
async def get_valuation(product_id: str):
try:
# Primary: ML model
result = await ml_service.predict(product_id)
return result
except MLServiceError:
logger.warning("ML service unavailable, using cache")
# Fallback 1: Cached prediction
cached = await redis.get(f"val:{product_id}")
if cached:
return ValuationResult(
value=cached.value,
confidence=cached.confidence * 0.9,
source="cache",
stale=True
)
# Fallback 2: Historical average
avg = await db.get_category_average(product_id)
if avg:
return ValuationResult(
value=avg,
confidence=0.5,
source="historical",
degraded=True
)
# Final fallback: Graceful error
raise ServiceDegradedError(
message="Unable to value product",
retry_after=60,
suggestions=["Try again later"]
)Resilient by Design
Graceful degradation and self-healing for maximum uptime.
0.12% Error Rate94% Retry Success4.2m MTTR