Performance Benchmarks
Real-time API latency and throughput metrics
JustKalm APIs are engineered for speed. We continuously monitor and optimize performance across all endpoints and global regions.
Endpoint Latency
P50, P95, and P99 response times by endpoint
POST /v1/products/analyzeGET /v1/products/{id}POST /v1/health/scorePOST /v1/valuations/estimatePOST /v1/sustainability/analyzePOST /v1/batch/processGET /v1/webhooksThroughput & Reliability
Maximum sustained requests per second by endpoint
| Endpoint | Throughput | Error Rate | Status |
|---|---|---|---|
POST /v1/products/analyze | 1,500req/s | 0.100% | Healthy |
GET /v1/products/{id} | 5,000req/s | 0.050% | Healthy |
POST /v1/health/score | 1,200req/s | 0.100% | Healthy |
POST /v1/valuations/estimate | 800req/s | 0.200% | Healthy |
POST /v1/sustainability/analyze | 1,000req/s | 0.100% | Healthy |
POST /v1/batch/process | 300req/s | 0.300% | Healthy |
GET /v1/webhooks | 8,000req/s | 0.010% | Healthy |
Global Edge Performance
Average latency from our edge locations worldwide
Performance Trend
How our API performance has improved over time
Infrastructure
How we achieve these numbers
Edge Caching
Cloudflare Workers cache responses at 300+ edge locations for sub-20ms reads.
Auto-Scaling
Kubernetes HPA scales from 10 to 1,000 pods in under 60 seconds.
Connection Pooling
PgBouncer maintains warm connections to eliminate cold start latency.
ML Model Serving
ONNX Runtime with GPU acceleration for sub-100ms inference.
Read Replicas
Geographic read replicas reduce cross-region latency by 70%.
HTTP/3
QUIC protocol eliminates head-of-line blocking for faster multiplexing.
Benchmarks measured using production traffic. Results may vary based on network conditions.
View real-time system status →