JK
JustKalm

Performance Benchmarks

Real-time API latency and throughput metrics

JustKalm APIs are engineered for speed. We continuously monitor and optimize performance across all endpoints and global regions.

Last updated: 5:18:33 AM
↓ Improving
35ms
Global P50 Latency
↓ Improving
148ms
Global P99 Latency
→ Stable
25K
Requests/Second
→ Stable
0.001%
Error Rate

SLA Guarantees

99.999%
Uptime SLA
<200ms
P99 Guarantee
100%
Credits if missed
View full SLA details

Endpoint Latency

P50, P95, and P99 response times by endpoint

POST /v1/products/analyze
p50: 45ms
p95: 120ms
p99: 180ms
GET /v1/products/{id}
p50: 12ms
p95: 28ms
p99: 45ms
POST /v1/health/score
p50: 67ms
p95: 150ms
p99: 220ms
POST /v1/valuations/estimate
p50: 89ms
p95: 180ms
p99: 280ms
POST /v1/sustainability/analyze
p50: 72ms
p95: 160ms
p99: 240ms
POST /v1/batch/process
p50: 250ms
p95: 450ms
p99: 680ms
GET /v1/webhooks
p50: 8ms
p95: 18ms
p99: 30ms

Throughput & Reliability

Maximum sustained requests per second by endpoint

EndpointThroughputError RateStatus
POST /v1/products/analyze1,500req/s0.100%Healthy
GET /v1/products/{id}5,000req/s0.050%Healthy
POST /v1/health/score1,200req/s0.100%Healthy
POST /v1/valuations/estimate800req/s0.200%Healthy
POST /v1/sustainability/analyze1,000req/s0.100%Healthy
POST /v1/batch/process300req/s0.300%Healthy
GET /v1/webhooks8,000req/s0.010%Healthy

Global Edge Performance

Average latency from our edge locations worldwide

N. Virginia
23ms
99.999% availability
Oregon
45ms
99.998% availability
Ireland
67ms
99.997% availability
Frankfurt
72ms
99.998% availability
Tokyo
89ms
99.996% availability
Singapore
95ms
99.995% availability
Mumbai
110ms
99.994% availability
São Paulo
125ms
99.993% availability

Performance Trend

How our API performance has improved over time

p50
p99
Last 6 months
Jul
Aug
Sep
Oct
Nov
Dec
33% fasterthan 6 months ago

Infrastructure

How we achieve these numbers

Edge Caching

Cloudflare Workers cache responses at 300+ edge locations for sub-20ms reads.

Auto-Scaling

Kubernetes HPA scales from 10 to 1,000 pods in under 60 seconds.

Connection Pooling

PgBouncer maintains warm connections to eliminate cold start latency.

ML Model Serving

ONNX Runtime with GPU acceleration for sub-100ms inference.

Read Replicas

Geographic read replicas reduce cross-region latency by 70%.

HTTP/3

QUIC protocol eliminates head-of-line blocking for faster multiplexing.

Benchmarks measured using production traffic. Results may vary based on network conditions.

View real-time system status →