Performance Benchmarks

Real-time API latency and throughput metrics

JustKalm APIs are engineered for speed. We continuously monitor and optimize performance across all endpoints and global regions.

Last updated: 5:18:33 AM

↓ Improving

35ms

Global P50 Latency

↓ Improving

148ms

Global P99 Latency

→ Stable

25K

Requests/Second

→ Stable

0.001%

Error Rate

SLA Guarantees

99.999%

Uptime SLA

<200ms

P99 Guarantee

100%

Credits if missed

View full SLA details

Endpoint Latency

P50, P95, and P99 response times by endpoint

POST /v1/products/analyze

p50: 45ms

p95: 120ms

p99: 180ms

GET /v1/products/{id}

p50: 12ms

p95: 28ms

p99: 45ms

POST /v1/health/score

p50: 67ms

p95: 150ms

p99: 220ms

POST /v1/valuations/estimate

p50: 89ms

p95: 180ms

p99: 280ms

POST /v1/sustainability/analyze

p50: 72ms

p95: 160ms

p99: 240ms

POST /v1/batch/process

p50: 250ms

p95: 450ms

p99: 680ms

GET /v1/webhooks

p50: 8ms

p95: 18ms

p99: 30ms

Throughput & Reliability

Maximum sustained requests per second by endpoint

Endpoint	Throughput	Error Rate	Status
`POST /v1/products/analyze`	1,500req/s	0.100%	Healthy
`GET /v1/products/{id}`	5,000req/s	0.050%	Healthy
`POST /v1/health/score`	1,200req/s	0.100%	Healthy
`POST /v1/valuations/estimate`	800req/s	0.200%	Healthy
`POST /v1/sustainability/analyze`	1,000req/s	0.100%	Healthy
`POST /v1/batch/process`	300req/s	0.300%	Healthy
`GET /v1/webhooks`	8,000req/s	0.010%	Healthy

Global Edge Performance

Average latency from our edge locations worldwide

N. Virginia

23ms

99.999% availability

Oregon

45ms

99.998% availability

Ireland

67ms

99.997% availability

Frankfurt

72ms

99.998% availability

Tokyo

89ms

99.996% availability

Singapore

95ms

99.995% availability

Mumbai

110ms

99.994% availability

São Paulo

125ms

99.993% availability

Performance Trend

How our API performance has improved over time

p50

p99Last 6 months

Jul

Aug

Sep

Oct

Nov

Dec

33% fasterthan 6 months ago

Infrastructure

How we achieve these numbers

Edge Caching

Cloudflare Workers cache responses at 300+ edge locations for sub-20ms reads.

Auto-Scaling

Kubernetes HPA scales from 10 to 1,000 pods in under 60 seconds.

Connection Pooling

PgBouncer maintains warm connections to eliminate cold start latency.

ML Model Serving

ONNX Runtime with GPU acceleration for sub-100ms inference.

Read Replicas

Geographic read replicas reduce cross-region latency by 70%.

HTTP/3

QUIC protocol eliminates head-of-line blocking for faster multiplexing.

Benchmarks measured using production traffic. Results may vary based on network conditions.

View real-time system status →