Skip to main content

Scaling and Performance

Exekra scales execution capacity by adding runners. The hub remains a single instance that manages work distribution through a Redis-backed queue. This section covers the infrastructure considerations for scaling deployments.

Horizontal Scaling with Runners

Each runner provides one concurrent execution slot. To increase throughput, register additional runners on dedicated machines. The hub assigns work to runners automatically based on availability. Runners with the most recent heartbeat are preferred.

RunnersConcurrent ExecutionsUse Case
1-21-2Development and testing
3-103-10Department-level automation
10-5010-50Enterprise-wide deployment
50+50+High-volume batch processing

Execution Queue

The execution queue runs on Redis. It decouples execution creation from runner assignment so a workflow request never blocks on a busy runner. The queue has bounded retention for completed and failed jobs and bounded retries on transient assignment failures. If Redis becomes temporarily unavailable, the Hub falls back to direct runner assignment with reduced functionality.

Runner Assignment

When a queued execution is ready to dispatch, the Hub picks an available ONLINE runner that satisfies any type or label constraints on the workflow, marks it busy to prevent double-assignment, and waits for the runner to pick up the work on its next poll. If no runner becomes available within the configured window, the stale execution check marks the execution FAILED.

Rate Limiting

The API enforces a per-client rate limit on user-facing endpoints. Runner operational endpoints are exempt so legitimate execution traffic can never be throttled by the rate limiter itself.

Database Performance

Recommendations for production deployments:

ConsiderationRecommendation
Connection PoolConfigure connection_limit in DATABASE_URL
Execution HistoryArchive or purge completed executions older than your retention period
BackupsSchedule pg_dump or configure WAL archiving for point-in-time recovery
MonitoringMonitor query latency and connection pool utilization

Periodic Background Tasks

The Hub runs maintenance tasks on a recurring schedule to mark stale runners OFFLINE, fail or re-queue stuck executions, and revalidate license status. The cadence and timeouts are tuned for typical deployments and can be reviewed with Exekra support for environments with unusual latency or scale characteristics.

High Availability Considerations

The current architecture uses a single hub instance. For high availability:

ComponentHA Strategy
PostgreSQLUse managed PostgreSQL with automated failover or streaming replication
RedisUse Redis Sentinel or managed Redis with automatic failover
HubSingle instance; restore from backup on failure. Runners reconnect automatically
RunnersDeploy multiple runners for execution redundancy. Lost runners trigger re-queue
TLS CertificatesUse certificates from internal CA with automated renewal

Runners are stateless between executions and reconnect automatically when the hub becomes available. Executions that were in-flight during an outage are handled by the stale execution check on restart.

Performance Metrics

Each execution records resource utilization on the runner: CPU average and peak (percentage), memory average and peak (megabytes). Runner heartbeats include real-time CPU, memory, and disk metrics. These metrics are available in the Hub dashboard for capacity planning.

Was this page helpful?

A quick signal helps us prioritise improvements.