Scaling and Performance
Exekra scales execution capacity by adding runners. The hub remains a single instance that manages work distribution through a Redis-backed queue. This section covers the infrastructure considerations for scaling deployments.
Horizontal Scaling with Runners
Each runner provides one concurrent execution slot. To increase throughput, register additional runners on dedicated machines. The hub assigns work to runners automatically based on availability. Runners with the most recent heartbeat are preferred.
| Runners | Concurrent Executions | Use Case |
|---|---|---|
| 1-2 | 1-2 | Development and testing |
| 3-10 | 3-10 | Department-level automation |
| 10-50 | 10-50 | Enterprise-wide deployment |
| 50+ | 50+ | High-volume batch processing |
Execution Queue
The execution queue runs on Redis. It decouples execution creation from runner assignment so a workflow request never blocks on a busy runner. The queue has bounded retention for completed and failed jobs and bounded retries on transient assignment failures. If Redis becomes temporarily unavailable, the Hub falls back to direct runner assignment with reduced functionality.
Runner Assignment
When a queued execution is ready to dispatch, the Hub picks an available ONLINE runner that satisfies any type or label constraints on the workflow, marks it busy to prevent double-assignment, and waits for the runner to pick up the work on its next poll. If no runner becomes available within the configured window, the stale execution check marks the execution FAILED.
Rate Limiting
The API enforces a per-client rate limit on user-facing endpoints. Runner operational endpoints are exempt so legitimate execution traffic can never be throttled by the rate limiter itself.
Database Performance
Recommendations for production deployments:
| Consideration | Recommendation |
|---|---|
| Connection Pool | Configure connection_limit in DATABASE_URL |
| Execution History | Archive or purge completed executions older than your retention period |
| Backups | Schedule pg_dump or configure WAL archiving for point-in-time recovery |
| Monitoring | Monitor query latency and connection pool utilization |
Periodic Background Tasks
The Hub runs maintenance tasks on a recurring schedule to mark stale runners OFFLINE, fail or re-queue stuck executions, and revalidate license status. The cadence and timeouts are tuned for typical deployments and can be reviewed with Exekra support for environments with unusual latency or scale characteristics.
High Availability Considerations
The current architecture uses a single hub instance. For high availability:
| Component | HA Strategy |
|---|---|
| PostgreSQL | Use managed PostgreSQL with automated failover or streaming replication |
| Redis | Use Redis Sentinel or managed Redis with automatic failover |
| Hub | Single instance; restore from backup on failure. Runners reconnect automatically |
| Runners | Deploy multiple runners for execution redundancy. Lost runners trigger re-queue |
| TLS Certificates | Use certificates from internal CA with automated renewal |
Runners are stateless between executions and reconnect automatically when the hub becomes available. Executions that were in-flight during an outage are handled by the stale execution check on restart.
Performance Metrics
Each execution records resource utilization on the runner: CPU average and peak (percentage), memory average and peak (megabytes). Runner heartbeats include real-time CPU, memory, and disk metrics. These metrics are available in the Hub dashboard for capacity planning.
Was this page helpful?
A quick signal helps us prioritise improvements.