Scaling and Performance

Exekra scales execution capacity by adding runners. The hub remains a single instance that manages work distribution through a Redis-backed queue. This section covers the infrastructure considerations for scaling deployments.

Horizontal Scaling with Runners

Each runner provides one concurrent execution slot. To increase throughput, register additional runners on dedicated machines. The hub assigns work to runners automatically based on availability. Runners with the most recent heartbeat are preferred.

Runners	Concurrent Executions	Use Case
1-2	1-2	Development and testing
3-10	3-10	Department-level automation
10-50	10-50	Enterprise-wide deployment
50+	50+	High-volume batch processing

Execution Queue

The execution queue runs on Redis. It decouples execution creation from runner assignment so a workflow request never blocks on a busy runner. The queue has bounded retention for completed and failed jobs and bounded retries on transient assignment failures. If Redis becomes temporarily unavailable, the Hub falls back to direct runner assignment with reduced functionality.

Runner Assignment

When a queued execution is ready to dispatch, the Hub picks an available ONLINE runner that satisfies any type or label constraints on the workflow, marks it busy to prevent double-assignment, and waits for the runner to pick up the work on its next poll. If no runner becomes available within the configured window, the stale execution check marks the execution FAILED.

Rate Limiting

The API enforces a per-client rate limit on user-facing endpoints. Runner operational endpoints are exempt so legitimate execution traffic can never be throttled by the rate limiter itself.

Database Performance

Recommendations for production deployments:

Consideration	Recommendation
Connection Pool	Configure connection_limit in DATABASE_URL
Execution History	Archive or purge completed executions older than your retention period
Backups	Schedule pg_dump or configure WAL archiving for point-in-time recovery
Monitoring	Monitor query latency and connection pool utilization

Periodic Background Tasks

The Hub runs maintenance tasks on a recurring schedule to mark stale runners OFFLINE, fail or re-queue stuck executions, and revalidate license status. The cadence and timeouts are tuned for typical deployments and can be reviewed with Exekra support for environments with unusual latency or scale characteristics.

High Availability Considerations

The current architecture uses a single hub instance. For high availability:

Component	HA Strategy
PostgreSQL	Use managed PostgreSQL with automated failover or streaming replication
Redis	Use Redis Sentinel or managed Redis with automatic failover
Hub	Single instance; restore from backup on failure. Runners reconnect automatically
Runners	Deploy multiple runners for execution redundancy. Lost runners trigger re-queue
TLS Certificates	Use certificates from internal CA with automated renewal

Runners are stateless between executions and reconnect automatically when the hub becomes available. Executions that were in-flight during an outage are handled by the stale execution check on restart.

Performance Metrics

Each execution records resource utilization on the runner: CPU average and peak (percentage), memory average and peak (megabytes). Runner heartbeats include real-time CPU, memory, and disk metrics. These metrics are available in the Hub dashboard for capacity planning.

Was this page helpful?

A quick signal helps us prioritise improvements.