# Task Processing System

The task processing system is built on Hatchet, an open-source durable workflow engine. Hatchet handles task dispatch, retry, scheduling, and execution state, while sfp server defines the workflows and provides the credentials and context that workers need.

## How Tasks Execute

{% @mermaid/diagram content="sequenceDiagram
participant Client
participant API as sfp Server
participant Hatchet as Hatchet Engine
participant Worker as Hatchet Worker
participant External as Salesforce / GitHub

```
Client->>API: Request operation
API->>Hatchet: Dispatch workflow
Hatchet-->>Client: Workflow run ID

Hatchet->>Worker: Assign step
Worker->>API: Fetch credentials
API-->>Worker: Decrypted credentials
Worker->>External: Execute operation
Worker-->>Hatchet: Progress updates
Hatchet-->>Client: Real-time updates (WebSocket)

Worker-->>Hatchet: Step complete
Worker->>Worker: Clear credentials" %}
```

The API server validates the request and dispatches a workflow to Hatchet. A worker picks it up, fetches credentials from the server, executes the operation using sfp CLI commands, and streams progress through Hatchet's event system and WebSocket channels. On completion, credentials are cleared and results are persisted.

## Worker Pools

Workers run as separate containers using the same image as the server. They connect to the Hatchet engine and register for task assignments.

**Standard workers** handle typical operations — builds, deployments, environment provisioning, source tracking, and configuration tasks. The number of standard workers is configurable via `WORKER_COUNT`.

**Long-running workers** handle operations that may take extended periods — full test suite execution, bulk metadata operations, and complex release pipelines. These run as a separate pool with longer timeout windows.

Both pools follow the same credential access pattern: credentials are fetched per-task, held in memory, and cleared on completion.

## Workflow Types

**Immediate workflows** are triggered by API requests and execute as soon as a worker is available. Deployments, builds, and environment operations use this pattern.

**Recurring workflows** run on cron schedules managed by Hatchet. Code analysis, environment health checks, and cache refreshes use this pattern. Cron definitions are registered with Hatchet during server startup.

**Composed workflows** chain multiple steps into a single durable execution. A release pipeline might validate packages, deploy to staging, run tests, and deploy to production — each step with its own retry policy and timeout. If one step fails, Hatchet can retry that step without re-executing completed steps.

## Durability

Workflow state is persisted in Hatchet's dedicated PostgreSQL database. If a worker crashes, Hatchet detects the failure through heartbeat monitoring and can reassign the step to another worker according to the configured retry policy.

Each step defines its own failure behaviour — number of retries, backoff strategy, and whether to block the workflow or continue. When a workflow fails permanently, the failure is recorded with full context (error messages, input parameters, timing) and is available through both the Hatchet dashboard and the sfp API.

## Graceful Updates

When the server is updated via `sfp server update`, active workflows are drained before services are stopped. The update process queries Hatchet for running workflows, waits for them to reach a completion point (configurable timeout), and then proceeds with the restart. This prevents updates from interrupting in-progress operations.

## Monitoring

The Hatchet dashboard provides a real-time view of running workflows, queued work, completed and failed executions, retry history, and worker health. This is complemented by VictoriaMetrics for infrastructure metrics (queue depth, execution latency, worker utilisation) and VictoriaLogs for full-text log search across all services.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.flxbl.io/flxbl/sfp-server/architecture-overview/sfp-server-architecture-overview-beta/task-processing-system.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
