# sfp-server: Architecture Overview

## Introduction

sfp server is the operational backbone of the sfp platform. It provides an API-driven control plane that orchestrates Salesforce development workflows — builds, deployments, environment management, and release coordination — while the sfp CLI, Codev desktop app, and CI/CD systems act as clients.

Unlike CI/CD-native approaches that scatter state across job logs and Git artifacts, sfp server centralises operational state in a database. This enables real-time visibility into what is happening across environments and pipelines, consistent coordination between concurrent operations, and security controls that operate at the data level rather than the pipeline level.

Each organisation runs on its own dedicated instance. There is no shared compute or shared database between tenants. This means that a failure, performance issue, or security incident in one organisation's instance has no impact on any other.

## System Architecture

The platform is organised into four cooperating layers, each with focused responsibilities that keep the system modular and operationally manageable.

{% @mermaid/diagram content="flowchart TB
subgraph ClientLayer\["Client Layer"]
CLI\["sfp CLI"]
CD\["Codev Desktop"]
CI\["CI/CD Pipelines"]
UI\["Web Application"]
end

```
subgraph EdgeLayer["Edge Layer"]
    Caddy["Caddy Reverse Proxy<br/>TLS · Auth Routing · Maintenance"]
end

subgraph AppLayer["Application Layer"]
    API["sfp Server (NestJS)<br/>REST API · WebSocket · Auth"]
end

subgraph ProcessLayer["Processing Layer"]
    HE["Hatchet Engine<br/>Workflow Orchestration"]
    HW["Hatchet Workers<br/>Standard · Long-running"]
end

subgraph DataLayer["Data & Observability Layer"]
    SB["Supabase<br/>PostgreSQL · GoTrue · PostgREST"]
    VM["VictoriaMetrics<br/>Metrics"]
    VL["VictoriaLogs<br/>Logs"]
    VD["Verdaccio<br/>NPM Registry"]
end

ClientLayer --> EdgeLayer
EdgeLayer --> AppLayer
AppLayer --> ProcessLayer
AppLayer --> DataLayer
ProcessLayer --> AppLayer
ProcessLayer --> DataLayer" %}
```

The **edge layer** is the single entry point for all external traffic. Caddy terminates TLS, routes requests based on path patterns, and applies baseline protections before traffic reaches internal services. In self-hosted deployments, Caddy also handles `/auth/v1/*` routing to GoTrue (the Supabase auth service), which means clients never need to know the internal Supabase URL. When the application server is unavailable during updates, Caddy automatically serves a maintenance page rather than returning connection errors.

The **application layer** hosts the NestJS API server, which serves as the single control plane for all operations. It handles authentication, user management, project configuration, and task dispatch. The API server does not execute long-running work itself — it delegates to the processing layer through Hatchet, a durable workflow engine. This separation keeps the API responsive even when expensive operations are running.

The **processing layer** is built on Hatchet, an open-source durable workflow engine. Hatchet provides automatic retries, cron scheduling, workflow composition, and execution visibility. Workers run sfp CLI commands in isolated contexts, each receiving only the credentials required for their assigned task. Workers are organised into separate pools — standard workers handle typical operations, while a dedicated long-running pool handles tasks like full test suite execution or bulk deployments that may take tens of minutes.

The **data and observability layer** uses Supabase (PostgreSQL with GoTrue authentication, PostgREST, and real-time subscriptions) for all persistent state. VictoriaMetrics and VictoriaLogs provide metrics collection and centralised log aggregation. Verdaccio runs a private npm registry within the instance, ensuring that builds are reproducible and do not depend on external registry availability.

## Service Topology

A running sfp server instance consists of the following services, all managed through a single Docker Compose stack.

**sfp server** is the NestJS application that exposes the REST API and coordinates all platform operations. It communicates with Supabase for data persistence and authentication, with Hatchet for task dispatch, and with VictoriaMetrics and VictoriaLogs for observability data. The server handles webhook ingestion from GitHub and Azure DevOps, manages integration credentials, and serves the web application as a single-page app.

**Caddy** serves as the reverse proxy and TLS termination point. It routes `/auth/v1/*` requests to Supabase Kong (and through to GoTrue), `/sfp/*` API requests to the server, and all other paths to the server for SPA serving. For self-hosted deployments, Caddy manages TLS certificates automatically via Let's Encrypt or uses Cloudflare origin certificates when deployed behind Cloudflare. Caddy also exposes administrative services (Hatchet dashboard, Supabase Studio, Verdaccio) on separate ports via a DNS-only admin subdomain.

**Hatchet** is the workflow engine that handles durable task execution. It consists of three components: a dedicated PostgreSQL database for workflow state, an engine service that manages scheduling and dispatch, and a web dashboard for monitoring workflow execution. Hatchet provides automatic retries with configurable backoff, cron scheduling for recurring operations, workflow composition for multi-step pipelines, and full execution history with log capture.

**Hatchet workers** execute the actual Salesforce operations by running sfp CLI commands. Each worker starts from the same container image as the server, loads a Hatchet client token, and connects to the engine to receive task assignments. When a task arrives, the worker fetches the required credentials from the server, executes the operation, streams progress updates, and completes cleanly. Standard workers and long-running workers operate as separate deployment units with independent scaling.

**Supabase** provides the data layer through several coordinated services. PostgreSQL stores all application data with row-level security policies enforcing access boundaries. GoTrue handles authentication including OAuth (GitHub, Azure AD) and SAML SSO, which is enabled by default with auto-generated signing keys. Kong acts as the internal API gateway for Supabase services. PostgREST exposes a REST interface over database tables. The real-time engine tracks table changes and pushes updates to subscribed clients through WebSocket channels. Studio provides a web-based database management interface for administrators.

**VictoriaMetrics** collects and stores time-series metrics, providing operational visibility into task execution times, queue depths, API latency, and system resource utilisation.

**VictoriaLogs** aggregates and indexes structured logs from all services, enabling centralised log search and analysis across the entire stack.

**Verdaccio** runs a private npm registry within the instance. It stores sfp packages locally, ensuring that builds are reproducible and do not depend on external registry availability during execution.

## Task Execution

When a client requests an operation — deploying a package, creating a sandbox, running tests, or executing a release pipeline — the request follows a consistent path through the system.

The API server receives the request, validates authentication and authorisation, and creates a task definition. It then dispatches the task to Hatchet as a workflow run. Hatchet places the workflow in its execution queue, and a worker picks it up when capacity is available. The worker requests the necessary credentials from the server — Salesforce org tokens, Git provider keys, or other secrets — and executes the operation by running the corresponding sfp CLI command.

Throughout execution, the worker reports progress back through Hatchet's event system and the server's WebSocket channels. Connected clients receive live updates as the operation proceeds. When the operation completes, the worker writes final results to the database, clears credentials from memory, and returns to the pool for the next task.

This model provides several properties that matter in production use. Operations are **durable** — if a worker crashes mid-execution, Hatchet detects the failure and can retry the operation automatically according to configured retry policies. Credentials are **scoped** — each worker receives only what it needs for its specific task, and credentials never persist beyond the task lifetime. Progress is **visible** — clients can watch operations in real time through WebSocket subscriptions, and the full execution history is preserved in both the Hatchet dashboard and the application database.

For complex operations like release pipelines, Hatchet's workflow composition allows multiple steps to be orchestrated as a single durable workflow. Each step can have its own retry policy, timeout, and error handling behaviour. If one step fails, the workflow can retry that specific step, skip to a fallback path, or pause for manual intervention — all without losing the state of previously completed steps.

## Authentication and Access Control

The authentication system supports both interactive users and non-interactive application tokens, handling them through distinct paths that converge at the authorisation layer.

Interactive users authenticate through OAuth (GitHub, Azure AD) or SAML SSO. For self-hosted deployments, SAML is enabled by default with an auto-generated signing key. Administrators register their identity provider through the API and provision users — no manual editing of environment variables, Docker Compose files, or reverse proxy configuration is required. The server auto-detects registered SSO providers from the database, and the login page automatically shows the correct SSO domain.

Application tokens serve CI/CD pipelines and automated integrations. These tokens are explicitly provisioned through the management API with defined permission boundaries that cannot exceed those of the creating user. Unlike interactive sessions, application tokens do not support automatic renewal — when a token expires, it must be explicitly rotated. This is a deliberate design choice: it creates a clear audit trail and prevents long-lived credentials from silently accumulating in CI/CD configurations.

Both authentication paths converge at the AuthGuard, which validates tokens, retrieves account memberships, and enforces role-based access control. The role hierarchy distinguishes between members (who can perform standard operations) and owners (who can manage users, configure integrations, and access production environments). Every authenticated request is logged with full context — the requesting identity, the operation performed, and the timestamp — for compliance and security audit purposes.

The platform supports two authentication deployment modes. In cloud-hosted deployments, a global authentication instance handles identity federation, routing users to their tenant instances after authentication. In self-hosted deployments, the instance runs its own complete authentication stack, providing full independence from external auth services. Both modes use identical authorisation logic once the identity is established.

## Credential Security

Sensitive credentials — Salesforce org tokens, GitHub App private keys, API secrets — are encrypted at rest using pgcrypto with a key stored in Supabase Vault. The encryption happens at the application layer, meaning the database stores only ciphertext. The encryption key itself is bootstrapped automatically during server initialisation and stored securely in the Vault.

When a worker needs credentials for a task, it requests them from the server through an authenticated internal API call. The server decrypts the credentials, returns them to the worker over the internal Docker network, and the worker holds them in memory only for the duration of the task. Upon completion, credentials are explicitly cleared before the worker returns to the pool.

This just-in-time access pattern means that credentials are never written to disk in plaintext within the application runtime, are never shared between tasks or workers, and are automatically cleaned up even if a worker fails unexpectedly. The blast radius of a compromised worker is limited to the specific credentials that were loaded for that particular task execution.

Integration credentials — GitHub App keys, Azure DevOps PATs, Salesforce connected app secrets — are registered through the integration API, encrypted with the same vault key, and stored in a dedicated credentials table. The server presents a provider abstraction over these credentials, allowing the same operational code to work with different source control and CI/CD providers without changes to the security model.

## Database Architecture

All operational state lives in PostgreSQL through Supabase. The schema covers deployment history, environment allocations, package versions, project configuration, user accounts, team memberships, audit logs, and integration credentials. Row-level security (RLS) policies enforce access boundaries at the database level, providing a defence-in-depth layer that operates independently of application-level access controls.

Each organisation has its own dedicated database instance. This architectural decision provides data separation at the infrastructure level rather than relying on application-level tenant filtering. It ensures that a query performance issue in one tenant cannot affect another, backup and retention policies can be managed independently, and organisations maintain full control over their data governance requirements.

The database also serves as the coordination layer for real-time updates. When a worker writes a state change — task progress, environment status, deployment result — Supabase's real-time engine detects the change and pushes a notification through WebSocket channels to subscribed clients. This keeps the API server, workers, and clients consistent without polling, enabling the UI and CLI to show live progress as operations execute.

The platform exposes two higher-level storage abstractions built on PostgreSQL. A document store provides collection-based storage with versioning and JSONB querying, used for storing build artifacts metadata, release definitions, and configuration documents. For high-volume collections (build logs, deployment records), dedicated tables are used with the same API interface but optimised indexing for query performance.

## Network Architecture

All external traffic enters through Caddy, which is the only service exposed to the public network. Internal services communicate through a Docker bridge network using internal hostnames. This provides service isolation while enabling controlled communication between components.

Caddy's routing table directs traffic based on path patterns. Authentication requests (`/auth/v1/*`) are forwarded to Supabase Kong, which routes them to GoTrue. API requests (`/sfp/*`) go to the application server. All other paths are served as static files from the web application, with client-side routing handling navigation. Administrative services — the Hatchet dashboard, Supabase Studio, and Verdaccio registry — are exposed on separate ports via a DNS-only admin subdomain with IP-based access controls.

For self-hosted deployments behind Cloudflare, Caddy uses Cloudflare origin certificates for TLS termination in Full (Strict) mode. For deployments without Cloudflare, Caddy can obtain certificates automatically from Let's Encrypt or use custom certificates provided by the operator.

Webhook endpoints accept inbound events from GitHub, Azure DevOps, and Salesforce. Incoming webhooks are validated for signature authenticity by the application server, mapped to the appropriate Hatchet workflow, and executed through the standard task processing pipeline. This enables automated responses to pull request events, push notifications, and Salesforce platform events.

## Deployment and Lifecycle

sfp server runs as a Docker Compose stack on a dedicated virtual machine. The minimum requirements are 8 GB RAM, 4 vCPUs, and 80 GB SSD storage. The stack includes the edge proxy, API server, Hatchet engine and workers, Supabase services, observability stack, and npm registry — roughly 20 containers in a typical deployment.

Deployment and lifecycle management are handled through the sfp CLI, which provides commands for the full operational lifecycle:

`sfp server init` provisions a new instance. It generates secrets (database passwords, JWT keys, SAML signing keys, encryption keys), creates the Docker Compose configuration, generates TLS certificates, sets up the database schema through migrations, and creates the initial admin user. The entire process runs from a single command that can target local or remote servers over SSH.

`sfp server start` brings up all services with proper dependency ordering, ensuring databases are healthy before starting application services, and application services are healthy before starting workers. It handles Docker registry authentication, volume provisioning, and profile selection automatically.

`sfp server update` performs a controlled update of a running instance. It backs up the current configuration, drains active workflows (waiting for in-progress tasks to complete), applies database migrations, pulls new container images, and restarts services. If migrations fail, the update can continue or abort based on operator preference.

The system supports both flxbl-managed deployments (where flxbl provisions and maintains the infrastructure) and self-hosted deployments (where the organisation operates their own instance). Both modes run identical software and container images. The differences are in who manages infrastructure operations and whether authentication uses a global auth instance or a self-contained local stack.

## System Requirements

Each organisation's dedicated instance requires a modern cloud virtual machine (AWS EC2, Azure VM, Hetzner Cloud, or equivalent) with a minimum of 8 GB RAM, 4 vCPUs, and 80 GB SSD storage for containers and data volumes, plus a static IP address or domain with direct internet connectivity.

The instance requires outbound access to Salesforce API endpoints, GitHub or Azure DevOps APIs (depending on the CI/CD provider), and standard HTTPS ports for client access. Docker and Docker Compose must be installed on the host. For self-hosted deployments, a domain name with DNS control is required for TLS certificate provisioning.

## Further Reading

| Topic                                                   | Document                                                                                                                                                              |
| ------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Authentication, access control, and credential security | [Authentication and Security](/flxbl/sfp-server/architecture-overview/sfp-server-architecture-overview-beta/authentication-and-security-architecture.md)              |
| Task execution, Hatchet workflows, and worker pools     | [Task Processing System](/flxbl/sfp-server/architecture-overview/sfp-server-architecture-overview-beta/task-processing-system.md)                                     |
| Database schema, real-time updates, and vault           | [Database Architecture](/flxbl/sfp-server/architecture-overview/sfp-server-architecture-overview-beta/database-architecture.md)                                       |
| Network topology, TLS, integrations, and webhooks       | [Network Architecture and Integrations](/flxbl/sfp-server/architecture-overview/sfp-server-architecture-overview-beta/network-architecture-and-integration-system.md) |


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.flxbl.io/flxbl/sfp-server/architecture-overview/sfp-server-architecture-overview-beta.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.