Architecture

Core Engine Architecture

How dataflows executes workflows reliably at scale.

Built on Experience

This architecture is not an academic exercise. It is the result of years of experience building mission-critical automation systems. We've seen every failure mode: API rate limits, database timeouts, worker crashes, and memory leaks.

We chose to build on top of proven Open Source technologies rather than inventing proprietary black boxes.

  • Runtime: Node.js & Nitro for high-performance async I/O.
  • Database: PostgreSQL for reliable, ACID-compliant state storage.
  • Engine: Durable Execution Engine for resumable workflows.

By leveraging these battle-tested components, we focus our innovation on the workflow abstraction itself, not the plumbing.

Durable Execution

dataflows is built on the concept of Durable Execution. Instead of managing queues, retries, and state manually, you write standard TypeScript code that is automatically made durable.

Reliability-as-Code

Move from hand-rolled queues and custom retries to durable, resumable code with simple directives.

  • No Queues: You don't define queues or workers. You just call functions.
  • No Timeouts: Workflows can run for seconds, days, or months.
  • Resumable: If the process crashes, it resumes exactly where it left off.

The use workflow Directive

Marks a function as a workflow entry point. This function is executed by the workflow engine and its state is persisted.

export async function copyToClickUp({ id }: { id: string }) {
  'use workflow'
  // ... workflow logic
}

The use step Directive

Marks a function as an atomic step. The result of a step is memoized. If the workflow crashes and restarts, the engine skips already executed steps and returns the memoized result.

export async function getAzureWorkItemById({ id }: { id: string }) {
  'use step'
  // This API call happens only once.
  // On replay, the result is fetched from the DB.
  return getWorkItem({ ... })
}

Execution Flow

The engine manages the execution state in PostgreSQL. When a workflow sleeps or waits for an event, it offloads its state and releases compute resources.

sequenceDiagram
    participant Webhook
    participant API
    participant Engine
    participant DB

    Webhook->>API: POST /hooks/shopify
    API->>Engine: Start Workflow
    API-->>Webhook: 202 Accepted
    
    Engine->>DB: Create Execution
    Engine->>Engine: Run Step 1 ('use step')
    Engine->>DB: Save Result 1
    Engine->>Engine: Run Step 2 ('use step')
    Engine->>DB: Save Result 2
    
    Engine->>DB: Suspend (Sleep/Wait)
    Note right of Engine: Resources Released
    
    Engine->>DB: Resume (Timer/Event)
    Engine->>Engine: Replay Steps (Cached)
    Engine->>Engine: Run Step 3