Component System

Overview

The component system is flow8’s abstraction layer for runtime capabilities. Instead of hardcoding integrations, flow8 allows administrators to configure and swap implementations of storage, AI, database, request handling, and logging without any code changes. This enables multi-tenancy, cost optimization, and gradual migration between providers.

Architecture

Components are registered in the component_configs MongoDB collection. Each component has:

Kind: Category of capability (storage, ai, db, request, console)
Name: Unique identifier within a company
Config: Provider-specific configuration (API keys, endpoints, parameters)
Is Default: Flag indicating if this is the default for its kind
Company ID: Multi-tenant scope

At execution time, a flowlet can:

Use the default component for its kind
Reference a named component explicitly
Override via ComponentConfigIds map

Component Registry

Components are loaded at startup and cached in memory. The registry is accessed during flow execution via the component resolver:

type ComponentResolver interface {
    Resolve(kind, name string, companyID string) (Component, error)
    ResolveDefault(kind, name string, companyID string) (Component, error)
}

If a component is updated, the cache is invalidated, and subsequent requests resolve the new configuration.

Supported Component Kinds

1. Storage (`storage`)

Handles file and artifact persistence. Used by:

DocumentProcessing module (stores OCR results, converted PDFs)
DataGenerator module (generates and stores output files)
HTTPRequest module (stores downloaded files)
EmailProcessor module (stores attachments)

Supported providers:

Provider	Config Keys	Use Case
`local`	`root_path` (directory path)	Development, single-instance deployments
`s3`	`bucket`, `region`, `endpoint` (optional, for S3-compatible services), `access_key_id` (encrypted), `secret_access_key` (encrypted)	Production AWS, DigitalOcean Spaces, MinIO
`gcs`	`bucket`, `project_id`, `credentials` (encrypted JSON)	Google Cloud Storage, managed GCS

Example configuration:

{
  "_id": "ObjectID",
  "name": "default-s3",
  "kind": "storage",
  "company_id": "company-123",
  "config": {
    "provider": "s3",
    "bucket": "flow8-artifacts",
    "region": "us-east-1",
    "access_key_id": "[encrypted]",
    "secret_access_key": "[encrypted]"
  },
  "is_default": true,
  "created_at": "2026-04-04T10:00:00Z"
}

2. AI (`ai`)

Provides large language model access. Used by:

ChatCompletion module (chat and text generation)
TextExtraction module (structured data extraction from documents)
DocumentSummary module (summarization)
ImageAnalysis module (image understanding)

Supported providers:

Provider	Config Keys	Notes
`openai`	`api_key` (encrypted), `model`, `base_url` (optional), `temperature`, `max_tokens`, `timeout_seconds`	OpenAI API, Azure OpenAI via base_url
`anthropic`	`api_key` (encrypted), `model`, `base_url` (optional), `temperature`, `max_tokens`, `timeout_seconds`	Anthropic Claude API
`mistral`	`api_key` (encrypted), `model`, `base_url` (optional), `temperature`, `max_tokens`, `timeout_seconds`	Mistral AI API
`ollama`	`base_url`, `model`, `timeout_seconds`	Self-hosted Ollama instance (no auth by default)
`openai_compatible`	`base_url`, `api_key` (encrypted, optional), `model`, `temperature`, `max_tokens`, `timeout_seconds`	Any OpenAI-compatible endpoint (LM Studio, text-generation-webui, etc.)

Example OpenAI configuration:

{
  "name": "default-gpt4",
  "kind": "ai",
  "company_id": "company-123",
  "config": {
    "provider": "openai",
    "api_key": "[encrypted: sk-...]",
    "model": "gpt-4-turbo-preview",
    "temperature": 0.7,
    "max_tokens": 2048,
    "timeout_seconds": 30
  },
  "is_default": true
}

Example self-hosted Ollama configuration:

{
  "name": "ollama-local",
  "kind": "ai",
  "company_id": "company-123",
  "config": {
    "provider": "ollama",
    "base_url": "http://ollama:11434",
    "model": "mistral:7b",
    "timeout_seconds": 60
  },
  "is_default": false
}

3. Database (`db`)

Provides SQL database connectivity for flows that need to execute queries directly. Used by:

SQLQuery module (execute SELECT/INSERT/UPDATE/DELETE)
DatabaseSync module (bi-directional sync with external databases)

Supported providers:

Provider	Config Keys	Requirements
`postgres`	`host`, `port`, `database`, `user`, `password` (encrypted), `ssl_mode`, `connection_timeout`, `max_open_conns`	PostgreSQL 12+
`mysql`	`host`, `port`, `database`, `user`, `password` (encrypted), `ssl_mode`, `connection_timeout`, `max_open_conns`	MySQL 5.7+ or MariaDB 10.2+
`sqlite`	`file_path`	SQLite 3 (single-file, limited concurrency)

Example PostgreSQL configuration:

{
  "name": "prod-analytics-db",
  "kind": "db",
  "company_id": "company-123",
  "config": {
    "provider": "postgres",
    "host": "analytics.db.internal",
    "port": 5432,
    "database": "analytics",
    "user": "flow8_user",
    "password": "[encrypted]",
    "ssl_mode": "require",
    "connection_timeout": 10,
    "max_open_conns": 25
  },
  "is_default": true
}

4. Request (`request`)

Provides HTTP client behavior for external API calls. Used by:

HTTPRequest module (make HTTP requests to external APIs)
Webhook module (deliver flow results via HTTP)
OAuth2 token refresh and external integrations

Supported providers:

Provider	Config Keys	Use Case
`default`	`timeout_seconds`, `max_retries`, `retry_backoff_ms`, `verify_ssl`	Standard HTTP with configurable timeouts and retries
`custom_tls`	`timeout_seconds`, `max_retries`, `retry_backoff_ms`, `verify_ssl`, `client_cert` (encrypted), `client_key` (encrypted), `ca_cert`	mTLS for APIs requiring client certificates

Example configuration:

{
  "name": "secure-api-requests",
  "kind": "request",
  "company_id": "company-123",
  "config": {
    "provider": "default",
    "timeout_seconds": 30,
    "max_retries": 3,
    "retry_backoff_ms": 1000,
    "verify_ssl": true
  },
  "is_default": true
}

5. Console (`console`)

Handles structured logging and debugging output during flow execution. Used by:

Debug module (log execution state for troubleshooting)
Background job logging and error reporting

Supported providers:

Provider	Config Keys	Use Case
`stdout`	`min_level` (debug/info/warn/error)	Direct console output, useful for container logs
`file`	`log_file_path`, `min_level`, `max_size_mb`, `max_backups`	Persistent file logging with rotation
`structured`	`format` (json/logfmt), `min_level`	Structured logging for log aggregation systems (Datadog, Splunk, ELK)

Per-Layer Component Override

A flowlet can override its parent flow’s component selections via the ComponentConfigIds map:

type DBFlowlet struct {
    Name              string
    ModuleRef         string
    ComponentConfigIds map[string]string // kind → config name
    Timeout           time.Duration
    // ... other fields
}

Example: Use a specific AI provider for a particular step:

{
  "name": "extract-entities",
  "module_ref": "text-extraction",
  "component_config_ids": {
    "ai": "claude-for-extraction"  // Override default AI provider
  },
  "timeout": "30s"
}

This is resolved at execution time in layer_service.go:

func (s *LayerService) Execute(ctx context.Context, play *model.DBPlay, flowlet *model.DBFlowlet) error {
    // 1. Check if flowlet overrides AI component
    aiComponentID := flowlet.ComponentConfigIds["ai"]

    // 2. If not, use default AI for the company
    if aiComponentID == "" {
        aiComponentID = s.ComponentResolver.ResolveDefault("ai", play.CompanyID)
    }

    // 3. Resolve and execute module with the selected component
    aiComponent, _ := s.ComponentResolver.Resolve("ai", aiComponentID, play.CompanyID)
    // ... execute module with aiComponent
}

Configuration Examples

Multi-AI-Provider Setup

A company wants to:

Use GPT-4 for sensitive analysis (expensive but accurate)
Use Mistral for bulk summarization (cheaper)
Use self-hosted Ollama for development/testing

Configuration:

[
  {
    "name": "gpt4-premium",
    "kind": "ai",
    "company_id": "company-123",
    "config": {
      "provider": "openai",
      "api_key": "[encrypted: sk-...]",
      "model": "gpt-4-turbo-preview",
      "temperature": 0.3
    },
    "is_default": false
  },
  {
    "name": "mistral-budget",
    "kind": "ai",
    "company_id": "company-123",
    "config": {
      "provider": "mistral",
      "api_key": "[encrypted: ...]",
      "model": "mistral-large",
      "temperature": 0.7
    },
    "is_default": true  // Default for general use
  },
  {
    "name": "ollama-local",
    "kind": "ai",
    "company_id": "company-123",
    "config": {
      "provider": "ollama",
      "base_url": "http://ollama:11434",
      "model": "neural-chat:7b"
    },
    "is_default": false
  }
]

Flow configuration:

{
  "name": "document-analysis-pipeline",
  "flowlets": [
    {
      "name": "sensitive-data-extraction",
      "module_ref": "text-extraction",
      "component_config_ids": {
        "ai": "gpt4-premium"  // Use GPT-4 for accuracy
      }
    },
    {
      "name": "bulk-summarize",
      "module_ref": "document-summary",
      "component_config_ids": {
        "ai": "mistral-budget"  // Use Mistral to save costs
      }
    }
  ]
}

Regional Storage Setup

Company operates in EU and US, wants data residency compliance:

[
  {
    "name": "s3-eu-central",
    "kind": "storage",
    "company_id": "company-123",
    "config": {
      "provider": "s3",
      "bucket": "flow8-eu-artifacts",
      "region": "eu-central-1",
      "access_key_id": "[encrypted]",
      "secret_access_key": "[encrypted]"
    },
    "is_default": true  // EU operations
  },
  {
    "name": "s3-us-east",
    "kind": "storage",
    "company_id": "company-123",
    "config": {
      "provider": "s3",
      "bucket": "flow8-us-artifacts",
      "region": "us-east-1",
      "access_key_id": "[encrypted]",
      "secret_access_key": "[encrypted]"
    },
    "is_default": false  // US operations
  }
]

Flows processing EU data use the default, while flows with a component_config_ids: { storage: "s3-us-east" } override use the US bucket.

Adding New Component Implementations

To add support for a new provider (e.g., a proprietary AI service), implement the component interface in pkg/components/:

// In pkg/components/ai_custom.go
type CustomAIComponent struct {
    APIKey  string
    BaseURL string
    Model   string
}

func (c *CustomAIComponent) Chat(ctx context.Context, messages []Message) (string, error) {
    // Implementation
}

// Register in provider
func ProvideAIComponent(config *ComponentConfig) (Component, error) {
    if config.Provider == "custom" {
        return &CustomAIComponent{
            APIKey:  decrypt(config.APIKey),
            BaseURL: config.BaseURL,
            Model:   config.Model,
        }, nil
    }
    // ... handle other providers
}

Then configure it via the MongoDB API:

POST /api/v1/components
{
  "kind": "ai",
  "name": "custom-service",
  "config": {
    "provider": "custom",
    "api_key": "...",
    "base_url": "...",
    "model": "..."
  }
}

Component Lifecycle

Registration: Administrator creates component config via API or MongoDB
Caching: Component is loaded into memory and cached with TTL
Discovery: At layer execution, component resolver finds best match (layer override > default)
Instantiation: Component is instantiated with decrypted config
Usage: Module executes with the component
Invalidation: If config is updated, cache is cleared and next use loads new config

Best Practices

Use defaults wisely: Set a sensible default for each kind to reduce cognitive load
Document overrides: Use clear naming (e.g., claude-for-extraction) to explain why flows override defaults
Monitor costs: Track AI token usage and storage bytes per component
Rotate credentials: Use managed secrets where possible (AWS Secrets Manager, HashiCorp Vault)
Test provider failover: Verify that switching from one provider to another doesn’t break flows
Version components: Include version info in component names if you maintain multiple versions of a provider

Component System

Overview

Architecture

Component Registry

Supported Component Kinds

1. Storage (storage)

2. AI (ai)

3. Database (db)

4. Request (request)

5. Console (console)

Per-Layer Component Override

Configuration Examples

Multi-AI-Provider Setup

Regional Storage Setup

Adding New Component Implementations

Component Lifecycle

Best Practices

1. Storage (`storage`)

2. AI (`ai`)

3. Database (`db`)

4. Request (`request`)

5. Console (`console`)