Skip to content

Component System

Overview

The component system is flow8’s abstraction layer for runtime capabilities. Instead of hardcoding integrations, flow8 allows administrators to configure and swap implementations of storage, AI, database, request handling, and logging without any code changes. This enables multi-tenancy, cost optimization, and gradual migration between providers.

Architecture

Components are registered in the component_configs MongoDB collection. Each component has:

  • Kind: Category of capability (storage, ai, db, request, console)
  • Name: Unique identifier within a company
  • Config: Provider-specific configuration (API keys, endpoints, parameters)
  • Is Default: Flag indicating if this is the default for its kind
  • Company ID: Multi-tenant scope

At execution time, a flowlet can:

  1. Use the default component for its kind
  2. Reference a named component explicitly
  3. Override via ComponentConfigIds map

Component Registry

Components are loaded at startup and cached in memory. The registry is accessed during flow execution via the component resolver:

type ComponentResolver interface {
Resolve(kind, name string, companyID string) (Component, error)
ResolveDefault(kind, name string, companyID string) (Component, error)
}

If a component is updated, the cache is invalidated, and subsequent requests resolve the new configuration.

Supported Component Kinds

1. Storage (storage)

Handles file and artifact persistence. Used by:

  • DocumentProcessing module (stores OCR results, converted PDFs)
  • DataGenerator module (generates and stores output files)
  • HTTPRequest module (stores downloaded files)
  • EmailProcessor module (stores attachments)

Supported providers:

ProviderConfig KeysUse Case
localroot_path (directory path)Development, single-instance deployments
s3bucket, region, endpoint (optional, for S3-compatible services), access_key_id (encrypted), secret_access_key (encrypted)Production AWS, DigitalOcean Spaces, MinIO
gcsbucket, project_id, credentials (encrypted JSON)Google Cloud Storage, managed GCS

Example configuration:

{
"_id": "ObjectID",
"name": "default-s3",
"kind": "storage",
"company_id": "company-123",
"config": {
"provider": "s3",
"bucket": "flow8-artifacts",
"region": "us-east-1",
"access_key_id": "[encrypted]",
"secret_access_key": "[encrypted]"
},
"is_default": true,
"created_at": "2026-04-04T10:00:00Z"
}

2. AI (ai)

Provides large language model access. Used by:

  • ChatCompletion module (chat and text generation)
  • TextExtraction module (structured data extraction from documents)
  • DocumentSummary module (summarization)
  • ImageAnalysis module (image understanding)

Supported providers:

ProviderConfig KeysNotes
openaiapi_key (encrypted), model, base_url (optional), temperature, max_tokens, timeout_secondsOpenAI API, Azure OpenAI via base_url
anthropicapi_key (encrypted), model, base_url (optional), temperature, max_tokens, timeout_secondsAnthropic Claude API
mistralapi_key (encrypted), model, base_url (optional), temperature, max_tokens, timeout_secondsMistral AI API
ollamabase_url, model, timeout_secondsSelf-hosted Ollama instance (no auth by default)
openai_compatiblebase_url, api_key (encrypted, optional), model, temperature, max_tokens, timeout_secondsAny OpenAI-compatible endpoint (LM Studio, text-generation-webui, etc.)

Example OpenAI configuration:

{
"name": "default-gpt4",
"kind": "ai",
"company_id": "company-123",
"config": {
"provider": "openai",
"api_key": "[encrypted: sk-...]",
"model": "gpt-4-turbo-preview",
"temperature": 0.7,
"max_tokens": 2048,
"timeout_seconds": 30
},
"is_default": true
}

Example self-hosted Ollama configuration:

{
"name": "ollama-local",
"kind": "ai",
"company_id": "company-123",
"config": {
"provider": "ollama",
"base_url": "http://ollama:11434",
"model": "mistral:7b",
"timeout_seconds": 60
},
"is_default": false
}

3. Database (db)

Provides SQL database connectivity for flows that need to execute queries directly. Used by:

  • SQLQuery module (execute SELECT/INSERT/UPDATE/DELETE)
  • DatabaseSync module (bi-directional sync with external databases)

Supported providers:

ProviderConfig KeysRequirements
postgreshost, port, database, user, password (encrypted), ssl_mode, connection_timeout, max_open_connsPostgreSQL 12+
mysqlhost, port, database, user, password (encrypted), ssl_mode, connection_timeout, max_open_connsMySQL 5.7+ or MariaDB 10.2+
sqlitefile_pathSQLite 3 (single-file, limited concurrency)

Example PostgreSQL configuration:

{
"name": "prod-analytics-db",
"kind": "db",
"company_id": "company-123",
"config": {
"provider": "postgres",
"host": "analytics.db.internal",
"port": 5432,
"database": "analytics",
"user": "flow8_user",
"password": "[encrypted]",
"ssl_mode": "require",
"connection_timeout": 10,
"max_open_conns": 25
},
"is_default": true
}

4. Request (request)

Provides HTTP client behavior for external API calls. Used by:

  • HTTPRequest module (make HTTP requests to external APIs)
  • Webhook module (deliver flow results via HTTP)
  • OAuth2 token refresh and external integrations

Supported providers:

ProviderConfig KeysUse Case
defaulttimeout_seconds, max_retries, retry_backoff_ms, verify_sslStandard HTTP with configurable timeouts and retries
custom_tlstimeout_seconds, max_retries, retry_backoff_ms, verify_ssl, client_cert (encrypted), client_key (encrypted), ca_certmTLS for APIs requiring client certificates

Example configuration:

{
"name": "secure-api-requests",
"kind": "request",
"company_id": "company-123",
"config": {
"provider": "default",
"timeout_seconds": 30,
"max_retries": 3,
"retry_backoff_ms": 1000,
"verify_ssl": true
},
"is_default": true
}

5. Console (console)

Handles structured logging and debugging output during flow execution. Used by:

  • Debug module (log execution state for troubleshooting)
  • Background job logging and error reporting

Supported providers:

ProviderConfig KeysUse Case
stdoutmin_level (debug/info/warn/error)Direct console output, useful for container logs
filelog_file_path, min_level, max_size_mb, max_backupsPersistent file logging with rotation
structuredformat (json/logfmt), min_levelStructured logging for log aggregation systems (Datadog, Splunk, ELK)

Per-Layer Component Override

A flowlet can override its parent flow’s component selections via the ComponentConfigIds map:

type DBFlowlet struct {
Name string
ModuleRef string
ComponentConfigIds map[string]string // kind → config name
Timeout time.Duration
// ... other fields
}

Example: Use a specific AI provider for a particular step:

{
"name": "extract-entities",
"module_ref": "text-extraction",
"component_config_ids": {
"ai": "claude-for-extraction" // Override default AI provider
},
"timeout": "30s"
}

This is resolved at execution time in layer_service.go:

func (s *LayerService) Execute(ctx context.Context, play *model.DBPlay, flowlet *model.DBFlowlet) error {
// 1. Check if flowlet overrides AI component
aiComponentID := flowlet.ComponentConfigIds["ai"]
// 2. If not, use default AI for the company
if aiComponentID == "" {
aiComponentID = s.ComponentResolver.ResolveDefault("ai", play.CompanyID)
}
// 3. Resolve and execute module with the selected component
aiComponent, _ := s.ComponentResolver.Resolve("ai", aiComponentID, play.CompanyID)
// ... execute module with aiComponent
}

Configuration Examples

Multi-AI-Provider Setup

A company wants to:

  1. Use GPT-4 for sensitive analysis (expensive but accurate)
  2. Use Mistral for bulk summarization (cheaper)
  3. Use self-hosted Ollama for development/testing

Configuration:

[
{
"name": "gpt4-premium",
"kind": "ai",
"company_id": "company-123",
"config": {
"provider": "openai",
"api_key": "[encrypted: sk-...]",
"model": "gpt-4-turbo-preview",
"temperature": 0.3
},
"is_default": false
},
{
"name": "mistral-budget",
"kind": "ai",
"company_id": "company-123",
"config": {
"provider": "mistral",
"api_key": "[encrypted: ...]",
"model": "mistral-large",
"temperature": 0.7
},
"is_default": true // Default for general use
},
{
"name": "ollama-local",
"kind": "ai",
"company_id": "company-123",
"config": {
"provider": "ollama",
"base_url": "http://ollama:11434",
"model": "neural-chat:7b"
},
"is_default": false
}
]

Flow configuration:

{
"name": "document-analysis-pipeline",
"flowlets": [
{
"name": "sensitive-data-extraction",
"module_ref": "text-extraction",
"component_config_ids": {
"ai": "gpt4-premium" // Use GPT-4 for accuracy
}
},
{
"name": "bulk-summarize",
"module_ref": "document-summary",
"component_config_ids": {
"ai": "mistral-budget" // Use Mistral to save costs
}
}
]
}

Regional Storage Setup

Company operates in EU and US, wants data residency compliance:

[
{
"name": "s3-eu-central",
"kind": "storage",
"company_id": "company-123",
"config": {
"provider": "s3",
"bucket": "flow8-eu-artifacts",
"region": "eu-central-1",
"access_key_id": "[encrypted]",
"secret_access_key": "[encrypted]"
},
"is_default": true // EU operations
},
{
"name": "s3-us-east",
"kind": "storage",
"company_id": "company-123",
"config": {
"provider": "s3",
"bucket": "flow8-us-artifacts",
"region": "us-east-1",
"access_key_id": "[encrypted]",
"secret_access_key": "[encrypted]"
},
"is_default": false // US operations
}
]

Flows processing EU data use the default, while flows with a component_config_ids: { storage: "s3-us-east" } override use the US bucket.

Adding New Component Implementations

To add support for a new provider (e.g., a proprietary AI service), implement the component interface in pkg/components/:

// In pkg/components/ai_custom.go
type CustomAIComponent struct {
APIKey string
BaseURL string
Model string
}
func (c *CustomAIComponent) Chat(ctx context.Context, messages []Message) (string, error) {
// Implementation
}
// Register in provider
func ProvideAIComponent(config *ComponentConfig) (Component, error) {
if config.Provider == "custom" {
return &CustomAIComponent{
APIKey: decrypt(config.APIKey),
BaseURL: config.BaseURL,
Model: config.Model,
}, nil
}
// ... handle other providers
}

Then configure it via the MongoDB API:

Terminal window
POST /api/v1/components
{
"kind": "ai",
"name": "custom-service",
"config": {
"provider": "custom",
"api_key": "...",
"base_url": "...",
"model": "..."
}
}

Component Lifecycle

  1. Registration: Administrator creates component config via API or MongoDB
  2. Caching: Component is loaded into memory and cached with TTL
  3. Discovery: At layer execution, component resolver finds best match (layer override > default)
  4. Instantiation: Component is instantiated with decrypted config
  5. Usage: Module executes with the component
  6. Invalidation: If config is updated, cache is cleared and next use loads new config

Best Practices

  1. Use defaults wisely: Set a sensible default for each kind to reduce cognitive load
  2. Document overrides: Use clear naming (e.g., claude-for-extraction) to explain why flows override defaults
  3. Monitor costs: Track AI token usage and storage bytes per component
  4. Rotate credentials: Use managed secrets where possible (AWS Secrets Manager, HashiCorp Vault)
  5. Test provider failover: Verify that switching from one provider to another doesn’t break flows
  6. Version components: Include version info in component names if you maintain multiple versions of a provider