Configuration Guide

InputLayer uses a hierarchical configuration system with multiple sources:

  1. config.toml - Default configuration file
  2. config.local.toml - Local overrides (git-ignored)
  3. Environment variables (INPUTLAYER_* prefix)

Configuration File Locations

InputLayer looks for config files in this order:

  1. ./config.toml (current directory)
  2. ./config.local.toml (local overrides, git-ignored)
  3. --config <path> (CLI flag for custom path)

Environment variables (INPUTLAYER_* prefix) are merged on top of file configuration.

Complete Configuration Reference

# =============================================================================
# STORAGE CONFIGURATION
# =============================================================================
[storage]
# Base directory for all knowledge graph data
data_dir = "./data"

# Default knowledge graph name (created on startup)
default_knowledge_graph = "default"

# Automatically create knowledge graphs when accessed
auto_create_knowledge_graphs = false

# Maximum number of knowledge graphs (default: 1000)
max_knowledge_graphs = 1000

# -----------------------------------------------------------------------------
# Legacy Persistence Settings
# -----------------------------------------------------------------------------
[storage.persistence]
# Storage format: parquet, csv, bincode
# - parquet: Columnar format, good compression, recommended for production
# - csv: Human-readable, no compression, good for debugging
# - bincode: Binary format, Rust-specific, fast serialization
format = "parquet"

# Compression: snappy, gzip, none
# - snappy: Fast compression/decompression, good ratio
# - gzip: Better compression ratio, slower
# - none: No compression
compression = "snappy"

# Auto-save interval in seconds (0 = manual save only)
auto_save_interval = 0

# Enable write-ahead logging for crash recovery
enable_wal = true

# -----------------------------------------------------------------------------
# DD-Native Persist Layer (Recommended)
# -----------------------------------------------------------------------------
[storage.persist]
# Enable the DD-native persistence layer
enabled = true

# Buffer size before flushing to disk (number of updates)
buffer_size = 10000

# Durability mode: immediate, batched, async
# - immediate: Sync to disk on each write (safest, slowest)
# - batched: Periodic sync (balanced performance/safety)
# - async: Fire-and-forget (fastest, may lose recent data on crash)
durability_mode = "immediate"

# Compaction window (reserved, not yet implemented; 0 = keep all)
compaction_window = 0

# Maximum WAL size in bytes before forced flush (default: 64 MB)
max_wal_size_bytes = 67108864

# Auto-compact when a shard accumulates this many batch files (0 = disabled)
auto_compact_threshold = 10

# How often to check for auto-compaction, in seconds (0 = disabled)
auto_compact_interval_secs = 300

# -----------------------------------------------------------------------------
# Performance Tuning
# -----------------------------------------------------------------------------
[storage.performance]
# Initial capacity for in-memory collections
initial_capacity = 10000

# Batch size for bulk operations
batch_size = 1000

# Enable async I/O operations
async_io = true

# Number of worker threads (0 = use all CPU cores)
num_threads = 0

# Query timeout in milliseconds (0 = no timeout)
query_timeout_ms = 30000

# Maximum query text size in bytes
max_query_size_bytes = 1048576

# Maximum tuples per single insert
max_insert_tuples = 10000

# Maximum string value size in bytes
max_string_value_bytes = 65536

# Maximum rows returned per query result (0 = unlimited)
max_result_rows = 100000

# Log queries slower than this (ms, 0 = disabled)
slow_query_log_ms = 5000

# Maximum query cost budget (0 = unlimited)
max_query_cost = 0

# =============================================================================
# QUERY OPTIMIZATION
# =============================================================================
[optimization]
# Enable join order planning
enable_join_planning = true

# Enable SIP (Sideways Information Passing) rewriting
enable_sip_rewriting = true

# Enable subplan sharing across rules
enable_subplan_sharing = true

# Enable boolean specialization optimizations
enable_boolean_specialization = true

# Enable Magic Sets demand-driven rewriting for recursive queries
enable_magic_sets = true

# =============================================================================
# LOGGING
# =============================================================================
[logging]
# Log level: trace, debug, info, warn, error
level = "info"

# Log format: text, json
format = "text"

# =============================================================================
# HTTP SERVER (WebSocket API)
# =============================================================================
[http]
# Enable HTTP server for WebSocket API access
enabled = true

# Bind address
host = "127.0.0.1"

# Port number
port = 8080

# CORS allowed origins (empty = same-origin only, unless cors_allow_all is true)
cors_origins = []

# Allow all CORS origins (use only in development)
cors_allow_all = false

# WebSocket idle timeout in milliseconds (default: 5 minutes)
ws_idle_timeout_ms = 300000

# Graceful shutdown timeout in seconds (default: 30)
shutdown_timeout_secs = 30

# Stats endpoint timeout in seconds (default: 5)
stats_timeout_secs = 5

# -----------------------------------------------------------------------------
# Web GUI Dashboard
# -----------------------------------------------------------------------------
[http.gui]
# Enable web-based GUI dashboard
enabled = true

# Directory containing GUI static files
static_dir = "./gui/dist"

# -----------------------------------------------------------------------------
# Authentication (always enabled)
# -----------------------------------------------------------------------------
[http.auth]
# On first boot, an admin user and bootstrap API key are created.
# Set a known password, or omit to auto-generate one (printed to stderr).
# bootstrap_admin_password = "your-secure-password"

# Session timeout in seconds (default: 24 hours)
session_timeout_secs = 86400

# =============================================================================
# RATE LIMITING
# =============================================================================
[http.rate_limit]
# Maximum total connections (0 = unlimited)
max_connections = 10000

# Maximum WebSocket connections (0 = unlimited)
max_ws_connections = 5000

# Maximum WebSocket messages per second per connection (0 = unlimited)
ws_max_messages_per_sec = 1000

# Maximum WebSocket connection lifetime in seconds (0 = unlimited, default: 24h)
ws_max_lifetime_secs = 86400

# Maximum HTTP requests per second per IP address (0 = unlimited)
per_ip_max_rps = 0

# Notification ring buffer size for reconnect replay
notification_buffer_size = 4096

Environment Variables

All config options can be overridden with environment variables using the INPUTLAYER_ prefix:

# Storage settings
export INPUTLAYER_STORAGE__DATA_DIR=/var/lib/inputlayer/data
export INPUTLAYER_STORAGE__DEFAULT_KNOWLEDGE_GRAPH=mydb

# Persistence
export INPUTLAYER_STORAGE__PERSIST__DURABILITY_MODE=batched
export INPUTLAYER_STORAGE__PERSIST__BUFFER_SIZE=50000

# HTTP Server
export INPUTLAYER_HTTP__ENABLED=true
export INPUTLAYER_HTTP__PORT=9090

# Authentication
export INPUTLAYER_ADMIN_PASSWORD=your-secure-password   # Admin password (first boot only)
export INPUTLAYER_API_KEY=your-api-key                  # CLI client authentication

# Performance limits
export INPUTLAYER_STORAGE__PERFORMANCE__MAX_RESULT_ROWS=50000

# Rate limiting
export INPUTLAYER_HTTP__RATE_LIMIT__WS_MAX_MESSAGES_PER_SEC=500

# Logging
export INPUTLAYER_LOGGING__LEVEL=debug

Note: Use double underscores (__) to separate nested config sections.

Common Configurations

Development (Fast Iteration)

[storage]
data_dir = "./dev-data"

[storage.persist]
durability_mode = "async"  # Fast writes, less safe

[logging]
level = "debug"

Production (Safe & Durable)

[storage]
data_dir = "/var/lib/inputlayer/data"

[storage.persistence]
format = "parquet"
compression = "snappy"
enable_wal = true

[storage.persist]
durability_mode = "immediate"
buffer_size = 10000

[storage.performance]
num_threads = 0  # Use all cores

[logging]
level = "warn"
format = "json"

[http]
host = "0.0.0.0"
port = 8080

[http.auth]
# Set a known admin password for production deployments
bootstrap_admin_password = "your-secure-password-here"

[http.rate_limit]
ws_max_messages_per_sec = 500
per_ip_max_rps = 100

High-Throughput Ingestion

[storage.persist]
durability_mode = "batched"
buffer_size = 100000

[storage.performance]
batch_size = 10000
async_io = true
num_threads = 0

Memory-Constrained Environment

[storage.performance]
initial_capacity = 1000
batch_size = 100

[storage.persist]
buffer_size = 1000

Durability Modes Explained

ModeWrite LatencyCrash SafetyUse Case
immediateHighFullFinancial data, critical records
batchedMediumPartialMost production workloads
asyncLowMinimalDevelopment, analytics pipelines

Immediate Mode

  • Every write syncs to disk before returning
  • Zero data loss on crash
  • Highest latency

Batched Mode

  • Writes buffer in memory
  • Periodic sync to disk
  • May lose last batch on crash

Async Mode

  • Writes return immediately
  • Background persistence
  • May lose recent updates on crash
  • Best for high-throughput ingestion where some loss is acceptable

Storage Formats

FormatSizeSpeedUse Case
parquetSmallestFast readsProduction, analytics
csvLargestSlowDebugging, interop
bincodeSmallFastestRust-only deployments

Verifying Configuration

Check your effective configuration in the REPL:

.status