Docker Multi-Service Setup Guide - Single Container Architecture

Purpose: Complete guide for deploying Sasha Studio as a single Docker container with all services integrated
Target: Production-ready deployment with simplified operations and maintenance
Architecture: All services managed by Supervisord within a single container

Architecture Overview

Single Container Philosophy

Sasha Studio uses a monolithic container approach to maximize simplicity while providing enterprise-grade functionality. All services run within a single container, managed by Supervisord, with internal communication via localhost.

graph TB subgraph "SASHA STUDIO CONTAINER" subgraph "Entry Points" A[Nginx :80] B[Health Check :8080] end subgraph "Application Layer" C[Sasha API :3000] D[WebSocket Server :3001] E[File Handler :3002] end subgraph "LLM Integration" F[LLxprt CLI Service] G[Model Router] end subgraph "Data Services" H[PostgreSQL :5432] I[Redis :6379] end subgraph "Publishing" J[Doc Builder :8081] K[Static Server :8082] end subgraph "Monitoring" L[Simple Metrics :9090] M[Log Collector] end subgraph "Process Management" N[Supervisord] O[Init Script] end end subgraph "External" P[Claude API] Q[OpenAI API] R[File Volumes] end A --> C A --> J A --> L C --> F F --> P F --> Q C --> H C --> I N --> C N --> D N --> H N --> I N --> J O --> N style A fill:#e3f2fd,stroke:#1976d2 style C fill:#f3e5f5,stroke:#7b1fa2 style F fill:#e8f5e9,stroke:#388e3c style N fill:#fff3e0,stroke:#f57c00

Multi-Stage Dockerfile Implementation

Complete Dockerfile

# Multi-stage build for Sasha Studio Single Container
FROM node:20-slim AS base

# Install system dependencies
RUN apt-get update && apt-get install -y \
    # Core system tools
    curl \
    wget \
    git \
    gnupg \
    ca-certificates \
    # Process management
    supervisor \
    # Web server
    nginx \
    # Database
    postgresql-14 \
    postgresql-client-14 \
    # Cache
    redis-server \
    # Build tools
    python3 \
    python3-pip \
    make \
    g++ \
    # Monitoring tools
    htop \
    lsof \
    netstat-nat \
    && rm -rf /var/lib/apt/lists/* \
    && apt-get clean

# Create application user for security
RUN groupadd --gid 1000 sasha \
    && useradd --uid 1000 --gid sasha --shell /bin/bash --create-home sasha

# Configure npm for non-root user
USER sasha
ENV NPM_CONFIG_PREFIX=/home/sasha/.npm-global
ENV PATH=$PATH:/home/sasha/.npm-global/bin

# Install LLxprt Code CLI globally
RUN npm install -g @vybestack/llxprt-code@latest

# Verify LLxprt installation
RUN llxprt --version

# Switch back to root for system configuration
USER root

FROM base AS application-builder

# Set working directory
WORKDIR /app

# Copy package files for dependency installation
COPY package*.json ./
COPY frontend/package*.json ./frontend/
COPY backend/package*.json ./backend/

# Install all dependencies
RUN npm ci --include=dev
RUN cd frontend && npm ci --include=dev
RUN cd backend && npm ci --include=dev

# Copy source code
COPY . .

# Build frontend
RUN cd frontend && npm run build

# Build backend (if TypeScript)
RUN cd backend && npm run build 2>/dev/null || echo "No build step for backend"

FROM base AS production

# Copy built application from builder stage
COPY --from=application-builder --chown=sasha:sasha /app/frontend/dist /app/frontend/dist
COPY --from=application-builder --chown=sasha:sasha /app/backend /app/backend
COPY --from=application-builder --chown=sasha:sasha /app/guides /app/guides
COPY --from=application-builder --chown=sasha:sasha /app/scripts /app/scripts
COPY --from=application-builder --chown=sasha:sasha /app/package*.json /app/

# Install only production dependencies
WORKDIR /app
RUN npm ci --omit=dev --no-audit --no-fund

# Create required directories with proper permissions
RUN mkdir -p \
    /app/data \
    /app/logs \
    /app/config \
    /app/tmp \
    /app/uploads \
    /var/lib/postgresql/14/main \
    /var/log/postgresql \
    /var/log/redis \
    /var/log/nginx \
    /var/log/supervisor \
    /run/postgresql \
    && chown -R sasha:sasha /app \
    && chown -R postgres:postgres /var/lib/postgresql \
    && chown -R postgres:postgres /var/log/postgresql \
    && chown -R redis:redis /var/log/redis \
    && chown -R www-data:www-data /var/log/nginx

# Copy configuration files
COPY docker/nginx.conf /etc/nginx/nginx.conf
COPY docker/supervisord.conf /etc/supervisor/conf.d/supervisord.conf
COPY docker/postgresql.conf /etc/postgresql/14/main/postgresql.conf
COPY docker/redis.conf /etc/redis/redis.conf
COPY docker/init.sh /usr/local/bin/init.sh
COPY docker/health-check.sh /usr/local/bin/health-check.sh

# Make scripts executable
RUN chmod +x /usr/local/bin/init.sh /usr/local/bin/health-check.sh

# Configure PostgreSQL
USER postgres
RUN /usr/lib/postgresql/14/bin/initdb -D /var/lib/postgresql/14/main \
    && echo "host all all 127.0.0.1/32 trust" >> /var/lib/postgresql/14/main/pg_hba.conf \
    && echo "local all all trust" >> /var/lib/postgresql/14/main/pg_hba.conf

USER root

# Set environment variables
ENV NODE_ENV=production
ENV SASHA_HOME=/app
ENV SASHA_USER=sasha

# Expose main HTTP port
EXPOSE 80

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
    CMD /usr/local/bin/health-check.sh

# Volume mounts for persistent data
VOLUME ["/app/data", "/app/logs", "/app/config"]

# Start all services via init script
CMD ["/usr/local/bin/init.sh"]

Service Configuration Files

Supervisord Configuration

; /etc/supervisor/conf.d/supervisord.conf
[supervisord]
nodaemon=true
user=root
logfile=/app/logs/supervisord.log
logfile_maxbytes=50MB
logfile_backups=3
loglevel=info
pidfile=/var/run/supervisord.pid

[unix_http_server]
file=/var/run/supervisor.sock
chmod=0700
username=admin
password=change_this_password

[supervisorctl]
serverurl=unix:///var/run/supervisor.sock
username=admin
password=change_this_password

[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface

; PostgreSQL Database
[program:postgresql]
command=/usr/lib/postgresql/14/bin/postgres -D /var/lib/postgresql/14/main -c config_file=/etc/postgresql/14/main/postgresql.conf
user=postgres
autostart=true
autorestart=true
stderr_logfile=/app/logs/postgresql-error.log
stdout_logfile=/app/logs/postgresql-access.log
stdout_logfile_maxbytes=10MB
stderr_logfile_maxbytes=10MB
priority=100

; Redis Cache
[program:redis]
command=/usr/bin/redis-server /etc/redis/redis.conf
user=redis
autostart=true
autorestart=true
stderr_logfile=/app/logs/redis-error.log
stdout_logfile=/app/logs/redis-access.log
stdout_logfile_maxbytes=10MB
stderr_logfile_maxbytes=10MB
priority=200

; Sasha API Server
[program:sasha-api]
command=/usr/bin/node /app/backend/server.js
directory=/app/backend
user=sasha
autostart=true
autorestart=true
environment=NODE_ENV=production,PORT=3000,DATABASE_URL="postgresql://sasha:sasha@localhost:5432/sasha",REDIS_URL="redis://localhost:6379"
stderr_logfile=/app/logs/sasha-api-error.log
stdout_logfile=/app/logs/sasha-api-access.log
stdout_logfile_maxbytes=10MB
stderr_logfile_maxbytes=10MB
priority=300

; WebSocket Server (if separate)
[program:websocket-server]
command=/usr/bin/node /app/backend/websocket-server.js
directory=/app/backend
user=sasha
autostart=true
autorestart=true
environment=NODE_ENV=production,PORT=3001
stderr_logfile=/app/logs/websocket-error.log
stdout_logfile=/app/logs/websocket-access.log
stdout_logfile_maxbytes=10MB
stderr_logfile_maxbytes=10MB
priority=400

; Doc Builder Service
[program:doc-builder]
command=/usr/bin/node /app/backend/doc-builder-server.js
directory=/app/backend
user=sasha
autostart=true
autorestart=true
environment=NODE_ENV=production,PORT=8081
stderr_logfile=/app/logs/doc-builder-error.log
stdout_logfile=/app/logs/doc-builder-access.log
stdout_logfile_maxbytes=10MB
stderr_logfile_maxbytes=10MB
priority=500

; Simple Metrics Server
[program:metrics-server]
command=/usr/bin/node /app/backend/metrics-server.js
directory=/app/backend
user=sasha
autostart=true
autorestart=true
environment=NODE_ENV=production,PORT=9090
stderr_logfile=/app/logs/metrics-error.log
stdout_logfile=/app/logs/metrics-access.log
stdout_logfile_maxbytes=10MB
stderr_logfile_maxbytes=10MB
priority=600

; Nginx Web Server (starts last)
[program:nginx]
command=/usr/sbin/nginx -g "daemon off;"
autostart=true
autorestart=true
stderr_logfile=/app/logs/nginx-error.log
stdout_logfile=/app/logs/nginx-access.log
stdout_logfile_maxbytes=10MB
stderr_logfile_maxbytes=10MB
priority=900

; Process monitoring and restart script
[program:service-monitor]
command=/app/scripts/service-monitor.sh
user=sasha
autostart=true
autorestart=true
stderr_logfile=/app/logs/monitor-error.log
stdout_logfile=/app/logs/monitor-access.log
priority=999

Nginx Configuration

# /etc/nginx/nginx.conf
user www-data;
worker_processes auto;
pid /run/nginx.pid;

events {
    worker_connections 1024;
    use epoll;
    multi_accept on;
}

http {
    # Basic settings
    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;
    keepalive_timeout 65;
    types_hash_max_size 2048;
    client_max_body_size 100M;
    
    # MIME types
    include /etc/nginx/mime.types;
    default_type application/octet-stream;
    
    # Logging
    log_format main '$remote_addr - $remote_user [$time_local] "$request" '
                   '$status $body_bytes_sent "$http_referer" '
                   '"$http_user_agent" "$http_x_forwarded_for"';
    
    access_log /app/logs/nginx-access.log main;
    error_log /app/logs/nginx-error.log warn;
    
    # Gzip compression
    gzip on;
    gzip_vary on;
    gzip_min_length 1024;
    gzip_comp_level 6;
    gzip_types
        text/plain
        text/css
        text/xml
        text/javascript
        application/json
        application/javascript
        application/xml+rss
        application/atom+xml
        image/svg+xml;
    
    # Rate limiting
    limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;
    limit_req_zone $binary_remote_addr zone=upload:10m rate=2r/s;
    
    # SSL/Security headers (for production with SSL termination)
    add_header X-Frame-Options DENY always;
    add_header X-Content-Type-Options nosniff always;
    add_header X-XSS-Protection "1; mode=block" always;
    add_header Referrer-Policy "strict-origin-when-cross-origin" always;
    
    # Upstream servers
    upstream sasha_api {
        server 127.0.0.1:3000 max_fails=3 fail_timeout=30s;
        keepalive 32;
    }
    
    upstream websocket_server {
        server 127.0.0.1:3001 max_fails=3 fail_timeout=30s;
    }
    
    upstream doc_builder {
        server 127.0.0.1:8081 max_fails=3 fail_timeout=30s;
    }
    
    upstream metrics_server {
        server 127.0.0.1:9090 max_fails=3 fail_timeout=30s;
    }
    
    # Main server configuration
    server {
        listen 80 default_server;
        server_name _;
        root /app/frontend/dist;
        index index.html;
        
        # Security
        server_tokens off;
        
        # Static files (frontend)
        location / {
            try_files $uri $uri/ @fallback;
            expires 1h;
            add_header Cache-Control "public, immutable";
        }
        
        # Fallback to index.html for SPA routing
        location @fallback {
            rewrite ^.*$ /index.html last;
        }
        
        # Health check endpoint
        location /health {
            access_log off;
            return 200 "OK\n";
            add_header Content-Type text/plain;
        }
        
        # API endpoints
        location /api/ {
            limit_req zone=api burst=20 nodelay;
            
            proxy_pass http://sasha_api;
            proxy_http_version 1.1;
            proxy_set_header Upgrade $http_upgrade;
            proxy_set_header Connection 'upgrade';
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;
            proxy_cache_bypass $http_upgrade;
            proxy_read_timeout 300s;
            proxy_connect_timeout 10s;
        }
        
        # WebSocket connections
        location /ws {
            proxy_pass http://websocket_server;
            proxy_http_version 1.1;
            proxy_set_header Upgrade $http_upgrade;
            proxy_set_header Connection "upgrade";
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;
            proxy_read_timeout 86400;
        }
        
        # File uploads
        location /api/files/upload {
            limit_req zone=upload burst=5 nodelay;
            client_max_body_size 100M;
            
            proxy_pass http://sasha_api;
            proxy_request_buffering off;
            proxy_http_version 1.1;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;
            proxy_read_timeout 300s;
        }
        
        # Doc builder service
        location /doc-builder/ {
            proxy_pass http://doc_builder/;
            proxy_http_version 1.1;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;
        }
        
        # Metrics endpoint (protected)
        location /metrics {
            allow 127.0.0.1;
            allow 10.0.0.0/8;
            allow 172.16.0.0/12;
            allow 192.168.0.0/16;
            deny all;
            
            proxy_pass http://metrics_server/metrics;
            proxy_http_version 1.1;
            proxy_set_header Host $host;
        }
        
        # Admin interface (if needed)
        location /admin {
            auth_basic "Sasha Studio Admin";
            auth_basic_user_file /app/config/.htpasswd;
            
            proxy_pass http://sasha_api/admin;
            proxy_http_version 1.1;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;
        }
        
        # Deny access to sensitive files
        location ~ /\. {
            deny all;
            access_log off;
            log_not_found off;
        }
        
        location ~ \.(conf|log)$ {
            deny all;
            access_log off;
        }
    }
}

PostgreSQL Configuration

# /etc/postgresql/14/main/postgresql.conf
# PostgreSQL configuration for single container deployment

# Connection settings
listen_addresses = 'localhost'
port = 5432
max_connections = 200
superuser_reserved_connections = 3

# Memory settings (adjust based on container resources)
shared_buffers = 256MB
effective_cache_size = 1GB
maintenance_work_mem = 64MB
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 100
random_page_cost = 1.1
effective_io_concurrency = 200

# Logging
log_destination = 'stderr'
logging_collector = on
log_directory = '/var/log/postgresql'
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
log_file_mode = 0600
log_truncate_on_rotation = on
log_rotation_age = 1d
log_rotation_size = 10MB
log_min_duration_statement = 1000
log_checkpoints = on
log_connections = on
log_disconnections = on
log_lock_waits = on
log_temp_files = 0

# Performance
checkpoint_timeout = 10min
checkpoint_completion_target = 0.9
wal_level = replica
max_wal_senders = 3
archive_mode = on
archive_command = 'cp %p /app/data/wal_archive/%f'

# Locale
datestyle = 'iso, mdy'
timezone = 'UTC'
lc_messages = 'en_US.utf8'
lc_monetary = 'en_US.utf8'
lc_numeric = 'en_US.utf8'
lc_time = 'en_US.utf8'
default_text_search_config = 'pg_catalog.english'

Redis Configuration

# /etc/redis/redis.conf
# Redis configuration for single container deployment

# Basic settings
bind 127.0.0.1
port 6379
timeout 300
keepalive 60

# Memory management
maxmemory 512mb
maxmemory-policy allkeys-lru
maxmemory-samples 5

# Persistence
save 900 1
save 300 10
save 60 10000
rdbcompression yes
rdbchecksum yes
dbfilename dump.rdb
dir /app/data/redis

# Logging
loglevel notice
logfile /app/logs/redis.log
syslog-enabled no

# Security
requirepass sasha_redis_password_change_this

# Performance
tcp-keepalive 60
tcp-backlog 511
databases 16

Initialization Scripts

Main Init Script

#!/bin/bash
# /usr/local/bin/init.sh
# Sasha Studio initialization script

set -e

echo "🚀 Starting Sasha Studio Single Container..."

# Create required directories
mkdir -p /app/data/{postgresql,redis,uploads,tmp,wal_archive}
mkdir -p /app/logs
mkdir -p /app/config
mkdir -p /var/run/postgresql

# Set proper permissions
chown -R postgres:postgres /var/lib/postgresql
chown -R postgres:postgres /var/log/postgresql
chown -R postgres:postgres /var/run/postgresql
chown -R redis:redis /app/data/redis
chown -R sasha:sasha /app/data/uploads
chown -R sasha:sasha /app/data/tmp
chown -R sasha:sasha /app/logs
chown -R sasha:sasha /app/config

# Initialize PostgreSQL if needed
if [ ! -f /var/lib/postgresql/14/main/PG_VERSION ]; then
    echo "🗄️  Initializing PostgreSQL database..."
    su - postgres -c "/usr/lib/postgresql/14/bin/initdb -D /var/lib/postgresql/14/main"
    
    # Start PostgreSQL temporarily to create database
    su - postgres -c "/usr/lib/postgresql/14/bin/pg_ctl -D /var/lib/postgresql/14/main -l /var/log/postgresql/postgresql.log start"
    sleep 5
    
    # Create application database and user
    su - postgres -c "createuser -s sasha"
    su - postgres -c "createdb -O sasha sasha"
    su - postgres -c "psql -c \"ALTER USER sasha PASSWORD 'sasha_db_password_change_this';\""
    
    # Stop PostgreSQL (Supervisord will manage it)
    su - postgres -c "/usr/lib/postgresql/14/bin/pg_ctl -D /var/lib/postgresql/14/main stop"
fi

# Generate configuration files if they don't exist
if [ ! -f /app/config/.env ]; then
    echo "⚙️  Generating configuration files..."
    cat > /app/config/.env << EOF
NODE_ENV=production
DATABASE_URL=postgresql://sasha:sasha_db_password_change_this@localhost:5432/sasha
REDIS_URL=redis://:sasha_redis_password_change_this@localhost:6379
JWT_SECRET=$(openssl rand -hex 32)
SESSION_SECRET=$(openssl rand -hex 32)
ENCRYPTION_KEY=$(openssl rand -hex 32)
EOF
    chown sasha:sasha /app/config/.env
    chmod 600 /app/config/.env
fi

# Generate admin password for metrics access
if [ ! -f /app/config/.htpasswd ]; then
    echo "🔒 Generating admin credentials..."
    ADMIN_PASSWORD=$(openssl rand -base64 12)
    echo "admin:$(openssl passwd -apr1 $ADMIN_PASSWORD)" > /app/config/.htpasswd
    echo "Admin password: $ADMIN_PASSWORD" > /app/config/admin-credentials.txt
    chmod 600 /app/config/.htpasswd /app/config/admin-credentials.txt
    chown www-data:www-data /app/config/.htpasswd
fi

# Wait for all critical services to be ready
echo "⏳ Waiting for services to initialize..."

# Set up environment variables for all services
export $(cat /app/config/.env | xargs)

# Run database migrations if needed
if [ -f /app/backend/migrations/init.sql ]; then
    echo "🗄️  Running database migrations..."
    su - postgres -c "/usr/lib/postgresql/14/bin/pg_ctl -D /var/lib/postgresql/14/main -l /var/log/postgresql/postgresql.log start"
    sleep 5
    
    su - sasha -c "psql -h localhost -U sasha -d sasha -f /app/backend/migrations/init.sql"
    
    su - postgres -c "/usr/lib/postgresql/14/bin/pg_ctl -D /var/lib/postgresql/14/main stop"
fi

# Initialize LLxprt configuration
echo "🤖 Configuring LLxprt CLI..."
su - sasha -c "mkdir -p /home/sasha/.config/llxprt"

if [ -n "$ANTHROPIC_API_KEY" ] || [ -n "$OPENAI_API_KEY" ]; then
    cat > /home/sasha/.config/llxprt/config.yaml << EOF
stream: true
save: false
clients:
EOF

    if [ -n "$ANTHROPIC_API_KEY" ]; then
        cat >> /home/sasha/.config/llxprt/config.yaml << EOF
  - type: claude
    api_key: ${ANTHROPIC_API_KEY}
EOF
    fi

    if [ -n "$OPENAI_API_KEY" ]; then
        cat >> /home/sasha/.config/llxprt/config.yaml << EOF
  - type: openai
    api_key: ${OPENAI_API_KEY}
EOF
    fi

    chown -R sasha:sasha /home/sasha/.config/llxprt
fi

echo "✅ Initialization complete. Starting services with Supervisord..."

# Start all services with Supervisord
exec /usr/bin/supervisord -c /etc/supervisor/conf.d/supervisord.conf

Health Check Script

#!/bin/bash
# /usr/local/bin/health-check.sh
# Comprehensive health check for all services

# Function to check if a service is responding
check_service() {
    local service_name=$1
    local url=$2
    local expected_status=${3:-200}
    
    if curl -f -s -o /dev/null -w "%{http_code}" "$url" | grep -q "$expected_status"; then
        echo "✅ $service_name: OK"
        return 0
    else
        echo "❌ $service_name: FAILED"
        return 1
    fi
}

# Function to check database connectivity
check_database() {
    if su - sasha -c "psql -h localhost -U sasha -d sasha -c 'SELECT 1;'" >/dev/null 2>&1; then
        echo "✅ PostgreSQL: OK"
        return 0
    else
        echo "❌ PostgreSQL: FAILED"
        return 1
    fi
}

# Function to check Redis connectivity
check_redis() {
    if redis-cli -a sasha_redis_password_change_this ping >/dev/null 2>&1; then
        echo "✅ Redis: OK"
        return 0
    else
        echo "❌ Redis: FAILED"
        return 1
    fi
}

# Initialize health status
health_status=0

# Check all services
echo "🔍 Performing health checks..."

# Core services
check_service "Nginx" "http://localhost:80/health" 200 || health_status=1
check_service "Sasha API" "http://localhost:3000/health" 200 || health_status=1
check_service "Doc Builder" "http://localhost:8081/health" 200 || health_status=1
check_service "Metrics" "http://localhost:9090/health" 200 || health_status=1

# Data services
check_database || health_status=1
check_redis || health_status=1

# Check disk space
DISK_USAGE=$(df /app | awk 'NR==2 {print $5}' | sed 's/%//')
if [ "$DISK_USAGE" -gt 85 ]; then
    echo "⚠️  Disk usage: ${DISK_USAGE}% (WARNING)"
    health_status=1
else
    echo "✅ Disk usage: ${DISK_USAGE}%"
fi

# Check memory usage
MEMORY_USAGE=$(free | awk 'NR==2{printf "%.0f", $3*100/$2}')
if [ "$MEMORY_USAGE" -gt 90 ]; then
    echo "⚠️  Memory usage: ${MEMORY_USAGE}% (WARNING)"
    health_status=1
else
    echo "✅ Memory usage: ${MEMORY_USAGE}%"
fi

# Check if all critical processes are running
CRITICAL_PROCESSES=("nginx" "postgres" "redis-server" "node")
for process in "${CRITICAL_PROCESSES[@]}"; do
    if pgrep "$process" >/dev/null; then
        echo "✅ Process $process: Running"
    else
        echo "❌ Process $process: Not found"
        health_status=1
    fi
done

# Output final status
if [ $health_status -eq 0 ]; then
    echo "🎉 All health checks passed"
    exit 0
else
    echo "💥 Some health checks failed"
    exit 1
fi

Service Monitor Script

#!/bin/bash
# /app/scripts/service-monitor.sh
# Monitor critical services and restart if needed

LOG_FILE="/app/logs/service-monitor.log"

log() {
    echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" >> "$LOG_FILE"
}

# Function to check and restart service
check_and_restart() {
    local service_name=$1
    local check_command=$2
    local restart_command=$3
    
    if ! eval "$check_command" >/dev/null 2>&1; then
        log "WARNING: $service_name is not responding"
        
        # Try to restart via supervisorctl
        if supervisorctl restart "$service_name" >/dev/null 2>&1; then
            log "INFO: Successfully restarted $service_name"
        else
            log "ERROR: Failed to restart $service_name"
        fi
    fi
}

# Monitor loop
while true; do
    # Check API server
    check_and_restart "sasha-api" \
        "curl -f -s http://localhost:3000/health" \
        "supervisorctl restart sasha-api"
    
    # Check WebSocket server
    check_and_restart "websocket-server" \
        "curl -f -s http://localhost:3001/health" \
        "supervisorctl restart websocket-server"
    
    # Check database connectivity
    check_and_restart "postgresql" \
        "su - sasha -c 'psql -h localhost -U sasha -d sasha -c \"SELECT 1;\"'" \
        "supervisorctl restart postgresql"
    
    # Check Redis
    check_and_restart "redis" \
        "redis-cli -a sasha_redis_password_change_this ping" \
        "supervisorctl restart redis"
    
    # Check disk space and log warning
    DISK_USAGE=$(df /app | awk 'NR==2 {print $5}' | sed 's/%//')
    if [ "$DISK_USAGE" -gt 85 ]; then
        log "WARNING: Disk usage is ${DISK_USAGE}%"
    fi
    
    # Sleep for 60 seconds before next check
    sleep 60
done

Environment Configuration Management

Environment Variables Template

# .env.production - Production environment template

# === APPLICATION SETTINGS ===
NODE_ENV=production
PORT=3000
APP_NAME="Sasha Studio"
APP_URL=https://your-domain.com
APP_VERSION=1.0.0

# === DATABASE CONFIGURATION ===
DATABASE_URL=postgresql://sasha:secure_password@localhost:5432/sasha
DATABASE_POOL_MIN=2
DATABASE_POOL_MAX=20
DATABASE_TIMEOUT=30000

# === REDIS CONFIGURATION ===
REDIS_URL=redis://:secure_redis_password@localhost:6379
REDIS_KEY_PREFIX=sasha:
REDIS_TTL=3600

# === SECURITY CONFIGURATION ===
JWT_SECRET=your-super-secure-jwt-secret-with-at-least-32-characters
JWT_EXPIRES_IN=15m
REFRESH_TOKEN_EXPIRES_IN=7d
SESSION_SECRET=your-session-secret-with-at-least-32-characters
ENCRYPTION_KEY=your-encryption-key-for-sensitive-data-32-chars

# === AI/LLM CONFIGURATION ===
ANTHROPIC_API_KEY=your-claude-api-key
OPENAI_API_KEY=your-openai-api-key
GOOGLE_API_KEY=your-google-api-key
OPENROUTER_API_KEY=your-openrouter-api-key

# LLxprt Configuration
LLXPRT_CONFIG_DIR=/home/sasha/.config/llxprt
ENABLE_LOCAL_MODELS=true
OLLAMA_HOST=http://localhost:11434

# Model Routing
DEFAULT_MODEL=claude-3-sonnet
FALLBACK_MODEL=gpt-4o-mini
MAX_TOKENS=4000
TEMPERATURE=0.7

# === FILE STORAGE ===
STORAGE_TYPE=local
UPLOAD_MAX_SIZE=100MB
UPLOAD_ALLOWED_TYPES=pdf,doc,docx,txt,md,csv,xlsx,png,jpg,jpeg
STORAGE_PATH=/app/data/uploads

# Google Cloud Storage (if using GCS)
# GCS_BUCKET=your-storage-bucket
# GOOGLE_APPLICATION_CREDENTIALS=/app/config/service-account.json

# === PUBLISHING CONFIGURATION ===
DOC_BUILDER_ENABLED=true
DOC_BUILDER_PORT=8081
ENABLE_PUBLIC_PUBLISHING=true
DEFAULT_SITE_THEME=default

# Vercel Integration (optional)
# VERCEL_TOKEN=your-vercel-token
# VERCEL_ORG_ID=your-org-id
# VERCEL_PROJECT_ID=your-project-id

# === MONITORING & LOGGING ===
LOG_LEVEL=info
LOG_FORMAT=json
ENABLE_ACCESS_LOGS=true
ENABLE_ERROR_TRACKING=true

# Metrics
METRICS_ENABLED=true
METRICS_PORT=9090
PROMETHEUS_ENABLED=false

# SIEM Integration
SIEM_ENABLED=false
# SIEM_ENDPOINT=https://your-siem-server.com/webhook
# SIEM_API_KEY=your-siem-api-key

# === RATE LIMITING ===
RATE_LIMIT_WINDOW=15m
RATE_LIMIT_MAX=100
RATE_LIMIT_SKIP_SUCCESSFUL=true

# API-specific rate limits
API_RATE_LIMIT_MAX=10
UPLOAD_RATE_LIMIT_MAX=2
WEBSOCKET_CONNECTION_LIMIT=50

# === EMAIL CONFIGURATION (if needed) ===
# SMTP_HOST=smtp.gmail.com
# SMTP_PORT=587
# SMTP_SECURE=true
# SMTP_USER=your-email@gmail.com
# SMTP_PASS=your-app-password
# FROM_EMAIL=noreply@your-domain.com

# === BACKUP CONFIGURATION ===
BACKUP_ENABLED=true
BACKUP_SCHEDULE="0 2 * * *"  # Daily at 2 AM
BACKUP_RETENTION_DAYS=30
BACKUP_PATH=/app/data/backups

# === FEATURE FLAGS ===
ENABLE_ONBOARDING=true
ENABLE_ANALYTICS=true
ENABLE_ADMIN_INTERFACE=true
ENABLE_WEBSOCKET_COMPRESSION=true
ENABLE_CONTENT_COMPRESSION=true

# === SECURITY HEADERS ===
ENABLE_HELMET=true
ENABLE_CORS=true
CORS_ORIGIN=https://your-domain.com
CORS_CREDENTIALS=true

# === DEVELOPMENT/DEBUG (set to false in production) ===
DEBUG_ENABLED=false
ENABLE_DEV_TOOLS=false
DETAILED_ERROR_MESSAGES=false

Docker Compose Configuration

# docker-compose.yml - Single container deployment
version: '3.8'

services:
  sasha-studio:
    build:
      context: .
      dockerfile: Dockerfile
      target: production
    image: sasha/studio:latest
    container_name: sasha-studio
    hostname: sasha-studio
    
    ports:
      - "80:80"          # Main web interface
      - "443:443"        # HTTPS (if SSL is configured)
      - "3001:3001"      # Direct access to metrics (optional)
    
    volumes:
      # Persistent data
      - sasha-data:/app/data
      - sasha-logs:/app/logs
      - sasha-config:/app/config
      
      # Optional: Mount local guides directory
      - ./guides:/app/guides:ro
      
      # Optional: Mount SSL certificates
      # - ./ssl:/app/ssl:ro
      
      # Optional: Mount backup directory
      - ./backups:/app/data/backups
    
    environment:
      # Load from .env file
      - NODE_ENV=production
      
      # AI API Keys (from secrets)
      - ANTHROPIC_API_KEY_FILE=/run/secrets/anthropic_api_key
      - OPENAI_API_KEY_FILE=/run/secrets/openai_api_key
      
      # Database configuration
      - DATABASE_URL=postgresql://sasha:${DB_PASSWORD}@localhost:5432/sasha
      - REDIS_URL=redis://:${REDIS_PASSWORD}@localhost:6379
      
      # Security
      - JWT_SECRET_FILE=/run/secrets/jwt_secret
      - SESSION_SECRET_FILE=/run/secrets/session_secret
      - ENCRYPTION_KEY_FILE=/run/secrets/encryption_key
      
      # Application configuration
      - APP_URL=${APP_URL:-http://localhost}
      - ENABLE_LOCAL_MODELS=${ENABLE_LOCAL_MODELS:-true}
      - LOG_LEVEL=${LOG_LEVEL:-info}
      
      # Monitoring
      - METRICS_ENABLED=${METRICS_ENABLED:-true}
      - SIEM_ENDPOINT=${SIEM_ENDPOINT}
    
    secrets:
      - anthropic_api_key
      - openai_api_key
      - jwt_secret
      - session_secret
      - encryption_key
      - db_password
      - redis_password
    
    healthcheck:
      test: ["/usr/local/bin/health-check.sh"]
      interval: 30s
      timeout: 10s
      start_period: 40s
      retries: 3
    
    restart: unless-stopped
    
    # Resource limits
    deploy:
      resources:
        limits:
          memory: 4G
          cpus: '2.0'
        reservations:
          memory: 2G
          cpus: '1.0'
    
    # Security settings
    security_opt:
      - no-new-privileges:true
    cap_drop:
      - ALL
    cap_add:
      - CHOWN
      - DAC_OVERRIDE
      - SETGID
      - SETUID
    read_only: false
    
    # Logging configuration
    logging:
      driver: "json-file"
      options:
        max-size: "100m"
        max-file: "3"
        labels: "sasha.studio"

# Secrets configuration
secrets:
  anthropic_api_key:
    file: ./secrets/anthropic_api_key.txt
  openai_api_key:
    file: ./secrets/openai_api_key.txt
  jwt_secret:
    file: ./secrets/jwt_secret.txt
  session_secret:
    file: ./secrets/session_secret.txt
  encryption_key:
    file: ./secrets/encryption_key.txt
  db_password:
    file: ./secrets/db_password.txt
  redis_password:
    file: ./secrets/redis_password.txt

# Volumes for persistent data
volumes:
  sasha-data:
    driver: local
    driver_opts:
      type: none
      o: bind
      device: ./data
  sasha-logs:
    driver: local
    driver_opts:
      type: none
      o: bind
      device: ./logs
  sasha-config:
    driver: local
    driver_opts:
      type: none
      o: bind
      device: ./config

# Network configuration
networks:
  default:
    driver: bridge
    ipam:
      config:
        - subnet: 172.20.0.0/16

Monitoring and Metrics

Simple Metrics Server

// /app/backend/metrics-server.js
const express = require('express');
const prometheus = require('prom-client');
const { exec } = require('child_process');
const fs = require('fs').promises;

const app = express();
const register = prometheus.register;

// Create custom metrics
const httpRequestsTotal = new prometheus.Counter({
  name: 'sasha_http_requests_total',
  help: 'Total number of HTTP requests',
  labelNames: ['method', 'status_code', 'endpoint']
});

const httpRequestDuration = new prometheus.Histogram({
  name: 'sasha_http_request_duration_seconds',
  help: 'Duration of HTTP requests in seconds',
  labelNames: ['method', 'endpoint'],
  buckets: [0.1, 0.3, 0.5, 0.7, 1, 3, 5, 7, 10]
});

const activeSessions = new prometheus.Gauge({
  name: 'sasha_active_sessions',
  help: 'Number of active user sessions'
});

const llmRequests = new prometheus.Counter({
  name: 'sasha_llm_requests_total',
  help: 'Total number of LLM requests',
  labelNames: ['provider', 'model', 'status']
});

const fileUploads = new prometheus.Counter({
  name: 'sasha_file_uploads_total',
  help: 'Total number of file uploads',
  labelNames: ['file_type', 'status']
});

const diskUsage = new prometheus.Gauge({
  name: 'sasha_disk_usage_bytes',
  help: 'Disk usage in bytes',
  labelNames: ['mount_point']
});

const memoryUsage = new prometheus.Gauge({
  name: 'sasha_memory_usage_bytes',
  help: 'Memory usage in bytes',
  labelNames: ['type']
});

// Collect default metrics
prometheus.collectDefaultMetrics({
  prefix: 'sasha_',
  gcDurationBuckets: [0.001, 0.01, 0.1, 1, 2, 5],
});

// Custom metrics collection
async function collectCustomMetrics() {
  try {
    // Collect disk usage
    exec('df -B1', (error, stdout) => {
      if (!error) {
        const lines = stdout.split('\n').slice(1);
        lines.forEach(line => {
          const parts = line.split(/\s+/);
          if (parts.length >= 6) {
            const mountPoint = parts[5];
            const used = parseInt(parts[2]);
            if (!isNaN(used)) {
              diskUsage.set({ mount_point: mountPoint }, used);
            }
          }
        });
      }
    });

    // Collect memory usage
    const meminfo = await fs.readFile('/proc/meminfo', 'utf8');
    const memLines = meminfo.split('\n');
    
    memLines.forEach(line => {
      const match = line.match(/^(\w+):\s+(\d+)\s+kB/);
      if (match) {
        const [, name, value] = match;
        memoryUsage.set({ type: name }, parseInt(value) * 1024);
      }
    });

    // Collect active sessions (mock - replace with real logic)
    const sessionCount = Math.floor(Math.random() * 50) + 10;
    activeSessions.set(sessionCount);

  } catch (error) {
    console.error('Error collecting custom metrics:', error);
  }
}

// Collect metrics every 30 seconds
setInterval(collectCustomMetrics, 30000);

// Health check endpoint
app.get('/health', (req, res) => {
  res.status(200).send('OK');
});

// Metrics endpoint
app.get('/metrics', async (req, res) => {
  await collectCustomMetrics();
  res.set('Content-Type', register.contentType);
  res.end(await register.metrics());
});

// Simple dashboard endpoint
app.get('/dashboard', (req, res) => {
  res.send(`
    <!DOCTYPE html>
    <html>
    <head>
        <title>Sasha Studio Metrics</title>
        <meta http-equiv="refresh" content="30">
        <style>
            body { font-family: Arial, sans-serif; margin: 20px; }
            .metric { margin: 10px 0; padding: 10px; border: 1px solid #ddd; border-radius: 5px; }
            .metric h3 { margin: 0 0 5px 0; color: #333; }
            .metric p { margin: 0; color: #666; }
            .status-ok { border-left: 4px solid #4CAF50; }
            .status-warning { border-left: 4px solid #FF9800; }
            .status-error { border-left: 4px solid #F44336; }
        </style>
    </head>
    <body>
        <h1>🚀 Sasha Studio Metrics Dashboard</h1>
        <div class="metric status-ok">
            <h3>System Status</h3>
            <p>All services running normally</p>
        </div>
        <div class="metric status-ok">
            <h3>Active Sessions</h3>
            <p id="sessions">Loading...</p>
        </div>
        <div class="metric status-ok">
            <h3>Memory Usage</h3>
            <p id="memory">Loading...</p>
        </div>
        <div class="metric status-ok">
            <h3>Disk Usage</h3>
            <p id="disk">Loading...</p>
        </div>
        
        <script>
            // Auto-refresh data every 30 seconds
            setInterval(() => {
                location.reload();
            }, 30000);
        </script>
    </body>
    </html>
  `);
});

const PORT = process.env.METRICS_PORT || 9090;
app.listen(PORT, '127.0.0.1', () => {
  console.log(`📊 Metrics server running on port ${PORT}`);
});

// Export metrics functions for use by other services
module.exports = {
  httpRequestsTotal,
  httpRequestDuration,
  activeSessions,
  llmRequests,
  fileUploads
};

Deployment Commands

Build and Deploy Script

#!/bin/bash
# scripts/deploy.sh
# Complete deployment script for Sasha Studio

set -e

echo "🚀 Deploying Sasha Studio Single Container..."

# Configuration
IMAGE_NAME="sasha/studio"
IMAGE_TAG="${1:-latest}"
CONTAINER_NAME="sasha-studio"

# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color

log_info() {
    echo -e "${GREEN}[INFO]${NC} $1"
}

log_warn() {
    echo -e "${YELLOW}[WARN]${NC} $1"
}

log_error() {
    echo -e "${RED}[ERROR]${NC} $1"
}

# Check prerequisites
check_prerequisites() {
    log_info "Checking prerequisites..."
    
    if ! command -v docker &> /dev/null; then
        log_error "Docker is not installed"
        exit 1
    fi
    
    if ! command -v docker-compose &> /dev/null; then
        log_error "Docker Compose is not installed"
        exit 1
    fi
    
    # Check if secrets directory exists
    if [ ! -d "./secrets" ]; then
        log_warn "Secrets directory not found. Creating it..."
        mkdir -p secrets
        
        # Generate sample secrets
        echo "$(openssl rand -hex 32)" > secrets/jwt_secret.txt
        echo "$(openssl rand -hex 32)" > secrets/session_secret.txt
        echo "$(openssl rand -hex 32)" > secrets/encryption_key.txt
        echo "$(openssl rand -base64 32)" > secrets/db_password.txt
        echo "$(openssl rand -base64 16)" > secrets/redis_password.txt
        
        log_warn "Generated sample secrets. Please update them with real values."
    fi
    
    # Check for required API keys
    if [ ! -f "./secrets/anthropic_api_key.txt" ] && [ ! -f "./secrets/openai_api_key.txt" ]; then
        log_error "At least one AI API key is required (Anthropic or OpenAI)"
        echo "Create ./secrets/anthropic_api_key.txt or ./secrets/openai_api_key.txt"
        exit 1
    fi
}

# Create required directories
create_directories() {
    log_info "Creating required directories..."
    
    mkdir -p data/{uploads,backups,redis,postgresql,wal_archive}
    mkdir -p logs
    mkdir -p config
    
    # Set proper permissions
    chmod 755 data logs config
    chmod -R 755 data/uploads
}

# Build the Docker image
build_image() {
    log_info "Building Docker image: ${IMAGE_NAME}:${IMAGE_TAG}"
    
    docker build \
        --target production \
        --tag "${IMAGE_NAME}:${IMAGE_TAG}" \
        --build-arg BUILD_DATE=$(date -u +'%Y-%m-%dT%H:%M:%SZ') \
        --build-arg VCS_REF=$(git rev-parse --short HEAD) \
        .
    
    log_info "Docker image built successfully"
}

# Run pre-deployment tests
run_tests() {
    log_info "Running pre-deployment tests..."
    
    # Test that the image can start
    docker run --rm -d \
        --name "${CONTAINER_NAME}-test" \
        "${IMAGE_NAME}:${IMAGE_TAG}" &
    
    CONTAINER_ID=$!
    sleep 30
    
    # Check if container is still running
    if docker ps | grep -q "${CONTAINER_NAME}-test"; then
        log_info "Container startup test passed"
        docker stop "${CONTAINER_NAME}-test" > /dev/null
    else
        log_error "Container failed to start properly"
        docker logs "${CONTAINER_NAME}-test"
        exit 1
    fi
}

# Deploy the container
deploy_container() {
    log_info "Deploying container..."
    
    # Stop existing container if running
    if docker ps -a | grep -q "$CONTAINER_NAME"; then
        log_info "Stopping existing container..."
        docker-compose down
    fi
    
    # Start the new container
    docker-compose up -d
    
    # Wait for container to be ready
    log_info "Waiting for container to be ready..."
    for i in {1..30}; do
        if docker exec "$CONTAINER_NAME" /usr/local/bin/health-check.sh &> /dev/null; then
            log_info "Container is ready!"
            break
        fi
        
        if [ $i -eq 30 ]; then
            log_error "Container failed to become ready"
            docker logs "$CONTAINER_NAME"
            exit 1
        fi
        
        sleep 10
    done
}

# Verify deployment
verify_deployment() {
    log_info "Verifying deployment..."
    
    # Check if all services are responding
    local services=(
        "http://localhost/health"
        "http://localhost/api/health"
    )
    
    for service in "${services[@]}"; do
        if curl -f -s "$service" > /dev/null; then
            log_info "✅ $service is responding"
        else
            log_error "❌ $service is not responding"
        fi
    done
    
    # Show container status
    echo ""
    log_info "Container status:"
    docker-compose ps
    
    echo ""
    log_info "Recent logs:"
    docker-compose logs --tail=20
}

# Cleanup old images
cleanup() {
    log_info "Cleaning up old images..."
    
    # Remove dangling images
    if docker images -f "dangling=true" -q | grep -q .; then
        docker images -f "dangling=true" -q | xargs docker rmi
    fi
    
    # Remove old versions (keep last 3)
    OLD_IMAGES=$(docker images "${IMAGE_NAME}" --format "table {{.Repository}}:{{.Tag}}\t{{.CreatedAt}}" | tail -n +2 | sort -k2 -r | tail -n +4 | awk '{print $1}')
    
    if [ -n "$OLD_IMAGES" ]; then
        echo "$OLD_IMAGES" | xargs -r docker rmi
        log_info "Removed old images"
    fi
}

# Main deployment flow
main() {
    echo "🐳 Sasha Studio Single Container Deployment"
    echo "=========================================="
    
    check_prerequisites
    create_directories
    build_image
    run_tests
    deploy_container
    verify_deployment
    cleanup
    
    echo ""
    log_info "🎉 Deployment completed successfully!"
    echo ""
    echo "Access your Sasha Studio at:"
    echo "  🌐 Web Interface: http://localhost"
    echo "  📊 Metrics: http://localhost:3001/dashboard"
    echo "  🔍 Health Check: http://localhost/health"
    echo ""
    echo "Admin credentials are in: ./config/admin-credentials.txt"
    echo "View logs with: docker-compose logs -f"
}

# Run main function
main "$@"

Backup and Recovery Script

#!/bin/bash
# scripts/backup.sh
# Automated backup script for Sasha Studio

set -e

BACKUP_DIR="/app/data/backups"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
BACKUP_NAME="sasha_backup_${TIMESTAMP}"

echo "🔄 Starting backup: $BACKUP_NAME"

# Create backup directory
mkdir -p "$BACKUP_DIR"

# Backup PostgreSQL database
echo "📦 Backing up PostgreSQL database..."
docker exec sasha-studio su - postgres -c "pg_dump -U sasha -h localhost sasha" | gzip > "$BACKUP_DIR/${BACKUP_NAME}_database.sql.gz"

# Backup Redis data
echo "📦 Backing up Redis data..."
docker exec sasha-studio redis-cli -a sasha_redis_password_change_this BGSAVE
sleep 5
docker cp sasha-studio:/app/data/redis/dump.rdb "$BACKUP_DIR/${BACKUP_NAME}_redis.rdb"

# Backup application data
echo "📦 Backing up application data..."
tar -czf "$BACKUP_DIR/${BACKUP_NAME}_data.tar.gz" -C /app/data uploads config logs

# Backup configuration
echo "📦 Backing up configuration..."
docker exec sasha-studio tar -czf "/tmp/config_backup.tar.gz" -C /app config
docker cp sasha-studio:/tmp/config_backup.tar.gz "$BACKUP_DIR/${BACKUP_NAME}_config.tar.gz"

# Create backup manifest
cat > "$BACKUP_DIR/${BACKUP_NAME}_manifest.txt" << EOF
Sasha Studio Backup Manifest
============================
Backup Name: $BACKUP_NAME
Created: $(date)
Host: $(hostname)
Docker Image: $(docker inspect sasha-studio --format='{{.Config.Image}}')

Files:
- ${BACKUP_NAME}_database.sql.gz (PostgreSQL dump)
- ${BACKUP_NAME}_redis.rdb (Redis snapshot)
- ${BACKUP_NAME}_data.tar.gz (Application data)
- ${BACKUP_NAME}_config.tar.gz (Configuration files)
- ${BACKUP_NAME}_manifest.txt (This file)

Backup Size: $(du -sh "$BACKUP_DIR/${BACKUP_NAME}"* | awk '{sum+=$1} END {print sum}')
EOF

# Cleanup old backups (keep last 30 days)
find "$BACKUP_DIR" -name "sasha_backup_*" -mtime +30 -delete

echo "✅ Backup completed: $BACKUP_NAME"
echo "📍 Location: $BACKUP_DIR"
echo "📊 Size: $(du -sh "$BACKUP_DIR/${BACKUP_NAME}"* | awk '{sum+=$1} END {print sum}')"

Production Deployment Checklist

Pre-Deployment Checklist

Docker Environment
- Docker 24.0+ installed
- Docker Compose v2+ installed
- Sufficient disk space (20GB minimum)
- Sufficient RAM (4GB minimum)
Security Configuration
- All secrets generated and stored securely
- Strong passwords for database and Redis
- API keys configured properly
- SSL certificates obtained (if using HTTPS)
- Firewall rules configured
Application Configuration
- Environment variables set correctly
- Database schema initialized
- LLxprt CLI configured with API keys
- Backup strategy implemented
- Monitoring endpoints accessible
Testing
- Container builds successfully
- All services start without errors
- Health checks pass
- Basic functionality tested
- Performance benchmarks met

Post-Deployment Verification

# Quick verification script
#!/bin/bash

echo "🔍 Post-deployment verification..."

# Check container status
echo "📦 Container Status:"
docker-compose ps

# Check service health
echo "🏥 Service Health:"
curl -f http://localhost/health && echo "✅ Main site OK" || echo "❌ Main site FAILED"
curl -f http://localhost/api/health && echo "✅ API OK" || echo "❌ API FAILED"

# Check disk space
echo "💾 Disk Usage:"
df -h | grep -E "(Filesystem|/dev/)"

# Check memory usage
echo "🧠 Memory Usage:"
free -h

# Check logs for errors
echo "📋 Recent Errors:"
docker-compose logs --tail=50 | grep -i error || echo "No errors found"

echo "✅ Verification complete"

Quick Start Commands

Development Setup

# 1. Clone and setup
git clone <repository>
cd sasha-studio

# 2. Create secrets directory
mkdir -p secrets
echo "your-anthropic-key" > secrets/anthropic_api_key.txt
echo "your-openai-key" > secrets/openai_api_key.txt

# 3. Generate secure secrets
./scripts/generate-secrets.sh

# 4. Create data directories
mkdir -p data logs config

# 5. Build and start
docker-compose up --build -d

# 6. Wait for services to be ready
sleep 60

# 7. Check status
./scripts/health-check.sh

# 8. Access the application
echo "🌐 Sasha Studio: http://localhost"
echo "📊 Metrics: http://localhost:3001/dashboard"

Production Deployment

# 1. Prepare production environment
export IMAGE_TAG=v1.0.0
export NODE_ENV=production

# 2. Configure secrets
./scripts/setup-production-secrets.sh

# 3. Deploy
./scripts/deploy.sh $IMAGE_TAG

# 4. Verify deployment
./scripts/verify-deployment.sh

# 5. Setup monitoring and backups
crontab -e
# Add: 0 2 * * * /app/scripts/backup.sh

Troubleshooting Guide

Common Issues and Solutions

Container Won't Start

# Check logs
docker-compose logs -f

# Check if ports are in use
netstat -tulpn | grep :80

# Reset everything
docker-compose down -v
docker system prune -f
docker-compose up --build -d

Database Connection Issues

# Check PostgreSQL status
docker exec sasha-studio supervisorctl status postgresql

# Test database connection
docker exec sasha-studio su - sasha -c "psql -h localhost -U sasha -d sasha -c 'SELECT 1;'"

# Reset database
docker exec sasha-studio supervisorctl stop postgresql
docker exec sasha-studio rm -rf /var/lib/postgresql/14/main/*
docker exec sasha-studio supervisorctl start postgresql

LLxprt CLI Issues

# Check LLxprt configuration
docker exec sasha-studio su - sasha -c "llxprt --version"
docker exec sasha-studio su - sasha -c "cat ~/.config/llxprt/config.yaml"

# Test API connectivity
docker exec sasha-studio su - sasha -c "llxprt 'test message'"

Performance Issues

# Check resource usage
docker stats sasha-studio

# Check disk space
docker exec sasha-studio df -h

# Check memory usage
docker exec sasha-studio free -h

# Check running processes
docker exec sasha-studio ps aux

This comprehensive Docker Multi-Service Setup Guide provides everything needed to deploy Sasha Studio as a single, manageable container while maintaining production-grade reliability and security. The single-container approach significantly simplifies deployment, maintenance, and scaling while providing all the functionality of a distributed system.

[{"content": "Create Docker Multi-Service Setup guide for single container deployment", "status": "completed", "priority": "high", "id": "1"}]