LLxprt Code Integration Implementation Guide

Generated: 2025-08-05 UTC
Purpose: Complete implementation guide for integrating LLxprt Code into Sasha Studio
Repository: https://github.com/acoliver/llxprt-code
Target: Docker-based Node.js application with real-time AI chat streaming

LLxprt Code Overview

LLxprt Code is an open-source multi-provider fork of Google's Gemini CLI, enhanced with:

Multi-Provider Support: OpenAI, Anthropic, Google, OpenRouter, Fireworks, Local models
Enhanced Theming: Beautiful, consistent UI themes
Flexible Configuration: Switch providers, models, and API keys on the fly
Local Model Support: LM Studio, llama.cpp, any OpenAI-compatible server
Advanced Settings: Fine-tune model parameters, profiles, ephemeral settings

System Requirements

Prerequisites

Node.js: Version 24+ (latest requirement from package.json)
npm: Latest version for global package installation
Docker: For containerized deployment
API Keys: For desired AI providers

Installation Methods

# NPM Global Installation (Recommended)
npm install -g @vybestack/llxprt-code

# Homebrew (macOS/Linux)
brew install llxprt-code

# Verify installation
llxprt --version

Docker Integration Strategy

Enhanced Dockerfile for Sasha Studio

FROM node:24-slim AS llxprt-base

# Install comprehensive system dependencies
RUN apt-get update && apt-get install -y \
    ca-certificates \
    curl \
    gnupg \
    python3 \
    python3-pip \
    make \
    g++ \
    git \
    && rm -rf /var/lib/apt/lists/*

# Create non-root user for security
RUN groupadd --gid 1000 node \
  && useradd --uid 1000 --gid node --shell /bin/bash --create-home node

# Configure npm global directory for non-root user
ENV NPM_CONFIG_PREFIX=/home/node/.npm-global
ENV PATH=$PATH:/home/node/.npm-global/bin

# Switch to non-root user
USER node
WORKDIR /home/node

# Install LLxprt Code globally
RUN npm install -g @vybestack/llxprt-code

# Verify installation
RUN llxprt --version

FROM llxprt-base AS sasha-studio

# Copy application files
COPY --chown=node:node package*.json ./
RUN npm ci --only=production

# Copy source code
COPY --chown=node:node . .

# Create configuration directory
RUN mkdir -p /home/node/.config/llxprt

# Expose application port
EXPOSE 3000

# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
  CMD curl -f http://localhost:3000/health || exit 1

CMD ["node", "server.js"]

Docker Compose Configuration

# docker-compose.yml
version: '3.8'

services:
  sasha-studio:
    build: .
    ports:
      - "3000:3000"
    environment:
      - NODE_ENV=production
      - OPENAI_API_KEY=${OPENAI_API_KEY}
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
      - GOOGLE_API_KEY=${GOOGLE_API_KEY}
    volumes:
      - llxprt-config:/home/node/.config/llxprt
      - user-files:/home/node/user-files:ro
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
      interval: 30s
      timeout: 10s
      retries: 3

volumes:
  llxprt-config:
  user-files:

LLxprt Configuration Management

Provider Configuration System

// config/llxprt-manager.js
const { spawn } = require('child_process');
const fs = require('fs').promises;
const path = require('path');

class LLxprtManager {
    constructor() {
        this.configDir = process.env.LLXPRT_CONFIG_DIR || '/home/node/.config/llxprt';
        this.providers = {
            openai: {
                models: ['o1', 'gpt-4o', 'gpt-4o-mini'],
                apiKeyEnv: 'OPENAI_API_KEY'
            },
            anthropic: {
                models: ['claude-3-5-sonnet-20241022', 'claude-3-5-haiku-20241022'],
                apiKeyEnv: 'ANTHROPIC_API_KEY'
            },
            google: {
                models: ['gemini-pro', 'gemini-pro-vision'],
                apiKeyEnv: 'GOOGLE_API_KEY'
            }
        };
        
        this.currentProvider = null;
        this.currentModel = null;
        this.isInitialized = false;
    }

    async initialize() {
        if (this.isInitialized) return;
        
        // Ensure config directory exists
        await fs.mkdir(this.configDir, { recursive: true });
        
        // Set up initial provider configuration
        await this.setupProviders();
        
        this.isInitialized = true;
    }

    async setupProviders() {
        for (const [provider, config] of Object.entries(this.providers)) {
            const apiKey = process.env[config.apiKeyEnv];
            if (apiKey) {
                await this.configureProvider(provider, apiKey);
            }
        }
    }

    async configureProvider(provider, apiKey) {
        return new Promise((resolve, reject) => {
            const process = spawn('llxprt', [], {
                stdio: ['pipe', 'pipe', 'pipe'],
                env: { ...process.env, LLXPRT_CONFIG_DIR: this.configDir }
            });

            let output = '';
            
            process.stdout.on('data', (data) => {
                output += data.toString();
            });

            process.stderr.on('data', (data) => {
                console.error(`LLxprt stderr: ${data}`);
            });

            process.on('close', (code) => {
                if (code === 0) {
                    console.log(`✅ Provider ${provider} configured successfully`);
                    resolve(output);
                } else {
                    reject(new Error(`Provider configuration failed with code ${code}`));
                }
            });

            // Send configuration commands
            process.stdin.write(`/provider ${provider}\n`);
            process.stdin.write(`/key ${apiKey}\n`);
            process.stdin.write(`/model ${this.providers[provider].models[0]}\n`);
            process.stdin.write('/exit\n');
            process.stdin.end();
        });
    }

    async switchProvider(provider, model) {
        if (!this.providers[provider]) {
            throw new Error(`Unsupported provider: ${provider}`);
        }

        if (!this.providers[provider].models.includes(model)) {
            throw new Error(`Unsupported model ${model} for provider ${provider}`);
        }

        this.currentProvider = provider;
        this.currentModel = model;

        return new Promise((resolve, reject) => {
            const process = spawn('llxprt', [], {
                stdio: ['pipe', 'pipe', 'pipe'],
                env: { ...process.env, LLXPRT_CONFIG_DIR: this.configDir }
            });

            process.on('close', (code) => {
                if (code === 0) {
                    resolve(`Switched to ${provider}:${model}`);
                } else {
                    reject(new Error(`Provider switch failed with code ${code}`));
                }
            });

            process.stdin.write(`/provider ${provider}\n`);
            process.stdin.write(`/model ${model}\n`);
            process.stdin.write('/exit\n');
            process.stdin.end();
        });
    }

    getAvailableProviders() {
        return Object.entries(this.providers)
            .filter(([_, config]) => process.env[config.apiKeyEnv])
            .map(([provider, config]) => ({
                provider,
                models: config.models,
                available: true
            }));
    }
}

module.exports = LLxprtManager;

Real-Time Streaming Implementation

WebSocket + LLxprt Integration

// services/streaming-service.js
const { spawn } = require('child_process');
const WebSocket = require('ws');
const LLxprtManager = require('../config/llxprt-manager');

class LLxprtStreamingService {
    constructor() {
        this.llxprtManager = new LLxprtManager();
        this.activeStreams = new Map();
        this.metrics = {
            totalStreams: 0,
            activeConnections: 0,
            averageResponseTime: 0
        };
    }

    async initialize() {
        await this.llxprtManager.initialize();
    }

    setupWebSocketServer(server) {
        const wss = new WebSocket.Server({ 
            server,
            path: '/chat/stream'
        });

        wss.on('connection', (ws, req) => {
            const sessionId = this.generateSessionId();
            const clientInfo = {
                sessionId,
                startTime: Date.now(),
                messageCount: 0
            };

            this.activeStreams.set(sessionId, clientInfo);
            this.metrics.activeConnections++;

            console.log(`🔗 New WebSocket connection: ${sessionId}`);

            ws.on('message', async (data) => {
                try {
                    const request = JSON.parse(data);
                    await this.handleStreamRequest(ws, sessionId, request);
                } catch (error) {
                    console.error('WebSocket message error:', error);
                    ws.send(JSON.stringify({
                        type: 'error',
                        error: error.message,
                        sessionId
                    }));
                }
            });

            ws.on('close', () => {
                this.cleanup(sessionId);
                this.metrics.activeConnections--;
                console.log(`❌ WebSocket disconnected: ${sessionId}`);
            });

            ws.on('error', (error) => {
                console.error(`WebSocket error for ${sessionId}:`, error);
                this.cleanup(sessionId);
            });

            // Send connection confirmation
            ws.send(JSON.stringify({
                type: 'connected',
                sessionId,
                availableProviders: this.llxprtManager.getAvailableProviders()
            }));
        });

        return wss;
    }

    async handleStreamRequest(ws, sessionId, request) {
        const { message, provider, model, options = {} } = request;
        
        // Validate request
        if (!message || !provider || !model) {
            throw new Error('Missing required fields: message, provider, model');
        }

        // Switch provider/model if needed
        if (this.llxprtManager.currentProvider !== provider || 
            this.llxprtManager.currentModel !== model) {
            await this.llxprtManager.switchProvider(provider, model);
        }

        // Start streaming process
        const streamData = await this.startLLxprtStream(ws, sessionId, message, options);
        
        // Update session info
        const clientInfo = this.activeStreams.get(sessionId);
        if (clientInfo) {
            clientInfo.messageCount++;
            clientInfo.lastMessage = Date.now();
        }

        this.metrics.totalStreams++;
    }

    async startLLxprtStream(ws, sessionId, message, options) {
        return new Promise((resolve, reject) => {
            const startTime = Date.now();
            let totalChunks = 0;
            let totalChars = 0;
            let firstTokenTime = null;

            // Spawn LLxprt process
            const llxprtProcess = spawn('llxprt', [], {
                stdio: ['pipe', 'pipe', 'pipe'],
                env: { 
                    ...process.env, 
                    LLXPRT_CONFIG_DIR: this.llxprtManager.configDir 
                }
            });

            // Handle stdout (AI response)
            llxprtProcess.stdout.on('data', (chunk) => {
                const content = chunk.toString();
                
                // Record first token time
                if (!firstTokenTime && content.trim()) {
                    firstTokenTime = Date.now() - startTime;
                }

                totalChunks++;
                totalChars += content.length;

                // Send chunk to WebSocket
                ws.send(JSON.stringify({
                    type: 'chunk',
                    content,
                    sessionId,
                    metadata: {
                        chunkIndex: totalChunks,
                        timestamp: Date.now()
                    }
                }));
            });

            // Handle stderr (errors and system messages)
            llxprtProcess.stderr.on('data', (data) => {
                const error = data.toString();
                console.error(`LLxprt stderr: ${error}`);
                
                // Send error to client if it's not just debug info
                if (error.toLowerCase().includes('error')) {
                    ws.send(JSON.stringify({
                        type: 'error',
                        error,
                        sessionId
                    }));
                }
            });

            // Handle process completion
            llxprtProcess.on('close', (code) => {
                const duration = Date.now() - startTime;
                
                // Update metrics
                this.metrics.averageResponseTime = 
                    (this.metrics.averageResponseTime + duration) / 2;

                const response = {
                    type: 'complete',
                    sessionId,
                    metadata: {
                        duration,
                        totalChunks,
                        totalChars,
                        firstTokenTime,
                        exitCode: code,
                        tokensPerSecond: totalChars / (duration / 1000)
                    }
                };

                ws.send(JSON.stringify(response));

                if (code === 0) {
                    resolve(response.metadata);
                } else {
                    reject(new Error(`LLxprt process exited with code ${code}`));
                }
            });

            // Handle process errors
            llxprtProcess.on('error', (error) => {
                console.error('LLxprt process error:', error);
                ws.send(JSON.stringify({
                    type: 'error',
                    error: error.message,
                    sessionId
                }));
                reject(error);
            });

            // Send message to LLxprt
            llxprtProcess.stdin.write(`${message}\n`);
            llxprtProcess.stdin.end();

            // Store process reference for cleanup
            const clientInfo = this.activeStreams.get(sessionId);
            if (clientInfo) {
                clientInfo.process = llxprtProcess;
            }
        });
    }

    cleanup(sessionId) {
        const clientInfo = this.activeStreams.get(sessionId);
        if (clientInfo?.process) {
            try {
                clientInfo.process.kill('SIGTERM');
            } catch (error) {
                console.error('Error killing LLxprt process:', error);
            }
        }
        this.activeStreams.delete(sessionId);
    }

    generateSessionId() {
        return `llxprt_${Date.now()}_${Math.random().toString(36).substr(2, 9)}`;
    }

    getMetrics() {
        return {
            ...this.metrics,
            activeStreams: this.activeStreams.size,
            uptime: process.uptime()
        };
    }
}

module.exports = LLxprtStreamingService;

Provider Management System

Dynamic Provider Configuration

// services/provider-service.js
class ProviderService {
    constructor(llxprtManager) {
        this.llxprtManager = llxprtManager;
        this.providerHealth = new Map();
        this.costTracking = new Map();
        this.usageStats = new Map();
    }

    async checkProviderHealth(provider) {
        try {
            const testMessage = "Hello, this is a health check.";
            const startTime = Date.now();
            
            await this.llxprtManager.switchProvider(provider, 
                this.llxprtManager.providers[provider].models[0]);
            
            // Quick test with minimal token usage
            const result = await this.sendTestMessage(testMessage);
            const responseTime = Date.now() - startTime;
            
            this.providerHealth.set(provider, {
                status: 'healthy',
                responseTime,
                lastCheck: Date.now(),
                error: null
            });
            
            return true;
        } catch (error) {
            this.providerHealth.set(provider, {
                status: 'unhealthy',
                responseTime: null,
                lastCheck: Date.now(),
                error: error.message
            });
            
            return false;
        }
    }

    async sendTestMessage(message) {
        return new Promise((resolve, reject) => {
            const process = spawn('llxprt', [], {
                stdio: ['pipe', 'pipe', 'pipe'],
                env: { 
                    ...process.env, 
                    LLXPRT_CONFIG_DIR: this.llxprtManager.configDir 
                }
            });

            let output = '';
            const timeout = setTimeout(() => {
                process.kill('SIGTERM');
                reject(new Error('Health check timeout'));
            }, 10000);

            process.stdout.on('data', (data) => {
                output += data.toString();
            });

            process.on('close', (code) => {
                clearTimeout(timeout);
                if (code === 0 && output.trim()) {
                    resolve(output);
                } else {
                    reject(new Error(`Health check failed with code ${code}`));
                }
            });

            process.stdin.write(`${message}\n`);
            process.stdin.end();
        });
    }

    async getOptimalProvider(preferredProvider = null) {
        const availableProviders = this.llxprtManager.getAvailableProviders();
        
        // If preferred provider is specified and healthy, use it
        if (preferredProvider) {
            const isHealthy = await this.checkProviderHealth(preferredProvider);
            if (isHealthy) {
                return preferredProvider;
            }
        }

        // Find the best available provider based on health and performance
        const healthyProviders = [];
        
        for (const { provider } of availableProviders) {
            const isHealthy = await this.checkProviderHealth(provider);
            if (isHealthy) {
                const health = this.providerHealth.get(provider);
                healthyProviders.push({
                    provider,
                    responseTime: health.responseTime,
                    priority: this.getProviderPriority(provider)
                });
            }
        }

        if (healthyProviders.length === 0) {
            throw new Error('No healthy providers available');
        }

        // Sort by priority, then by response time
        healthyProviders.sort((a, b) => {
            if (a.priority !== b.priority) {
                return a.priority - b.priority; // Lower priority number = higher priority
            }
            return a.responseTime - b.responseTime;
        });

        return healthyProviders[0].provider;
    }

    getProviderPriority(provider) {
        const priorities = {
            'openai': 1,
            'anthropic': 2,
            'google': 3
        };
        return priorities[provider] || 99;
    }

    trackUsage(provider, model, inputTokens, outputTokens, cost) {
        const key = `${provider}:${model}`;
        const current = this.usageStats.get(key) || {
            requests: 0,
            inputTokens: 0,
            outputTokens: 0,
            totalCost: 0,
            averageResponseTime: 0
        };

        current.requests++;
        current.inputTokens += inputTokens;
        current.outputTokens += outputTokens;
        current.totalCost += cost;

        this.usageStats.set(key, current);
    }

    getUsageStats() {
        const stats = {};
        for (const [key, data] of this.usageStats.entries()) {
            stats[key] = {
                ...data,
                averageCostPerRequest: data.totalCost / data.requests,
                totalTokens: data.inputTokens + data.outputTokens
            };
        }
        return stats;
    }

    getProviderHealth() {
        const health = {};
        for (const [provider, data] of this.providerHealth.entries()) {
            health[provider] = data;
        }
        return health;
    }
}

module.exports = ProviderService;

Express.js API Integration

Complete Server Implementation

// server.js
const express = require('express');
const http = require('http');
const cors = require('cors');
const helmet = require('helmet');
const rateLimit = require('express-rate-limit');

const LLxprtStreamingService = require('./services/streaming-service');
const ProviderService = require('./services/provider-service');

class SashaStudioServer {
    constructor() {
        this.app = express();
        this.server = http.createServer(this.app);
        this.streamingService = new LLxprtStreamingService();
        this.providerService = new ProviderService(this.streamingService.llxprtManager);
        
        this.setupMiddleware();
        this.setupRoutes();
    }

    setupMiddleware() {
        // Security middleware
        this.app.use(helmet({
            contentSecurityPolicy: {
                directives: {
                    defaultSrc: ["'self'"],
                    styleSrc: ["'self'", "'unsafe-inline'"],
                    scriptSrc: ["'self'"],
                    imgSrc: ["'self'", "data:", "https:"],
                    connectSrc: ["'self'", "ws:", "wss:"]
                }
            }
        }));

        // CORS configuration
        this.app.use(cors({
            origin: process.env.CORS_ORIGIN || 'http://localhost:3000',
            credentials: true
        }));

        // Rate limiting
        const limiter = rateLimit({
            windowMs: 15 * 60 * 1000, // 15 minutes
            max: 100, // Limit each IP to 100 requests per windowMs
            message: 'Too many requests from this IP, please try again later.'
        });
        this.app.use('/api/', limiter);

        // Body parsing
        this.app.use(express.json({ limit: '10mb' }));
        this.app.use(express.urlencoded({ extended: true, limit: '10mb' }));

        // Request logging
        this.app.use((req, res, next) => {
            console.log(`${req.method} ${req.path} - ${req.ip}`);
            next();
        });
    }

    setupRoutes() {
        // Health check
        this.app.get('/health', (req, res) => {
            res.json({
                status: 'healthy',
                timestamp: new Date().toISOString(),
                uptime: process.uptime(),
                version: process.env.npm_package_version || '1.0.0'
            });
        });

        // Provider information
        this.app.get('/api/providers', async (req, res) => {
            try {
                const providers = this.streamingService.llxprtManager.getAvailableProviders();
                const health = this.providerService.getProviderHealth();
                
                res.json({
                    providers: providers.map(p => ({
                        ...p,
                        health: health[p.provider] || { status: 'unknown' }
                    }))
                });
            } catch (error) {
                res.status(500).json({ error: error.message });
            }
        });

        // Provider health check
        this.app.post('/api/providers/:provider/health', async (req, res) => {
            try {
                const { provider } = req.params;
                const isHealthy = await this.providerService.checkProviderHealth(provider);
                const health = this.providerService.getProviderHealth()[provider];
                
                res.json({
                    provider,
                    healthy: isHealthy,
                    ...health
                });
            } catch (error) {
                res.status(500).json({ error: error.message });
            }
        });

        // Usage statistics
        this.app.get('/api/stats', (req, res) => {
            try {
                const streamingMetrics = this.streamingService.getMetrics();
                const usageStats = this.providerService.getUsageStats();
                const providerHealth = this.providerService.getProviderHealth();
                
                res.json({
                    streaming: streamingMetrics,
                    usage: usageStats,
                    health: providerHealth,
                    timestamp: new Date().toISOString()
                });
            } catch (error) {
                res.status(500).json({ error: error.message });
            }
        });

        // Switch provider/model
        this.app.post('/api/configure', async (req, res) => {
            try {
                const { provider, model } = req.body;
                
                if (!provider || !model) {
                    return res.status(400).json({ 
                        error: 'Provider and model are required' 
                    });
                }

                await this.streamingService.llxprtManager.switchProvider(provider, model);
                
                res.json({
                    success: true,
                    currentProvider: provider,
                    currentModel: model,
                    timestamp: new Date().toISOString()
                });
            } catch (error) {
                res.status(500).json({ error: error.message });
            }
        });

        // Error handling middleware
        this.app.use((error, req, res, next) => {
            console.error('Unhandled error:', error);
            res.status(500).json({
                error: 'Internal server error',
                timestamp: new Date().toISOString()
            });
        });

        // 404 handler
        this.app.use((req, res) => {
            res.status(404).json({
                error: 'Not found',
                path: req.path,
                timestamp: new Date().toISOString()
            });
        });
    }

    async start() {
        try {
            // Initialize services
            await this.streamingService.initialize();
            console.log('✅ LLxprt streaming service initialized');

            // Setup WebSocket server
            this.streamingService.setupWebSocketServer(this.server);
            console.log('✅ WebSocket server configured');

            // Start HTTP server
            const port = process.env.PORT || 3000;
            this.server.listen(port, () => {
                console.log(`🚀 Sasha Studio server running on port ${port}`);
                console.log(`🔗 WebSocket endpoint: ws://localhost:${port}/chat/stream`);
                console.log(`📊 Health check: http://localhost:${port}/health`);
            });

        } catch (error) {
            console.error('❌ Failed to start server:', error);
            process.exit(1);
        }
    }

    async shutdown() {
        console.log('🛑 Shutting down server...');
        
        // Cleanup active streams
        for (const [sessionId] of this.streamingService.activeStreams) {
            this.streamingService.cleanup(sessionId);
        }

        // Close server
        this.server.close(() => {
            console.log('✅ Server shutdown complete');
            process.exit(0);
        });
    }
}

// Handle graceful shutdown
const server = new SashaStudioServer();

process.on('SIGTERM', () => server.shutdown());
process.on('SIGINT', () => server.shutdown());

// Start server
server.start();

module.exports = SashaStudioServer;

Testing Strategy

Integration Tests

// tests/llxprt-integration.test.js
const WebSocket = require('ws');
const { spawn } = require('child_process');
const LLxprtStreamingService = require('../services/streaming-service');

describe('LLxprt Integration Tests', () => {
    let streamingService;
    let server;

    beforeAll(async () => {
        streamingService = new LLxprtStreamingService();
        await streamingService.initialize();
        
        // Start test server
        server = require('http').createServer();
        streamingService.setupWebSocketServer(server);
        server.listen(0); // Random port
    });

    afterAll(async () => {
        server.close();
    });

    test('should initialize LLxprt successfully', async () => {
        expect(streamingService.llxprtManager.isInitialized).toBe(true);
    });

    test('should establish WebSocket connection', (done) => {
        const ws = new WebSocket(`ws://localhost:${server.address().port}/chat/stream`);
        
        ws.on('open', () => {
            expect(ws.readyState).toBe(WebSocket.OPEN);
            ws.close();
            done();
        });
    });

    test('should stream AI response', (done) => {
        const ws = new WebSocket(`ws://localhost:${server.address().port}/chat/stream`);
        let receivedChunks = 0;
        
        ws.on('open', () => {
            ws.send(JSON.stringify({
                message: 'Say hello',
                provider: 'openai',
                model: 'gpt-4o-mini'
            }));
        });

        ws.on('message', (data) => {
            const response = JSON.parse(data);
            
            if (response.type === 'chunk') {
                receivedChunks++;
                expect(response.content).toBeDefined();
                expect(response.sessionId).toBeDefined();
            }
            
            if (response.type === 'complete') {
                expect(receivedChunks).toBeGreaterThan(0);
                expect(response.metadata.duration).toBeGreaterThan(0);
                ws.close();
                done();
            }
        });
    }, 30000);

    test('should handle provider switching', async () => {
        const providers = streamingService.llxprtManager.getAvailableProviders();
        expect(providers.length).toBeGreaterThan(0);
        
        const provider = providers[0];
        await streamingService.llxprtManager.switchProvider(
            provider.provider, 
            provider.models[0]
        );
        
        expect(streamingService.llxprtManager.currentProvider).toBe(provider.provider);
    });

    test('should handle errors gracefully', (done) => {
        const ws = new WebSocket(`ws://localhost:${server.address().port}/chat/stream`);
        
        ws.on('open', () => {
            ws.send(JSON.stringify({
                message: 'Test message',
                provider: 'invalid-provider',
                model: 'invalid-model'
            }));
        });

        ws.on('message', (data) => {
            const response = JSON.parse(data);
            
            if (response.type === 'error') {
                expect(response.error).toBeDefined();
                ws.close();
                done();
            }
        });
    });
});

Load Testing

// tests/load-test.js
const WebSocket = require('ws');

async function loadTest() {
    const concurrent = 10;
    const messagesPerConnection = 3;
    const connections = [];
    const results = [];

    console.log(`🧪 Starting load test with ${concurrent} concurrent connections`);

    for (let i = 0; i < concurrent; i++) {
        const connectionPromise = new Promise((resolve) => {
            const ws = new WebSocket('ws://localhost:3000/chat/stream');
            const startTime = Date.now();
            let messagesCompleted = 0;
            let totalResponseTime = 0;

            ws.on('open', () => {
                console.log(`Connection ${i} opened`);
                sendMessage();
            });

            ws.on('message', (data) => {
                const response = JSON.parse(data);
                
                if (response.type === 'complete') {
                    messagesCompleted++;
                    totalResponseTime += response.metadata.duration;
                    
                    if (messagesCompleted < messagesPerConnection) {
                        setTimeout(sendMessage, 1000);
                    } else {
                        const totalTime = Date.now() - startTime;
                        resolve({
                            connection: i,
                            totalTime,
                            messagesCompleted,
                            averageResponseTime: totalResponseTime / messagesCompleted
                        });
                        ws.close();
                    }
                }
            });

            function sendMessage() {
                ws.send(JSON.stringify({
                    message: `Load test message ${messagesCompleted + 1}`,
                    provider: 'openai',
                    model: 'gpt-4o-mini'
                }));
            }
        });

        connections.push(connectionPromise);
    }

    const results = await Promise.all(connections);
    
    console.log('\n📊 Load Test Results:');
    console.log(`Total connections: ${concurrent}`);
    console.log(`Messages per connection: ${messagesPerConnection}`);
    console.log(`Total messages: ${concurrent * messagesPerConnection}`);
    
    const avgResponseTime = results.reduce((sum, r) => sum + r.averageResponseTime, 0) / results.length;
    console.log(`Average response time: ${avgResponseTime.toFixed(2)}ms`);
    
    const maxResponseTime = Math.max(...results.map(r => r.averageResponseTime));
    console.log(`Max response time: ${maxResponseTime.toFixed(2)}ms`);
    
    const minResponseTime = Math.min(...results.map(r => r.averageResponseTime));
    console.log(`Min response time: ${minResponseTime.toFixed(2)}ms`);
}

if (require.main === module) {
    loadTest().catch(console.error);
}

module.exports = loadTest;

Performance Optimization

Connection Pooling and Resource Management

// utils/resource-manager.js
class ResourceManager {
    constructor() {
        this.processPool = [];
        this.maxPoolSize = 5;
        this.activeProcesses = new Set();
        this.processTimeout = 30000; // 30 seconds
    }

    async getProcess() {
        // Try to get from pool first
        if (this.processPool.length > 0) {
            const process = this.processPool.pop();
            this.activeProcesses.add(process);
            return process;
        }

        // Create new process if under limit
        if (this.activeProcesses.size < this.maxPoolSize) {
            const process = this.createProcess();
            this.activeProcesses.add(process);
            return process;
        }

        // Wait for available process
        return new Promise((resolve) => {
            const checkInterval = setInterval(() => {
                if (this.processPool.length > 0) {
                    clearInterval(checkInterval);
                    const process = this.processPool.pop();
                    this.activeProcesses.add(process);
                    resolve(process);
                }
            }, 100);
        });
    }

    createProcess() {
        const process = spawn('llxprt', [], {
            stdio: ['pipe', 'pipe', 'pipe'],
            env: { ...process.env }
        });

        // Set timeout for idle processes
        process.idleTimeout = setTimeout(() => {
            this.destroyProcess(process);
        }, this.processTimeout);

        process.on('error', () => {
            this.destroyProcess(process);
        });

        return process;
    }

    releaseProcess(process) {
        this.activeProcesses.delete(process);
        
        // Return to pool if healthy
        if (process.killed === false && this.processPool.length < this.maxPoolSize) {
            this.processPool.push(process);
            
            // Reset idle timeout
            clearTimeout(process.idleTimeout);
            process.idleTimeout = setTimeout(() => {
                this.destroyProcess(process);
            }, this.processTimeout);
        } else {
            this.destroyProcess(process);
        }
    }

    destroyProcess(process) {
        try {
            clearTimeout(process.idleTimeout);
            this.activeProcesses.delete(process);
            this.processPool = this.processPool.filter(p => p !== process);
            
            if (!process.killed) {
                process.kill('SIGTERM');
            }
        } catch (error) {
            console.error('Error destroying process:', error);
        }
    }

    cleanup() {
        // Destroy all processes
        for (const process of this.activeProcesses) {
            this.destroyProcess(process);
        }
        
        for (const process of this.processPool) {
            this.destroyProcess(process);
        }
        
        this.processPool = [];
        this.activeProcesses.clear();
    }

    getStats() {
        return {
            poolSize: this.processPool.length,
            activeProcesses: this.activeProcesses.size,
            maxPoolSize: this.maxPoolSize
        };
    }
}

module.exports = ResourceManager;

Monitoring and Observability

Metrics Collection

// utils/metrics.js
class MetricsCollector {
    constructor() {
        this.metrics = {
            requests: {
                total: 0,
                successful: 0,
                failed: 0,
                byProvider: new Map()
            },
            performance: {
                averageResponseTime: 0,
                responseTimeHistory: [],
                firstTokenTimes: [],
                tokensPerSecond: []
            },
            resources: {
                memoryUsage: [],
                activeConnections: 0,
                processCount: 0
            },
            errors: {
                total: 0,
                byType: new Map(),
                recent: []
            }
        };

        this.startCollection();
    }

    startCollection() {
        // Collect system metrics every 30 seconds
        setInterval(() => {
            const usage = process.memoryUsage();
            this.metrics.resources.memoryUsage.push({
                timestamp: Date.now(),
                heapUsed: usage.heapUsed / 1024 / 1024,
                heapTotal: usage.heapTotal / 1024 / 1024,
                external: usage.external / 1024 / 1024,
                rss: usage.rss / 1024 / 1024
            });

            // Keep only last 100 measurements (50 minutes)
            if (this.metrics.resources.memoryUsage.length > 100) {
                this.metrics.resources.memoryUsage.shift();
            }
        }, 30000);
    }

    recordRequest(provider, model, success, responseTime, firstTokenTime, tokensPerSecond) {
        this.metrics.requests.total++;
        
        if (success) {
            this.metrics.requests.successful++;
        } else {
            this.metrics.requests.failed++;
        }

        // Track by provider
        const providerKey = `${provider}:${model}`;
        const providerStats = this.metrics.requests.byProvider.get(providerKey) || {
            total: 0,
            successful: 0,
            failed: 0
        };
        
        providerStats.total++;
        if (success) {
            providerStats.successful++;
        } else {
            providerStats.failed++;
        }
        
        this.metrics.requests.byProvider.set(providerKey, providerStats);

        // Record performance metrics
        if (success && responseTime) {
            this.metrics.performance.responseTimeHistory.push(responseTime);
            
            if (firstTokenTime) {
                this.metrics.performance.firstTokenTimes.push(firstTokenTime);
            }
            
            if (tokensPerSecond) {
                this.metrics.performance.tokensPerSecond.push(tokensPerSecond);
            }

            // Keep only last 1000 measurements
            if (this.metrics.performance.responseTimeHistory.length > 1000) {
                this.metrics.performance.responseTimeHistory.shift();
            }
            
            // Update average
            const sum = this.metrics.performance.responseTimeHistory.reduce((a, b) => a + b, 0);
            this.metrics.performance.averageResponseTime = sum / this.metrics.performance.responseTimeHistory.length;
        }
    }

    recordError(error, type = 'unknown', context = {}) {
        this.metrics.errors.total++;
        
        const errorCount = this.metrics.errors.byType.get(type) || 0;
        this.metrics.errors.byType.set(type, errorCount + 1);
        
        this.metrics.errors.recent.push({
            timestamp: Date.now(),
            error: error.message || error,
            type,
            context
        });
        
        // Keep only last 100 errors
        if (this.metrics.errors.recent.length > 100) {
            this.metrics.errors.recent.shift();
        }
    }

    updateResourceMetrics(activeConnections, processCount) {
        this.metrics.resources.activeConnections = activeConnections;
        this.metrics.resources.processCount = processCount;
    }

    getMetrics() {
        return {
            ...this.metrics,
            timestamp: new Date().toISOString(),
            uptime: process.uptime()
        };
    }

    getHealthScore() {
        const successRate = this.metrics.requests.total > 0 
            ? this.metrics.requests.successful / this.metrics.requests.total 
            : 1;
        
        const avgResponseTime = this.metrics.performance.averageResponseTime || 0;
        const responseTimeScore = Math.max(0, 1 - (avgResponseTime / 10000)); // Penalty after 10s
        
        const memoryUsage = this.metrics.resources.memoryUsage;
        const latestMemory = memoryUsage[memoryUsage.length - 1];
        const memoryScore = latestMemory 
            ? Math.max(0, 1 - (latestMemory.heapUsed / 1000)) // Penalty after 1GB
            : 1;
        
        const healthScore = (successRate * 0.5) + (responseTimeScore * 0.3) + (memoryScore * 0.2);
        
        return {
            overall: Math.round(healthScore * 100),
            components: {
                successRate: Math.round(successRate * 100),
                responseTime: Math.round(responseTimeScore * 100),
                memory: Math.round(memoryScore * 100)
            }
        };
    }

    generateReport() {
        const metrics = this.getMetrics();
        const health = this.getHealthScore();
        
        return {
            summary: {
                totalRequests: metrics.requests.total,
                successRate: `${((metrics.requests.successful / metrics.requests.total) * 100).toFixed(2)}%`,
                averageResponseTime: `${metrics.performance.averageResponseTime.toFixed(0)}ms`,
                healthScore: `${health.overall}%`,
                uptime: `${(metrics.uptime / 3600).toFixed(2)} hours`
            },
            performance: {
                averageResponseTime: metrics.performance.averageResponseTime,
                averageFirstTokenTime: metrics.performance.firstTokenTimes.length > 0
                    ? metrics.performance.firstTokenTimes.reduce((a, b) => a + b, 0) / metrics.performance.firstTokenTimes.length
                    : 0,
                averageTokensPerSecond: metrics.performance.tokensPerSecond.length > 0
                    ? metrics.performance.tokensPerSecond.reduce((a, b) => a + b, 0) / metrics.performance.tokensPerSecond.length
                    : 0
            },
            resources: {
                currentMemoryUsage: metrics.resources.memoryUsage[metrics.resources.memoryUsage.length - 1],
                activeConnections: metrics.resources.activeConnections,
                processCount: metrics.resources.processCount
            },
            providers: Object.fromEntries(metrics.requests.byProvider),
            errors: {
                total: metrics.errors.total,
                byType: Object.fromEntries(metrics.errors.byType),
                recent: metrics.errors.recent.slice(-10)
            },
            health
        };
    }
}

module.exports = MetricsCollector;

Deployment and Production Considerations

Production Docker Configuration

# Production Dockerfile
FROM node:24-slim AS base

# Install system dependencies including Python for potential native modules
RUN apt-get update && apt-get install -y \
    ca-certificates \
    curl \
    gnupg \
    python3 \
    python3-pip \
    make \
    g++ \
    git \
    && rm -rf /var/lib/apt/lists/* \
    && apt-get clean

# Create non-root user
RUN groupadd --gid 1000 sasha \
  && useradd --uid 1000 --gid sasha --shell /bin/bash --create-home sasha

# Configure npm for non-root user
USER sasha
ENV NPM_CONFIG_PREFIX=/home/sasha/.npm-global
ENV PATH=$PATH:/home/sasha/.npm-global/bin

WORKDIR /home/sasha/app

# Install LLxprt Code
RUN npm install -g @vybestack/llxprt-code@latest

# Verify installation
RUN llxprt --version

FROM base AS production

# Copy package files
COPY --chown=sasha:sasha package*.json ./

# Install production dependencies
RUN npm ci --omit=dev --no-audit --no-fund

# Copy source code
COPY --chown=sasha:sasha . .

# Create necessary directories
RUN mkdir -p /home/sasha/.config/llxprt \
    && mkdir -p /home/sasha/logs \
    && mkdir -p /home/sasha/tmp

# Set environment variables
ENV NODE_ENV=production
ENV LOG_LEVEL=info
ENV MAX_CONCURRENT_STREAMS=10
ENV STREAM_TIMEOUT=300000

# Expose port
EXPOSE 3000

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
  CMD curl -f http://localhost:3000/health || exit 1

# Start application
CMD ["node", "server.js"]

Kubernetes Deployment

# k8s/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: sasha-studio
  labels:
    app: sasha-studio
spec:
  replicas: 3
  selector:
    matchLabels:
      app: sasha-studio
  template:
    metadata:
      labels:
        app: sasha-studio
    spec:
      containers:
      - name: sasha-studio
        image: sasha-studio:latest
        ports:
        - containerPort: 3000
        env:
        - name: NODE_ENV
          value: "production"
        - name: OPENAI_API_KEY
          valueFrom:
            secretKeyRef:
              name: ai-api-keys
              key: openai
        - name: ANTHROPIC_API_KEY
          valueFrom:
            secretKeyRef:
              name: ai-api-keys
              key: anthropic
        - name: GOOGLE_API_KEY
          valueFrom:
            secretKeyRef:
              name: ai-api-keys
              key: google
        resources:
          requests:
            memory: "256Mi"
            cpu: "250m"
          limits:
            memory: "1Gi"
            cpu: "1000m"
        livenessProbe:
          httpGet:
            path: /health
            port: 3000
          initialDelaySeconds: 30
          periodSeconds: 30
        readinessProbe:
          httpGet:
            path: /health
            port: 3000
          initialDelaySeconds: 5
          periodSeconds: 10
        volumeMounts:
        - name: llxprt-config
          mountPath: /home/sasha/.config/llxprt
        - name: logs
          mountPath: /home/sasha/logs
      volumes:
      - name: llxprt-config
        configMap:
          name: llxprt-config
      - name: logs
        emptyDir: {}
---
apiVersion: v1
kind: Service
metadata:
  name: sasha-studio-service
spec:
  selector:
    app: sasha-studio
  ports:
    - protocol: TCP
      port: 80
      targetPort: 3000
  type: LoadBalancer

Summary and Next Steps

Implementation Roadmap

Week 1: Core Setup
- Install and configure LLxprt Code in Docker
- Implement basic streaming service
- Test multi-provider configuration
Week 2: Integration
- Build WebSocket streaming system
- Implement provider management
- Add error handling and resilience
Week 3: Production Ready
- Add monitoring and metrics
- Implement resource management
- Security hardening and testing
Week 4: Optimization
- Performance tuning
- Load balancing
- Production deployment

Key Benefits of LLxprt Integration

Multi-Provider Support: 8+ AI providers with unified interface
Streaming Native: Built-in real-time streaming capabilities
Local Model Support: Ollama, LM Studio, llama.cpp integration
Docker Ready: Existing containerization support
Node.js Compatible: NPM package with CLI interface
Active Development: Maintained by Vybestack team
Configuration Management: Flexible provider and model switching

Success Metrics

Performance: < 500ms first token, < 50ms subsequent tokens
Reliability: 99.9% uptime, graceful fallback between providers
Scalability: 100+ concurrent streams, < 1GB memory usage
Security: Zero vulnerabilities, encrypted API key management

This comprehensive integration guide provides everything needed to implement LLxprt Code into Sasha Studio with production-grade quality and performance.

With LLxprt Code as our foundation, Sasha Studio will have enterprise-grade AI integration with maximum flexibility and reliability.