# Deployment Infrastructure ## Overview The J4F Assistant includes comprehensive deployment infrastructure supporting multiple environments, container orchestration, and enterprise-grade deployment patterns. The system supports Docker Compose for development and Kubernetes/Helm for production deployments. ## 🚀 Deployment Options ### 1. Docker Compose (Development/Testing) **Quick Start:** ```bash # Basic deployment docker-compose up -d # Production-like deployment docker-compose -f docker-compose-prod.yml up -d ``` **Features:** - Local development environment - Integrated MongoDB and NATS services - Environment-specific configurations - Volume persistence for data - Service discovery and networking ### 2. Kubernetes + Helm (Production) **Quick Start:** ```bash # Install with Helm helm install j4f-assistant ./helm/j4f-assistant -n website --create-namespace # Or use deployment script ./helm/deploy.sh --environment prod --namespace production ``` **Features:** - Production-ready Kubernetes deployment - Scalable architecture with load balancing - Persistent storage for data and models - Integrated service mesh support - Health checks and rolling updates ### 3. Manual Deployment **For custom environments:** ```bash # Install dependencies npm install # Configure environment cp .env.example .env # Edit .env with your configuration # Start services npm start ``` ## 🏗️ Infrastructure Components ### Core Services #### Application Stack ```yaml services: j4f-assistant: image: j4f-assistant:latest ports: - "3000:3000" environment: - NODE_ENV=production - DATABASE_URL=mongodb://mongo:27017/j4f - NATS_API_URL=nats://nats:4222 ``` #### Data Services ```yaml mongodb: image: mongo:7 volumes: - mongodb_data:/data/db environment: - MONGO_INITDB_DATABASE=j4f_assistant nats: image: nats:latest ports: - "4222:4222" - "8222:8222" command: - "--jetstream" - "--store_dir=/data" ``` #### Reverse Proxy (Production) ```yaml traefik: image: traefik:v3.0 ports: - "80:80" - "443:443" volumes: - /var/run/docker.sock:/var/run/docker.sock ``` ## 🔧 Configuration Management ### Environment Variables #### Core Application ```bash # Application Settings NODE_ENV=production PORT=3000 LOG_LEVEL=info # Database Configuration DATABASE_URL=mongodb://mongo:27017/j4f_assistant MONGODB_DB_NAME=j4f_assistant # Message Queue NATS_API_URL=nats://nats:4222 NATS_API_KEY=your-nats-token # AI Provider Settings AI_MODEL_BASE_URL=http://ollama:11434 AI_MODEL_NAME=llama2 ANTHROPIC_API_KEY=your-anthropic-key OPENAI_API_KEY=your-openai-key # Ollama Fallback Configuration (NEW) AI_MODEL_FALLBACK_URLS=http://backup-ollama:11434,http://tertiary-ollama:11434 AI_MODEL_DISABLE_ON_FAILURE=true AI_MODEL_CONNECTION_TIMEOUT=10000 ``` #### Authentication (OIDC/Keycloak) ```bash # OIDC Configuration OIDC_ISSUER_BASE_URL=https://keycloak.example.com/realms/j4f OIDC_BASE_URL=https://j4f-assistant.example.com OIDC_CLIENT_ID=j4f-assistant OIDC_CLIENT_SECRET=your-client-secret OIDC_SECRET=your-session-secret AUTH_ENABLED=true ``` #### Security & Monitoring ```bash # Security Settings SECURE_COMMAND_EXECUTION=true AUDIT_LOG_ENABLED=true AUDIT_LOG_PATH=/var/log/j4f/audit.log # Monitoring HEALTH_CHECK_ENABLED=true METRICS_ENABLED=true PROMETHEUS_PORT=9090 ``` ### Helm Chart Configuration #### Basic Values (`values.yaml`) ```yaml replicaCount: 2 image: repository: j4f-assistant tag: latest pullPolicy: IfNotPresent service: type: ClusterIP port: 3000 ingress: enabled: true host: j4f-assistant.example.com tls: enabled: true secretName: j4f-tls persistence: enabled: true size: 10Gi storageClass: fast-ssd mongodb: enabled: true persistence: size: 50Gi nats: enabled: true jetstream: enabled: true ``` #### Production Values (`values-prod.yaml`) ```yaml replicaCount: 3 resources: limits: cpu: 2000m memory: 4Gi requests: cpu: 500m memory: 1Gi autoscaling: enabled: true minReplicas: 2 maxReplicas: 10 targetCPUUtilizationPercentage: 70 monitoring: enabled: true serviceMonitor: enabled: true security: podSecurityPolicy: enabled: true networkPolicy: enabled: true ``` ## 🌍 Environment-Specific Deployments ### Development Environment ```bash # Local development with hot reloading docker-compose -f docker-compose.yml up -d # Features: # - File watching and auto-restart # - Debug logging enabled # - Development databases # - Exposed debugging ports ``` ### Staging Environment ```bash # Staging deployment ./helm/deploy.sh --environment staging --namespace staging # Features: # - Production-like configuration # - Test data isolation # - Performance monitoring # - Security testing enabled ``` ### Production Environment ```bash # Production deployment ./helm/deploy.sh --environment prod --namespace production --create-namespace # Features: # - High availability (3+ replicas) # - Load balancing and auto-scaling # - Persistent storage # - Security hardening # - Full monitoring and alerting ``` ## 🔍 Health Checks & Monitoring ### Application Health Checks ```javascript // Built-in health check endpoint app.get('/health', (req, res) => { res.json({ status: 'healthy', timestamp: new Date().toISOString(), services: { database: 'connected', nats: 'connected', ai_providers: 'available' } }); }); ``` ### Kubernetes Probes ```yaml livenessProbe: httpGet: path: /health port: 3000 initialDelaySeconds: 30 periodSeconds: 10 readinessProbe: httpGet: path: /ready port: 3000 initialDelaySeconds: 5 periodSeconds: 5 ``` ### Prometheus Metrics ```javascript // Custom metrics for monitoring const promClient = require('prom-client'); const httpDuration = new promClient.Histogram({ name: 'http_request_duration_seconds', help: 'Duration of HTTP requests in seconds', labelNames: ['method', 'route', 'status'] }); const pipelineLatency = new promClient.Histogram({ name: 'pipeline_processing_duration_seconds', help: 'Duration of pipeline processing in seconds', labelNames: ['pipeline_type', 'step'] }); ``` ## 🔒 Security Configuration ### TLS/SSL Configuration ```yaml # Helm chart TLS configuration ingress: tls: enabled: true secretName: j4f-tls hosts: - j4f-assistant.example.com # Certificate management with cert-manager annotations: cert-manager.io/cluster-issuer: "letsencrypt-prod" ``` ### Network Policies ```yaml # Kubernetes network policy for security isolation apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: j4f-assistant-netpol spec: podSelector: matchLabels: app: j4f-assistant policyTypes: - Ingress - Egress ingress: - from: - podSelector: matchLabels: app: traefik ports: - protocol: TCP port: 3000 ``` ### Security Context ```yaml # Pod security context securityContext: runAsNonRoot: true runAsUser: 1000 fsGroup: 2000 seccompProfile: type: RuntimeDefault # Container security context containerSecurityContext: allowPrivilegeEscalation: false capabilities: drop: - ALL readOnlyRootFilesystem: true ``` ## 📊 Scaling & Performance ### Horizontal Pod Autoscaling ```yaml apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: j4f-assistant-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: j4f-assistant minReplicas: 2 maxReplicas: 10 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 70 - type: Resource resource: name: memory target: type: Utilization averageUtilization: 80 ``` ### Resource Management ```yaml resources: requests: cpu: 500m memory: 1Gi limits: cpu: 2000m memory: 4Gi # JVM tuning for Node.js env: - name: NODE_OPTIONS value: "--max-old-space-size=3072" ``` ## 🛠️ Deployment Scripts ### Automated Deployment Script ```bash #!/bin/bash # helm/deploy.sh ENVIRONMENT=${1:-dev} NAMESPACE=${2:-website} CREATE_NAMESPACE=${3:-false} echo "Deploying J4F Assistant to $ENVIRONMENT environment..." # Create namespace if needed if [ "$CREATE_NAMESPACE" = "true" ]; then kubectl create namespace $NAMESPACE --dry-run=client -o yaml | kubectl apply -f - fi # Deploy with environment-specific values helm upgrade --install j4f-assistant ./j4f-assistant \ --namespace $NAMESPACE \ --values values-$ENVIRONMENT.yaml \ --timeout 10m \ --wait echo "Deployment complete!" echo "Access the application at: https://j4f-assistant-$ENVIRONMENT.example.com" ``` ### Health Check Script ```bash #!/bin/bash # scripts/health-check.sh APP_URL=${1:-http://localhost:3000} echo "Checking application health..." # Check main application curl -f $APP_URL/health || exit 1 # Check specific services curl -f $APP_URL/api/models || exit 1 curl -f $APP_URL/api/conversations || exit 1 echo "All health checks passed!" ``` ## 🔄 CI/CD Integration ### GitLab CI Pipeline ```yaml stages: - build - test - deploy-staging - deploy-production build: stage: build script: - docker build -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA . - docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA deploy-staging: stage: deploy-staging script: - helm upgrade --install j4f-assistant-staging ./helm/j4f-assistant --namespace staging --set image.tag=$CI_COMMIT_SHA --values values-staging.yaml environment: name: staging url: https://j4f-assistant-staging.example.com deploy-production: stage: deploy-production script: - helm upgrade --install j4f-assistant ./helm/j4f-assistant --namespace production --set image.tag=$CI_COMMIT_SHA --values values-prod.yaml environment: name: production url: https://j4f-assistant.example.com when: manual only: - main ``` ## 📋 Troubleshooting ### Common Issues #### Application Won't Start ```bash # Check logs kubectl logs -l app=j4f-assistant -n website # Check configuration kubectl describe deployment j4f-assistant -n website # Verify environment variables kubectl get configmap j4f-config -o yaml ``` #### Database Connection Issues ```bash # Test MongoDB connectivity kubectl exec -it deployment/j4f-assistant -- nc -zv mongodb 27017 # Check MongoDB logs kubectl logs deployment/mongodb -n website ``` #### NATS Connection Problems ```bash # Check NATS server status kubectl exec -it deployment/nats -- nats server info # Test NATS connectivity kubectl exec -it deployment/j4f-assistant -- nc -zv nats 4222 ``` ### Performance Issues ```bash # Check resource usage kubectl top pods -n website # Monitor pipeline performance kubectl logs -l app=j4f-assistant -n website | grep "Pipeline" # Check database performance kubectl exec -it deployment/mongodb -- mongo --eval "db.stats()" ``` ## Related Documentation - [Architecture Overview](../structure/Architecture.md) - [Pipeline System](../pipeline/Overview.md) - [Authentication Setup](../auth/Overview.md) - [Security Configuration](../security/CommandExecution.md) ## 🔄 AI Provider Fallback System ### Robust Ollama Connection Management The J4F Assistant now includes a sophisticated fallback system for AI providers that ensures continuous operation even when primary services are unavailable. #### Fallback Strategy 1. **Primary Ollama Connection** - Attempts connection to main Ollama endpoint 2. **Fallback Ollama Endpoints** - Tries backup Ollama servers in sequence 3. **External Providers** - Falls back to OpenAI/Anthropic if configured 4. **Graceful Degradation** - Continues without AI if all providers fail (optional) #### Configuration Options ```bash # Primary Ollama endpoint AI_MODEL_BASE_URL=http://localhost:11434 # Comma-separated list of fallback Ollama endpoints AI_MODEL_FALLBACK_URLS=http://backup-ollama:11434,http://tertiary-ollama:11434 # Whether to disable AI entirely if all Ollama endpoints fail # true = continue with external providers only # false = crash application if no Ollama available AI_MODEL_DISABLE_ON_FAILURE=true # Connection timeout per endpoint (milliseconds) AI_MODEL_CONNECTION_TIMEOUT=10000 # External provider fallback configuration OPENAI_API_KEY=your-openai-key # Optional fallback ANTHROPIC_API_KEY=your-anthropic-key # Optional fallback ``` #### Production Deployment Scenarios ##### High Availability Ollama Setup ```yaml # docker-compose.yml - Multiple Ollama instances services: ollama-primary: image: ollama/ollama:latest ports: - "11434:11434" volumes: - ollama_primary_data:/root/.ollama ollama-backup: image: ollama/ollama:latest ports: - "11435:11434" volumes: - ollama_backup_data:/root/.ollama j4f-assistant: image: j4f-assistant:latest environment: - AI_MODEL_BASE_URL=http://ollama-primary:11434 - AI_MODEL_FALLBACK_URLS=http://ollama-backup:11434 - AI_MODEL_DISABLE_ON_FAILURE=true - OPENAI_API_KEY=${OPENAI_API_KEY} ``` ##### Cloud-Hybrid Setup ```bash # Use local Ollama with cloud fallback AI_MODEL_BASE_URL=http://localhost:11434 AI_MODEL_FALLBACK_URLS=http://backup-server:11434 AI_MODEL_DISABLE_ON_FAILURE=true OPENAI_API_KEY=your-openai-key ANTHROPIC_API_KEY=your-anthropic-key ``` ##### Pure Cloud Setup ```bash # External providers only (no Ollama) AI_MODEL_DISABLE_ON_FAILURE=true OPENAI_API_KEY=your-openai-key ANTHROPIC_API_KEY=your-anthropic-key # Leave AI_MODEL_BASE_URL unset or set to unreachable endpoint ``` #### Health Monitoring The application provides several endpoints for monitoring AI provider health: - **`GET /api/health`** - Overall system health - **`GET /api/health/connections`** - Detailed connection status - **`GET /api/health/providers`** - Provider-specific status - **`POST /api/health/reconnect`** - Manual reconnection trigger - **`GET /api/ping`** - Lightweight health check #### Frontend Status Display The web interface now includes a real-time connection status indicator showing: - Current active AI provider (Ollama, OpenAI, Anthropic) - Connection health status - Fallback provider availability - Manual reconnection controls - Detailed error information #### Kubernetes Health Checks ```yaml # Deployment with health checks apiVersion: apps/v1 kind: Deployment spec: template: spec: containers: - name: j4f-assistant livenessProbe: httpGet: path: /api/ping port: 3000 initialDelaySeconds: 30 periodSeconds: 10 readinessProbe: httpGet: path: /api/health port: 3000 initialDelaySeconds: 5 periodSeconds: 5 ``` ### Troubleshooting Connection Issues #### Common Configuration Errors 1. **Invalid Fallback URLs** ```bash # ❌ Wrong - URLs not properly formatted AI_MODEL_FALLBACK_URLS=backup-ollama:11434,localhost:11435 # ✅ Correct - Full HTTP URLs AI_MODEL_FALLBACK_URLS=http://backup-ollama:11434,http://localhost:11435 ``` 2. **Timeout Configuration** ```bash # ❌ Too short for model loading AI_MODEL_CONNECTION_TIMEOUT=1000 # ✅ Reasonable timeout AI_MODEL_CONNECTION_TIMEOUT=10000 ``` 3. **Missing External Keys** ```bash # ❌ Fallback enabled but no keys provided AI_MODEL_DISABLE_ON_FAILURE=true # Missing OPENAI_API_KEY and ANTHROPIC_API_KEY # ✅ Proper fallback configuration AI_MODEL_DISABLE_ON_FAILURE=true OPENAI_API_KEY=sk-... ANTHROPIC_API_KEY=sk-ant-... ``` #### Debugging Connection Issues ```bash # Check connection status curl http://localhost:3000/api/health/connections # Attempt manual reconnection curl -X POST http://localhost:3000/api/health/reconnect # Check overall system health curl http://localhost:3000/api/health ``` #### Log Analysis Look for these log patterns: ```bash # Successful fallback INFO OllamaConnectionManager: Failed to connect to Ollama at http://primary:11434: Connection refused INFO OllamaConnectionManager: Successfully connected to Ollama at http://backup:11434 # External provider fallback WARN ConnectionManager: All Ollama endpoints failed. Disabling Ollama support - continuing with external providers only. INFO ExternalProviderManager: Generating response using OpenAI model: gpt-3.5-turbo # Complete failure with graceful degradation ERROR ConnectionManager: All AI providers failed and fallback disabled ```