# Deployment Infrastructure

## Overview

The J4F Assistant includes comprehensive deployment infrastructure supporting multiple environments, container orchestration, and enterprise-grade deployment patterns. The system supports Docker Compose for development and Kubernetes/Helm for production deployments.

## 🚀 Deployment Options

### 1. Docker Compose (Development/Testing)

**Quick Start:**
```bash
# Basic deployment
docker-compose up -d

# Production-like deployment
docker-compose -f docker-compose-prod.yml up -d
```

**Features:**
- Local development environment
- Integrated MongoDB and NATS services
- Environment-specific configurations
- Volume persistence for data
- Service discovery and networking

### 2. Kubernetes + Helm (Production)

**Quick Start:**
```bash
# Install with Helm
helm install j4f-assistant ./helm/j4f-assistant -n website --create-namespace

# Or use deployment script
./helm/deploy.sh --environment prod --namespace production
```

**Features:**
- Production-ready Kubernetes deployment
- Scalable architecture with load balancing
- Persistent storage for data and models
- Integrated service mesh support
- Health checks and rolling updates

### 3. Manual Deployment

**For custom environments:**
```bash
# Install dependencies
npm install

# Configure environment
cp .env.example .env
# Edit .env with your configuration

# Start services
npm start
```

## 🏗️ Infrastructure Components

### Core Services

#### Application Stack
```yaml
services:
  j4f-assistant:
    image: j4f-assistant:latest
    ports:
      - "3000:3000"
    environment:
      - NODE_ENV=production
      - DATABASE_URL=mongodb://mongo:27017/j4f
      - NATS_API_URL=nats://nats:4222
```

#### Data Services
```yaml
  mongodb:
    image: mongo:7
    volumes:
      - mongodb_data:/data/db
    environment:
      - MONGO_INITDB_DATABASE=j4f_assistant

  nats:
    image: nats:latest
    ports:
      - "4222:4222"
      - "8222:8222"
    command:
      - "--jetstream"
      - "--store_dir=/data"
```

#### Reverse Proxy (Production)
```yaml
  traefik:
    image: traefik:v3.0
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
```

## 🔧 Configuration Management

### Environment Variables

#### Core Application
```bash
# Application Settings
NODE_ENV=production
PORT=3000
LOG_LEVEL=info

# Database Configuration
DATABASE_URL=mongodb://mongo:27017/j4f_assistant
MONGODB_DB_NAME=j4f_assistant

# Message Queue
NATS_API_URL=nats://nats:4222
NATS_API_KEY=your-nats-token

# AI Provider Settings
AI_MODEL_BASE_URL=http://ollama:11434
AI_MODEL_NAME=llama2
ANTHROPIC_API_KEY=your-anthropic-key
OPENAI_API_KEY=your-openai-key

# Ollama Fallback Configuration (NEW)
AI_MODEL_FALLBACK_URLS=http://backup-ollama:11434,http://tertiary-ollama:11434
AI_MODEL_DISABLE_ON_FAILURE=true
AI_MODEL_CONNECTION_TIMEOUT=10000
```

#### Authentication (OIDC/Keycloak)
```bash
# OIDC Configuration
OIDC_ISSUER_BASE_URL=https://keycloak.example.com/realms/j4f
OIDC_BASE_URL=https://j4f-assistant.example.com
OIDC_CLIENT_ID=j4f-assistant
OIDC_CLIENT_SECRET=your-client-secret
OIDC_SECRET=your-session-secret
AUTH_ENABLED=true
```

#### Security & Monitoring
```bash
# Security Settings
SECURE_COMMAND_EXECUTION=true
AUDIT_LOG_ENABLED=true
AUDIT_LOG_PATH=/var/log/j4f/audit.log

# Monitoring
HEALTH_CHECK_ENABLED=true
METRICS_ENABLED=true
PROMETHEUS_PORT=9090
```

### Helm Chart Configuration

#### Basic Values (`values.yaml`)
```yaml
replicaCount: 2

image:
  repository: j4f-assistant
  tag: latest
  pullPolicy: IfNotPresent

service:
  type: ClusterIP
  port: 3000

ingress:
  enabled: true
  host: j4f-assistant.example.com
  tls:
    enabled: true
    secretName: j4f-tls

persistence:
  enabled: true
  size: 10Gi
  storageClass: fast-ssd

mongodb:
  enabled: true
  persistence:
    size: 50Gi

nats:
  enabled: true
  jetstream:
    enabled: true
```

#### Production Values (`values-prod.yaml`)
```yaml
replicaCount: 3

resources:
  limits:
    cpu: 2000m
    memory: 4Gi
  requests:
    cpu: 500m
    memory: 1Gi

autoscaling:
  enabled: true
  minReplicas: 2
  maxReplicas: 10
  targetCPUUtilizationPercentage: 70

monitoring:
  enabled: true
  serviceMonitor:
    enabled: true

security:
  podSecurityPolicy:
    enabled: true
  networkPolicy:
    enabled: true
```

## 🌍 Environment-Specific Deployments

### Development Environment
```bash
# Local development with hot reloading
docker-compose -f docker-compose.yml up -d

# Features:
# - File watching and auto-restart
# - Debug logging enabled
# - Development databases
# - Exposed debugging ports
```

### Staging Environment
```bash
# Staging deployment
./helm/deploy.sh --environment staging --namespace staging

# Features:
# - Production-like configuration
# - Test data isolation
# - Performance monitoring
# - Security testing enabled
```

### Production Environment
```bash
# Production deployment
./helm/deploy.sh --environment prod --namespace production --create-namespace

# Features:
# - High availability (3+ replicas)
# - Load balancing and auto-scaling
# - Persistent storage
# - Security hardening
# - Full monitoring and alerting
```

## 🔍 Health Checks & Monitoring

### Application Health Checks
```javascript
// Built-in health check endpoint
app.get('/health', (req, res) => {
  res.json({
    status: 'healthy',
    timestamp: new Date().toISOString(),
    services: {
      database: 'connected',
      nats: 'connected',
      ai_providers: 'available'
    }
  });
});
```

### Kubernetes Probes
```yaml
livenessProbe:
  httpGet:
    path: /health
    port: 3000
  initialDelaySeconds: 30
  periodSeconds: 10

readinessProbe:
  httpGet:
    path: /ready
    port: 3000
  initialDelaySeconds: 5
  periodSeconds: 5
```

### Prometheus Metrics
```javascript
// Custom metrics for monitoring
const promClient = require('prom-client');

const httpDuration = new promClient.Histogram({
  name: 'http_request_duration_seconds',
  help: 'Duration of HTTP requests in seconds',
  labelNames: ['method', 'route', 'status']
});

const pipelineLatency = new promClient.Histogram({
  name: 'pipeline_processing_duration_seconds',
  help: 'Duration of pipeline processing in seconds',
  labelNames: ['pipeline_type', 'step']
});
```

## 🔒 Security Configuration

### TLS/SSL Configuration
```yaml
# Helm chart TLS configuration
ingress:
  tls:
    enabled: true
    secretName: j4f-tls
    hosts:
      - j4f-assistant.example.com

# Certificate management with cert-manager
annotations:
  cert-manager.io/cluster-issuer: "letsencrypt-prod"
```

### Network Policies
```yaml
# Kubernetes network policy for security isolation
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: j4f-assistant-netpol
spec:
  podSelector:
    matchLabels:
      app: j4f-assistant
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: traefik
    ports:
    - protocol: TCP
      port: 3000
```

### Security Context
```yaml
# Pod security context
securityContext:
  runAsNonRoot: true
  runAsUser: 1000
  fsGroup: 2000
  seccompProfile:
    type: RuntimeDefault

# Container security context
containerSecurityContext:
  allowPrivilegeEscalation: false
  capabilities:
    drop:
    - ALL
  readOnlyRootFilesystem: true
```

## 📊 Scaling & Performance

### Horizontal Pod Autoscaling
```yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: j4f-assistant-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: j4f-assistant
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
```

### Resource Management
```yaml
resources:
  requests:
    cpu: 500m
    memory: 1Gi
  limits:
    cpu: 2000m
    memory: 4Gi

# JVM tuning for Node.js
env:
- name: NODE_OPTIONS
  value: "--max-old-space-size=3072"
```

## 🛠️ Deployment Scripts

### Automated Deployment Script
```bash
#!/bin/bash
# helm/deploy.sh

ENVIRONMENT=${1:-dev}
NAMESPACE=${2:-website}
CREATE_NAMESPACE=${3:-false}

echo "Deploying J4F Assistant to $ENVIRONMENT environment..."

# Create namespace if needed
if [ "$CREATE_NAMESPACE" = "true" ]; then
  kubectl create namespace $NAMESPACE --dry-run=client -o yaml | kubectl apply -f -
fi

# Deploy with environment-specific values
helm upgrade --install j4f-assistant ./j4f-assistant \
  --namespace $NAMESPACE \
  --values values-$ENVIRONMENT.yaml \
  --timeout 10m \
  --wait

echo "Deployment complete!"
echo "Access the application at: https://j4f-assistant-$ENVIRONMENT.example.com"
```

### Health Check Script
```bash
#!/bin/bash
# scripts/health-check.sh

APP_URL=${1:-http://localhost:3000}

echo "Checking application health..."

# Check main application
curl -f $APP_URL/health || exit 1

# Check specific services
curl -f $APP_URL/api/models || exit 1
curl -f $APP_URL/api/conversations || exit 1

echo "All health checks passed!"
```

## 🔄 CI/CD Integration

### GitLab CI Pipeline
```yaml
stages:
  - build
  - test
  - deploy-staging
  - deploy-production

build:
  stage: build
  script:
    - docker build -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA .
    - docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA

deploy-staging:
  stage: deploy-staging
  script:
    - helm upgrade --install j4f-assistant-staging ./helm/j4f-assistant
      --namespace staging
      --set image.tag=$CI_COMMIT_SHA
      --values values-staging.yaml
  environment:
    name: staging
    url: https://j4f-assistant-staging.example.com

deploy-production:
  stage: deploy-production
  script:
    - helm upgrade --install j4f-assistant ./helm/j4f-assistant
      --namespace production
      --set image.tag=$CI_COMMIT_SHA
      --values values-prod.yaml
  environment:
    name: production
    url: https://j4f-assistant.example.com
  when: manual
  only:
    - main
```

## 📋 Troubleshooting

### Common Issues

#### Application Won't Start
```bash
# Check logs
kubectl logs -l app=j4f-assistant -n website

# Check configuration
kubectl describe deployment j4f-assistant -n website

# Verify environment variables
kubectl get configmap j4f-config -o yaml
```

#### Database Connection Issues
```bash
# Test MongoDB connectivity
kubectl exec -it deployment/j4f-assistant -- nc -zv mongodb 27017

# Check MongoDB logs
kubectl logs deployment/mongodb -n website
```

#### NATS Connection Problems
```bash
# Check NATS server status
kubectl exec -it deployment/nats -- nats server info

# Test NATS connectivity
kubectl exec -it deployment/j4f-assistant -- nc -zv nats 4222
```

### Performance Issues
```bash
# Check resource usage
kubectl top pods -n website

# Monitor pipeline performance
kubectl logs -l app=j4f-assistant -n website | grep "Pipeline"

# Check database performance
kubectl exec -it deployment/mongodb -- mongo --eval "db.stats()"
```

## Related Documentation

- [Architecture Overview](../structure/Architecture.md)
- [Pipeline System](../pipeline/Overview.md)
- [Authentication Setup](../auth/Overview.md)
- [Security Configuration](../security/CommandExecution.md)

## 🔄 AI Provider Fallback System

### Robust Ollama Connection Management

The J4F Assistant now includes a sophisticated fallback system for AI providers that ensures continuous operation even when primary services are unavailable.

#### Fallback Strategy

1. **Primary Ollama Connection** - Attempts connection to main Ollama endpoint
2. **Fallback Ollama Endpoints** - Tries backup Ollama servers in sequence
3. **External Providers** - Falls back to OpenAI/Anthropic if configured
4. **Graceful Degradation** - Continues without AI if all providers fail (optional)

#### Configuration Options

```bash
# Primary Ollama endpoint
AI_MODEL_BASE_URL=http://localhost:11434

# Comma-separated list of fallback Ollama endpoints
AI_MODEL_FALLBACK_URLS=http://backup-ollama:11434,http://tertiary-ollama:11434

# Whether to disable AI entirely if all Ollama endpoints fail
# true = continue with external providers only
# false = crash application if no Ollama available
AI_MODEL_DISABLE_ON_FAILURE=true

# Connection timeout per endpoint (milliseconds)
AI_MODEL_CONNECTION_TIMEOUT=10000

# External provider fallback configuration
OPENAI_API_KEY=your-openai-key      # Optional fallback
ANTHROPIC_API_KEY=your-anthropic-key # Optional fallback
```

#### Production Deployment Scenarios

##### High Availability Ollama Setup
```yaml
# docker-compose.yml - Multiple Ollama instances
services:
  ollama-primary:
    image: ollama/ollama:latest
    ports:
      - "11434:11434"
    volumes:
      - ollama_primary_data:/root/.ollama

  ollama-backup:
    image: ollama/ollama:latest
    ports:
      - "11435:11434"
    volumes:
      - ollama_backup_data:/root/.ollama

  j4f-assistant:
    image: j4f-assistant:latest
    environment:
      - AI_MODEL_BASE_URL=http://ollama-primary:11434
      - AI_MODEL_FALLBACK_URLS=http://ollama-backup:11434
      - AI_MODEL_DISABLE_ON_FAILURE=true
      - OPENAI_API_KEY=${OPENAI_API_KEY}
```

##### Cloud-Hybrid Setup
```bash
# Use local Ollama with cloud fallback
AI_MODEL_BASE_URL=http://localhost:11434
AI_MODEL_FALLBACK_URLS=http://backup-server:11434
AI_MODEL_DISABLE_ON_FAILURE=true
OPENAI_API_KEY=your-openai-key
ANTHROPIC_API_KEY=your-anthropic-key
```

##### Pure Cloud Setup
```bash
# External providers only (no Ollama)
AI_MODEL_DISABLE_ON_FAILURE=true
OPENAI_API_KEY=your-openai-key
ANTHROPIC_API_KEY=your-anthropic-key
# Leave AI_MODEL_BASE_URL unset or set to unreachable endpoint
```

#### Health Monitoring

The application provides several endpoints for monitoring AI provider health:

- **`GET /api/health`** - Overall system health
- **`GET /api/health/connections`** - Detailed connection status
- **`GET /api/health/providers`** - Provider-specific status
- **`POST /api/health/reconnect`** - Manual reconnection trigger
- **`GET /api/ping`** - Lightweight health check

#### Frontend Status Display

The web interface now includes a real-time connection status indicator showing:

- Current active AI provider (Ollama, OpenAI, Anthropic)
- Connection health status
- Fallback provider availability
- Manual reconnection controls
- Detailed error information

#### Kubernetes Health Checks

```yaml
# Deployment with health checks
apiVersion: apps/v1
kind: Deployment
spec:
  template:
    spec:
      containers:
      - name: j4f-assistant
        livenessProbe:
          httpGet:
            path: /api/ping
            port: 3000
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /api/health
            port: 3000
          initialDelaySeconds: 5
          periodSeconds: 5
```

### Troubleshooting Connection Issues

#### Common Configuration Errors

1. **Invalid Fallback URLs**
   ```bash
   # ❌ Wrong - URLs not properly formatted
   AI_MODEL_FALLBACK_URLS=backup-ollama:11434,localhost:11435
   
   # ✅ Correct - Full HTTP URLs
   AI_MODEL_FALLBACK_URLS=http://backup-ollama:11434,http://localhost:11435
   ```

2. **Timeout Configuration**
   ```bash
   # ❌ Too short for model loading
   AI_MODEL_CONNECTION_TIMEOUT=1000
   
   # ✅ Reasonable timeout
   AI_MODEL_CONNECTION_TIMEOUT=10000
   ```

3. **Missing External Keys**
   ```bash
   # ❌ Fallback enabled but no keys provided
   AI_MODEL_DISABLE_ON_FAILURE=true
   # Missing OPENAI_API_KEY and ANTHROPIC_API_KEY
   
   # ✅ Proper fallback configuration
   AI_MODEL_DISABLE_ON_FAILURE=true
   OPENAI_API_KEY=sk-...
   ANTHROPIC_API_KEY=sk-ant-...
   ```

#### Debugging Connection Issues

```bash
# Check connection status
curl http://localhost:3000/api/health/connections

# Attempt manual reconnection
curl -X POST http://localhost:3000/api/health/reconnect

# Check overall system health
curl http://localhost:3000/api/health
```

#### Log Analysis

Look for these log patterns:

```bash
# Successful fallback
INFO  OllamaConnectionManager: Failed to connect to Ollama at http://primary:11434: Connection refused
INFO  OllamaConnectionManager: Successfully connected to Ollama at http://backup:11434

# External provider fallback
WARN  ConnectionManager: All Ollama endpoints failed. Disabling Ollama support - continuing with external providers only.
INFO  ExternalProviderManager: Generating response using OpenAI model: gpt-3.5-turbo

# Complete failure with graceful degradation
ERROR ConnectionManager: All AI providers failed and fallback disabled
```