Skip to main content

Deployment Guide

Deploy Watchlight in your production environment.

Enterprise Content

This section contains enterprise deployment information. Contact sales@watchlight.ai for enterprise licensing.

Deployment Options

Docker Compose (Development/Staging)

Quick setup for development and staging environments:

# docker-compose.yml
version: '3.8'

services:
wl-apdp:
image: watchlight/wl-apdp:latest
ports:
- "8081:8081"
environment:
- LOG_LEVEL=info
- OTEL_ENABLED=false
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8081/health"]
interval: 30s
timeout: 10s
retries: 3

mcp-registry:
image: watchlight/mcp-registry:latest
ports:
- "8080:8080"
environment:
- DATABASE_URL=postgres://postgres:postgres@postgres:5432/mcp_registry
depends_on:
postgres:
condition: service_healthy

postgres:
image: postgres:16
environment:
- POSTGRES_DB=mcp_registry
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=postgres
volumes:
- pgdata:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 10s
timeout: 5s
retries: 5

beacon-dashboard:
image: watchlight/beacon-dashboard:latest
ports:
- "5173:80"
environment:
- VITE_APDP_URL=http://localhost:8081
- VITE_REGISTRY_URL=http://localhost:8080

volumes:
pgdata:
docker-compose up -d

Kubernetes (Production)

Production deployment on Kubernetes:

Namespace and ConfigMap

# namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: watchlight

---
# configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: watchlight-config
namespace: watchlight
data:
LOG_LEVEL: "info"
OTEL_ENABLED: "true"
OTEL_EXPORTER_OTLP_ENDPOINT: "http://otel-collector:4317"

WL-APDP Deployment

# wl-apdp-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: wl-apdp
namespace: watchlight
spec:
replicas: 3
selector:
matchLabels:
app: wl-apdp
template:
metadata:
labels:
app: wl-apdp
spec:
containers:
- name: wl-apdp
image: watchlight/wl-apdp:latest
ports:
- containerPort: 8081
envFrom:
- configMapRef:
name: watchlight-config
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
livenessProbe:
httpGet:
path: /health
port: 8081
initialDelaySeconds: 10
periodSeconds: 30
readinessProbe:
httpGet:
path: /ready
port: 8081
initialDelaySeconds: 5
periodSeconds: 10

---
apiVersion: v1
kind: Service
metadata:
name: wl-apdp
namespace: watchlight
spec:
selector:
app: wl-apdp
ports:
- port: 8081
targetPort: 8081

MCP Registry Deployment

# mcp-registry-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: mcp-registry
namespace: watchlight
spec:
replicas: 2
selector:
matchLabels:
app: mcp-registry
template:
metadata:
labels:
app: mcp-registry
spec:
containers:
- name: mcp-registry
image: watchlight/mcp-registry:latest
ports:
- containerPort: 8080
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: watchlight-secrets
key: database-url
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"

Ingress

# ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: watchlight-ingress
namespace: watchlight
annotations:
kubernetes.io/ingress.class: nginx
cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
tls:
- hosts:
- api.watchlight.example.com
secretName: watchlight-tls
rules:
- host: api.watchlight.example.com
http:
paths:
- path: /authorize
pathType: Prefix
backend:
service:
name: wl-apdp
port:
number: 8081
- path: /policies
pathType: Prefix
backend:
service:
name: wl-apdp
port:
number: 8081
- path: /mcp-servers
pathType: Prefix
backend:
service:
name: mcp-registry
port:
number: 8080

Helm Chart

# Add Watchlight Helm repository
helm repo add watchlight https://charts.watchlight.ai
helm repo update

# Install Watchlight
helm install watchlight watchlight/watchlight \
--namespace watchlight \
--create-namespace \
--set wlapdp.replicas=3 \
--set registry.replicas=2 \
--set postgresql.enabled=true \
--set ingress.enabled=true \
--set ingress.host=api.watchlight.example.com

Helm Values

# values.yaml
global:
imageTag: "latest"

wlapdp:
replicas: 3
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"

registry:
replicas: 2
resources:
requests:
memory: "256Mi"
cpu: "250m"

postgresql:
enabled: true
auth:
database: mcp_registry
postgresPassword: changeme

ingress:
enabled: true
className: nginx
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
host: api.watchlight.example.com
tls: true

monitoring:
enabled: true
prometheus:
enabled: true
grafana:
enabled: true

Database Configuration

PostgreSQL

Recommended Production Settings:

-- Connection pooling
max_connections = 200
shared_buffers = 256MB
effective_cache_size = 768MB
work_mem = 4MB

-- Performance
random_page_cost = 1.1
effective_io_concurrency = 200

High Availability:

Use PostgreSQL with streaming replication or a managed service:

  • AWS RDS for PostgreSQL
  • Google Cloud SQL
  • Azure Database for PostgreSQL
  • DigitalOcean Managed Databases

Connection Pooling

Use PgBouncer for connection pooling:

# pgbouncer-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: pgbouncer-config
data:
pgbouncer.ini: |
[databases]
mcp_registry = host=postgres port=5432 dbname=mcp_registry

[pgbouncer]
listen_addr = 0.0.0.0
listen_port = 6432
auth_type = md5
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 20

High Availability

WL-APDP HA

WL-APDP is stateless and horizontally scalable:

spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 1

Load Balancing:

  • Use Kubernetes Service (ClusterIP) with external load balancer
  • Or use Ingress controller with session affinity disabled

Registry HA

Registry requires database for state:

  1. Run multiple replicas behind load balancer
  2. Use PostgreSQL with replication
  3. Enable read replicas for query scaling

Security

TLS Configuration

All production deployments should use TLS:

# Use cert-manager for automatic certificate management
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: watchlight-cert
spec:
secretName: watchlight-tls
issuerRef:
name: letsencrypt-prod
kind: ClusterIssuer
dnsNames:
- api.watchlight.example.com

Network Policies

Restrict network access:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: wl-apdp-policy
namespace: watchlight
spec:
podSelector:
matchLabels:
app: wl-apdp
policyTypes:
- Ingress
- Egress
ingress:
- from:
- namespaceSelector:
matchLabels:
name: ingress-nginx
ports:
- port: 8081
egress:
- to:
- namespaceSelector:
matchLabels:
name: monitoring
ports:
- port: 4317 # OTLP

Secrets Management

Use Kubernetes Secrets or external secrets manager:

apiVersion: v1
kind: Secret
metadata:
name: watchlight-secrets
namespace: watchlight
type: Opaque
stringData:
database-url: postgres://user:password@host:5432/db
api-key: your-secret-key

Or use External Secrets Operator with AWS Secrets Manager / HashiCorp Vault.

Monitoring

Prometheus Metrics

WL-APDP and Registry expose Prometheus metrics at /metrics.

# servicemonitor.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: watchlight-monitor
namespace: watchlight
spec:
selector:
matchLabels:
app: wl-apdp
endpoints:
- port: http
path: /metrics
interval: 30s

Grafana Dashboards

Import pre-built dashboards:

  • Authorization latency and throughput
  • Policy evaluation metrics
  • Registry health and usage
  • Error rates and alerts

Alerting

# alerts.yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: watchlight-alerts
spec:
groups:
- name: watchlight
rules:
- alert: HighAuthorizationLatency
expr: histogram_quantile(0.95, rate(authorization_duration_seconds_bucket[5m])) > 0.1
for: 5m
labels:
severity: warning
annotations:
summary: "High authorization latency detected"

- alert: AuthorizationErrors
expr: rate(authorization_errors_total[5m]) > 0.1
for: 2m
labels:
severity: critical
annotations:
summary: "Authorization errors increasing"

Backup and Recovery

Database Backup

# Automated backup with pg_dump
pg_dump -h $DB_HOST -U $DB_USER -d mcp_registry | gzip > backup_$(date +%Y%m%d).sql.gz

# Upload to S3
aws s3 cp backup_*.sql.gz s3://your-backup-bucket/watchlight/

Policy Backup

# Export all policies
curl http://localhost:8081/policies > policies_backup.json

# Import policies
curl -X POST http://localhost:8081/policies/import \
-H "Content-Type: application/json" \
-d @policies_backup.json

Scaling

Horizontal Scaling

# Scale WL-APDP
kubectl scale deployment wl-apdp --replicas=5

# Or use HPA
kubectl autoscale deployment wl-apdp --min=3 --max=10 --cpu-percent=70

Vertical Scaling

Increase resources for high-throughput scenarios:

resources:
requests:
memory: "1Gi"
cpu: "1"
limits:
memory: "2Gi"
cpu: "2"

Troubleshooting

Common Issues

Connection refused:

  • Check service endpoints: kubectl get endpoints -n watchlight
  • Verify pods are running: kubectl get pods -n watchlight

Database connection errors:

  • Check DATABASE_URL secret
  • Verify network connectivity to database
  • Check connection pool limits

High latency:

  • Check resource limits and requests
  • Review policy complexity
  • Enable policy selection metrics

Debug Mode

# Enable debug logging
kubectl set env deployment/wl-apdp LOG_LEVEL=debug -n watchlight

# View logs
kubectl logs -f deployment/wl-apdp -n watchlight

Next Steps

  • SLA - Service level agreements
  • Compliance - Regulatory compliance