Docker multi-stage build · K8s manifests · macOS node Operator · HPA autoscaling with queue depth
Is your OpenClaw still running natively? Facing multi-team collaboration, elastic scaling, and cross-region deployment needs, the traditional installation falls short. This guide provides a complete containerization and K8s orchestration plan: how to transform OpenClaw into a standard Docker image, deploy it on macOS nodes using Kubernetes Operator, and implement autoscaling with HPA based on custom metrics. Includes multi-stage build optimization, macOS privileged configuration, and cost-elasticity trade-off matrix.
Moving OpenClaw from local Node to Docker, then to Kubernetes, is not about chasing trends—it solves real production pain points.
Inconsistent environments: Different developers/CI nodes have varying Node/npm versions, causing \"works on my machine\" issues. Containers guarantee identical runtime.
Scaling inefficiency: Native OpenClaw cannot be replicated quickly. When task queues grow, manual instance creation is too slow.
Cross-region deployment difficulty: Deploying to multiple remote Mac nodes manually is error-prone. K8s declarative config enables consistent multi-region rollout.
Slow failure recovery: Node failure requires manual reinstall. Orchestrator auto-detects and reschedules, drastically reducing MTTR.
Low resource utilization: Each instance dedicates a whole machine. K8s enables multi-tenancy sharing and resource quotas, improving overall efficiency.
Containerization benefits: 2026 DevOps surveys show containerized AI Agent services reduce mean recovery time by 70% and improve resource utilization by over 40%.
Before writing a Dockerfile, externalize OpenClaw’s config from filesystem to environment variables and external storage—this is the key first step.
| Original (file) | Containerized Approach | Notes |
|---|---|---|
openclaw.json API keys | ConfigMap / Secret | Sensitive data must use K8s Secret or external vault |
storage.type = "file" | Change to redis or cloud storage | Container filesystem is ephemeral; state must be externalized |
gateway.address = "0.0.0.0:8080" | Env var GATEWAY_ADDR | Port mapping via Docker -p or K8s Service |
skills local path | ConfigMap or sidecar container | Skills should be independent from image for easy updates |
logs to local file | stdout/stderr + sidecar collector | Follow 12FA: log to console |
Core change: make openclaw.json read from environment variables. Example:
{
"storage": {
"type": "redis",
"host": "${REDIS_HOST}",
"port": 6379,
"password": "${REDIS_PASSWORD}"
},
"gateway": {
"address": "${GATEWAY_ADDR}",
"adminPort": "${ADMIN_PORT}"
},
"llm": {
"provider": "${LLM_PROVIDER}",
"apiKey": "${LLM_API_KEY}"
}
}
An efficient, secure Docker image is the foundation for K8s orchestration. Use multi-stage builds to shrink size, enhance security, and add health probes.
Choose minimal base image: Use official Node.js (e.g., node:18-alpine) for runtime, not a full OS, to reduce attack surface.
Multi-stage build: Stage 1 installs deps & compiles; Stage 2 contains only runtime files, reducing image size by 60%+.
Non-root user: Create dedicated openclaw user in Dockerfile and switch with USER; avoid root inside container.
Add health check: Declare HEALTHCHECK calling OpenClaw’s /health endpoint so K8s can monitor readiness.
Expose port: Use EXPOSE 8080; K8s Service will route traffic accordingly.
Layer optimization: Separate dependency install from source copy to leverage cache; use npm ci --only=production to exclude dev deps from production image.
# Build stage
FROM node:18 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
# Runtime stage
FROM node:18-alpine
WORKDIR /app
RUN addgroup -g 1001 -S openclaw && adduser -u 1001 -S openclaw -G openclaw
COPY --from=builder --chown=openclaw:openclaw /app/dist ./dist
COPY --from=builder --chown=openclaw:openclaw /app/node_modules ./node_modules
COPY --from=builder --chown=openclaw:openclaw /app/package*.json ./
USER openclaw
EXPOSE 8080
HEALTHCHECK --interval=30s --timeout=3s --start-period=10s --retries=3 \
CMD node -e "require('http').get('http://localhost:8080/health', (r) => {if(r.statusCode!==200)throw new Error('unhealthy')})"
CMD ["node", "dist/index.js"]
Full Kubernetes deployment requires several resources working together. Below is the minimal set to run OpenClaw on remote Mac nodes.
| Resource | Purpose | Key Fields |
|---|---|---|
| Deployment | Manages pod replicas and rolling updates | replicas, strategy, template |
| Service | Internal load balancing and DNS | type (ClusterIP), ports, selector |
| Ingress | External HTTP/HTTPS routing | rules, host, path, backend service |
| ConfigMap | Non-sensitive config storage | data, immutable |
| Secret | Sensitive info (API keys, passwords) | type: Opaque, data (base64) |
The following YAML example demonstrates a complete K8s setup for OpenClaw. Secrets must be created separately.
apiVersion: apps/v1
kind: Deployment
metadata:
name: openclaw
labels:
app: openclaw
spec:
replicas: 2
selector:
matchLabels:
app: openclaw
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
template:
metadata:
labels:
app: openclaw
spec:
containers:
- name: openclaw
image: your-registry/openclaw:latest
ports:
- containerPort: 8080
env:
- name: GATEWAY_ADDR
value: "0.0.0.0:8080"
- name: REDIS_HOST
valueFrom:
configMapKeyRef:
name: openclaw-config
key: redis.host
- name: REDIS_PASSWORD
valueFrom:
secretKeyRef:
name: openclaw-secret
key: redis.password
readinessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 15
periodSeconds: 10
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 30
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
---
apiVersion: v1
kind: Service
metadata:
name: openclaw
spec:
selector:
app: openclaw
ports:
- protocol: TCP
port: 80
targetPort: 8080
type: ClusterIP
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: openclaw
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
spec:
rules:
- host: openclaw.vpsmesh.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: openclaw
port:
number: 80
---
apiVersion: v1
kind: ConfigMap
metadata:
name: openclaw-config
data:
redis.host: "redis-cluster.vpsmesh.com"
llm.provider: "openai"
Kubernetes typically runs on Linux; deploying to macOS requires handling hardware binding, privileges, and device plugins. A custom Operator encapsulates this complexity.
node-role.kubernetes.io/macos=true) and affinity rules to schedule pods on Mac nodes.securityContext.privileged: true or allowPrivilegeEscalation: true, but restrict scope strictly.k8s-device-plugin) and declare resources like limits: apple.com/metal: 1.hostPath or Local PV, but data persistence/migration must be considered.OpenClawAgent into native K8s resources for one-click deployment and lifecycle management.Security warning: Container isolation on macOS is less strict than Linux. Thoroughly test privileged containers in a sandbox and restrict which nodes they can land on.
Operator skeleton example (using Kopf):
import kopf
@kopf.on.create('openclawvpsmesh.io', 'v1', 'openclawagents')
def create_fn(spec, **kwargs):
image = spec.get('image', 'your-registry/openclaw:latest')
replicas = spec.get('replicas', 1)
redis_host = spec['redis']['host']
redis_password = spec['redis'].get('password')
deployment = {
"apiVersion": "apps/v1",
"kind": "Deployment",
"metadata": {"name": f"openclaw-{kwargs['name']}"},
"spec": {
"replicas": replicas,
"selector": {"matchLabels": {"app": "openclaw"}},
"template": {
"metadata": {"labels": {"app": "openclaw"}},
"spec": {
"containers": [{
"name": "openclaw",
"image": image,
"env": [
{"name": "REDIS_HOST", "value": redis_host},
{"name": "REDIS_PASSWORD", "value": redis_password}
],
"ports": [{"containerPort": 8080}]
}]
}
}
}
}
return {"status": "created"}
OpenClaw load correlates directly with task queue depth. K8s HorizontalPodAutoscaler can use custom metrics (queue length, response time) to auto-scale replicas.
| Metric | Description | Scale thresholds | Cool-down |
|---|---|---|---|
| CPU utilization | Pod CPU % | scale-out >70%, scale-in <30% | 300s |
| Memory utilization | Pod memory % | scale-out >80%, scale-in <40% | 300s |
| Queue depth | OpenClaw pending tasks | scale-out >50 tasks, scale-in <10 tasks | 180s |
| Task response time | P95 latency | scale-out >2s, scale-in <500ms | 300s |
To enable queue-depth based scaling, deploy Prometheus stack and an OpenClaw metrics exporter, then register the metric with K8s Custom Metrics API. HPA example:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: openclaw-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: openclaw
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Pods
pods:
metric:
name: openclaw_queue_depth
target:
type: AverageValue
averageValue: 50
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Pods
value: 1
periodSeconds: 300
scaleUp:
stabilizationWindowSeconds: 60
policies:
- type: Pods
value: 2
periodSeconds: 60
Cost vs elasticity trade-off: Too low a replica floor wastes resources; too aggressive thresholds cause flapping. Set baselines via load testing during off-peak hours before going live.
Containerization and K8s are not silver bullets—they bring operational complexity and overhead. Here are three hard data points to decide if it's worth the investment.
stabilizationWindowSeconds and step sizes to avoid flapping.Building your own containerized platform demands deep Kubernetes and macOS expertise, including image builds, resource scheduling, networking, and security hardening. For most teams, this technical burden is substantial.
For production environments seeking reliable, elastic OpenClaw deployments, VpsMesh's OpenClaw managed service is often the superior choice. We offer pre-installed Mac Mini M4 nodes with automatic scaling and cross-region DR—no need to write complex Operators or HPA configs. Discover how to get enterprise-grade OpenClaw with minimal startup cost on our pricing page or order online now.
Main change is externalizing config. Move skill definitions from local directories to ConfigMap volume mounts; replace hard-coded paths with environment variables. Ensure all file access uses declared volumes. See our Help Center containerization page for details.
macOS container ecosystem lags behind Linux. Privileged containers, device plugins, and CNI networking have limited support. Consider lightweight distributions like k3s or minikube, and run only non-privileged or minimally privileged workloads. For high-performance needs, evaluate whether containerization is necessary or use VpsMesh's managed OpenClaw nodes.
Deploy Prometheus stack and an OpenClaw metrics exporter to expose queue depth via K8s Custom Metrics API. Then configure HPA with type: Pods referencing the metric name. See K8s docs on \"Using Prometheus as a custom metrics source\" for adapter setup.