The Gateway container is healthy but the CLI still times out. Which layer should I check first?

Start with HTTP reachability inside the same network namespace as the CLI, then validate WebSocket upgrade and reverse-proxy pathing, and only then revisit CLI hostnames and ports for accidental loopback or stale aliases.

When is network_mode: service:gateway appropriate?

Use it when a sidecar or helper must share the Gateway network stack and loopback semantics. Tradeoffs include coupled restart ordering and shared port ownership, so document upgrades explicitly.

What commonly breaks WebSocket behind a reverse proxy?

Short upstream timeouts, missing Upgrade-related header forwarding, or allowedOrigins mismatches for the real browser or CLI origin. Tie every symptom to a reproducible minimal command sequence.

OpenClaw Docker Compose Networking in 2026: When the CLI Cannot Reach the Gateway — Namespaces, network_mode, and Reverse-Proxy Reachability

Translate healthcheck green into network terms: five frequent misreads

A Compose healthcheck often probes loopback inside the container or process liveness. It does not automatically prove that DNS resolution, iptables, and user-space proxies all cooperate on the path from a CLI container to the Gateway service name. The five patterns below arrive together in tickets; separating them thins your incident log immediately.

01
Listening on 127.0.0.1 only: when the Gateway binds loopback, sibling services on the same bridge get connection refused via the service name; it feels like random timeouts even though nothing ever left that network namespace.
02
CLI on the host with a container hostname: copying openclaw-gateway:18789 into a host shell profile misaligns resolution and routing instantly.
03
Reverse proxy forwards HTTP but not Upgrade: browsers or CLIs using WSS see 400 or silent drops while application logs still show Gateway ready.
04
allowedOrigins drift from real origins: mixing production domains, internal aliases, and MagicDNS-style names rejects handshakes at the app layer while packet captures look fine.
05
network_mode: service restart races: after restart ordering changes, downstreams still hit old container IPs or stale port mappings, producing intermittent success.

Print the next section as a review handout: allow only one matrix cell to change per architecture change, and attach paired outputs for curl inside the same namespace versus curl from the outer namespace.

Add a time dimension: during rolling updates Compose briefly runs old and new containers together. If DNS caches and client pools diverge, you see first request succeeds, then minutes of failures. Before raising timeouts, refresh resolution on the initiator and compare connection reuse against the current endpoint from docker inspect. If you also chain user-space or corporate proxies, log CONNECT tunnel targets separately from direct targets so a 407 from the proxy is not misread as application auth failure.

Another easy miss is MTU and fragmentation on cross-cloud or cross-carrier paths, which inflates into sporadic timeouts. When large payloads fail while tiny health checks stay green, narrow captures to WebSocket frame sizes and TLS record boundaries instead of rewriting application routes first.

Once those signals live in the change ticket, align timestamps between openclaw logs and edge access logs. Most teams can collapse mystifying networking into a single configuration field within thirty minutes, which is also the context depth a minimal repro package should carry when asking for outside help.

Compose network models: default bridge, host, and shared network namespaces

When you pick a model, write down who initiates the connection, what name resolves to which address, and which NAT layers sit in between. Without that table the team ping-pongs between changing ports, extra_hosts, and reverse-proxy upstreams.

Model	Typical listen pattern	Other services in the same compose file	Host processes
Default bridge with published ports	`0.0.0.0` inside the container or explicit publishes	Use the Compose service name and internal port	Use `127.0.0.1:published` or a host NIC IP
Host networking	Shares the host stack; binds are host-visible	Other containers that stay off bridge cannot keep the old service-name path	Check port collisions and INPUT firewall chains alongside containers
network_mode: service:gateway	Shares the Gateway netns; loopback semantics align	Sidecars may call `127.0.0.1:gateway-port`	The host still needs published ports or a proxy; nothing is inherited automatically

True reachability means repeating the same hostname, port, and TLS parameters inside the initiator network namespace and getting consistent responses, not a one-off curl from a laptop.

Six-step runbook: from process liveness to WebSocket upgrades and origins

The sequence keeps the cheapest observations first; stop and save output whenever a step fails. If field names diverge from your install, cross-check the install and doctor troubleshooting checklist.

01
Label the initiator: note whether commands run on the host, in a Gateway sidecar, or in a standalone CLI container; capture hostname and a short ip route summary.
02
HTTP probe inside the netns: from the initiator namespace, GET or HEAD the target hostname and port, verify status codes and body prefixes, and rule out pure DNS failure.
03
WebSocket probe: exercise the upgrade path you actually use, record edge versus application response headers, and align timestamps with logs.
04
Listen matrix inside Gateway: if listeners bind 127.0.0.1 only and cross-service access is required, move to 0.0.0.0 or adopt a shared netns and update the runbook accordingly.
05
Reverse-proxy quadruple: confirm upstream points at container IP versus published port, verify Connection and Upgrade forwarding, and ensure idle timeouts are not clipping long-lived sessions.
06
Origins checklist: enumerate real Origin strings or equivalents for browsers, CLIs, and CI; every missing row is a release blocker.

Inside bridge stack versus host (examples)

docker compose ps
docker compose exec cli sh -lc 'getent hosts openclaw-gateway; curl -fsS -o /dev/null -w "%{http_code}\n" http://openclaw-gateway:18789/health || true'
curl -fsS -o /dev/null -w "%{http_code}\n" http://127.0.0.1:18789/health || true
docker compose logs --no-color --tail=200 openclaw-gateway

ℹ

Note: replace service names, ports, and health paths with the values from your repository; keep the pattern of running the same URL from two namespaces.

Reverse proxies and Gateway parameters: three facts that belong in the change ticket

This section lists facts you can name in configuration files, not vibes about CDNs misbehaving. For memory limits and log rotation language, return to the Compose production baseline.

Published ports and iptables: ports: mappings create DNAT rules on the host; misunderstanding ordering between local firewall policies and Docker chains yields container-to-container success while the host path fails, or the opposite.
127.0.0.1 bind semantics: listening on loopback inside a container netns only accepts loopback ingress for that netns; this does not automatically contradict publishing ports, but it often contradicts sibling containers addressing you by service name.
WebSocket coexisting with HTTP/2: when an edge buffer mis-handles upgrade or keep-alive timeouts, streams die quietly tens of seconds after the handshake; align timestamps on both sides before rotating API keys.

⚠

Warning: do not simultaneously change reverse-proxy upstreams, Gateway binds, and CLI configuration without a recorded baseline; triangular changes make bisection impossible.

Decision matrix: when a CLI sidecar or host networking is worth the complexity

Write a boolean for whether the CLI must share loopback semantics with the Gateway before choosing network_mode: service: or host. Use the matrix for design review, not slogans.

Constraint	Safer default	Key acceptance signal	Main risk
CLI and Gateway in one compose file	bridge plus explicit `0.0.0.0` binds	service-name resolution matches internal-port curl	Firewall and published-port documentation drifts from reality
Must share localhost semantics	sidecar with `network_mode: service:gateway`	sidecar restart does not resurrect stale connection pools	upgrade ordering couples with volume mount permissions
Mature host reverse proxy already exists	published loopback only plus TLS termination at the edge	packet or log proof shows consistent WS upgrade	`allowedOrigins` misses URL shapes the CLI actually uses

Relying on ad-hoc tunnel scripts or hand-edited hosts files binds mean time to recovery to individual memory. When upstream certificates or internal DNS change, triage regresses into all-hands meetings.

⚠

Common pitfall: seeing 502 and rotating model keys first; finish HTTP and WebSocket probes from section three before touching credentials.

Opening ports temporarily without a checklist rarely proves default-deny, explicit-allow posture for audits. When OpenClaw must ship alongside fixed egress, hostnames, and mutual TLS policy, ad-hoc VPS networking layers often lack signable change records. For teams that need iOS builds, desktop handoff, and persistent agents on dedicated machines with predictable regions and network tiers, and want fewer loops guessing host versus container netns, VpsMesh Mac Mini cloud rental is usually the better fit: dedicated nodes simplify how you describe listeners and ACLs in the same language as the team private network runbook; pricing lives on the pricing page and connectivity guidance on the help center.

FAQ

Three questions readers ask first

Run HTTP probes inside the same netns as the CLI, then validate WebSocket upgrade, and only then revisit CLI hostnames for accidental host loopback or stale aliases. For hardening, see the production hardening checklist.

Use it when helpers must share the Gateway stack and loopback view; document restart ordering for the shared service and cross-check official connectivity notes in the help center.

Upstream timeouts, missing Upgrade header forwarding, or allowedOrigins gaps for the real client origin; align edge and application logs before rotating keys or model routes. For plans and egress needs, see the pricing page.

OpenClaw on Docker Compose in 2026Networking and CLI Reachability

Translate healthcheck green into network terms: five frequent misreads

Compose network models: default bridge, host, and shared network namespaces

Six-step runbook: from process liveness to WebSocket upgrades and origins

Reverse proxies and Gateway parameters: three facts that belong in the change ticket

Decision matrix: when a CLI sidecar or host networking is worth the complexity

Three questions readers ask first

OpenClaw on Docker Compose in 2026
Networking and CLI Reachability