OpenClaw Docker Headless Browser Skills in 2026: shm_size, Capabilities, Memory Limits, and a Symptom Matrix for VPS Rollouts

01

Why the Gateway can be healthy while the browser stack flakes: five hidden taxes

Many teams validate OpenClaw on a VPS by proving that messages flow and doctor is green. That is necessary but not sufficient for browser-class skills. Headless Chromium creates large anonymous mappings and synthetic buffers; when those collide with Docker default 64MB /dev/shm or an aggressive cgroup memory cap, the symptom is often blank UI, tab crashes, or screenshot timeouts rather than an immediate Exit 137. The incident is then misrouted to model latency, site anti-bot rules, or channel retries. Operations engineers waste hours tuning model timeouts when the real constraint is shared memory and peak RSS from the renderer stack.

01
Single-page success mistaken for load testing: loading a marketing homepage is not the same stress as multi-step login, long scrolling captures, or concurrent tabs; production traffic will spike memory and shm pressure without warning.
02
Ignoring coupling between /dev/shm and host memory: Chromium prefers large shared-memory segments; RSS in docker stats can look modest while dmesg already shows cgroup throttling or oom-kill events.
03
Copy-pasting wide capabilities: adding SYS_ADMIN to bypass sandbox friction expands blast radius from browser bugs to host compromise; reviewers need a written threat trade-off.
04
Mixing this with reverse-proxy and allowedOrigins incidents: non-loopback control UI errors and WebSocket drops belong to the Compose networking runbook; do not triangulate unrelated failure trees in one change window.
05
Stacking heavy browser jobs on the same instance as chatty channels: overnight batch automations can break a profile that looked stable during daytime pings unless you plan peaks and isolation profiles.

Encode the five taxes as explicit forbidden patterns and mandatory soak tests. Print them on the first page of your change request so nobody silently widens privileges to make a demo pass. The next section indexes symptoms to parameters so on-call engineers can stop bleeding without rereading every upstream doc.

02

Symptom matrix: align shm, memory caps, capabilities, and log fingerprints

The table below is indexed by what you observe first, not by parameter names, because incidents arrive as user-visible pain. After each change, capture docker stats peaks and a short Gateway log slice; change only one knob per experiment so rollbacks stay honest.

Symptom you see	Check first	Typical root cause and move
Intermittent white screen, Aw Snap, tab crashes	`shm_size`, `/dev/shm` utilization	Default 64MB is often too small; try `512m` then `1g` and cap concurrent pages.
Process disappears, Exit 137	`mem_limit`, host swap, oom_kill counters	Browser peak plus Node resident set exceeded cgroup; raise limits in steps or split instances; see Exit 137 primer.
Immediate permission or device errors	`cap_add`, `devices`, seccomp profile	Diff against official compose snippets; add the minimum surface, not a bag of privileged caps.
CPU pegged but navigation stalls	Software rendering flags, infinite navigation retries in skills	Bound retries and timeouts; verify the skill is not hot-reloading in a loop.
Only certain sites fail	TLS fingerprinting, HTTP/2, regional egress	If signals point to network rather than cgroup, pivot to egress tests instead of stacking shm tweaks.

Stability for browser-class skills is mostly three auditable facts: peak memory, shared memory, and a capability allowlist; everything else is secondary tuning.

Community writeups and official Docker guidance in 2026 still recommend explicit shm_size for stacks that embed browser automation—commonly in the 512MB to 1GB band—paired with a clear memory ceiling. You do not need to memorize vendor magic numbers, but you do need the phrase defaults are not enough in your team vocabulary, plus a separate capacity line item for overnight batch windows when skills scrape dashboards or capture evidence packs.

03

Six-step runbook: from read-only checks to reversible compose changes

The sequence below matches the Compose production baseline: observe, change one variable, soak test, archive. Paste outputs into the ticket instead of narrating changes in chat.

01
Pin the image reference: note digest or immutable tag before touching browser parameters; avoid drifting production on :latest while debugging peaks.
02
Capture baseline: run the same skill three times; record docker stats peaks, df -h /dev/shm inside the container, and Gateway log windows.
03
Change shm only: raise shm_size to 512m or 1g, keep everything else fixed, rerun the same skill three times.
04
Then adjust mem_limit: if Exit 137 or oom_kill persists, raise mem_limit in roughly 25 percent steps and verify whether swap is disabled on the host.
05
Minimize capabilities: if official snippets require specific cap_add or device nodes, document the exact error you fix; avoid SYS_ADMIN unless the threat model is explicit.
06
Archive rollback points: commit the passing compose fragment and digest; keep rollback as a copy-paste compose down && compose up -d block.

docker-compose fragment (replace fields with your official template)

services:
  openclaw:
    image: ghcr.io/openclaw/openclaw:<pin-a-digest-not-latest>
    shm_size: "1g"
    mem_limit: "4g"
    # keep control plane on 127.0.0.1; terminate TLS at the reverse proxy
    # align json-file rotation and healthcheck start_period with the baseline article

ℹ

Tip: if you need a second heavier browser stack on the same host, read multi-instance isolation for ports and volumes before cloning this runbook.

04

Three hard parameters for sign-off reviews

This section lists only facts you can point to in config or monitoring, not vibes like the browser feels unstable. Treat the numbers as starting bands and validate with your own skills.

shm_size baseline: for headless Chromium workloads, treat shm_size as co-equal with mem_limit; start at 512MB, validate long captures, then consider a 1GB tier—common in 2026 community documentation.
mem_limit and peak windows: gateway-only footprints may survive at 2–4GB, but production stacks with browser automation often need 4–8GB ceilings per instance with alerts on p95 peaks; parallel agents deserve split instances, not linear double counting.
healthcheck start_period: first boot pulls images and cold-starts browsers; if healthcheck start_period is too short, Compose restarts during warm-up and looks like random flakiness. Align fields with the baseline article.

⚠

Warning: do not rotate reverse-proxy certificates, model keys, and browser resource caps in the same change; triple moves make rollback non-bisectable. TLS paths live in the reverse-proxy guide.

05

Single mixed instance versus split stacks: a closing decision table

Once daytime traffic is stable, ask the organizational question: may this same instance run heavy browser batches overnight? Answering after an outage is expensive.

Pattern	When it fits	Main risk
Single mixed instance	Personal pilots and light skills without long captures	Peak stacking is invisible; one OOM takes down channels and tools together.
Dedicated browser profile	Two compose stacks on one machine with split volumes	Requires strict isolation checklists; see multi-instance article.
Dedicated 24/7 node	Team production needing predictable SLA	Higher cost, but you get sign-off capacity and auditable change history.

Ad-hoc VPS tuning is flexible early, yet production OpenClaw needs three artifacts that informal stacks often lack: reserved capacity, pinned images, and ticketed changes. When skills must coexist with iOS builds, desktop handoffs, and always-on agents, moving browser peaks to a predictable 24/7 footprint beats endless parameter whack-a-mole. For teams that need dedicated, region-stable Mac capacity with operational clarity, VpsMesh Mac Mini cloud rental is usually the better fit: easier headroom for browser bursts and disk, aligned with the Mac Mesh collaboration narrative. See pricing and help center.

FAQ

Three questions readers ask first

Chromium-style renderers lean on shared memory for large buffers. When /dev/shm is tiny you can see intermittent white screens or tab crashes while CPU still looks fine. Raise shm_size to 512m or 1g first, then cross-check memory lines in the Exit 137 primer.

Not always. Start from official compose snippets with least privilege. If you must add privileged caps, document the threat trade-off. Channel hardening references live in the production hardening checklist.

Shared Gateways amplify resource contention when browser peaks spike. Add shm and memory rows to the team resource table and review jointly with the multi API key compartments runbook.