Measurement study · prototype (N=10) · pilot campaign

Can Matrix carry a tactical radio network?

A mobile ad-hoc mesh of vehicle-borne Matrix homeservers, connected by intermittent 1–9 kbit/s federation links. Matrix does no multi-hop routing — cross-node delivery happens only opportunistically, through room-DAG backfill. This study measures whether that DAG behaves as a usable delay-tolerant, store-carry-forward layer.

k3s + Calico + Chaos Mesh 70-node design · 10-node prototype Synapse ×10 federation real shaped-link data 📄 Read the full paper →
runs

Live demonstration

Replay of a real run. Two convoys (blue / orange); the link between them flaps. Watch cross-convoy messages stall during a partition and burst through on the heal — store-carry-forward in action.

0s link:
delivered (intended): 0 / 0
speed
seek
convoy A convoy B delivered
run
link bandwidth
nodes

First results

Every figure below is generated from real orchestrator logs of the pilot campaign (see methods). Headline numbers:

loading summary.json
delivery ratio vs bandwidth
Delivery ratio collapses as the federation link tightens (full mesh, fixed run length). Points are replicate runs; line is the mean.
latency CDF
Convergence-latency CDF per bandwidth — the low-bandwidth tail stretches out as backlog drains slowly across the throttled link.
goodput vs payload
Goodput vs payload size. Federation framing/signatures dominate, so the useful-payload ratio collapses for short messages. model — see paper TODO
partition gating
Two-convoy MANET: intra-convoy traffic flows; cross-convoy traffic is gated by the partition and only the heal windows let it through.

Per-run summary

runscenariobw (kbit/s)delivery median lat (s)goodputnon-conv.status
loading…

Scaling to N=70: one requirement, one real barrier

⚠ Results updating. An earlier headline ("a fundamental convergence ceiling, ~37% at N=70") was traced to a setup artifact, not a protocol limit. The large-N scaling is being re-measured with the corrected setup (ETA a few hours); figures below refresh automatically.

A setup requirement — pre-establish the room (not a hard limit)

Naively joining 70 servers concurrently rate-limits the resident server and forks the room DAG, so membership never converges and delivery looked capped (~33%). But this is a deployment procedure, not a barrier: with sequential pre-establishment + a convergence gate (form the operational room while connected, before dispersing), all 70 servers converge — and a converged N=70 room then delivers ~95% at ample bandwidth.

The real barrier — bandwidth / fan-out

Each message fans out to N−1 point-to-point federation transactions of ~762 B each, through one shared radio. So required per-radio bandwidth grows with fleet size (the fan-out tax): the fluid law r·(N−1)·(S+O) sets the shape, and real throughput sits well above it (federation is round-trip/chatter-bound). This is what makes a 70-vehicle operational room infeasible at tactical link rates.

33% → 95%
N=70 delivery: naive join vs pre-established room
70 / 70
members converged with sequential setup
762 B
federation PDU overhead, constant in N
required bandwidth vs fleet size
Required per-radio bandwidth (B₅₀/B₁₀₀) vs fleet size, vs the fluid floor — the fan-out tax. large-N points re-measuring
overhead vs N
Federation PDU overhead is flat in N (origin-only signature, fixed auth_events) — so the N-scaling is pure fan-out, not bigger messages.

How it works

LayerChoiceRole
Clusterk3s (single host)every federation edge is synthetic
Link emulationChaos Mesh NetworkChaosbandwidth shaping + scheduled partition/heal
Static graphCalico NetworkPolicywhich servers may peer at all
HomeserverSynapse ×10 (×70 design)one per vehicle; federation backoff tuned
DatabasePostgreSQL, one DB/nodelocal, unthrottled
Harnessasync orchestrator, one clocktraffic + the §6 measurement log

Only the server-to-server (federation) links are shaped; the client↔own-server link inside each vehicle is never impaired. See the paper for the full method, the closed-mesh gotchas, and limitations.