rf.mesh — Lab & Compute

This is a framework under active development — not a committed build. The schema below represents the agreed-on long-term architecture. Specific hardware choices within each node are still being evaluated. Competing options are documented. The Raspberry Pi testbed runs in parallel and feeds real data into this design process.

Total racks

100

Gbps fabric

Storage target

Cloud deps

∞

Local sovereign

Rack 1 — primary lab (core fabric)

The first rack is the primary compute and network fabric. All three racks share this network. 30–50% empty space reserved from day one — this is infrastructure, not a PC build.

Slot	Role	Function	Options / notes
Node 0 SWITCH	100 GbE fabric	All nodes communicate through this. No direct point-to-point hacks. Unifying nervous system of the entire lab. Leave open ports for future expansion.	Option A MikroTik CRS354 (48× GbE + 2× 100G QSFP28) ~$700 Option B Arista 7060X / Juniper QFX series (enterprise, ~$3k+) DECIDE: managed vs unmanaged initially
Node 1 POWER	Surge · UPS · filter	Clean power for all nodes. UPS provides runtime during outages. Surge protection for all RF and SDR hardware — switching PSUs generate noise that contaminates SDR receive chains.	APC Smart-UPS 1500VA RT or equivalent · clean sine wave output PDU with metered outlets per rack RF note: SDR node on isolated PDU branch if possible
Node 2 COMPUTE	Local compute · GPU/NPU farm	Heavy background processing: photogrammetry (ODM/COLMAP), Gaussian splatting (Nerfstudio/Postshot), LiDAR fusion (PDAL), digital twin rendering (UE5+Cesium). Kubernetes breakaway to Rack 2 for burst workloads.	Option A — single workstation Threadripper 7960X + RTX 4090 + 256 GB RAM (~$8–12k) Option B — mini-rack fabric 3× Lenovo ThinkCentre Tiny (8–16c, 64 GB each) + separate GPU server 2U (~$5–8k total) NOT DECIDED See Compute Paths tab
Node 3 NAS	Petabyte storage	All persistent data: raw drone imagery, point clouds, sensor time-series, processed outputs, long-term biome logs. ZFS initially, Ceph as cluster grows. NVMe cache tier for active working sets.	Option A TrueNAS Scale on mini-PC + 4–8 HDDs RAIDZ2 Option B Synology DS1823xs+ / QNAP TES-1885U (enterprise NAS) Breakaway to Rack 3 as storage scales. 10GbE NFS/iSCSI to all compute nodes. PLAN: Ceph migration path from ZFS
Node 4 SDR HEAD	Full-spectrum analysis · recording	Dedicated RF node physically isolated from switching PSUs. RTL-SDR (Phase 0), HackRF (Phase 1), LimeSDR Mini 2 / USRP (Phase 2+). Spectrum monitoring, NOAA/APRS/ADS-B receive, Meshtastic gateway, Reticulum rnode interface.	Hardware: Raspberry Pi 5 or small x86 mini-PC (clean USB power) Separate thermal zone — RF hardware runs warm Isolation note: SDR performance degrades near switching PSUs. Physical separation matters.
Node 4-sub RF AUX	TX/RX · amp · atten · antenna	All external RF hardware cabled to SDR head. Transceivers, amplifiers, attenuators, bandpass filters, antenna splitters. Modular — each frequency band or function is a discrete unit that can be swapped independently.	TX/RX: per active band (see Protocols page) LNA (low-noise amplifier) for receive sensitivity Bandpass filters to reject out-of-band interference SPEC: full RF chain per band on Protocols page
Node 5 WORKSTATION	Daily desktop · 6× 1440p · EUD sync	Primary operator interface. 6× 1440p displays across 6 virtual desktops. Synced with laptops and phone for mobile continuity. Drone GCS, live Grafana dashboards, QGIS, field planning, comms.	High-core desktop (AMD Ryzen 9 or Intel Core Ultra 9) GPU with multi-display output (RTX 4080 can drive 4+ displays) Syncs with: Lenovo Yoga (canvas/field), ThinkPad (rugged field), S25 Ultra (phone) EUD = End User Device — phone, laptops are extensions of this desktop
Node 6+ SERVICES	Git · CI/CD · internal tools	Version control for configs, firmware, processing scripts. Self-hosted Gitea or GitLab. Internal documentation wiki. Monitoring stack (Prometheus/Grafana meta). Expandable — this slot covers future services as they are identified.	Gitea (lightweight) or GitLab CE (full-featured) Prometheus + Alertmanager for infrastructure monitoring Containerised (Docker/Podman) on compute node or dedicated SPEC LATER — lower priority than data and RF nodes

Rack 2 — compute breakaway (Kubernetes burst)

Rack 2 is a Kubernetes worker cluster that Rack 1's orchestration node can schedule workloads onto. It does not need to be populated at Phase 0 — it is reserved space. When multi-drone simultaneous ODM processing exceeds what Rack 1's compute can handle, Rack 2 workers absorb the overflow without any architectural change.

Rack 2 — role

Purpose	Kubernetes worker nodes for burst compute
Trigger	Multi-drone simultaneous processing saturates Rack 1
Nodes	2–6 mini PCs or 1U servers depending on workload
Network	100GbE uplink to Rack 1 switch
Storage	Reads/writes via Rack 1 NAS over NVMe-oF or NFS
GPU option	Add GPU server here to offload Nerfstudio / COLMAP at scale
Phase 0 status	Empty — reserve rack space only

Rack 2 — scaling trigger events

Trigger 1	3+ drones operating simultaneously → ODM queue backs up
Trigger 2	Nerfstudio / Gaussian splat generation takes >6 hrs on Rack 1
Trigger 3	Real-time NDVI classification required during flight (not post-processing)
Trigger 4	LiDAR fusion + photogrammetry running simultaneously
Response	Add worker nodes to Rack 2. K8s schedules jobs automatically. No code changes.

Rack 3 — storage breakaway (Ceph / distributed)

Rack 3 absorbs storage growth that exceeds Rack 1's NAS capacity. ZFS on Rack 1 migrates to Ceph distributed storage spanning Racks 1 and 3. This is the petabyte path. Not needed until Rack 1 NAS is >70% full.

Rack 3 — role

Purpose	Distributed storage cluster (Ceph OSD nodes)
Trigger	Rack 1 NAS exceeds ~70% capacity
Technology	Ceph RADOS distributed object/block/file store
Drive density	High-density JBOD enclosures (24–60 drives per 4U)
Network	100GbE uplink to Rack 1 switch (Ceph needs fat pipes)
Scale target	Petabyte-range raw capacity
Phase 0 status	Not deployed — reserve rack space

Ceph vs ZFS — transition decision

ZFS (Rack 1 NAS)	Simple · single node · fast setup · Phase 0–2
Ceph (Rack 1+3)	Distributed · no SPOF · petabyte-capable · Phase 2+
Migration path	ZFS data exported to Ceph OSD nodes via rsync/rclone migration. TrueNAS Scale supports Ceph natively (Bluefin+).
MinIO alternative	S3-compatible object store. Simpler than Ceph for unstructured data (imagery, point clouds). Less suited to block storage.
Decision point	Evaluate at Rack 1 NAS ~50% full. Don't over-engineer before data volume is real.

Auxiliary systems

Laptop 1 — Lenovo Yoga (canvas / field)

Form factor	2-in-1 · touchscreen · digital canvas
Primary use	On-site drone control · field planning · UE5 editing · sketching / annotation
Sync	Continuous sync with Rack 1 workstation via LAN/VPN
RF role	Drone GCS (Dronelink / DJI Fly) · Meshtastic app · Grafana mobile dashboard
Note	Touchscreen canvas function for site annotation on drone orthomosaics

Laptop 2 — Lenovo ThinkPad (rugged / portable)

Form factor	Durable · keyboard-first · field utility
Primary use	Scripting · config · SSH to rack · light processing · field documentation
Sync	Synced with Rack 1 via LAN/VPN
RF role	Fallback GCS · terminal access to SDR node · Reticulum client

Phone — Samsung S25 Ultra

Primary use	Mobile EUD · drone GCS · Meshtastic / MeshCore app · Infiray thermal camera
MANET / SDR	GoTenna MANET or equivalent for mobile mesh node capability evaluate options
Digital sketchpad	S Pen annotation · site sketching · field photography
Sync	Continuous with Rack 1 workstation and laptops
Reticulum	Sideband app (Android) — full Reticulum client over BT rnode

Mesh radio stack

Meshtastic	Phase 0 edge nodes · SenseCAP P1-Pro · Heltec V3 drone relay active now
MeshCore	Regional backbone evaluation · 64-hop repeater model parallel experiment
Reticulum	Hub routing daemon (rnsd on SDR head) · long-term transport layer active now
Other	AREDN (licensed ham mesh · Phase 3) · LoRaWAN (sensor density option · Phase 2) evaluate later

Pi testbed — parallel track note

The Raspberry Pi 4 testbed is not part of the mini-rack schema — it is a parallel validation track. Pi #1 (MQTT/InfluxDB/Grafana/Reticulum) and Pi #2 (ODM/Potree/API ingest) are running the Phase 0 weather node pipeline right now and generating real data. That data and the lessons from that pipeline design feed directly into the mini-rack schema. When the mini-rack Rack 1 is deployed, the Pi testbed does not get replaced — the SDR head node (Node 4) absorbs its radio functions, and the Pi #2 ODM pipeline migrates to Rack 1 compute (Node 2). The Pis themselves become edge-only — firmware flashing, field-site sensor nodes, remote relay nodes where mains power isn't available.

Pi vs x86 for specific tasks: Pis remain the correct hardware for battery/solar-powered outdoor edge nodes (SenseCAP P1-Pro uses a similar ARM SoC). For anything involving parallel processing, GPU compute, simultaneous multi-stream ingest, or petabyte storage, x86 with proper network fabric is the only viable path. These are complementary, not competing.

Framework, not commitment. All four AI sources consulted independently converged on the three-plane separation. What is not settled is the specific implementation within each plane. Competing options documented side by side without picking winners.

Three-plane architecture

PLANE 1

CONTROL

Nervous system — always on, low bandwidth, safety-critical

Drone swarm kinematics, heartbeat, GPS sync, collision avoidance, metadata sidecars, sensor telemetry, MAVLink commands. Does NOT carry payload data. Must function independently of all other planes.

LoRa 915 MHzMeshtastic → MeshCore → Reticulum ~1–21 kbpsSolar autonomous

PLANE 2

INGEST

Bloodstream — burst data transfer, proximity-dependent

Raw sensor payload: imagery, point clouds, multispectral, thermal. Two competing models — not yet resolved:

Option A Physical store-and-forward

Method	Drone records to NVMe. Lands at dock. USB 3.2 / Thunderbolt transfer at 10–40 Gbps.
Bandwidth	40–120 Gbps at dock
Latency	Data at hub only after landing
Infrastructure	Docking stations across site with power + NVMe
Status	Phase 0 equivalent — current approach

Option B Aerial burst (60 GHz / FSO)

Method	Drone hovers near relay tower. Dumps data via 60 GHz mmWave or FSO laser at 20–40 Gbps.
Bandwidth	802.11ay: up to 100 Gbps theoretical
Latency	Near-real-time while drone still flying
Range limit	60 GHz: ~50–200 m LOS only. Rain-sensitive.
Status	Phase 3 — requires relay tower infrastructure first

Hybrid is the likely long-term answer: physical offload for bulk archival, mmWave/FSO for priority live streams. But Option B requires relay towers. Build Option A first and validate the data model before committing to tower infrastructure.

PLANE 3

REFINERY

Brain — processing, storage, output generation

Receives all data from Planes 1 and 2. Fuses sensor streams. Produces digital twins, NDVI maps, point clouds, climate profiles, carbon baselines, ecological models. Fully local — no cloud.

This is Rack 1 (Node 2 compute + Node 3 NAS) expanding into Racks 2 and 3 as scale demands. See Mini-rack schema tab for node detail.

Physical backbone — options

Option	Bandwidth	Span	Phase	Notes
Cat6A / 10GbE copper	10 Gbps	100 m / run	Now	Simplest. Rack 1 internal fabric. Trenching needed for cross-site outdoor runs.
Single-mode fibre (SMF)	100 Gbps+	km-scale	Phase 2+	Fibre ring connecting relay towers across site. Immune to EM. Gemini/Grok recommendation for field scale.
60 GHz mmWave P2P	1–10 Gbps	200–500 m LOS	Phase 2	Wireless fibre substitute for tower spans. Rain-sensitive. Evaluate before fibre is laid.
Wi-Fi HaLow 802.11ah	~8 Mbps	1+ km	Phase 1–2	Sub-GHz Wi-Fi. Better range than standard Wi-Fi. Useful for sensor-dense field areas. Not for drone data ingest.
FSO / laser	1–10 Gbps	500 m–2 km	Phase 3	Licence-free optical wireless. Wind sway and fog are real problems. Tower-to-tower where fibre is impractical.

Edge triage — reduce before moving

Option A Full raw archival · process at hub

Model	Record everything raw. Transfer to Rack 1/3. Process there.
Strength	No data loss · re-processable with future algorithms
Weakness	Petabyte storage required faster
Phase fit	Phase 0–2. Correct while algorithms are being validated.

Option B Edge triage · Jetson Orin on drone

Model	Onboard NVIDIA Jetson Orin performs first-pass classification. Discards sky, static areas, below-threshold data. Transfers only changed/relevant data.
Hardware	Jetson Orin NX · 100 TOPS · ~150–400 g · ~5–15 W
Strength	Orders of magnitude smaller transfer volume
Weakness	Irreversible data loss if triage model is wrong
Phase fit	Phase 3+. Do not discard raw data until algorithms are validated on archival data first.

Metadata sidecars — bridge both options: Regardless of which path is chosen, the LoRa control plane should transmit a lightweight metadata sidecar per frame: timestamp, GPS, altitude, orientation quaternion, sensor mode, exposure. Small enough for LoRa. Enables georeferencing before bulk transfer. Design this into firmware from Phase 0.

Sensor payload — data volume reality check

LoRa is the wrong unit of analysis for data volume. LoRa handles the control plane. The data plane is several orders of magnitude larger and requires completely different infrastructure.

Sensor	Raw rate (est.)	Per 30-min mission	Notes
16K visible (200MP single sensor)	~660 MB/frame	~100–500 GB	Even 1 fps = ~5 Gbps raw. Compressed AV1/H.266 ~10–50:1.
Full-sphere visible array	~2–5 Gbps	~450 GB–1.1 TB	Multiple 16K tiles stitched. Sensor count TBD. exact array spec open
Dense LiDAR (128-line, 20 Hz)	~1–10 Gbps raw	~200 GB–2 TB	Draco compression ~5–10:1. Point density and scan rate variable.
Multispectral (5–10 bands)	~500 Mbps–2 Gbps	~100–450 GB	Reduced per-band resolution. NDVI derivable in post.
Hyperspectral (200+ bands)	~2–8 Gbps	~450 GB–1.8 TB	Phase 3+. Heavy classification compute required.
Thermal IR	~100–500 Mbps	~22–110 GB	Lower resolution than visible. Radiometric calibration required.
IMU / GPS / telemetry	<1 Mbps	<500 MB	Negligible. Carried on LoRa control plane.
Single drone — full payload	~2.5–20 Gbps	~1–5 TB raw	Compressed practical: ~100–500 GB per mission.
12-drone swarm	~30–240 Gbps	~12–60 TB	Simultaneous ingest. Distributed storage required.
Annual (12 drones, daily ops)	—	~4–22 PB / year	Petabyte accumulation timescale. Physical storage fabric, not radio.

LoRa implication: LoRa SF7 (fastest) = ~21 kbps. A single compressed 16K frame at 50 MB would take ~5 hours to transmit over LoRa. LoRa carries metadata, telemetry, and control signals only. This is the correct separation of concerns — not a limitation to fix.

Bandwidth ladder — control vs data planes

LoRa SF12 (Phase 0)

—

~1 kbps

LoRa SF7 (fastest)

—

~21 kbps

Wi-Fi HaLow 802.11ah

—

~8 Mbps

SDR OFDM (LimeSDR)

—

~30 Mbps

AREDN 802.11ac

—

~150 Mbps

Wi-Fi 6E (drone link)

—

~3 Gbps

60 GHz mmWave (802.11ay)

—

~40 Gbps

USB 3.2 / Thunderbolt (dock)

—

~20–40 Gbps

100 GbE fibre (Rack fabric)

—

100 Gbps

NVMe-oF / InfiniBand

—

200+ Gbps

The control plane (LoRa) and data plane (everything from Wi-Fi 6E up) are not on the same scale. The bandwidth ladder above is logarithmic in practice — there are nine orders of magnitude between LoRa and NVMe-oF.

The compute architecture is not decided. Two structurally different paths are documented below. The critical decision is the next hardware purchase — a single powerful machine commits to Path A; a mini PC as the first fabric node opens Path B. Both are valid. Neither forecloses the other permanently.

Path A — single powerful workstation

Workstation spec (representative)

CPU	AMD Threadripper 7960X (24c) or Intel Core Ultra 9 285K
RAM	128–256 GB DDR5 ECC
GPU	NVIDIA RTX 4090 · 24 GB VRAM · ODM + Nerfstudio + UE5 + COLMAP
NVMe	2–4 TB (active working set)
HDD array	48–200 TB ZFS (long-term archival)
Network	10GbE NIC → managed switch
Est. cost	~$5,000–$12,000 depending on storage

Path A — assessment

Strengths	Simple · low admin overhead · single maintenance point · works today
Ceiling	Hits wall at multi-drone simultaneous ingest + real-time processing
Breaks at	Swarm scale — 5+ drones simultaneous ODM saturates single CPU
Upgrade path	Second machine creates ad-hoc cluster — not designed, just bolted on
Pi integration	Pis demoted to edge-only. Workstation absorbs all Pi data roles.

Path B — distributed mini-rack fabric

Fabric node roles (Rack 1)

Node A — orchestration	Lenovo Tiny M90q · 16c · 64 GB · Proxmox / K3s · Mosquitto · InfluxDB
Node B — storage	Mini PC + SATA bays · TrueNAS · ZFS RAIDZ2 · NVMe cache
Node C — compute	Mini PC · 16c · 64 GB · ODM · COLMAP · PDAL · no GPU initially
Node D — RF/SDR	Pi 5 or mini PC · RTL-SDR · rtl_433 · Direwolf · Reticulum rnode
GPU server (add later)	2U server or eGPU enclosure · RTX 4090 · triggers when Nerfstudio / UE5 needed

Path B — assessment

Strengths	Horizontal scaling · role isolation · no rebuild on upgrade · maps to HPC
Ceiling	Designed not to have one — add nodes to Rack 2 via Kubernetes
Admin overhead	Higher than Path A — Kubernetes, Ceph, and fabric require more ops knowledge
Initial cost	Higher upfront for equivalent single-node performance
Est. total cost	~$5,500–$8,000 for full Rack 1 (Grok estimate)
Pi integration	Pis remain as edge nodes — they are Node D class devices in the fabric model

GPT: "This is a micro data center, not a PC build." The fabric model is what allows the testbed to evolve into swarm-scale operations without architectural resets. Path A is faster to deploy. Path B is what you build if you know the destination.

GPU — when and what

Trigger event	GPU requirement
UE5 + Cesium digital twin	RTX 4080/4090 · 16–24 GB VRAM · cannot run on CPU alone
Gaussian splatting (Nerfstudio / Postshot)	RTX 3090 minimum · 4090 recommended · VRAM-limited
COLMAP dense reconstruction	Any CUDA GPU accelerates significantly · 8+ GB VRAM
Real-time NDVI / hyperspectral classification	Jetson Orin (edge drone) or A100/H100 (hub) for inference
Multi-drone simultaneous ODM	Multi-GPU or cluster — single RTX 4090 saturates at ~5 simultaneous missions
On-drone edge triage	NVIDIA Jetson Orin NX (100 TOPS) · separate from hub GPU

The existing Pi 4s are not replaced — they are repurposed. Every new purchase should either be a direct upgrade to the data plane or the first node of the distributed fabric.

Current — two Pi 4s, home network, Phase 0 testbed

Pi #1: MQTT + InfluxDB + Grafana + Reticulum + RTL-SDR. Pi #2: ODM + Potree + API ingest. Home Wi-Fi. microSD storage. One drone. Data ceiling ~200 GB. This is the validation track — generating real data that informs every subsequent hardware decision.

First upgrade — managed 10GbE switch (both paths require this)

MikroTik CRS312 or equivalent managed 10GbE switch (~$300–$500). This is the single purchase that benefits both Path A and Path B equally. It separates drone burst transfer traffic from sensor MQTT traffic with QoS policies. All future nodes connect here. Upgrade the switch before buying any compute.

First storage node — NAS (both paths require this)

TrueNAS Scale on a mini PC with 4–8 HDDs in ZFS RAIDZ2, or a dedicated NAS appliance. Drone imagery moves here from microSD. Pi #2's ODM scratch space moves here. This prevents "mission lost when card fills" incidents and creates the shared storage that all future compute nodes will read from.

Path A fork — single workstation purchase

Buy Threadripper or Core Ultra 9 workstation with RTX 4090. ODM, Nerfstudio, UE5, COLMAP all move here. Pi #2 becomes API ingest + Potree serving only. This is fast, powerful, and adequate for single-drone operations and small swarms.

Path B fork — first fabric compute node

Buy first Lenovo Tiny or equivalent mini PC as the orchestration + processing node. Install Proxmox or K3s. ODM moves here from Pi #2. Leave GPU slot open — add eGPU or GPU server as a separate fabric node when Nerfstudio is needed. This is slower per-task than Path A but every subsequent purchase adds capacity rather than complexity.

GPU node — triggered by Gaussian splats or UE5

When Nerfstudio or UE5+Cesium enters the pipeline: Path A already has it in the workstation. Path B adds a dedicated GPU server (2U or eGPU enclosure) as a new fabric node. RTX 4090 at current prices handles both Nerfstudio and UE5 adequately for single-site operations.

Multi-drone scale — Rack 2 activation

When 3+ drones operate simultaneously and Rack 1 compute saturates: Path B adds worker nodes to Rack 2, Kubernetes schedules across them automatically, Ceph replaces ZFS for distributed storage. Path A at this point requires a second machine and an ad-hoc cluster arrangement — this is where Path A's ceiling becomes concrete and Path B's design advantage becomes tangible.