Technical writing
Post-Quantum Mesh Cryptography for Drone Swarms: the Swarm SDK Design
The adversarial environment for drone communications is unlike most networked systems. Nodes move at 30 m/s. Connectivity is intermittent. A captured drone is a captured key. An adversary can record every packet today and decrypt it after a cryptographically relevant quantum computer exists. The Swarm SDK is a response to these constraints. This is how it's designed.
Why post-quantum now
Harvest-now-decrypt-later (HNDL) attacks are the primary threat driver. An adversary that records drone communications today — encrypted with classical X25519 or RSA — can archive the ciphertext and decrypt it when a cryptographically relevant quantum computer becomes available. Intelligence with a shelf life of 10+ years is particularly vulnerable; autonomous systems operating in contested environments routinely generate it.
The timeline pressure is real. NIST finalized FIPS 203 (ML-KEM, the algorithm formerly known as CRYSTALS-Kyber) in August 2024. The NSA's CNSA 2.0 suite requires ML-KEM-1024 for National Security Systems by January 2030 — we ship ML-KEM-768 (128-bit quantum security) with a migration path to -1024 on the upgrade schedule. The DoD's quantum-ready roadmap mandates post-quantum in all new NSS acquisitions from 2025 onward.
The counter-argument — wait for standardization — is no longer available. Standards are final. The gap is implementation.
Key exchange: ML-KEM-768 + X25519 hybrid
The SDK uses a hybrid key exchange that concatenates the shared secrets from both algorithms and hashes them into a single session key. The design principle: if either algorithm is broken, the session key is still secure. This is a well-known conservative approach — NIST explicitly recommends it for the transition period.
# Hybrid key derivation (simplified)
kem_ciphertext, kem_shared = ml_kem_768.encapsulate(peer_pk)
dh_shared = x25519(my_sk, peer_pk_x25519)
session_key = HKDF-SHA256(
ikm = kem_shared || dh_shared,
info = b"swarm-v1-session",
len = 32
)ML-KEM-768 provides 128-bit quantum security. X25519 provides classical Diffie-Hellman security. Both must be broken simultaneously for the hybrid to fail — no current attack comes close. The total key exchange overhead is approximately 1,184 bytes on the wire (KEM ciphertext 1,088 bytes + X25519 32 bytes + framing).
Keys are generated on-device and never transmitted in cleartext. The SDK has no PKI dependency — peers exchange public keys out-of-band (via QR code, USB, or secure side-channel at mission planning time). Self-certifying DIDs anchor peer identity without a trusted third party.
Session security: Double Ratchet forward secrecy
The initial hybrid key exchange establishes a root key. From there the SDK runs the Double Ratchet algorithm (Signal protocol spec) to derive per-message keys. The ratchet has two components:
- Symmetric-key ratchet (KDF chain): Each message consumes a key derived from the chain and advances the chain state. Old keys are deleted immediately after use. Compromise of a single message key exposes only that message; prior messages are mathematically inaccessible.
- Diffie-Hellman ratchet: At each ratchet step, peers exchange new DH public keys and combine the result with the KDF chain. This provides post-compromise recovery: after a node is temporarily compromised and key material is extracted, future messages are secure once a new DH ratchet step completes.
In the drone context, post-compromise recovery is essential. A damaged drone may be recovered by an adversary. The ratchet ensures that once the node is neutralized (or re-keyed), the adversary's extracted keys cannot decrypt future mission traffic.
# Per-message key structure (Double Ratchet)
sending_chain_key, message_key = KDF_CK(sending_chain_key)
ciphertext = XSalsa20-Poly1305.encrypt(
key = message_key,
nonce = random_24_bytes(),
msg = plaintext,
)
# message_key is zeroed from memory immediately after useMessage format and overhead
Every encrypted message shares the same on-wire format regardless of encryption mode. The overhead is fixed and predictable — critical for fitting inside MAVLink v2's 253-byte ENCAPSULATED_DATA payload.
┌────────────────────────────────────────────┐ │ 3 bytes │ Header (version, mode, flags) │ │ 24 bytes │ XSalsa20 nonce │ │ N bytes │ Ciphertext │ │ 16 bytes │ Poly1305 MAC │ │ 64 bytes │ Ed25519 signature │ └────────────────────────────────────────────┘ Total overhead: 107 bytes Max plaintext at 253-byte limit: 146 bytes 22-byte telemetry → 139-byte encrypted message ✓
The Ed25519 signature authenticates the sender. For deniable operations, the signature can be replaced with HMAC-SHA256 — both parties can produce the same MAC, which provides plausible deniability without sacrificing integrity. The mode bit in the header signals which authentication is in use.
Mesh routing: gossip with bounded fanout
A mesh of N drones cannot use a hub-and-spoke model — there is no persistent ground station in contested airspace. The SDK implements gossip-based routing where each node forwards messages to a random subset of peers (fanout = 3, TTL = 5).
The trade-offs are deliberate: fanout 3 and TTL 5 gives O(3⁵) = 243 potential delivery paths before a message is dropped, which covers swarms up to ~50 nodes with high reliability. In our testing across simulated contested environments (20% packet loss, 2-node failures), message delivery remains above 99.1% for swarms of 12 nodes.
# Message routing decision per received packet
if msg_id in seen_window: # 50,000-msg dedup window
return # drop duplicate
seen_window.add(msg_id)
deliver_to_application(msg)
if msg.ttl > 0:
peers = random.sample(peer_table, min(3, len(peer_table)))
for peer in peers:
forward(msg.with_ttl(msg.ttl - 1), peer)Quarantined peers (flagged by the Situational Awareness module as exhibiting Sybil behavior, RSSI anomalies, or authentication failures) are excluded from the gossip peer table automatically. The mesh re-routes around them without manual intervention. Quarantine release uses exponential backoff with jitter to prevent simultaneous reintegration of multiple flagged nodes.
Group encryption: Sender Keys
Broadcasting to N peers with Double Ratchet requires N separate encryption operations — O(N) computation and O(N) ciphertext copies. For large swarms broadcasting telemetry at high frequency, that's prohibitive. The Sender Keys scheme solves it:
- Each sender has a symmetric chain key distributed to authorized peers at session setup.
- Broadcast messages are encrypted once with a key derived from the chain.
- Each recipient independently advances their copy of the sender's chain.
- The chain is ratcheted forward with each message — forward secrecy is preserved.
The result is O(1) encryption for broadcast regardless of swarm size, with the same forward secrecy guarantees as the Double Ratchet. The cost is that revoking a compromised node from the group requires distributing new Sender Keys to all remaining members — an intentional design choice that makes revocation explicit and auditable rather than implicit.
CNSA 2.0 compliance mapping
CNSA 2.0 specifies algorithm requirements for National Security Systems. The Swarm SDK maps to these requirements as follows:
- Key establishment
- ML-KEM-768 + X25519 hybrid ✓
- Digital signatures
- Ed25519 (transition) / migration path to ML-DSA
- Symmetric encryption
- XSalsa20-Poly1305 (256-bit key) ✓
- Key derivation
- HKDF-SHA256 ✓
- Message integrity
- Poly1305 MAC (per message) ✓
- Forward secrecy
- Double Ratchet (per message) ✓
The one gap: CNSA 2.0 prefers ML-DSA (FIPS 204) over Ed25519 for signatures. ML-DSA signatures are 3,309 bytes — larger than the entire MAVLink payload. The current design uses Ed25519 with a documented migration path to in-protocol signature algorithm negotiation in a future version, allowing upgrades without breaking the wire format.
Performance: encryption at altitude
On a Raspberry Pi 4 (representative of a mid-tier flight controller companion computer), measured encryption throughput:
- Standard encrypt
- ~210,000 msg/sec
- Double Ratchet
- ~95,000 msg/sec
- ML-KEM-768 keygen
- ~1,400 ops/sec
- ML-KEM-768 encap/decap
- ~1,200 ops/sec
- X25519 DH
- ~22,000 ops/sec
- Ed25519 sign/verify
- ~18,000 / ~7,000 ops/sec
At 20 Hz telemetry (typical MAVLink rate), 95,000 Double Ratchet operations per second is approximately 4,750× headroom on a single core. Cryptographic overhead is not the bottleneck; radio bandwidth and MAVLink scheduling are. The initial ML-KEM key exchange (~1,400 ops/sec) adds approximately 0.7ms latency — negligible for session establishment, which happens at most once per mission.
Open problems
Three problems we haven't fully solved:
- Group key agreement at scale. Sender Keys solve broadcast encryption but don't provide authenticated group key agreement. Adding a new drone to an active mission requires distributing Sender Keys from each existing member — O(N) messages. At 100+ nodes this is manageable; at 1,000+ nodes it needs a tree-based CGKA approach (MLS, ART). We're watching the MLS RFC progress and prototyping.
- Offline pre-key distribution. The current design requires a brief connectivity window at mission start for key exchange. True air-gap operation (no ground contact once airborne) requires pre-distributing one-time prekeys at mission planning time, with strict bounds on how many sessions a prekey bundle can support.
- ML-DSA signature size. The CNSA 2.0 path to post-quantum signatures (ML-DSA-65, FIPS 204) produces 3,309-byte signatures — 13× the MAVLink payload limit. Until compressed or batch-signature schemes are standardized, the signature layer will remain a hybrid or deferred problem.
For how encrypted SDK messages are fragmented into 253-byte MAVLink v2 frames, reassembled, and integrated with PX4 and ArduPilot flight stacks: Swarm SDK MAVLink v2 integration: encrypting mesh messages inside 253-byte drone protocol frames →
For a deep dive into the Double Ratchet implementation — the encapsulation ratchet, header encryption, out-of-order key cache, and MAVLink v2 framing: The Swarm SDK double ratchet: forward secrecy and post-compromise security →
For how device cryptographic identity is provisioned, rotated, and revoked across a drone fleet: Swarm SDK key management: device provisioning, certificate rotation, and revocation →
For what shipped in v0.4 — Situational Awareness API, EW Coordination, Adversarial Resilience, and RF Fingerprinting: Swarm SDK v0.4: situational awareness, EW coordination, and adversarial resilience →
For how a new drone goes from factory state to trusted mesh participant — factory-provisioned keypairs, Fleet CA certificate signing, USB and RF enrollment paths, and gossip mesh announcement: Swarm SDK device enrollment: how a new drone joins an authenticated fleet mesh →
For how the SDK resists traffic analysis from the ground up — message size normalization across six fixed bins, STM32H7 TRNG-driven jitter scheduling, store-and-forward under packet loss, and OperationalMode degraded-channel behavior: Swarm SDK operational security: traffic analysis resistance, message size normalization, and timing jitter →
See also: Swarm SDK overview, FAQ, and release notes →
Questions or access requests: info@ai-analytics.org