Technical writing
Swarm SDK prekey bundle management: generating, distributing, and consuming OneTimePreKeys across a drone fleet
X3DH's security guarantee rests on a deceptively simple invariant: each OneTimePreKey is consumed by exactly one session and then destroyed. Enforcing that invariant in a Signal server deployment is straightforward — the server hands out keys and marks them used. In a drone swarm, there is no server. Keys must be generated on-device, distributed across a contested radio mesh before takeoff, and consumed with consistency guarantees that survive link outages, simultaneous initiations, and prolonged disconnection. This article traces every step of that lifecycle.
Why prekey management is hard at the edge
The Signal Protocol's original X3DH design assumes a key server: a phone fetches a peer's prekey bundle over HTTPS, the server atomically marks the consumed OneTimePreKey (OTP) as used, and the phone establishes a session. The server absorbs all the hard consistency problems — concurrent consumption, exhaustion detection, replenishment — because it sits on a reliable, always-connected infrastructure.
Drones operate in contested RF environments where connectivity to any central server is intermittent at best and absent at worst. Jamming, terrain masking, and the geometry of multi-rotor formation flight routinely partition the mesh into islands for tens of seconds at a time. The Swarm SDK's prekey management layer must therefore solve problems that the Signal Protocol delegates to infrastructure:
- Pre-distribution before mission start. All prekey bundles must be in every peer's cache before takeoff. There is no opportunity to fetch a missing bundle mid-mission over a reliable channel.
- Concurrent consumption without a coordinator. Two drones that simultaneously observe the same OTP in a peer's cached bundle and both try to use it will create a collision. The protocol must handle this without a serialising lock.
- Graceful exhaustion handling. When a drone runs out of OTPs — because it has been popular, or because the mesh was partitioned during replenishment — sessions must still be establishable, with a clear security downgrade signal to the operator.
- Bounded memory on microcontrollers. The STM32H7's battery-backed SRAM is 4 KB. Key storage layout must be designed to the byte.
Key types and the X3DH prekey bundle
The prekey bundle published by each drone contains three categories of key material, plus a device certificate anchoring the bundle to the fleet CA:
pub struct PreKeyBundle {
/// Long-term identity keypair (ML-KEM-768 + X25519 hybrid)
pub identity_key: IdentityPublicKey,
/// Medium-term signed prekey (7-day rotation)
pub signed_prekey: SignedPreKey,
/// One-time prekeys: consumed one per session initiation
pub one_time_prekeys: Vec<OneTimePreKey>,
/// Fleet CA certificate verifying this bundle
pub device_cert: DeviceCertificate,
}
pub struct SignedPreKey {
pub prekey_id: u32,
pub public_key: X25519PublicKey,
pub signature: Ed25519Signature, // signed by identity key
pub created_at: u64, // Unix timestamp
}
pub struct OneTimePreKey {
pub prekey_id: u32,
pub public_key: X25519PublicKey,
// No per-OTP signature — integrity guaranteed by signed bundle signature
}The SignedPreKey (SPK) is signed by the identity key so that any peer can verify the SPK was not substituted in transit. OneTimePreKeys carry no individual signatures — the integrity of the entire OTP list is covered by the bundle-level Ed25519Signature in the gossip announcement (described below). This keeps per-key wire size minimal: a 32-byte X25519 public key plus a 4-byte ID, 36 bytes per OTP before framing.
The identity key is a hybrid ML-KEM-768 + X25519 construction, consistent with the post-quantum design documented in the post-quantum drone mesh article. For prekey management purposes, what matters is that the identity key is long-lived (device lifetime), the SPK is medium-lived (7 days), and OTPs are single-use.
OTP generation on the STM32H7 TRNG peripheral
X25519 scalar generation requires 32 bytes of cryptographically secure randomness. On the STM32H7, this comes from the on-die TRNG (True Random Number Generator) peripheral, which delivers entropy conditioned through a CSPRNG. The TRNG provides 32 bits every 40 clock cycles; at 240 MHz, that is one 32-bit word per 167 ns. Collecting the 8 words needed for a 32-byte X25519 scalar therefore takes approximately 1.25 μs, ignoring peripheral read latency.
OTP keypairs are generated in batches of 100 at provisioning time. The full batch takes roughly 0.25 ms — a negligible one-time cost during the ground station phase. The private scalars are written immediately into STM32H7 backup SRAM (BKPSRAM), the 4 KB battery-backed region that survives power cycles as long as VBAT is present. Each slot occupies exactly 64 bytes:
/// Layout of a single OTP private key record in BKPSRAM.
/// Total: 4 + 32 + 28 = 64 bytes, 100 records = 6.4 KB.
/// BKPSRAM is 4 KB so only the first 62 records fit;
/// the remaining 38 overflow into SRAM1 with a persistence note in OtpKeyStore.
#[repr(C, packed)]
struct OtpPrivateRecord {
prekey_id: u32, // 4 bytes
private_scalar: [u8; 32], // X25519 private key
_pad: [u8; 28], // reserved, zeroed
}The 6.4 KB required for 100 OTPs exceeds the 4 KB BKPSRAM, so the first 62 records fit in BKPSRAM and the remainder spill into SRAM1. SRAM1 contents are lost on power cycle; the OTP store tracks which records are in BKPSRAM versus SRAM1 and re-announces lost OTPs as consumed after reboot to prevent reuse of keys whose private scalars can no longer be accessed.
OTP distribution via the gossip mesh
At fleet assembly — before takeoff, while the ground station has high-bandwidth connectivity to all drones — each drone broadcasts aPreKeyBundleAnnouncement gossip message containing its full prekey bundle. Every peer that receives the announcement stores the bundle in apeer_prekey_store:
pub struct PreKeyBundleAnnouncement {
pub device_id: DeviceId,
pub bundle: PreKeyBundle,
pub bundle_signature: Ed25519Signature, // over canonical bundle bytes
pub announced_at: u64,
}
// Per-drone state
struct SwarmNode {
peer_prekey_store: HashMap<DeviceId, PreKeyBundle>,
// ...
}The announcement is gossip-flooded with TTL = 7 hops, which is sufficient to reach every node in a 128-drone swarm at k = 3 fanout. Deduplication uses the tuple (device_id, signed_prekey.prekey_id): a bundle for the same device with the same SPK ID is not re-flooded, preventing the O(N²) amplification that would otherwise result from 128 drones each announcing simultaneously. When the SPK rotates (every 7 days), the newprekey_id breaks the deduplication key and the updated bundle propagates normally.
Before storing a received bundle, the node verifies the certificate chain: the device certificate must be signed by the fleet CA, the SPK signature must be valid under the identity key in the certificate, and the bundle-level signature in bundle_signature must cover the canonical serialisation of the entire bundle. A bundle that fails any of these checks is silently discarded and the gossip flood is not forwarded — malformed announcements are contained at the first hop that catches them.
OTP consumption tracking
When a drone consumes one of its own OTPs — because a peer used it to initiate an X3DH session — it learns about the consumption through anOtpConsumed gossip message sent by the initiator. The OTP key store tracks three states for each key:
pub struct OtpKeyStore {
/// All generated OTPs indexed by prekey_id
available: BTreeMap<u32, OneTimePreKeyPrivate>,
/// Consumed OTP IDs (received via OtpConsumed gossip)
consumed: HashSet<u32>,
/// Pending consumption acknowledgments
pending_ack: HashMap<u32, Instant>, // prekey_id → sent_at
}The consumed set serves as a replay guard: if an attacker or a duplicate gossip message attempts to re-announce a bundle with a prekey_id that already appears in consumed, the OTP is not reinstated intoavailable. The private scalar in BKPSRAM is zeroized immediately when an OTP moves from available to consumed — the key cannot be recovered even if the device is physically seized.
OTP consumption is not instantaneous: there is a window between an initiator choosing prekey_id 42 from a cached bundle and theOtpConsumed gossip message arriving at the key owner. A second initiator that observes the same cached bundle before theOtpConsumed propagates could also choose prekey_id 42, creating a double-consumption. The Swarm SDK handles this withoptimistic consumption: the initiator immediately marks the OTP as consumed on the owner's behalf via gossip, and the owner removes it fromavailable when the message arrives. Both X3DH sessions that used prekey_id 42 are technically valid — both initiators derived a shared secret using the same OTP public key — but the second session loses the forward-secrecy property of OTP use because the owner's private scalar was already consumed and is now zeroized. The second initiator's session is treated as a no-OTP session at the security accounting layer.
In practice, double-consumption is rare. Each initiator picks the lowest-numbered available OTP from the last-seen bundle, and bundles are refreshed frequently enough that by the time two drones are simultaneously initiating sessions to the same peer, they are likely looking at different OTP ranges. The pending_ack map lets the owner detectOtpConsumed messages that never arrived — after a configurable timeout, the affected OTP is proactively marked consumed and the private scalar is zeroized, trading a potential double-use for safety.
OTP exhaustion handling
When the count of available OTPs drops below the refresh_threshold(default: 10), the drone generates a new batch and announces aPartialBundleUpdate — only the new OTPs, not the full bundle — to avoid re-flooding 3 KB of data for a minor replenishment. Peers merge the new OTPs into their cached bundle for that device.
Complete exhaustion is an edge case that occurs in prolonged RF disconnection or when a drone becomes extremely popular as a session target during a mission. When a session initiator finds that its cached bundle for the target device has no available OTPs, the X3DH computation falls back to a 3-DH variant: DH1, DH2, and DH3 are computed normally, but DH4 — the OTP contribution — is omitted:
pub enum SessionInitResult {
Success { session: Session, used_otp: bool },
NoOtpAvailable { session: Session }, // degraded — log this
BundleNotFound,
CertificateExpired,
}The security consequence of NoOtpAvailable is specific and bounded: an attacker who later compromises the target drone's SignedPreKey private key can decrypt that particular session's initial key material. The compromise window is the SPK's 7-day rotation period. Subsequent sessions — after OTPs have been replenished and the mesh has recovered — resume full X3DH security. The degraded session is flagged to the fleet management layer so the operator can investigate the RF disruption that caused exhaustion, and so post-mission forensics can identify which sessions lack OTP-level forward secrecy.
SignedPreKey rotation
The SPK is rotated every 7 days to limit the window of exposure if a medium-term key is compromised. Rotation is coordinated via the gossip mesh without any central scheduler: each drone tracks its own created_at timestamp and self-triggers rotation when the current time exceedscreated_at + 7 * 24 * 3600 seconds.
When rotation fires, the drone generates a new X25519 keypair, signs the new SPK with its identity key (incrementing prekey_id), and gossip-floods a SignedPreKeyRotation message. Peers that receive it update their cached bundle's signed_prekey field in place. The old SPK private key is retained in a separate previous_spk slot for 24 hours post-rotation to handle in-flight session-init messages that referenced the previous prekey_id — a drone that was mid-flight and isolated when rotation occurred might relay an X3DH init that uses the old SPK. After the 24-hour grace window, the previous SPK private key is zeroized:
fn zeroize_spk_bkpsram(slot: BkpsramSlot) {
// Write 0xFF to the BKPSRAM region before marking as freed.
// The STM32H7 reference manual (RM0433) specifies that BKPSRAM
// cells retain last-written value through power cycles, so
// explicitly overwriting is the only way to prevent cold-boot reads.
let region = bkpsram_ptr(slot);
unsafe {
core::ptr::write_bytes(region, 0xFF, SPK_RECORD_SIZE);
core::sync::atomic::fence(core::sync::atomic::Ordering::SeqCst);
}
mark_slot_free(slot);
}Writing 0xFF rather than 0x00 follows the STM32H7 BKPSRAM erased-cell convention and avoids leaving a forensically interesting all-zeros pattern that a differential power analysis tool might distinguish from random plaintext. The memory fence ensures the write is not reordered past the slot-free marker by the compiler or the Cortex-M7 store buffer.
Key bundle requests for late joiners
A drone that joins a fleet mid-mission — a replacement unit, or one that rebooted after a crash and lost its SRAM1 peer bundle cache — cannot wait for the next natural bundle announcement cycle. It sends aBundleRequest gossip message for each peer it needs:
pub struct BundleRequest {
pub requester_id: DeviceId,
pub target_device_id: DeviceId,
pub request_id: u32, // nonce for dedup
}
pub struct BundleResponse {
pub request_id: u32,
pub bundle: PreKeyBundle,
pub bundle_signature: Ed25519Signature,
}The request is gossip-relayed with TTL = 5. Any node that has the requested bundle in its peer_prekey_store emits aBundleResponse addressed back to the requester. Because the response is also gossip-relayed (not a direct RF unicast), the late joiner receives the bundle even if it has no direct RF link to either the target device or any node that holds the bundle. The response carries the same bundle_signatureas the original announcement, so the requester can verify authenticity without trusting the relay node. This is a deliberate design choice: relay nodes are treated as untrusted message forwarders, not trusted key distribution authorities.
Storage budget on STM32H7
The Swarm SDK's prekey storage is split between BKPSRAM (battery-backed, survives power cycle) and SRAM1 (volatile, lost on reboot):
| Data | Size | Location |
|---|---|---|
| Own OTP private keys (100 records × 64 bytes) | 6.4 KB | BKPSRAM (first 62) + SRAM1 (last 38) |
| Current SPK private key (1 × 100 bytes) | 100 bytes | BKPSRAM |
| Previous SPK private key (24-hour grace) | 100 bytes | BKPSRAM |
| Peer bundle cache (64 peers × ~3 KB per bundle) | ~192 KB | SRAM1 (volatile) |
The BKPSRAM budget of 4 KB is tight: 6.4 KB for OTPs plus 200 bytes for two SPK records exceeds the available space by roughly 2.6 KB. The overflow strategy (first 62 OTPs in BKPSRAM, remainder in SRAM1) is tracked per-record in a 100-bit persistence bitmap stored at the start of BKPSRAM. After a power cycle, the SDK reads the bitmap and immediately gossip-floods OtpConsumedmessages for any OTP IDs whose private scalars were in SRAM1 — those scalars are gone, so the corresponding public keys must be retired from the bundle immediately to prevent peers from attempting to use them.
Peer bundles are kept in SRAM1 intentionally. The ~192 KB footprint for a 64-drone fleet would consume all of BKPSRAM 48 times over. Since peer bundles can be re-fetched via the BundleRequest gossip mechanism after a reboot, the cost of losing them on power cycle is a brief re-join latency, not a security failure.
Security properties
The prekey management layer provides four security properties that compose with the X3DH and Double Ratchet layers above it:
- OTP use-once guarantee. The
consumedHashSet provides a monotonic record of all OTP IDs that have been used. Replay of a consumed OTP ID in a new bundle announcement is detected and rejected — the owner cannot reissue a key whose private scalar has been zeroized, and a man-in-the-middle who replays an old bundle containing a still-valid public key will find the matching private scalar already gone on the owner's side, preventing session establishment rather than enabling a replay attack. - Bundle forgery prevention. SPK signatures prevent an attacker from substituting a malicious public key into a bundle without the identity private key. The bundle-level signature in
PreKeyBundleAnnouncementcovers the entire bundle, so partial substitution is also detected. An attacker who controls a relay node can drop, delay, or replay bundles — but cannot modify them. - SPK compromise window bound. The 7-day SPK rotation limits how long a compromised SPK is useful. Sessions established after a rotation use a new SPK whose private key the attacker does not have. Combined with OTP use, the window in which a compromised SPK enables decryption is bounded to sessions established without an OTP during the rotation period — an uncommon event that is explicitly logged.
- Degraded session transparency. The
NoOtpAvailableresult is surfaced to the fleet management layer, not hidden inside the SDK. Operators can query which session pairs lack OTP-level forward secrecy, and post-mission forensics can determine whether any of those sessions were established during a period of suspected RF compromise. This transparency is a deliberate trade-off: availability is preferred over blocking session establishment, but the security downgrade is never silent.
Related technical articles:
- Swarm SDK session establishment: X3DH prekey bundles and the initial drone-to-drone handshake →
- Swarm SDK key management: device provisioning, certificate rotation, and revocation →
- Swarm SDK gossip mesh: bounded fanout routing, message deduplication, and network partition handling →
- Swarm SDK device enrollment: fleet CA provisioning and trust chain establishment →
- Post-quantum mesh cryptography for drone swarms: the Swarm SDK design →