Technical writing

Voidly probe operator safety: anonymity design, data minimization, and operational security for censorship measurement

May 7, 2025· 9 min read· AI Analytics

CensorshipVoidlySecurityInfrastructure

Every Voidly probe is operated by a person. In many of the countries where Voidly most needs measurement coverage, that person is taking a real risk. Russia, China, Iran, and Belarus have all prosecuted individuals for network activity that could be characterized as unauthorized measurement or assistance to foreign surveillance operations. Voidly's safety design starts from one non-negotiable principle: the system must protect operators even if Voidly's own servers are compromised.

This post covers the full operator safety stack: the threat model, what data Voidly stores and what it deliberately does not store, how IP addresses are handled, Tor routing for high-risk operators, measurement scrubbing before publication, in-app operational security guidance, per-country legal risk tiers, the tension between anonymity and measurement attribution, and the emergency stop mechanism.

The threat model

Probe operators face three categories of risk that interact in practice but require different mitigations.

Legal risk. In some jurisdictions, operating a censorship measurement probe could be characterized as unauthorized network activity, unauthorized access to foreign systems, or assistance to foreign intelligence services. Russia's laws on unauthorized computer activity and Iran's cybercrime framework are written broadly enough to reach passive measurement. The fact that the probe only makes outbound HTTP and DNS requests does not immunize the operator against a prosecutor with discretion.

Technical risk. An ISP or state-level network monitor that detects the WireGuard tunnel between the probe and Voidly's ingest server could expose the operator's participation. WireGuard traffic has a distinctive handshake pattern. Deep packet inspection at the ISP level can identify it even without decrypting the payload. If that traffic is logged and correlated with the operator's account, the technical link exists.

Social risk. Even in jurisdictions without a legal basis for prosecution, disclosure of an operator's participation could have professional consequences (employer discovery of politically sensitive research), personal consequences (family pressure), or community consequences in environments where association with foreign-funded internet research is stigmatized.

The design goal is to ensure that a full compromise of Voidly's servers — database, ingest infrastructure, key material — does not give an adversary a usable link from stored data to a real-world operator identity.

What Voidly stores about operators

The registration record for a probe contains exactly these fields:

Probe public key (X25519, generated on-device). Voidly never sees the private key. The public key is the probe's cryptographic identity and is used to derive the probe_id.
ASN number of the operator's ISP. Required for coverage claims — Voidly's geographic coverage assertions depend on knowing which autonomous systems have probes.
City-level location, operator-provided and not verified. Used for geographic coverage statistics. Voidly does not collect street address or postal code.
Email address for critical alerts — optional but recommended, because a probe that goes offline without explanation should notify its operator. If provided, the email is stored encrypted at rest and is never published or shared.
Application timestamp and approval status.

Voidly explicitly does not store: operator name, home address, ISP account details, or the IP address used at commissioning time. The design treats all of these as unnecessary for operating the network and treats the risk of storing them as outweighing any operational benefit.

IP address handling

The probe connects to Voidly's ingest infrastructure over WireGuard. The ingest server sees the NAT-translated IP of the operator's connection — this is unavoidable at the network layer. That IP is handled under a strict policy:

Connection-layer IP addresses are retained in a rotating 24-hour buffer used exclusively for DoS detection and rate limiting. They are not written to any persistent database.
The IP address is never linked to the probe's identity in any stored record. The ingest server authenticates incoming connections cryptographically using the probe's WireGuard public key — it does not need to know the source IP to authenticate the probe.
IP addresses in the 24-hour buffer are not shared with third parties, not used for geolocation beyond what is already captured in the declared ASN, and not retained beyond the 24-hour window under any circumstances.

WireGuard's design makes this clean: the peer identifier in a WireGuard session is the peer's public key, not its IP address. The ingest server can verify "this connection is from probe X25519:a3f8..." without ever storing what IP address probe X25519:a3f8... connected from.

Tor hidden service option

For operators in high-risk environments who want to prevent the ingest server from observing any network-layer identifier, Voidly operates a Tor hidden service for probe data upload. Routing through the .onion address means the ingest server sees only a Tor exit node IP — there is no network-level connection between the operator and Voidly's infrastructure.

The trade-off is latency and throughput: Tor routing typically adds 3–5× the round-trip time for each upload batch, and upload throughput is reduced by the onion routing overhead. For most probes, this is acceptable — measurement batches are small (a few hundred KB per upload cycle). The Tauri app supports a custom SOCKS5 proxy for the upload path, enabling Tor Browser or a local Tor daemon as the transport. Configure it in the probe's settings file:

// ~/.voidly/probe_config.json
{
  "upload": {
    "endpoint": "http://voidlyxxx...onion/v1/ingest",
    "transport": "socks5",
    "socks5_host": "127.0.0.1",
    "socks5_port": 9150,
    "tor_mode": true
  }
}

When tor_mode is set to true, the Tauri app also suppresses any telemetry or version-check requests that would otherwise go out over the direct connection. All network activity from the app is routed through the SOCKS5 proxy.

Measurement data scrubbing

Published measurements are scrubbed before they appear in the public dataset. The scrubbing pipeline runs on every batch before it is written to the TimescaleDB publication table:

Probe ID pseudonymization. The stable probe_id is replaced with a daily pseudonym: SHA-256 of the probe's public key concatenated with a daily salt. The daily salt is different for each country, which prevents researchers from linking the same probe's pseudonym across measurements from different countries even if the probe travels.
Timestamp rounding. The probe-local timestamp of each measurement is rounded to the nearest 15 minutes before publication. This prevents timing correlation attacks where an adversary observes when a specific household's traffic stops to infer which probe operator lives there.
Uniqueness suppression. Any measurement that would uniquely identify a specific household — a single probe in a small city reporting a block page response that no other probe in that ASN has seen — is withheld from publication for 30 days. After 30 days, the measurement is published with the delayed flag set and contributes to historical analyses but not to real-time incident detection.

// Measurement scrubbing pipeline
import { createHash } from 'crypto';

interface RawMeasurement {
  probe_id: string;
  probe_public_key: string;
  country_code: string;
  timestamp: Date;
  domain: string;
  result: MeasurementResult;
}

interface ScrubbedMeasurement {
  daily_probe_pseudonym: string;
  country_code: string;
  rounded_timestamp: Date;
  domain: string;
  result: MeasurementResult;
  suppressed_until?: Date;
}

function getDailySalt(country_code: string, date: string): string {
  // Salt is country-specific and rotates daily — stored in KV, not derived
  return dailySaltStore.get(country_code + ':' + date);
}

function scrubMeasurement(
  raw: RawMeasurement,
  peerCount: number,
): ScrubbedMeasurement {
  const date = raw.timestamp.toISOString().slice(0, 10);
  const salt = getDailySalt(raw.country_code, date);

  const pseudonym = createHash('sha256')
    .update(raw.probe_public_key + salt)
    .digest('hex');

  const roundedTs = new Date(
    Math.round(raw.timestamp.getTime() / (15 * 60 * 1000)) * (15 * 60 * 1000)
  );

  const scrubbed: ScrubbedMeasurement = {
    daily_probe_pseudonym: pseudonym,
    country_code: raw.country_code,
    rounded_timestamp: roundedTs,
    domain: raw.domain,
    result: raw.result,
  };

  // Suppress unique household-identifying measurements for 30 days
  if (peerCount < 2) {
    const suppressed_until = new Date(raw.timestamp);
    suppressed_until.setDate(suppressed_until.getDate() + 30);
    scrubbed.suppressed_until = suppressed_until;
  }

  return scrubbed;
}

Operational security guidance for operators

The Tauri app includes an "Operator Safety" screen that is shown on first launch and accessible at any time from the settings menu. It explains the threat model in plain language and provides country-specific guidance maintained by Voidly's legal team. The in-product guidance covers:

Dedicated device. In high-risk countries, operators should use a device that is not shared with personal browsing. A low-cost single-board computer running headless Linux is preferable to running the probe on the family laptop — if the device is seized, the worst case is loss of the device, not exposure of personal browsing history.
Network isolation. Connecting the probe to a separate SSID or VLAN from household devices limits the ability of an ISP-level observer to correlate probe traffic with other household activity. Most consumer routers support multiple SSIDs; the guidance includes setup instructions for common router firmware.
Network identity. If possible, connect the probe via a data SIM or prepaid mobile connection that does not appear in the operator's name. This removes the direct subscriber-identity link that an ISP would provide to authorities under a legal demand.
Key material. The probe generates its own keypair on-device. The private key never leaves the OS keychain and should never be copied or backed up to cloud storage. The guidance explains that copying the key file to another device or cloud backup negates the security properties of on-device generation.

Legal review by country

Voidly maintains a per-country legal risk assessment — HIGH, MEDIUM, or LOW — based on analysis of applicable laws, known prosecutions of similar network measurement or circumvention research, and ongoing monitoring of legislative changes. The assessment is reviewed quarterly by outside counsel in partnership with digital rights organizations.

Operators in HIGH-risk countries (currently: China, Russia, Iran, North Korea, Belarus, Venezuela) see an extended disclosure during the commissioning flow. Before completing registration, they must confirm that they have read and understood the country-specific risk summary, that they are acting voluntarily with awareness of the legal environment, and that they have considered the operational security guidance. This confirmation is recorded in the commissioning log with a timestamp.

Voidly does not accept applications from certain jurisdictions at all — territories under comprehensive international sanctions or where even encrypted outbound traffic to foreign servers has been shown to trigger immediate investigation. Restricting access from these jurisdictions is itself a protective measure: a probe that cannot safely upload data adds no measurement coverage and puts its operator at risk.

Operator anonymity vs. measurement attribution

Anonymized measurements create a genuine research tradeoff. A researcher who wants to ask "what does AS12345 in Tehran specifically see for bbc.com?" can get that answer from Voidly's dataset. A researcher who wants to ask "has the specific probe operated by person X changed its behavior?" cannot — and that is by design.

Voidly resolves the tension by treating the ASN as the stable, publishable identifier, while anonymizing the probe-within-ASN identity. ASN 31337 in Tehran is a public fact; the specific subscriber account at that ISP running a Voidly probe is not. Researchers can answer "what do Rostelecom customers see?" because all probes in Rostelecom's ASN share the same ASN label, and that label is stable across time. They cannot answer "is the probe at this apartment in Moscow still active?" because the daily pseudonym rotation ensures that probe-level continuity is not reconstructable from the published dataset.

This design accepts a limitation: longitudinal analysis of a specific probe's behavior — useful for detecting probe compromise or selective censorship targeting a specific subscriber — is not possible using the published dataset alone. Voidly retains the mapping internally for operational health monitoring, but does not expose it via the public API or research dataset.

Emergency stop mechanism

Any operator can immediately terminate their probe's participation and erase all local data using the Emergency Stop button in the Tauri app. The button is prominently placed in the main navigation and requires a single confirmation tap — it is explicitly not hidden behind multiple menus, because in an emergency, it needs to be reachable in under five seconds.

Emergency Stop executes these steps in sequence:

Terminates all active measurement runs and flushes the pending upload queue without uploading.
Drops the WireGuard tunnel to Voidly's ingest server.
Deletes the SQLite measurement store from the local filesystem, including all WAL files.
Deletes the probe keypair from the OS keychain (Keychain, Secret Service, or Credential Manager).
Sends an authenticated DELETE /v1/probes/{probe_id} request to Voidly's API to remove the operator's commissioning record from the server-side database. This request is signed with the private key before it is deleted from the keychain, so it is the last action taken with that key.

After Emergency Stop, the app returns to its first-launch state. The operator can recommission with a fresh keypair, which generates a new probe_id with no link to the previous probe's history. From Voidly's perspective, the old probe ceased to exist and a new, unrelated probe registered.

Transparency report

Voidly publishes a quarterly transparency report that covers three areas. First, the number of active operators broken down by country risk tier — HIGH, MEDIUM, or LOW — without naming specific countries in the HIGH tier, to avoid creating a list that could assist an adversary in identifying which operators to investigate. Second, any legal demands received from government or law enforcement authorities: court orders, national security letters, administrative demands for operator data. As of Q2 2025, Voidly has received zero such demands. Third, any security incidents affecting operator data, including unauthorized access to the registration database, ingest infrastructure compromise, or key material exposure.

The transparency report is Voidly's public commitment that the safety design described here is actually implemented and being upheld. If any of these policies change — in either direction — that change will appear in the next transparency report with an explanation.

For how new operators register and receive a probe_id — the commissioning process this safety design protects: Voidly probe commissioning: how a new operator joins the censorship measurement network →

For what domains operators' probes are testing — the measurement burden on their network: Voidly's URL test list: how we curate the domains that reveal internet censorship →

For how probe vantage selection balances ASN diversity against operator safety constraints: Voidly probe vantage selection: ASN diversity, operator safety, and reaching hard-to-measure countries →

For the local measurement buffer that stores data if the probe can't immediately upload: Voidly probe local measurement buffer: SQLite ring buffer, batch compression, and resilient upload →