Technical writing

The Voidly REST API: querying the global censorship index in real time

January 6, 2025· 8 min read· AI Analytics

CensorshipVoidlyAPI designInfrastructure

Voidly exposes its full censorship index — incidents, raw measurements, country summaries, BGP events, domain histories, and 7-day shutdown forecasts — through a versioned REST API at https://api.voidly.ai/v1/. The public tier requires no authentication: you can start querying incidents the moment you read this. This post walks through every major endpoint, the request and response shapes, pagination, rate limits, streaming exports, and working code in curl, Python, and JavaScript.

Versioning

The API is versioned by URL path prefix (/v1/). Within a major version, the API is additive-only: new fields may appear in responses without a version bump, but no existing field will be removed or renamed. If you need to pin to a specific minor revision of the response schema — useful when parsing is strict — send the X-Voidly-API-Version request header:

GET /v1/incidents HTTP/1.1
Host: api.voidly.ai
X-Voidly-API-Version: 2025-01-06
Accept: application/json

The response includes an X-Voidly-API-Version header confirming which schema revision was used. If you omit the header you receive the latest revision of v1. Breaking changes — field removals, renamed enums — will only ever appear in a new major version path (/v2/), announced with a six-month deprecation window.

Authentication and rate limits

The public tier is unauthenticated. Requests are rate-limited by source IP: 120 requests per minute for point-lookup endpoints (incident by ID, country summary, domain history, forecast) and 20 requests per minute for paginated aggregate endpoints (/v1/incidents, /v1/measurements, /v1/bgp/events). These limits are per-IP, not per-session, and they reset on a sliding 60-second window.

Authenticated requests carry a bearer token in the Authorization header and receive 5× higher limits (600 req/min lookups, 100 req/min aggregates) plus access to the streaming export endpoint. To obtain a token, email info@ai-analytics.org with a brief description of your use case. There is no per-request cost; the token is free for academic, journalism, and non-commercial research use.

Every response — successful or error — includes three rate-limit headers:

X-RateLimit-Limit: 120
X-RateLimit-Remaining: 87
X-RateLimit-Reset: 1736121600

X-RateLimit-Reset is a Unix timestamp (UTC) for when the current window resets. When you exceed a limit the API returns HTTP 429 with a Retry-After header giving the number of seconds until you may retry:

HTTP/1.1 429 Too Many Requests
Retry-After: 14
Content-Type: application/problem+json

{
  "type": "https://api.voidly.ai/problems/rate-limit-exceeded",
  "title": "Rate limit exceeded",
  "status": 429,
  "detail": "Aggregate endpoint limit (20 req/min) exceeded for this IP.",
  "retry_after": 14
}

Core endpoints

GET /v1/incidents

The primary entry point. Returns a paginated list of censorship incidents — each incident represents a distinct blocking event on a specific domain in a specific country, potentially spanning multiple measurement windows. The full list of supported query parameters:

country_code — ISO 3166-1 alpha-2 code (e.g. IR, RU, ET)
confidence_tier — anomaly, corroborated, or verified; may be repeated for OR logic
interference_type — dns_tampering, tls_interference, http_blocking, bgp_withdrawal, or throttling; may be repeated
domain — exact hostname or wildcard (e.g. *.twitter.com); wildcard prefix only
date_from / date_to — ISO 8601 datetime, applied to started_at; defaults to the last 7 days
status — open (last seen within 2 hours) or resolved
cursor / limit — pagination (see below)

A typical response for a single incident in the list:

{
  "data": [
    {
      "incident_id": "inc_01hq7r2k4vj8bx3np5yzw6dc9f",
      "country_code": "IR",
      "domain": "twitter.com",
      "interference_type": "dns_tampering",
      "confidence_tier": "verified",
      "started_at": "2025-01-04T08:12:00Z",
      "last_seen_at": "2025-01-06T14:47:00Z",
      "status": "open",
      "measurement_count": 1847,
      "probe_asns": [44244, 16322, 48159, 197207, 58224],
      "ooni_corroborated": true,
      "cp_corroborated": false,
      "ioda_corroborated": true,
      "corroboration_score": 0.997
    }
  ],
  "meta": {
    "total": 4821,
    "count": 100,
    "cursor": "eyJpZCI6ImluY18wMWhxN3Iya...",
    "has_more": true
  }
}

The corroboration_score is a float in [0, 1] derived from the independence-weighted agreement of all sources that observed this incident. Scores above 0.95 indicate agreement from at least two methodologically distinct source pairs (e.g. Voidly + OONI + IODA). Scores below 0.80 indicate single-source Anomaly-tier events.

GET /v1/incidents/{incident_id}

Returns the full detail record for a single incident. In addition to all fields from the list endpoint, the detail response includes the array of underlying measurement IDs and a structured breakdown of corroboration by source:

{
  "incident_id": "inc_01hq7r2k4vj8bx3np5yzw6dc9f",
  "country_code": "IR",
  "domain": "twitter.com",
  "interference_type": "dns_tampering",
  "confidence_tier": "verified",
  "started_at": "2025-01-04T08:12:00Z",
  "last_seen_at": "2025-01-06T14:47:00Z",
  "status": "open",
  "measurement_count": 1847,
  "probe_asns": [44244, 16322, 48159, 197207, 58224],
  "measurement_ids": [
    "meas_01hq7r2k4vj8bx3np5yzw6dc9f",
    "meas_01hq7r2k5aj9by4oq6zax7ed0g",
    "..."
  ],
  "corroboration": {
    "ooni_corroborated": true,
    "ooni_first_seen": "2025-01-04T08:19:00Z",
    "cp_corroborated": false,
    "cp_first_seen": null,
    "ioda_corroborated": true,
    "ioda_first_seen": "2025-01-04T08:15:00Z",
    "corroboration_score": 0.997,
    "sources": ["voidly", "ooni", "ioda"]
  },
  "ooni_corroborated": true,
  "cp_corroborated": false,
  "ioda_corroborated": true,
  "corroboration_score": 0.997
}

GET /v1/measurements

Raw probe measurements. This endpoint returns the individual measurement records that underlie incident aggregations — one row per probe test. Because cardinality is high (tens of millions of measurements per day globally), date filtering is required: requests without date_from and date_to will return a 400. The maximum date window is 24 hours for unauthenticated requests, 7 days for authenticated. Additional filter parameters mirror those on /v1/incidents, plus:

probe_asn — filter to a specific ASN integer
probe_type — residential, datacenter, or mobile
incident_id — return only measurements belonging to a specific incident

A single measurement record:

{
  "measurement_id": "meas_01hq7r2k4vj8bx3np5yzw6dc9f",
  "probe_id": "prb_ir_044244_0017",
  "probe_cc": "IR",
  "probe_asn": 44244,
  "probe_type": "residential",
  "domain": "twitter.com",
  "test_start_time": "2025-01-04T08:12:03Z",
  "dns_failure": "dns_nxdomain",
  "dns_consistency": "inconsistent",
  "dns_resolved_ips": ["10.10.34.35"],
  "dns_ip_blockpage_asn": 44244,
  "tls_interception_detected": false,
  "http_blockpage_match": false,
  "interference_type": "dns_tampering",
  "prob_dns_tampering": 0.961,
  "prob_tls_interference": 0.023,
  "prob_http_blocking": 0.011,
  "prob_bgp_withdrawal": 0.003,
  "prob_throttling": 0.002,
  "confidence_tier": "verified",
  "incident_id": "inc_01hq7r2k4vj8bx3np5yzw6dc9f",
  "corroboration_score": 0.997
}

The prob_* fields are per-class confidence scores from the anomaly classifier summing to approximately 1.0. The interference_type field is the argmax class at the measurement level; the incident-level type reflects the dominant class across all contributing measurements.

GET /v1/countries/{cc}/summary

A country-level aggregate. Useful for building dashboards and for seeding the 7-day forecast context. The cc path parameter is a two-letter ISO 3166-1 alpha-2 code. No additional query parameters are accepted; the summary always reflects the trailing 90 days plus a live 7-day forecast window:

{
  "country_code": "IR",
  "country_name": "Iran",
  "censorship_score": 0.91,
  "incident_counts_90d": {
    "total": 312,
    "dns_tampering": 198,
    "tls_interference": 44,
    "http_blocking": 61,
    "bgp_withdrawal": 5,
    "throttling": 4
  },
  "verified_incident_counts_90d": {
    "total": 147,
    "dns_tampering": 94,
    "tls_interference": 22,
    "http_blocking": 28,
    "bgp_withdrawal": 2,
    "throttling": 1
  },
  "probe_asns_active": [44244, 16322, 48159, 197207, 58224, 31549, 43754],
  "measurements_90d": 4182950,
  "shutdown_forecast_7d": {
    "forecast_generated_at": "2025-01-06T00:00:00Z",
    "days": [
      { "date": "2025-01-06", "prob_shutdown": 0.08, "confidence_low": 0.04, "confidence_high": 0.14 },
      { "date": "2025-01-07", "prob_shutdown": 0.09, "confidence_low": 0.05, "confidence_high": 0.16 },
      { "date": "2025-01-08", "prob_shutdown": 0.11, "confidence_low": 0.06, "confidence_high": 0.19 },
      { "date": "2025-01-09", "prob_shutdown": 0.07, "confidence_low": 0.03, "confidence_high": 0.13 },
      { "date": "2025-01-10", "prob_shutdown": 0.07, "confidence_low": 0.03, "confidence_high": 0.13 },
      { "date": "2025-01-11", "prob_shutdown": 0.06, "confidence_low": 0.02, "confidence_high": 0.12 },
      { "date": "2025-01-12", "prob_shutdown": 0.06, "confidence_low": 0.02, "confidence_high": 0.11 }
    ]
  },
  "last_verified_incident": {
    "incident_id": "inc_01hq7r2k4vj8bx3np5yzw6dc9f",
    "domain": "twitter.com",
    "interference_type": "dns_tampering",
    "started_at": "2025-01-04T08:12:00Z"
  }
}

The censorship_score is a composite index in [0, 1] derived from verified incident frequency, domain breadth, interference type diversity, and recency decay. It is not a raw count and should not be treated as a probability — it is an ordinal ranking instrument.

GET /v1/domains/{domain}/history

Returns the cross-country blocking history for a single domain. The domain path parameter must be an exact hostname (no wildcards). Query parameters: date_from, date_to, confidence_tier.

{
  "domain": "twitter.com",
  "countries_with_incidents": 14,
  "per_country": [
    {
      "country_code": "IR",
      "country_name": "Iran",
      "incident_count": 312,
      "verified_incident_count": 147,
      "first_incident": "2010-03-01T00:00:00Z",
      "last_incident": "2025-01-06T14:47:00Z",
      "interference_types": ["dns_tampering", "http_blocking"]
    },
    {
      "country_code": "RU",
      "country_name": "Russia",
      "incident_count": 98,
      "verified_incident_count": 41,
      "first_incident": "2022-03-04T00:00:00Z",
      "last_incident": "2025-01-05T22:03:00Z",
      "interference_types": ["http_blocking", "tls_interference"]
    }
  ]
}

GET /v1/bgp/events

BGP withdrawal events ingested from IODA's RouteViews and RIPE RIS feeds. These are country-level or ASN-level prefix withdrawals that correlate with internet shutdowns. Query parameters: country_code, asn, date_from, date_to, event_type (withdrawal or announcement). Date filtering is required; maximum window is 30 days.

{
  "data": [
    {
      "event_id": "bgp_01hq9m4k2nw7cy5rp8vab3fe6h",
      "country_code": "ET",
      "asn": 24757,
      "asn_name": "EthioTelecom",
      "event_type": "withdrawal",
      "prefix": "196.188.0.0/16",
      "peers_observing": 142,
      "started_at": "2025-01-05T02:18:00Z",
      "ended_at": null,
      "duration_minutes": null,
      "ioda_signal_strength": 0.88,
      "correlated_incident_ids": [
        "inc_01hq9m5r3ox8dz6sq9wcb4gf7i"
      ]
    }
  ],
  "meta": {
    "total": 7,
    "count": 7,
    "cursor": null,
    "has_more": false
  }
}

GET /v1/forecast/{cc}

A dedicated 7-day shutdown forecast for a single country, with richer detail than the condensed version inside /v1/countries/{cc}/summary. The forecast model runs daily at 00:00 UTC and the response reflects the most recently computed forecast. In addition to per-day probabilities with confidence intervals, the response exposes the model's top driving features for explainability:

{
  "country_code": "MM",
  "country_name": "Myanmar",
  "forecast_generated_at": "2025-01-06T00:00:00Z",
  "model_version": "shutdown-forecast-v3.2.1",
  "days": [
    {
      "date": "2025-01-06",
      "prob_shutdown": 0.34,
      "confidence_low": 0.22,
      "confidence_high": 0.47,
      "driving_features": {
        "political_calendar_flag": true,
        "bgp_volatility_7d": 0.71,
        "historical_shutdown_rate": 0.29,
        "verified_incidents_14d": 18,
        "active_probe_asns": 3
      }
    },
    {
      "date": "2025-01-07",
      "prob_shutdown": 0.38,
      "confidence_low": 0.25,
      "confidence_high": 0.52,
      "driving_features": {
        "political_calendar_flag": true,
        "bgp_volatility_7d": 0.71,
        "historical_shutdown_rate": 0.29,
        "verified_incidents_14d": 18,
        "active_probe_asns": 3
      }
    }
  ]
}

political_calendar_flag is true when the target date falls within ±3 days of a scheduled election, national referendum, major protest anniversary, or state-flagged “sensitive period” in the Voidly political calendar. bgp_volatility_7d is a normalized measure of prefix announcement churn over the trailing 7 days. For more detail on the model architecture see Seven-day internet shutdown forecasting →.

Cursor-based pagination

All list endpoints use cursor-based pagination. The cursor is an opaque base64-encoded token returned in meta.cursor. Pass it as the cursor query parameter on the next request to receive the next page. When meta.has_more is false, you have consumed the full result set and the cursor value is null.

The default page size is 100 rows. Set limit to any value up to 500 for most endpoints; the /v1/measurements endpoint accepts up to 1000 rows per page. Attempting to set limit higher than the endpoint maximum returns a 400 with a clear error.

An offset-based alternative is available via the offset parameter, but it is intentionally slow past page 500 — the query planner cannot push an offset of 50,000+ rows efficiently. Use cursor pagination for any programmatic data consumption. Offset pagination is provided only for interactive UIs that need arbitrary page jumping.

Streaming large exports

For bulk data pulls — more than a few thousand rows — use the streaming export endpoint rather than paginating through the standard list endpoints. Authenticated requests to GET /v1/export stream the response body as newline-delimited JSON (NDJSON), one record per line:

GET /v1/export?resource=incidents&country_code=RU&date_from=2024-01-01T00:00:00Z&date_to=2024-12-31T23:59:59Z HTTP/1.1
Host: api.voidly.ai
Authorization: Bearer <token>
Accept: application/x-ndjson

The connection stays open and the server flushes records as they are read from the index. The timeout is 5 minutes — if the server cannot flush all records within that window it returns a X-Export-Truncated: true trailer header and closes the connection at the 10-million-row hard limit.

For datasets larger than 10 million rows — full-country measurement histories, global incident dumps — use the static HuggingFace dataset snapshots at huggingface.co/datasets/voidly/censorship-index. Snapshots are published weekly and cover the full historical corpus from 2012 onward. The API is the right tool for real-time queries and targeted pulls; HuggingFace is the right tool for training set construction and longitudinal analysis.

Code samples

curl

# Fetch all open verified incidents in Russia from the last 7 days
curl -s "https://api.voidly.ai/v1/incidents?country_code=RU&confidence_tier=verified&status=open"   -H "Accept: application/json" | jq '.data[] | {id: .incident_id, domain: .domain, type: .interference_type}'

Python — async paginator

This generator transparently follows cursors until the result set is exhausted. It uses httpx with async/await for efficient concurrent downstream processing:

import asyncio
import httpx
from typing import AsyncIterator

BASE_URL = "https://api.voidly.ai/v1"


async def paginate_incidents(
    *,
    country_code: str | None = None,
    confidence_tier: str | None = None,
    interference_type: str | None = None,
    date_from: str | None = None,
    date_to: str | None = None,
    status: str | None = None,
    token: str | None = None,
    limit: int = 100,
) -> AsyncIterator[dict]:
    """Yield every incident matching the given filters, following cursors automatically."""
    headers = {"Accept": "application/json"}
    if token:
        headers["Authorization"] = f"Bearer {token}"

    params: dict[str, str] = {"limit": str(limit)}
    if country_code:
        params["country_code"] = country_code
    if confidence_tier:
        params["confidence_tier"] = confidence_tier
    if interference_type:
        params["interference_type"] = interference_type
    if date_from:
        params["date_from"] = date_from
    if date_to:
        params["date_to"] = date_to
    if status:
        params["status"] = status

    async with httpx.AsyncClient(timeout=30) as client:
        while True:
            resp = await client.get(f"{BASE_URL}/incidents", params=params, headers=headers)
            resp.raise_for_status()
            body = resp.json()

            for incident in body["data"]:
                yield incident

            meta = body["meta"]
            if not meta["has_more"] or not meta["cursor"]:
                break
            params["cursor"] = meta["cursor"]
            # Remove offset if it was set — cursor takes precedence
            params.pop("offset", None)


async def main() -> None:
    async for incident in paginate_incidents(
        country_code="IR",
        confidence_tier="verified",
        interference_type="dns_tampering",
        date_from="2025-01-01T00:00:00Z",
        date_to="2025-01-06T23:59:59Z",
    ):
        print(incident["incident_id"], incident["domain"], incident["started_at"])


if __name__ == "__main__":
    asyncio.run(main())

JavaScript / TypeScript

const BASE_URL = "https://api.voidly.ai/v1";

interface Incident {
  incident_id: string;
  country_code: string;
  domain: string;
  interference_type: string;
  confidence_tier: string;
  started_at: string;
  last_seen_at: string;
  status: string;
  measurement_count: number;
  probe_asns: number[];
  ooni_corroborated: boolean;
  cp_corroborated: boolean;
  ioda_corroborated: boolean;
  corroboration_score: number;
}

interface IncidentListParams {
  country_code?: string;
  confidence_tier?: string;
  interference_type?: string;
  date_from?: string;
  date_to?: string;
  status?: string;
  limit?: number;
  token?: string;
}

async function* paginateIncidents(
  params: IncidentListParams
): AsyncGenerator<Incident> {
  const { token, limit = 100, ...filters } = params;

  const headers: Record<string, string> = { Accept: "application/json" };
  if (token) headers["Authorization"] = `Bearer ${token}`;

  const searchParams = new URLSearchParams(
    Object.fromEntries(
      Object.entries({ ...filters, limit: String(limit) }).filter(
        ([, v]) => v !== undefined
      )
    )
  );

  while (true) {
    const url = `${BASE_URL}/incidents?${searchParams.toString()}`;
    const resp = await fetch(url, { headers });
    if (!resp.ok) throw new Error(`HTTP ${resp.status}: ${await resp.text()}`);

    const body = await resp.json();
    for (const incident of body.data as Incident[]) yield incident;

    const { has_more, cursor } = body.meta;
    if (!has_more || !cursor) break;

    searchParams.set("cursor", cursor);
    searchParams.delete("offset");
  }
}

// Usage
for await (const incident of paginateIncidents({
  country_code: "ET",
  confidence_tier: "corroborated",
  date_from: "2025-01-01T00:00:00Z",
})) {
  console.log(incident.incident_id, incident.domain, incident.interference_type);
}

Error responses

All error responses follow RFC 7807 Problem Details for HTTP APIs. The response body is always Content-Type: application/problem+json and contains at minimum:

{
  "type": "https://api.voidly.ai/problems/invalid-parameter",
  "title": "Invalid query parameter",
  "status": 400,
  "detail": "Parameter 'confidence_tier' must be one of: anomaly, corroborated, verified. Got: 'trusted'.",
  "instance": "/v1/incidents?confidence_tier=trusted"
}

The type URI is a stable identifier for the error class — you can use it as a discriminant in error handling code rather than pattern-matching on detail strings, which may change. The full set of defined problem types:

…/problems/invalid-parameter — 400, malformed or out-of-range query parameter
…/problems/missing-required-parameter — 400, required filter absent (e.g., date range on /measurements)
…/problems/not-found — 404, incident ID or country code does not exist
…/problems/rate-limit-exceeded — 429, includes retry_after integer (seconds)
…/problems/authentication-required — 401, endpoint requires a bearer token
…/problems/export-window-exceeded — 400, requested export window exceeds the 7-day authenticated maximum
…/problems/internal-error — 500, transient; safe to retry with exponential backoff

The MCP alternative

If you are building LLM-powered applications rather than direct HTTP integrations, the Voidly MCP server exposes the same data through 83 structured tool definitions — one per query pattern — that Claude and GPT can invoke directly without you writing any HTTP plumbing. The MCP server handles pagination, parameter validation, and response shaping internally; your model just calls get_verified_incidents_by_country or get_shutdown_forecast as a tool call. See The Voidly MCP server: 83 censorship query tools for Claude and GPT → for the full tool catalog and integration guide.

For a complete reference of every field in the measurement and incident response objects: The Voidly measurement dataset: field-by-field schema reference →

For how measurements are promoted from Anomaly to Verified Incident and what confidence_tier means at the API level: From anomaly to verified incident: the Voidly confidence tier system →

For the 83-tool MCP server that wraps this API for LLM agent use: The Voidly MCP server: 83 censorship query tools for Claude and GPT →

For the model behind the /v1/forecast/{cc} endpoint: Seven-day internet shutdown forecasting →

For how incident alerts reach journalists, researchers, and monitoring tools after an incident crosses the confidence threshold: Voidly's alert delivery system: PGP-encrypted email, webhooks, and RSS for censorship incidents →

For Server-Sent Events streaming — GET /v1/stream, four event types, Last-Event-ID reconnect, and a comparison to HMAC webhooks: The Voidly SSE streaming API: real-time censorship event delivery →

For the authentication layer that gates access to this API — key format, PBKDF2 storage, D1+KV auth flow, and four plan tiers: Voidly API authentication: key format, PBKDF2 storage, D1+KV auth flow, and plan tiers →