Technical writing

Seven-day internet shutdown forecasting: how Voidly predicts connectivity outages

June 15, 2025· 12 min read· AI Analytics

CensorshipMLForecastingVoidly

The most useful warning about an internet shutdown is the one that arrives before it happens. A journalist preparing for an election, an activist coordinating across borders, a researcher trying to capture baseline data — they all need lead time. This is the architecture of the 7-day forecast Voidly publishes alongside every country dashboard.

What we're forecasting

The forecast targets three distinct event classes, each with a separate model:

Full connectivity blackout: BGP withdrawal covering ≥50% of a country's routable prefixes. Detectable within minutes via global routing table snapshots; the hardest to hide and the hardest to reverse.
Partial or regional shutdown: BGP withdrawal or null-routing confined to one or more ASNs or provinces. Voidly's sub-national ASN mapping is the key data layer here — we track which prefixes belong to which ISP and region.
Service-specific throttling: Bandwidth collapse on a named service (WhatsApp, Signal, Instagram) without broader connectivity loss. The softest intervention; the hardest to detect because it requires content-aware inspection, not just routing-table analysis.

The per-country dashboards publish a 7-day probability for each class, not a binary prediction. A 72% probability on "service-specific throttling in Iran" is meaningful; "yes/no" would be misleading.

Training data: five years, 200 countries

The training corpus is the Voidly measurement archive going back to June 2020: 2.2B+ probe measurements, 1,574+ verified shutdown events, and the OONI historical corpus (1.66M+ downloads on HuggingFace) for pre-2020 context. Every labeled event has:

Country, ASN list, and affected prefix set
Start and end timestamps (some events lasted minutes; some, weeks)
Shutdown class: full / partial / service-specific
Corroboration source(s): OONI, CensoredPlanet, IODA, and/or Voidly-only
Political-event annotation: election, protest, coup attempt, legislative session, etc.

The label imbalance is severe. Full blackouts are rare — roughly 1 country-week in every 5,000 in the training corpus. Service-specific throttling is more common but still sparse relative to "nothing happened" weeks. We use a combination of oversampling rare events and asymmetric loss functions to keep precision usable on the minority class.

Feature engineering: three signal families

1 · Network telemetry features

These come directly from the Voidly probe network and the BGP route-collector feeds:

# Rolling windows over probe measurements
dns_anomaly_rate_7d        # fraction of DNS checks returning wrong IP
tls_interference_rate_7d   # fraction of TLS handshakes interrupted
bgp_prefix_churn_24h       # % of prefixes with routing changes in 24h
throttling_zscore_7d       # z-score of bandwidth vs 30-day baseline
probe_loss_rate_24h        # probe response failures (≠ interference)

The probe-loss feature is subtle: a spike in probe failures can mean the network is blocking our probes (a precursor to a harder shutdown), or it can mean a probe node went down. Distinguishing those requires cross-referencing with other probes in the same country.

2 · Political calendar features

The single strongest predictor in the model is proximity to a contested political event. Elections, constitutional referenda, and large protests account for 61% of verified full shutdowns in the training corpus, concentrated in the 0–5 day window before and during the event.

days_to_next_election       # -∞ if no election in next 30 days
election_competitiveness    # Freedom House / V-Dem contested-election score
protest_activity_index      # ACLED event count, 7-day rolling
coup_risk_percentile        # CoupCast model percentile, monthly
legislative_session_active  # binary: parliament in session
press_freedom_rank          # RSF index, annual

We source election calendars from the International IDEA dataset plus manual curation for sub-national elections in high-risk countries. ACLED event data is ingested daily for protest activity.

3 · Historical shutdown fingerprint

Countries that have shut down the internet before are more likely to do it again, and they often follow recognizable patterns. The fingerprint features capture this:

shutdowns_last_3y           # count of verified events in 3-year window
last_shutdown_days_ago      # recency signal
typical_shutdown_duration   # median hours for this country's events
shutdown_election_overlap   # fraction of past shutdowns with elections ±7d
service_throttle_precursor  # throttling preceded full shutdown in past

Model architecture

The forecast is an ensemble of three components:

ARIMA(p,d,q) per country per metric. Time-series component capturing seasonal and trend components in the network telemetry features. We fit separate ARIMA models per country because the autocorrelation structure varies: Iran has tight weekly cycles; Ethiopia has multi-month election cycles; Russia has sporadic spikes without strong seasonality.
Gradient-boosted classifier (XGBoost). Takes the full feature vector (ARIMA residuals + political calendar + fingerprint) and produces a probability for each shutdown class on each forecast day. Trained on the global corpus with country-cluster embeddings to share information across structurally similar countries.
Country-specific calibration layer. Isotonic regression calibration applied per country using held-out validation sets. Countries with sparse historical data (≤3 verified events) fall back to the regional cluster calibration. Calibration is updated monthly as new events are confirmed.

The ensemble probability is the weighted average of the three components, where weights are learned per country-class pair. Countries with rich historical data weight the calibrated XGBoost heavily; data-sparse countries weight the ARIMA signal more.

Validation: how we score forecast quality

Calibration matters more than raw accuracy here — a perfectly calibrated 60% forecast for a rare event is more useful than an overconfident 95% that fires on false positives. We score on:

Brier score: Mean squared error on the probability outputs, across all country-days in the validation set. Lower is better. Our 7-day full-blackout Brier score is 0.023 globally; it rises to 0.11 for countries with ≤2 historical events.
Reliability diagram: Calibration plot of predicted vs. observed rates in 10 probability bins. Well-calibrated forecasts fall on the diagonal. We publish reliability diagrams per country-class for transparency.
Precision at 50%: When the model exceeds 50% probability, how often does a shutdown actually occur within the 7-day window? Global: 41% for full blackouts. Regionally, this ranges from 28% (data-sparse) to 67% (well-calibrated high-risk countries).
Recall at 50%: What fraction of actual shutdowns did the model flag at >50% probability in the 7 days prior? Global recall: 58%. The model misses ≈42% of events — mostly spontaneous ones with no political-calendar signal.

We do not publish a single global accuracy number for the model. Aggregated across 200 countries, most country-days are true negatives (no shutdown), making accuracy misleadingly high. The per-country reliability scores published alongside each forecast are the honest summary.

Using the forecast

The 7-day forecast is available at three surfaces:

Live dashboard. voidly.ai shows the per-country forecast on the country drilldown pages, with the reliability score visible before the probability so users calibrate their interpretation.
REST API. GET /api/v1/forecast?country=IR&days=7 returns per-day probabilities for all three shutdown classes plus the per-country reliability score and the top-3 contributing features.
Alert subscriptions. Researchers and journalists can subscribe to email or webhook alerts when a country's forecast crosses a configurable threshold (e.g., >40% probability of full blackout in the next 72 hours). Alert fatigue is managed by requiring two consecutive crossing-days before triggering.

The top-3 contributing features in the API response are the most important part for journalists. A forecast driven primarily by days_to_next_election = 2 and shutdowns_last_3y = 4 tells a different story than one driven by bgp_prefix_churn_24h — the latter suggests something is already happening, not just predicted.

Limitations and what we don't claim

We can't predict all shutdowns. Events triggered by genuine emergencies (infrastructure failures, unplanned security incidents) have no political-calendar signal. The model misses most of these. The network telemetry can catch them in real-time — the forecast cannot catch them ahead of time.
The model is not neutral about probe coverage. Countries with sparse Voidly probe presence have weaker telemetry features. We surface this as a data-quality warning on sparse-coverage forecasts; we don't pretend the probability is equally meaningful for a country with 1 probe as for one with 12.
Political-event data has lead time. Election calendars are known weeks in advance. Protest emergence is not. When a protest appears suddenly and escalates within 48 hours, the forecast has no advance signal — the real-time measurement is the product to watch, not the 7-day outlook.
Forecast quality varies by country. Countries where we have 10+ verified events and consistent political cycles (Iran, Russia, Ethiopia, Myanmar) have substantially better calibrated forecasts than countries where we have 1–2 events. Per-country reliability scores are visible on every forecast page for this reason.

Access the forecast

All forecast outputs are part of the Voidly measurement dataset, licensed CC BY 4.0. The underlying training data is downloadable from huggingface.co/emperor-mew (global-censorship-index dataset). See the Voidly page for the full access surface list, and methodology for the verification standards that label the training data.

For how BGP prefix withdrawal patterns and IODA data feed into the model: BGP routing signals and internet shutdown detection: how Voidly uses IODA data →

Previous in this series: Cross-source censorship verification: reconciling OONI, CensoredPlanet, and IODA →