Technical writing
The Voidly Anomaly Classifier: Five Interference Classes, Gradient Boosted Trees, and Why We Optimize for Recall
Every Voidly probe run produces four raw measurements per domain: a DNS lookup, a TLS handshake attempt, an HTTP fetch, and a BGP reachability check. Across 37+ nodes and an 80-domain test list, that's roughly 11,500 raw measurements every 5 minutes. Most of them are clean. Some are noisy for entirely legitimate reasons — CDN timeouts, transient routing issues, misconfigured servers. A small fraction represent actual interference.
The anomaly classifier's job is to separate interference from noise, label it with the right type, and produce a confidence score that the cross-source reconciler can use to decide whether to promote an observation to the verified-incident tier. This post covers how that classifier works.
Why five classes instead of binary blocked/not-blocked
The naive approach — binary classification of “this domain is blocked” vs. “this domain is accessible” — loses too much information. The typeof interference matters for two reasons:
- Different actors use different techniques. DNS-level blocks are typically ISP-implemented (cheap to deploy at scale). TLS interference (SNI-based reset) requires DPI hardware and is more expensive — a signal that the block is deliberately targeted. BGP withdrawal is a national-scale event. Knowing the type narrows down who is responsible.
- Some interference types coexist. A target might be blocked at the DNS layer and have its TLS traffic reset — belt and suspenders. A binary model would label this as one event; the multi-class model surfaces both, which matters for cross-source corroboration (OONI might catch the HTTP block while CensoredPlanet catches the TLS reset).
The five classes the classifier distinguishes:
- DNS tampering. The resolver returns an IP that doesn't belong to the domain's known ASN, returns NXDOMAIN for a domain that exists, times out selectively (other queries to the same resolver succeed), or returns a redirect to an ISP landing page.
- TLS interference. The TCP handshake completes but the TLS handshake doesn't: reset after ClientHello, alert on SNI extension, or substituted certificate with a mismatched CN or unexpected issuer chain.
- HTTP blocking. The connection succeeds but the response is a block page (fingerprinted against a corpus of 800+ known block-page signatures), a 451 Unavailable For Legal Reasons, or a transparent redirect to a government notice.
- BGP withdrawal. The origin AS for the domain's IP prefix is no longer reachable from the probe's vantage point — routing to the destination has been severed at the infrastructure level.
- Throttling. All protocol-level checks pass but the measured bandwidth to the target is more than 3 standard deviations below the probe's per-ISP bandwidth baseline for that time window.
Feature engineering
Each class has its own feature set. Features are extracted from the raw probe measurement and joined with per-ISP and per-country baseline windows computed over the trailing 7 days.
# DNS features
dns_features = {
'ip_in_expected_asn': check_ip_asn(returned_ip, domain_known_asns),
'is_nxdomain': response_code == 'NXDOMAIN',
'is_refused': response_code == 'REFUSED',
'response_time_z': (response_ms - baseline_dns_ms) / baseline_dns_std,
'ttl_anomaly': abs(returned_ttl - expected_ttl) > 3600,
'ip_matches_sinkhole': returned_ip in KNOWN_SINKHOLES,
'redirect_to_block_page': is_known_block_ip(returned_ip),
}
# TLS features
tls_features = {
'handshake_completed': tls_ok,
'cert_hash_expected': cert_hash in known_cert_hashes(domain),
'cert_issuer_trusted': issuer in TRUSTED_CA_ROOTS,
'sni_alert_type': tls_alert_code, # e.g., 112 = unrecognized_name
'reset_after_client_hello': tcp_reset_at_stage == 'CLIENT_HELLO',
'handshake_time_z': (handshake_ms - baseline_tls_ms) / baseline_tls_std,
}
# HTTP features
http_features = {
'status_code': response_code,
'is_451': response_code == 451,
'body_fingerprint_match': max_similarity(body, BLOCK_PAGE_CORPUS),
'redirect_count': len(redirect_chain),
'final_url_domain': extract_domain(final_url),
'content_type_mismatch': expected_content_type != actual_content_type,
}
# BGP features
bgp_features = {
'origin_as_reachable': as_path_exists(target_prefix),
'path_length_delta': current_path_len - baseline_path_len,
'unique_collectors_visible': sum(c.sees_prefix for c in ROUTE_COLLECTORS),
}
# Throttling features
throttle_features = {
'bandwidth_z': (measured_bw - baseline_bw) / baseline_bw_std,
'latency_z': (measured_latency - baseline_latency) / baseline_latency_std,
'other_domains_ok': check_neighboring_domains(probe_id, timestamp),
}Five per-class binary classifiers
Rather than a single multi-class model, we train five independent binary classifiers — one per interference type. This allows interference types to coexist (a domain can be DNS-tampered and TLS-intercepted) and lets us tune the threshold for each class independently.
Each classifier is a gradient boosted tree ensemble (XGBoost, 100 estimators, max depth 5). Training data comes from labeled historical measurements: confirmed interference events from the OONI corpus and CensoredPlanet, augmented with probe measurements from countries where a block was subsequently confirmed by official government statements.
import xgboost as xgb
def train_classifier(class_name: str, features: pd.DataFrame, labels: pd.Series):
"""Train one binary classifier for a single interference class."""
model = xgb.XGBClassifier(
n_estimators=100,
max_depth=5,
learning_rate=0.1,
scale_pos_weight=neg_count / pos_count, # handle class imbalance
eval_metric='aucpr', # area under precision-recall curve
random_state=42,
)
model.fit(
features, labels,
eval_set=[(X_val, y_val)],
verbose=False,
)
return model
# One model per class
classifiers = {
'dns_tamper': train_classifier('dns_tamper', dns_features, dns_labels),
'tls_interfere': train_classifier('tls_interfere', tls_features, tls_labels),
'http_block': train_classifier('http_block', http_features, http_labels),
'bgp_withdraw': train_classifier('bgp_withdraw', bgp_features, bgp_labels),
'throttle': train_classifier('throttle', throttle_features, throttle_labels),
}Why we optimize for recall over precision
The classifiers are tuned to maximize recall (true positive rate) rather than precision. Current operating points:
- DNS tampering: 96% recall, 74% precision
- TLS interference: 94% recall, 81% precision
- HTTP blocking: 91% recall, 88% precision (most precise class — block pages are distinctive)
- BGP withdrawal: 98% recall, 85% precision
- Throttling: 89% recall, 51% precision (hardest class — most false positives)
The reason for this tradeoff: false negatives (missed censorship events) are worse than false positives in our pipeline. A missed event never gets investigated. A false positive gets surfaced as “Observed” and is filtered by the cross-source reconciler — it only reaches “Verified incident” status if OONI, CensoredPlanet, or IODA independently flags the same target in the same time window. A spurious classifier output that no other source corroborates stays at “Observed” indefinitely.
This means the cross-source verification layer is not just a quality gate — it's a design dependency. The classifier is deliberately imprecise, trusting the reconciler to wash out the noise.
Confidence scoring
Each classifier outputs a probability between 0 and 1. These are combined into a single measurement-level confidence score:
def compute_confidence(probabilities: dict[str, float]) -> float:
"""
Combine per-class probabilities into a single confidence score.
Returns 0.0 (no interference detected) to 1.0 (high-confidence verified block).
"""
CLASS_WEIGHTS = {
'dns_tamper': 0.25,
'tls_interfere': 0.25,
'http_block': 0.30, # slightly higher — block pages are definitive
'bgp_withdraw': 0.15, # lower — BGP events are coarse-grained
'throttle': 0.05, # lowest — high false-positive rate
}
raw = sum(probabilities[c] * w for c, w in CLASS_WEIGHTS.items())
# Boost if multiple classes fire simultaneously
classes_above_threshold = sum(p > 0.5 for p in probabilities.values())
if classes_above_threshold >= 2:
raw = min(1.0, raw * 1.2)
return round(raw, 3)Measurements with confidence ≥ 0.40 are surfaced as “Observed” and enter the cross-source reconciliation queue. Measurements with confidence ≥ 0.75 that are corroborated by at least one external source are promoted to “Corroborated.” Reaching “Verified incident” additionally requires a sustained pattern across multiple measurement windows.
Country-specific calibration
The raw probability from each classifier is well-calibrated globally but poorly calibrated for specific countries. DNS timeout rates in Pakistan (high baseline) look like interference in a model trained on German probes. To correct for this, each classifier has a per-country probability adjustment layer trained on country-specific labeled data:
# Per-country calibration using isotonic regression
from sklearn.isotonic import IsotonicRegression
calibrators = {}
for country in COUNTRIES_WITH_LABELED_DATA:
country_probs = raw_probs[raw_probs['country'] == country]
ir = IsotonicRegression(out_of_bounds='clip')
ir.fit(country_probs['raw_prob'], country_probs['label'])
calibrators[country] = ir
def calibrated_probability(raw_prob: float, country: str, class_name: str) -> float:
key = (country, class_name)
if key in calibrators:
return float(calibrators[key].predict([raw_prob])[0])
return raw_prob # fall back to global calibration for thin countriesCountries with fewer than 500 labeled measurements use global calibration with a conservative threshold boost (confidence × 0.85) to reduce the false-positive rate in data-sparse regions.
Throttling: the difficult class
Throttling detection has a structurally higher false-positive rate and lower precision than the other four classes. Several factors make it harder:
- No hard signal. DNS, TLS, HTTP, and BGP blocks produce a binary failure. Throttling is a gradient — the question is “how slow is too slow?” The answer varies by ISP, time of day, and the content delivery network the target domain uses.
- Legitimate congestion looks identical.A probe in a densely-used shared connection (common in residential deployments) may see low bandwidth for reasons unrelated to censorship. We partially mitigate this by requiring that other domains on the same probe don't show similar degradation — if five domains are slow, it's congestion; if one domain is slow while others are fine, it's more likely targeted throttling.
- Baseline drift. ISPs legitimately change their capacity provisioning. A 7-day bandwidth baseline can be stale if the ISP upgraded or downgraded their link. We detect baseline drift with a Kolmogorov-Smirnov test against the 30-day trailing distribution and reset the baseline when drift is detected.
As a result, throttling events surface frequently as “Observed” but reach “Verified incident” at a much lower rate than the other four classes. We don't penalize the throttling classifier for this — it's a structural property of the signal, not a model deficiency.
What the classifier output feeds
The classifier output — per-class probabilities, composite confidence score, and the dominant interference type — is attached to every measurement record sent to the collector. The cross-source reconciler then:
- Groups measurements by target domain, country, and 4-hour time window
- Checks whether OONI, CensoredPlanet, or IODA flag the same target in the same window
- Applies independence weights (two probes on the same ISP are not independent)
- Computes a composite corroboration score that, if it crosses the verified-incident threshold, publishes the event to the public dataset
The classifier's recall bias means the reconciler's input is “noisy but complete.” The reconciler's independence weighting turns that noisy input into a high-precision output. Neither component works well without the other.
For how the probe application collects the raw measurements this classifier processes: The Voidly Probe: Tauri + boringtun network measurement at the operator's edge →
For how the OONI archive that provides labeled training data for this classifier was processed: Building the OONI historical corpus: 1.66M downloads, schema normalization, and the decisions behind the dataset →
For how the classifier output is reconciled across OONI, CensoredPlanet, and IODA: Cross-source censorship verification: reconciling OONI, CensoredPlanet, and IODA →