Technical writing
Voidly probe scheduling constraints: battery budgets, cellular data limits, and adaptive domain selection
A Voidly probe running on a volunteer's mobile device operates in a fundamentally different resource environment than a server-side agent. Battery is not infinite. Cellular data plans have monthly caps. Thermal throttling can suspend a measurement mid-flight. Measurement software that ignores these constraints gets uninstalled; measurement software that respects them can run continuously for years without operator intervention.
This article covers the four resource constraint systems that govern when and how much Voidly probes measure: battery-floor enforcement, thermal detection, cellular data budget accounting, and the adaptive priority queue that determines domain selection order when available budget forces a shorter-than-full measurement cycle.
Resource constraint model
The probe's scheduler evaluates four constraint checks before starting each measurement cycle. All four must pass for the cycle to proceed:
// src/scheduler/constraints.rs
use crate::config::ResolvedConfig;
#[derive(Debug, Clone)]
pub struct ResourceSnapshot {
pub battery_pct: u8, // 0–100
pub is_charging: bool,
pub is_thermal_limited: bool, // from thermal state API
pub network_type: NetworkType,
pub cellular_bytes_today: u64, // bytes sent+received since local midnight
pub wifi_bytes_today: u64,
}
#[derive(Debug, Clone, PartialEq)]
pub enum NetworkType { WiFi, Cellular, Ethernet, Unknown }
#[derive(Debug)]
pub enum ConstraintViolation {
BatteryTooLow { current: u8, floor: u8 },
ThermalThrottled,
CellularDailyCapExceeded { used: u64, cap: u64 },
UnknownNetwork,
}
pub fn check_constraints(
snap: &ResourceSnapshot,
cfg: &ResolvedConfig,
) -> Result<(), ConstraintViolation> {
// 1. Battery floor (skip if charging: charging probes never skip on battery grounds)
if !snap.is_charging && snap.battery_pct < cfg.battery_floor_pct {
return Err(ConstraintViolation::BatteryTooLow {
current: snap.battery_pct,
floor: cfg.battery_floor_pct,
});
}
// 2. Thermal throttle: always skip if the OS signals thermal pressure
if snap.is_thermal_limited {
return Err(ConstraintViolation::ThermalThrottled);
}
// 3. Cellular data cap
if snap.network_type == NetworkType::Cellular {
if snap.cellular_bytes_today >= cfg.daily_upload_cap_bytes {
return Err(ConstraintViolation::CellularDailyCapExceeded {
used: snap.cellular_bytes_today,
cap: cfg.daily_upload_cap_bytes,
});
}
}
// 4. Unknown network: skip rather than measure on an uncharacterized link
if snap.network_type == NetworkType::Unknown {
return Err(ConstraintViolation::UnknownNetwork);
}
Ok(())
}The thermal check is binary: if the OS reports thermal pressure (iOS'sProcessInfo.thermalState >= .serious or Android'sPowerManager.getThermalHeadroom() < 0.3), the cycle is skipped entirely. Attempting to run measurements under thermal pressure would trigger the OS scheduler to throttle the probe process, producing unreliable timing measurements that would corrupt the anomaly detector's baseline.
Cellular data budget accounting
The probe tracks cellular data usage in a local SQLite database (separate from the measurement database) using a per-minute bucket pattern analogous to the regulatory API's daily quota sliding window. Usage is summed over the trailing 24 hours to implement a sliding-window cap rather than a midnight-reset cap:
-- src/scheduler/budget.sql
CREATE TABLE cellular_usage (
bucket_ts INTEGER NOT NULL, -- unix timestamp, floored to minute
bytes_tx INTEGER NOT NULL DEFAULT 0,
bytes_rx INTEGER NOT NULL DEFAULT 0,
PRIMARY KEY (bucket_ts)
);
-- Sum usage for the trailing 24 hours
-- Called before each cycle to determine remaining budget.
CREATE VIEW cellular_usage_24h AS
SELECT
COALESCE(SUM(bytes_tx + bytes_rx), 0) AS total_bytes,
MIN(bucket_ts) AS window_start
FROM cellular_usage
WHERE bucket_ts > (strftime('%s', 'now') - 86400);// src/scheduler/budget.rs
pub fn record_cycle_usage(db: &Connection, bytes_tx: u64, bytes_rx: u64) -> rusqlite::Result<()> {
let bucket = SystemTime::now()
.duration_since(UNIX_EPOCH).unwrap().as_secs()
/ 60 * 60; // floor to minute
db.execute(
"INSERT INTO cellular_usage (bucket_ts, bytes_tx, bytes_rx)
VALUES (?1, ?2, ?3)
ON CONFLICT (bucket_ts) DO UPDATE SET
bytes_tx = bytes_tx + excluded.bytes_tx,
bytes_rx = bytes_rx + excluded.bytes_rx",
rusqlite::params![bucket as i64, bytes_tx as i64, bytes_rx as i64],
)?;
// Prune entries older than 25 hours (1 hour grace beyond the 24h window)
let cutoff = (SystemTime::now()
.duration_since(UNIX_EPOCH).unwrap().as_secs()
- 90_000) as i64;
db.execute("DELETE FROM cellular_usage WHERE bucket_ts < ?1", [cutoff])?;
Ok(())
}
pub fn cellular_used_bytes_24h(db: &Connection) -> rusqlite::Result<u64> {
db.query_row(
"SELECT total_bytes FROM cellular_usage_24h",
[],
|row| row.get::<_, i64>(0),
).map(|v| v as u64)
}Measurement bytes are recorded after each cycle completes (not before), so the budget check uses the previous cycle's actual consumption. This introduces a one-cycle lag but avoids the need to predict cycle size before running it. The daily cap in the config bundle defaults to 5 MB on cellular (sufficient for ~200 web-connectivity measurements at ~25 KB each) and is disabled (set to u64::MAX) on WiFi.
Adaptive cycle length
When the remaining cellular budget is less than the default cycle size would consume, the scheduler reduces the number of domains in the cycle proportionally rather than skipping the cycle entirely:
// src/scheduler/cycle_builder.rs
const BYTES_PER_MEASUREMENT_ESTIMATE: u64 = 28_000; // empirical median over 30d
pub struct CycleParams {
pub domain_count: u16,
pub test_types: Vec<TestType>,
}
pub fn compute_cycle_params(
snap: &ResourceSnapshot,
cfg: &ResolvedConfig,
cellular_used: u64,
) -> CycleParams {
let configured_count = cfg.domains_per_cycle;
let domain_count = if snap.network_type == NetworkType::Cellular {
let remaining_bytes = cfg.daily_upload_cap_bytes.saturating_sub(cellular_used);
let budget_domains = (remaining_bytes / BYTES_PER_MEASUREMENT_ESTIMATE)
.min(u64::from(configured_count)) as u16;
// Never go below 5 domains (floor for statistical validity)
budget_domains.max(5).min(configured_count)
} else {
configured_count
};
// On cellular with a reduced cycle, skip TLS measurement (2x byte cost)
// to preserve more domains within the remaining budget.
let test_types = if snap.network_type == NetworkType::Cellular
&& domain_count < configured_count / 2
{
cfg.enabled_test_types.iter()
.filter(|t| t.as_str() != "tls_handshake")
.cloned()
.collect()
} else {
cfg.enabled_test_types.clone()
};
CycleParams { domain_count, test_types }
}The 5-domain floor ensures that even a heavily budget-constrained probe continues to contribute to the measurement corpus, particularly for priority domains (news sites, election infrastructure, health resources) that appear at the top of the priority queue regardless of budget.
Adaptive domain priority queue
Given a cycle length shorter than the full test list, the probe must select which domains to measure. The priority queue scores each domain on three axes:
| Axis | Weight | Description |
|---|---|---|
| Staleness | 0.50 | Hours since last measurement by this probe (capped at 168h = 1 week) |
| Priority flag | 0.35 | 1.0 for config-bundle priority domains, 0.0 otherwise |
| Anomaly recency | 0.15 | 1.0 if probe detected anomaly in last 7 days, decays to 0.0 at 30 days |
// src/scheduler/priority_queue.rs
pub fn score_domain(
domain: &str,
hours_since_last_measurement: f32,
is_priority: bool,
days_since_anomaly: Option<f32>,
) -> f32 {
let staleness = (hours_since_last_measurement / 168.0).min(1.0);
let priority_flag = if is_priority { 1.0_f32 } else { 0.0 };
let anomaly_recency = days_since_anomaly
.map(|d| (1.0 - d / 30.0).max(0.0))
.unwrap_or(0.0);
0.50 * staleness + 0.35 * priority_flag + 0.15 * anomaly_recency
}
pub fn select_domains(
test_list: &[DomainEntry],
cycle_length: u16,
last_measured_at: &HashMap<String, SystemTime>,
priority_domains: &HashSet<String>,
recent_anomalies: &HashMap<String, SystemTime>, // domain -> last anomaly time
now: SystemTime,
) -> Vec<String> {
let mut scored: Vec<(f32, &str)> = test_list
.iter()
.map(|entry| {
let hours = last_measured_at.get(&entry.domain)
.map(|t| now.duration_since(*t).unwrap_or_default().as_secs_f32() / 3600.0)
.unwrap_or(168.0); // never measured = max staleness
let days_anomaly = recent_anomalies.get(&entry.domain)
.map(|t| now.duration_since(*t).unwrap_or_default().as_secs_f32() / 86_400.0);
let score = score_domain(
&entry.domain,
hours,
priority_domains.contains(&entry.domain),
days_anomaly,
);
(score, entry.domain.as_str())
})
.collect();
// Sort descending by score
scored.sort_by(|a, b| b.0.partial_cmp(&a.0).unwrap_or(std::cmp::Ordering::Equal));
scored.into_iter()
.take(cycle_length as usize)
.map(|(_, d)| d.to_string())
.collect()
}The anomaly recency weight ensures that domains where censorship was recently detected continue to be measured frequently even after the anomaly resolves, enabling the system to capture intermittent blocking patterns. Priority domains from the config bundle always score at least 0.35, guaranteeing they appear in every cycle at default cycle lengths.
Scheduler telemetry
Every skipped cycle is recorded in the probe's measurement database with the constraint violation reason. This telemetry is uploaded with the regular measurement batch so that Voidly's backend can distinguish between “no anomaly detected” (a measurement was made and found nothing) and “no measurement attempted” (the probe was constrained). Confounding these two states would incorrectly inflate the denominator in the coverage statistics used to weight probe contributions in the vantage selection algorithm.
| Constraint reason | Fleet incidence (30d median) | Time-of-day pattern |
|---|---|---|
| Battery below floor | 8.3% of cycles | Peak 22:00–06:00 local |
| Thermal throttled | 2.1% of cycles | Peak 13:00–15:00 local (summer months) |
| Cellular cap exceeded | 1.4% of cycles | Uniform (day-end heavy for heavy-use probes) |
| Unknown network | 0.7% of cycles | No pattern (VPN/tethering transitions) |
Related writing
Voidly test list management describes how the domain test list that feeds the priority queue is maintained, categorized, and updated.
Voidly measurement feature extraction covers what happens after a domain is selected for measurement: the feature extraction pipeline that converts raw probe observations into classifier input vectors.