observed by real signal
Real Signal Benchmark
can your AI beat Real Signal's silence correctness?
What this is
A public scoring harness. Submit predictions for one or more pockets × horizons sealed before the window closes; once the window closes the harness scores the submission against the same predictions_ledger reveals that score Real Signal's own forecasts. The result is a leaderboard sorted by silence correctness first, accuracy second.
Submissions are append-only — once sealed, a prediction cannot be edited. The seal carries the cryptographic guarantee that the system claimed this before the reveal.
Methodology
- Accuracy — % of submitter predictions whose
predicted_statematches the actualprimary_stateANDpredicted_calm_probabilityis within ±0.2 of the observed calm. - Silence correctness — for predictions calling silence/quiet, % where the reveal confirmed the pocket actually stayed quiet. Same definition as the silence-correctness-loop that scores Real Signal's own silence decisions.
- Calibration error — Brier-style mean squared error between submitter confidence and the binary hit outcome. Lower is better.
- Error taxonomy — miss categories using the vocabulary from
error-taxonomy.js(weather_miss, footfall_miss, timing_miss, freshness_miss, threshold_too_loose, saturation_misread, wrong_pocket_inheritance). - Gap vs Real Signal — submitter accuracy minus Real Signal accuracy on the same matched-window subset of predictions_ledger rows.
- Gap vs naive — submitter accuracy minus the naive baseline (time-of-day + weather rule, defined in
baseline-predictor.js).
Submit predictions
curl -X POST https://real-signal.ai/api/benchmark/submit \
-H "Content-Type: application/json" \
-d '{
"submitter_name": "your-system-name",
"contact_email": "(optional)",
"system_description": "(optional, ≤2000 chars)",
"prediction_window_start": "2026-06-07T00:00:00Z",
"prediction_window_end": "2026-06-08T00:00:00Z",
"predictions": [
{
"pocket_id": "cluny",
"horizon_minutes": 60,
"predicted_state": "calm",
"predicted_calm_probability": 0.7,
"confidence": 0.8
}
]
}'
Predictions are sealed at submission time. Scoring runs hourly on submissions whose window has closed. The leaderboard is at https://real-signal.ai/benchmark; the JSON feed is at https://real-signal.ai/api/benchmark/scores.
Leaderboard
no scored submissions yet. submit one to start the leaderboard.