Significant changes to the scoring methodology, data pipeline, and corpus coverage. Minor bug fixes and infrastructure updates are not listed. Earlier entries are approximate.
NT 10-K and NT 10-Q filings (late-filing notifications) are now detected and flagged separately from scored pairs. Companies with active NT flags display a warning badge on the company page and trigger a dedicated alert email. 1,444 companies in the current corpus carry at least one historical NT flag. NT signals are not included in the drift score — they are a separate binary indicator.
Quarterly 10-Q filings are now ingested and scored using year-over-year pairing (Q1 this year vs. Q1 last year, etc.) rather than sequential quarter-to-quarter comparison. Sequential pairing produced near-zero signal during sustained distress periods because adjacent quarters look similar; YoY pairing captures the cumulative drift. A separate 10-Q ceiling is computed from the quarterly corpus using the same 95th-percentile method as the 10-K ceiling. 10-Q scores are shown as orange triangle overlays on the company chart and are included in API responses and alert emails.
Coverage expanded from the original labeled corpus (~40 companies) through two phases of EDGAR backfill to the current 4,900+ company universe. The control ceiling — the 95th-percentile score across all non-crisis companies in the same year — is now computed from this larger corpus and is more stable than the early labeled-corpus ceiling. All backtest statistics are computed on the full corpus.
The final score is now a weighted composite of two components: phrase-frequency escalation (IDF-weighted counts of specific risk vocabulary) and semantic drift (cosine distance between sentence embeddings of consecutive risk factor sections). The semantic component detects meaning-level changes that phrase counting misses — paraphrasing, restructured disclosures, buried hedging — and vice versa. The weighting between components is fixed and was calibrated on the labeled crisis corpus.
A company's phrase escalation is weighted by how rare each phrase is across the whole corpus (IDF) and downweighted if the phrase escalated across the whole corpus in that filing year. This removes the confound where risk language rises corpus-wide during macro stress periods (COVID, 2008, regional banking crisis) and would otherwise inflate individual company scores. The normalization is corpus-wide, not by SIC sector (per-sector normalization was tested and reduced both the portfolio alpha and the distress recall). The control ceiling is the 95th percentile of control-company scores.
FilingDrift launched with a phrase-escalation model applied to annual 10-K filings from EDGAR. Risk factor sections are tokenized, and a curated vocabulary of distress-associated phrases (going concern, impairment, liquidity risk, etc.) is tracked across consecutive filing pairs. Scores are IDF-weighted to reduce the influence of phrases that appear across the entire corpus. Initial labeled corpus: 11 crisis companies, 29 controls.
For the full technical methodology, see the methodology page. For API documentation, see API docs.