Sort every company into quintiles by how much its distress vocabulary rises year over year; rebalance monthly with a 1-month signal lag, over the full 2000–2026 period (including the 2007–2011 crisis). The lowest-change quintile — companies whose filing language stays most stable — earns a large, persistent alpha that survives a full five-factor + quality adjustment, and even survives removing all the corpus normalization.
Q1 alpha rises as factors are added (80 → 83 → 93 bps) — so it isn't a quality proxy. This is the directed cousin of "Lazy Prices" (Cohen, Malloy & Nguyen 2020): the long/short spread (39–60 bps depending on the factor model) is comparable to their 18–45 bps range — the long side (Q1) is where the signal lives. Full validation →
Survivorship caveat. These quintile figures are computed on FilingDrift's filer universe — companies that still report — so names that have since delisted are absent, which inflates the long side. When we rebuild the universe with a corpus of 4,716 delisted companies and model each delisting as a total loss (a deliberate upper bound), the long-side alpha shrinks and the surviving edge concentrates in small-caps (micro-cap Q1–Q5 ≈ 164 bps/mo); above ~$300M market cap the ranking inverts. Read this as an in-sample research signal, not a tradeable, all-cap return. How we measured it →
Read from the high end, the same score flags deteriorating companies early — on a labeled crisis set it recalls 75% of pre-event filings, typically 1–3 years out. The table shows peak pre-event scores vs the 95th-percentile control ceiling (51.5).
On February 24, 2023, SVB filed its annual 10-K. The stock was at $267.83. 23 of 24 analysts had it Buy or Hold.
In that filing, SVB described in detail how forced asset sales to meet withdrawals would crystallize losses that it had previously been able to carry as unrealized. Plenty of companies mentioned unrealized losses in 2022 — rising rates hit everyone, so that language was discounted as corpus-wide. But almost no company was describing the liquidity mechanism in those terms. That corpus-wide rarity is what the score catches.
FilingDrift scored that filing 57.5 against a ceiling of 51.5. The FDIC arrived 14 days later.
SVB moved fast. Most don't. Bed Bath & Beyond was flagged ~2 years before its bankruptcy. Nikola, ~3.7 years. The median warning time: 251 days.
See the full analysis →The distress read is evidenced by lift and recall: moderate-flagged companies reach a distress outcome about 1.2× the corpus base rate, and 75% of labeled crisis companies are flagged before the event. Flagged companies are typically near recent highs when flagged, not already collapsed — the signal leads the equity market. Methodology →
Every annual and quarterly SEC filing from EDGAR: risk factors, MD&A, liquidity disclosures. 10-Ks run 100–200 pages. We ingest all of it, not a summary. Most tools look for keywords or use LLMs for analysis. We process the entire document to capture subtle signals.
How unusual is this company's language change this year — relative to every other company that filed? Each phrase is weighted by how rare it is across the whole corpus and downweighted if it rose corpus-wide that year. Corpus-wide language shifts don't move the needle; company-specific escalation does. (Corpus-wide, not by SIC sector — we tested per-sector and it scored worse.)
A phrase that rises across the whole corpus in a year (every company writing about interest-rate risk in 2022) gets little weight. A phrase that's rare across the corpus but escalates in one company — covenant headroom, forced asset sales — drives the score. The signal is in the unusual change, not the raw count.
Run the same filing twice, get the same score. No language model, no prompt, no randomness. SVB's filing always scores 57.5. It scored 57.5 when the FDIC arrived 14 days later and it scores 57.5 today.
Case studies and methodology notes — what the signal caught, what it missed, and how it works.
The filing was public for 14 days before the FDIC arrived. Score: 57.5 vs. ceiling 51.5. What the language said that 23 of 24 analysts didn't flag.
Read →Score: 71.5 vs. ceiling 51.5. The Hess arbitration resolved in Chevron's favor. Venezuela operations ended. Stock hit ATH. Here's what the filing showed and what happened.
Read →The same filing always produces the same score. No prompting, no hallucination, no context-window truncation. How deterministic scoring works and why it matters for research.
Read →Amenity Analytics (now part of Symphony) is the closest product in this space — it tracks linguistic change in filings and transcripts. FilingDrift's corpus-wide normalization is purpose-built for the specific question "was this change unusual relative to what every other company wrote that year." Both Amenity and AlphaSense are embedded in enterprise platforms requiring sales demos; FilingDrift is self-serve from day one.
Why not just ask an LLM? LLMs read one document at a time. They have no idea what every other company filed that year, so they can't tell you whether SVB's language change was unusual across the corpus. They also give different answers on the same text every run. FilingDrift's score is computed once from the full corpus and stays the same.
Paid subscribers get an alert when a watched company files a 10-K or 10-Q above the distress threshold — usually within hours of the filing hitting EDGAR. Useful for credit monitoring, pre-deal diligence, and counterparty risk.
Free accounts get dashboard access and 40 labeled case studies. Alerts, watchlist expansion, and API access are on paid plans — see pricing.
Earnings calls, non-US markets, and more — see the roadmap.