SVB's 2022 annual report contained a sentence that no other bank in our corpus was writing at the time. Our system flagged it in January 2023. The FDIC arrived in March.
That's a striking data point. But it's one case. The question worth asking is: does this pattern hold at scale? We ran the numbers across 4923 public companies and 7069 flag events. Here's what we found.
Each year a company files a 10-K. We compare that filing's language against two things simultaneously:
The score is high when a company is simultaneously writing things unusual for itself and unusual for its peers. That double divergence is the signal. Computed using sentence embeddings, fully deterministic — the same input always produces the same score.
This is different from keyword search (which can't distinguish industry-wide language shifts from company-specific ones) and different from LLM summarization (which can't do cross-sectional comparison across thousands of filings).
We ran a forward-return backtest across every flag event in the full corpus: 7069 events from 4923 companies (excluding the 2007–2011 macro crisis era, which would inflate the numbers for any distress signal).
| Horizon | Median alpha vs. S&P 500 | % events with negative alpha |
|---|---|---|
| 1 year | -8.6% | 58% |
| 2 years | -14.8% | 61% |
| 3 years | -22.4% | 63% |
n=7069 flag events, 4923 companies. Excluding 2007–2011 macro crisis era. Alpha = company return minus SPY return over the same period.
To be clear about what this means: across 7069 flag events, companies that crossed the distress ceiling underperformed SPY by a median of -8.6% at 1 year. 58% of those events had negative alpha — versus roughly 50% you'd expect from random flagging. That's a real directional signal in a noisy market, not a perfect predictor.
Companies we flagged before widely-known distress events:
| Company | Event | Lead time |
|---|---|---|
| SVB | Bank collapse Mar 2023 | 14 days (final filing) |
| Party City | Bankruptcy Apr 2023 | 1,137 days |
| Nikola | Bankruptcy Nov 2023 | 1,336 days |
| Bed Bath & Beyond | Bankruptcy Apr 2023 | 731 days |
| Rite Aid | Bankruptcy Oct 2023 | 167 days |
Three notable misses worth documenting:
These aren't buried in a footnote. The signal requires multi-year filing history to work. Companies with few historical pairs have lower signal reliability, and we flag this on the company page.
Two problems we know about and haven't solved:
The binomial false-positive problem. The ceiling is set at the 95th percentile of pair scores from labeled stable companies. But if a company has 10 years of filing history, the probability of at least one pair randomly exceeding the 95th percentile is 1-(0.95^10) ≈ 40%. Companies with long histories have a structurally higher false-positive rate. We're working on adaptive per-company thresholds.
Coarse peer groups. We use EDGAR SIC codes for peer comparison. These are imprecise — a healthcare device company and a biotech might share a code despite very different filing vocabularies. Tighter industry classifications would improve the cross-sectional comparison.
We built FilingDrift to make this signal accessible. Free tier covers our labeled company set (the cases above and more). Individual, Professional, and Institutional plans add watchlist alerts, API access, and the full 4923-company corpus.
The live demo shows SVB's full score history with annotations. The methodology page has the technical detail and the full validation analysis.
Questions about the methodology or specific tickers? Email hello@filingdrift.com