Deterministic. Reproducible. No LLMs.

It's in the drift
before it's in the price

You read the 10-K. Nothing jumps out.

The signal is in how much the language changes year over year. Companies whose filings stay most stable have historically outperformed — a quality factor that survives a five-factor + quality adjustment. The same score, read from the high end, flags deteriorating companies early.

FilingDrift scores that change. No LLMs, no hallucinations.
Just a deterministic, corpus-normalized, auditable score.

When SVB filed its last 10-K, 23 of 24 analysts had it Buy or Hold.
Our score was already above the distress ceiling.

Built for forensic, distressed-debt, and quant research.

See the SVB story → Start free → Browse the examples
251 days
Median warning lead time
Lead time before the collapse
(detected cases only, ≤3yr)
93.2 bps
Q1 monthly alpha
Most stable-language quintile, FF5+QMJ (t=9.16)
survivors-only backtest — caveat
34 / 43
Crisis events detected
Out of labeled crisis companies
with ≥2 pre-event filings — see what we miss
4930
Companies tracked
Continuously expanding as new
filings appear on EDGAR
↑ growing · 10-Qs now live · 8-Ks & global markets coming
How to read the score — it measures how much a 10-K's language changed year over year
Low score — language barely changed — the stable, market-beating quintile (the factor)
High score (above the 51.5 ceiling) — language spiked — flagged (distress early-warning)

The factor: stable filing language beats the market

Sort every company into quintiles by how much its distress vocabulary rises year over year; rebalance monthly with a 1-month signal lag, over the full 2000–2026 period (including the 2007–2011 crisis). The lowest-change quintile — companies whose filing language stays most stable — earns a large, persistent alpha that survives a full five-factor + quality adjustment, and even survives removing all the corpus normalization.

Factor model Q1 (stable) monthly alpha Q1–Q5 long/short
Fama-French 3-factor 80.0 bps (t=7.77) 59.5 bps (t=4.90)
FF5 + momentum + quality (QMJ) 93.2 bps (t=9.16) 38.9 bps (t=3.47)

Q1 alpha rises as factors are added (80 → 83 → 93 bps) — so it isn't a quality proxy. This is the directed cousin of "Lazy Prices" (Cohen, Malloy & Nguyen 2020): the long/short spread (39–60 bps depending on the factor model) is comparable to their 18–45 bps range — the long side (Q1) is where the signal lives. Full validation →

Survivorship caveat. These quintile figures are computed on FilingDrift's filer universe — companies that still report — so names that have since delisted are absent, which inflates the long side. When we rebuild the universe with a corpus of 4,716 delisted companies and model each delisting as a total loss (a deliberate upper bound), the long-side alpha shrinks and the surviving edge concentrates in small-caps (micro-cap Q1–Q5 ≈ 164 bps/mo); above ~$300M market cap the ranking inverts. Read this as an in-sample research signal, not a tradeable, all-cap return. How we measured it →

The other end: distress early-warning

Read from the high end, the same score flags deteriorating companies early — on a labeled crisis set it recalls 75% of pre-event filings, typically 1–3 years out. The table shows peak pre-event scores vs the 95th-percentile control ceiling (51.5).

Case Study: SVB Financial Group

Language almost no one else was writing — 14 days early.

On February 24, 2023, SVB filed its annual 10-K. The stock was at $267.83. 23 of 24 analysts had it Buy or Hold.

In that filing, SVB described in detail how forced asset sales to meet withdrawals would crystallize losses that it had previously been able to carry as unrealized. Plenty of companies mentioned unrealized losses in 2022 — rising rates hit everyone, so that language was discounted as corpus-wide. But almost no company was describing the liquidity mechanism in those terms. That corpus-wide rarity is what the score catches.

FilingDrift scored that filing 57.5 against a ceiling of 51.5. The FDIC arrived 14 days later.

SVB moved fast. Most don't. Bed Bath & Beyond was flagged ~2 years before its bankruptcy. Nikola, ~3.7 years. The median warning time: 251 days.

See the full analysis →
SVB FilingDrift Score — Year over Year
Red line = control ceiling (51.5)
Company Event Peak Score vs. Ceiling
UNTC UNIT CORP Bankruptcy 253.9 4.9×
FLYYQ Spirit Airlines, Inc. Bankruptcy 207.8 4.0×
KODK EASTMAN KODAK CO Bankruptcy 187.6 3.6×
GPUS BitNile Holdings, Inc. Bankruptcy 164.8 3.2×
SD SANDRIDGE ENERGY INC Bankruptcy 151.8 2.9×
BBDC Barings BDC, Inc. Bankruptcy 151.1 2.9×
GPOR GULFPORT ENERGY CORP Bankruptcy 140.3 2.7×
NRDE Lordstown Motors Corp. Bankruptcy 139.2 2.7×

The distress read is evidenced by lift and recall: moderate-flagged companies reach a distress outcome about 1.2× the corpus base rate, and 75% of labeled crisis companies are flagged before the event. Flagged companies are typically near recent highs when flagged, not already collapsed — the signal leads the equity market. Methodology →

How the score is built

📄

Read every 10-K and 10-Q. The full document.

Every annual and quarterly SEC filing from EDGAR: risk factors, MD&A, liquidity disclosures. 10-Ks run 100–200 pages. We ingest all of it, not a summary. Most tools look for keywords or use LLMs for analysis. We process the entire document to capture subtle signals.

📊

Normalize against the whole corpus

How unusual is this company's language change this year — relative to every other company that filed? Each phrase is weighted by how rare it is across the whole corpus and downweighted if it rose corpus-wide that year. Corpus-wide language shifts don't move the needle; company-specific escalation does. (Corpus-wide, not by SIC sector — we tested per-sector and it scored worse.)

🧠

Find what keywords miss

A phrase that rises across the whole corpus in a year (every company writing about interest-rate risk in 2022) gets little weight. A phrase that's rare across the corpus but escalates in one company — covenant headroom, forced asset sales — drives the score. The signal is in the unusual change, not the raw count.

🚨

A number, not a narrative

Run the same filing twice, get the same score. No language model, no prompt, no randomness. SVB's filing always scores 57.5. It scored 57.5 when the FDIC arrived 14 days later and it scores 57.5 today.

From the research

Case studies and methodology notes — what the signal caught, what it missed, and how it works.

Case Study
SVB's final 10-K: what the language showed before the bank run started

The filing was public for 14 days before the FDIC arrived. Score: 57.5 vs. ceiling 51.5. What the language said that 23 of 24 analysts didn't flag.

Read →
Case Study
What's going on with Chevron? — 15 months on

Score: 71.5 vs. ceiling 51.5. The Hess arbitration resolved in Chevron's favor. Venezuela operations ended. Stock hit ATH. Here's what the filing showed and what happened.

Read →
Research
Why we use embeddings instead of an LLM

The same filing always produces the same score. No prompting, no hallucination, no context-window truncation. How deterministic scoring works and why it matters for research.

Read →

How it compares

FilingDrift LLM tools AlphaSense Amenity / Symphony
Corpus-wide normalization Partial
Deterministic, reproducible score ✗ stochastic ✗ LLM-based
Historical context ✗ context limit Manual search
Self-serve, no sales call ✗ enterprise ✗ enterprise
Published quantitative backtest ✗ not disclosed Partial
Starting price Free · from $79/mo Free · from $20/mo $15,000+/yr $10,000+/yr

Amenity Analytics (now part of Symphony) is the closest product in this space — it tracks linguistic change in filings and transcripts. FilingDrift's corpus-wide normalization is purpose-built for the specific question "was this change unusual relative to what every other company wrote that year." Both Amenity and AlphaSense are embedded in enterprise platforms requiring sales demos; FilingDrift is self-serve from day one.

Why not just ask an LLM? LLMs read one document at a time. They have no idea what every other company filed that year, so they can't tell you whether SVB's language change was unusual across the corpus. They also give different answers on the same text every run. FilingDrift's score is computed once from the full corpus and stays the same.

Disclaimer: FilingDrift provides automated linguistic analysis of public SEC filings for research purposes only. This is not investment advice. Past detection of distress events does not guarantee future accuracy. See Terms of Service.

Never miss what's hiding in plain sight again

Paid subscribers get an alert when a watched company files a 10-K or 10-Q above the distress threshold — usually within hours of the filing hitting EDGAR. Useful for credit monitoring, pre-deal diligence, and counterparty risk.

Free accounts get dashboard access and 40 labeled case studies. Alerts, watchlist expansion, and API access are on paid plans — see pricing.

Create free account → See pricing

Earnings calls, non-US markets, and more — see the roadmap.

This site uses a session cookie for authentication. We also use Plausible Analytics, a privacy-friendly, cookieless tool that collects no personal data and requires no consent under GDPR. See our Privacy Policy.