Learn

Survivorship bias in hedge funds

Every hedge fund performance number you've read is probably an overestimate. The funds that blew up don't show up in the average. Here's why that matters for how you use 13F data.

What is survivorship bias?

Survivorship bias is what happens when you study only the entities that made it through a filter — and forget about the ones that didn't.

The classic example comes from World War II. The statistician Abraham Wald was asked to help the US Air Force determine where to add armor to their bombers. They showed him data on bullet holes from planes that returned from missions. The obvious answer seemed to be: reinforce the parts that got hit most often. Wald said the opposite. The planes that took hits in those spots came back. The planes that got hit in the other spots never returned to be counted. Reinforce the blank spots.

The same logic applies to hedge funds, stock screens, sports analytics, startup advice, and practically anywhere a selection process has been running long enough to produce visible winners and invisible losers.

The hedge fund graveyard

The hedge fund industry is enormous — roughly 30,000 funds managing over $4 trillion worldwide as of 2024. It is also brutally Darwinian. Estimates vary, but credible research (Ackermann, McEnally & Ravenscraft; Liang; more recently Eurekahedge) consistently finds approximately 6–10% of hedge funds close every year. Some blow up. Some return capital to investors voluntarily. Some merge into larger platforms. But they disappear from the data.

A fund launched in 2010 that blew up in 2015 after three losing years doesn't show up in a 2026 database query for "hedge fund average returns." Its bad years are simply missing. The result: databases composed exclusively of living funds overstate historical performance by roughly 2–4 percentage points per year according to most peer-reviewed estimates. That's not noise — that's the difference between a mediocre strategy and a great one.

The HFRI Composite Index — the most cited hedge fund benchmark — has historically suffered from this problem. Its index construction methodologies have improved over time, but any database that relies solely on self-reported returns from living funds will inherit some version of it.

Why it inflates track records

Imagine 100 funds launch in January 2015. By January 2025, 50 of them have closed. Now a researcher asks: what was the average annualized return of hedge funds that launched in January 2015?

If they can only access data from the 50 living funds, they get a number. If those 50 survived — even partially because of performance — the number is biased upward. The 50 dead funds had, by definition, worse trajectories. Missing their data makes the whole cohort look better than it was.

This compounds over time in a particularly nasty way. A fund that loses 40% in Year 1 and closes has its returns counted for Year 1 in databases with backfill. But many databases only backfill if the fund voluntarily submitted data before closing — which funds with strong early returns are far more likely to do than funds that got off to a bad start. The selection bias is present at entry, not just at exit.

Backfill bias: Funds that start reporting to a database often submit historical returns going back several years — but only if those historical returns were good enough to attract investors. Bad early records go unsubmitted.
Exit bias: Funds that close often stop reporting to databases in their final months, when performance is worst. Their worst quarters disappear.
Liquidation bias: A fund in wind-down mode may have illiquid positions that take months to mark to market. The final reported NAV may be more flattering than the actual recovery value.

The 13F-specific problem

13F filings are required for any institutional investment manager with over $100 million in "Section 13(f) securities" (US-listed equities and certain options) at the end of a calendar quarter. A fund that drops below $100M in assets — or liquidates — stops filing.

This means the 13F universe is, by construction, a live index of surviving funds. A manager who ran $2B in 2019, blew up during March 2020, and returned capital by December 2020 does not appear in any 2021 or later 13F analysis. Their pre-2020 filings exist in EDGAR, but they're not part of any "what are the best managers doing now" analysis because they're gone.

The practical consequence: if you run a screen of "what tickers do the top 50 institutional investors hold?" using current 13F data, every one of those 50 managers survived whatever the market threw at them over the past 5–10 years. That's not random — it correlates, imperfectly, with skill. But it also introduces selection effects that are easy to miss.

The honest position HoldLens takes

HoldLens tracks 30 managers — Buffett, Ackman, Druckenmiller, Klarman, Burry, Marks, Loeb, Tepper, Hohn, Coleman, and 20 others. Every one of them is still active. Every one of them has filed a 13F in the last two quarters.

That means we have, definitionally, selected for survivors.

We're not going to pretend otherwise. The managers we track were chosen because of their track records — long-term alpha generation, concentrated positions, and the kind of documented discipline that makes their signals worth following. Those selection criteria correlate with survival. We didn't pick them arbitrarily; we picked them because they're demonstrably good at what they do over multi-decade windows. That's a different kind of selection bias than random database survivorship — it's intentional curation of quality. But it's still a filter.

What this means in practice: when you see a ConvictionScore weighted by a manager's historical alpha, that alpha was computed on a manager who has existed long enough to develop a track record. Managers who ran for two years, produced mediocre results, and closed are not in our dataset. That's a feature — we don't want their signals. But it's worth naming.

What survivorship bias is NOT saying

None of this means smart-money signals are useless. It's a calibration problem, not a disqualification.

The managers we track aren't just lucky survivors. The academic literature on hedge fund persistence is mixed — luck explains much short-run outperformance — but at the extremes, over decades, skill is separable from luck. A 30+ year track record of compounding at 20%+ annually is not a coin-flip outcome. Buffett, Druckenmiller, Klarman, and a handful of others have demonstrated something real.

The lesson of survivorship bias is not "ignore the data." It's "apply the right discount."

When a data vendor claims "hedge funds averaged 12% per year over the last decade," discount by 2–4 percentage points for survivorship and backfill effects.
When you read that "the managers in our database beat the S&P in 7 of the last 10 years," ask who is excluded from that database — particularly any fund that closed in those same 10 years.
When you use HoldLens signals, treat them as research starting points, not buy signals. A consensus buy from five long-tenured superinvestors is strong evidence that a thesis is worth your research time. It is not proof the stock will go up.

The interaction with the 45-day lag

Survivorship bias interacts with the 45-day lag in 13F filings in a way that's easy to miss.

When a fund starts closing during a quarter — in orderly wind-down or after a crisis — they're selling positions. If they sell enough to drop below the $100M threshold, they may not file for that quarter at all. The last 13F you see from a fund that closes is typically from before the drawdown that caused the closure. The portfolio is frozen in a pre-crisis snapshot that looks nothing like what actually happened to those positions.

This means that when a manager disappears from the 13F landscape, their last known portfolio — often a large, concentrated, pre-blow-up book — can persist in aggregated data as if it were still a current consensus signal. Systems that don't explicitly expire or down-weight exited managers can spread stale signals.

HoldLens addresses this by only including managers who have filed in the last two consecutive quarters. An inactive manager — whether they're in the process of closing or simply went below the $100M threshold — is automatically excluded from conviction calculations. Their historical data remains accessible on their profile page, but it's not weighted in current signals.

How to use smart-money signals knowing all of this

Three practical calibrations:

Weight by tenure, not just recent alpha. A manager with a 5-year track record is more likely to be a lucky survivor than one with 25 years. HoldLens's manager quality scores give higher weight to track records that span multiple full market cycles, including at least one major bear market. Anything less than 10 years deserves a discount.
Focus on conviction, not just holdings. Survivorship bias inflates the absolute performance numbers of any group. But the relative conviction signals — which positions are growing vs. shrinking, which are consensus buys across multiple managers vs. lone wolves — are less affected by who's missing from the dataset. A stock that five tenured superinvestors are all increasing is a stronger signal than absolute performance comparisons.
Never use aggregate hedge fund statistics without asking about the sample. "Hedge funds returned X%" is almost never the right number. Ask: which hedge funds? Over what period? Does the sample include funds that closed? Self-reported or audited? Any number that passes without those qualifiers is doing survivorship bias work on you.

The relationship to alpha

Survivorship bias and alpha are linked in a counterintuitive way. Because dead funds are excluded from most databases, the measured alpha of the "average hedge fund" looks higher than it really is. But the existence of survivorship bias also means that detecting true alpha is harder, not easier.

If you're trying to determine whether a manager has skill or luck, you need to account for the fact that you're studying a manager who passed a multi-year survival test. That test is informative (surviving is not random), but it also means some of your sample's luck has already been filtered out at the data-collection stage. The remaining returns have already survived once. That compresses the noise — which makes the genuine skill signals more detectable, but also means you need longer windows to draw confident conclusions.

In practical terms: don't conclude a manager has skill until you've seen them navigate at least one complete market cycle including a meaningful drawdown. How they behave when their thesis is wrong, when redemptions are coming in, and when the market is offering them prices far below fair value — that's where skill separates from luck. Survivorship bias filters out the worst outcomes before you even see the data; you need to look hard at how your managers behaved during the hard periods they did survive.