Did the recommender actually work?
For each historical 13F filing date, we compute what HoldLens would have recommended using only the data available at that point in time. Then we measure the realized return from that day to today using live prices.
No survivorship bias, no curation, no cherry-picking. If the model picked stocks that lost money, this page shows it. Trust comes from being right when nobody is looking.
Every backtested quarter has ≥3 prior quarters of trend data
ConvictionScore v3 weights multi-quarter trend streaks heavily — a manager building a position for 3 consecutive quarters is a much stronger signal than a single-quarter move. In v0.23 we extended the dataset to 8 historical quarters (Q1 2024 → Q4 2025) so the backtest can test the model under its actual operating conditions:
- Q1-Q3 2024: used only as context for the trend engine (not backtested directly)
- Q4 2024: model has 3 prior quarters — first quarter fairly backtestable
- Q1 2025: model has 4 prior quarters — fully operational
- Q2 2025: model has 5 prior quarters — strong trend signal
- Q3 2025: model has 6 prior quarters — peak trend signal
The rule: a quarter is only included in the backtest if the model had at least 3 quarters of prior data available to score with. That's the same condition today's /best-now ranking operates under. No handicap, no excuses.
Earlier 2024 quarters exist in the dataset but aren't backtested as entry points — they would themselves lack enough prior quarters to score fairly. v0.24 will extend coverage to 2022-2023 so we can test the model across a full bull-bear cycle.
How the backtest works
- As-of conviction: For each historical quarter (Q4 2024, Q1/Q2/Q3 2025), we compute the ConvictionScore using ONLY moves filed up to that quarter. Time decay is re-anchored so the historical "latest" quarter has weight 1.0. Every backtested quarter has ≥3 prior quarters of trend context (Q1-Q3 2024 serve as the warmup dataset).
- Top 5 BUY signals: We take the top 5 stocks ranked BUY at that historical point in time. No curation — whatever the model said.
- Entry price: The closing price closest to the 13F filing date (when an investor could have actually acted on the signal).
- Exit price: Today's live price from Yahoo Finance via our Cloudflare Worker proxy.
- Realized return: Simple (exit − entry) / entry. Not annualized for the per-pick rows; the aggregate annualizes using days held.
- Benchmark: SPY total return over the same period. Hit rate = % of picks that beat SPY.
- Equal weight: Each pick weighted 1/N. No position sizing, no rebalancing. The simplest possible strategy.
- Small sample size: 4 quarters × 5 picks = 20 data points. Statistically meaningful inference would need 20+ quarters of data. v0.24 extends coverage further back.
- Insider data is not time-locked:The model uses current Form 4 data even for historical computations. Minor look-ahead bias on the insider component.
- Owner count is not time-locked:The crowding penalty uses today's ownership count, not the historical count. Minor.
- No transaction costs: Real returns would be slightly lower due to spreads + commissions (though most brokers are commission-free in 2026).
- Past ≠ future: A model that worked historically can stop working tomorrow. Past performance is not indicative of future results.
Backtest data is recomputed live on every page load. Returns shift with the market each day. Not investment advice. Full methodology →