Quick Take
In this sample, it reduced volatility and maximum drawdown, but lagged buy and hold on CAGR and ending wealth. I read it as a drawdown-control rule, not a return-enhancement rule.
Why This Strategy
The 200-day moving average is one of the most common market timing rules in public equity research, newsletters, and trading forums. It is also a useful first study because the rule is simple, the benchmark is obvious, and the main implementation choices can be stated explicitly.
This study pins down one testable version of the 200-day rule: the data, signal timing, cost assumption, and benchmark.
Method
The backtest uses daily SPY data from Yahoo Finance through yfinance. The script requests adjusted OHLCV with auto_adjust=True, caches the result as data/SPY.csv, and uses adjusted close for both strategy and benchmark returns.
Yahoo Finance and yfinance are good enough for an educational reproduction, but they are not institutional-grade data.
Data
| Field | Value |
|---|---|
| Source | Yahoo Finance via yfinance |
| Ticker | SPY |
| Price series | Adjusted close |
| Start date | 1993-01-29 |
| End date | 2026-07-02 |
| First valid SMA200 date | 1993-11-11 |
| Metric window | Includes the SMA warmup period |
| Initial capital | $10,000 |
| Base transaction cost | 5 bps per position change |
| Cash return | 0% |
The data/SPY.csv cache makes later runs independent of a new Yahoo request. Running python3 backtest.py --refresh-data replaces the cache with the latest available yfinance data.
Signal Definition
For trading day t, the script computes:
SMA200[t] = mean(adjusted_close[t-199] ... adjusted_close[t])
The strategy does not use day t close data to trade day t. The position for day t is based on the prior trading day’s information:
signal[t-1] = 1 if adjusted_close[t-1] > SMA200[t-1], else 0
position[t] = signal[t-1]
The first 199 trading days have no valid 200-day average and therefore produce no risk-on signal.
Execution and Cost Assumptions
The execution model is a close-to-close approximation. A signal computed from the adjusted close of day t-1 is modeled as the target position for the return interval from close t-1 to close t.
This means the model assumes the portfolio can be aligned at or near the same adjusted close that generated the signal. It is not a strict after-close order model, and it is not a next-open fill model. A stricter next-open or one-day-delayed next-close model would be a useful follow-up test.
Daily strategy return is:
strategy_return[t] = position[t] * SPY_return[t] - trading_cost[t]
The base case deducts 5 bps of portfolio equity whenever the position changes. A move from cash to SPY costs 5 bps, and a move from SPY to cash costs 5 bps. This fixed bps assumption is a conservative transaction-cost approximation for a highly liquid ETF. It is not a claim about actual historical fills, commissions, bid/ask spreads, or market impact.
Results
The 5 bps base case tells a fairly clean story: the moving average rule lowered volatility and drawdown, but gave up a large amount of long-run compounding. Buy and hold had a deeper drawdown, but it also finished far ahead on wealth and CAGR.
| Metric | Strategy | Benchmark | Note |
|---|---|---|---|
| CAGR | 8.08% | 10.81% | 1993-01-29 to 2026-07-02 |
| Annualized volatility | 11.99% | 18.57% | Daily returns annualized with 252 trading days |
| Sharpe ratio | 0.71 | 0.65 | 0% risk-free rate |
| Max drawdown | -29.42% | -55.19% | Peak-to-trough equity drawdown |
| Calmar ratio | 0.27 | 0.20 | CAGR divided by absolute max drawdown |
| Time in market | 75.35% | 100.00% | Daily position average |
| Position changes | 215 | Initial buy only | Cash-to-SPY and SPY-to-cash changes |
| Final equity | $134,111 | $308,867 | $10,000 initial capital |
Interpretation
My interpretation is that the rule did what a simple trend filter is supposed to do: it reduced risk exposure and cut the worst drawdown. That is useful information. It is also not free.
The cost showed up as opportunity cost during long bull markets. When the strategy is in cash, it is protected from the next down day, but it is also absent from the next up day. Over a sample where SPY compounded strongly, that missing exposure mattered more than the drawdown reduction if the objective was final wealth.
The 215 position changes are also important. That count is a clue about whipsaw risk, taxable events, execution friction, and the behavioral pressure of repeatedly switching between risk-on and risk-off. A spreadsheet cost model can deduct 5 bps cleanly; a human or a taxable account may experience the same turnover less cleanly.
I would not read this as a verdict on market timing. Under these assumptions, I see a drawdown-control tradeoff, not a return-enhancement result.
Robustness Checks
The main result uses 5 bps per position change. The table below reruns the same signal with 0 bps and 10 bps costs.
| Cost per position change | CAGR | Sharpe | Max drawdown | Final equity |
|---|---|---|---|---|
| 0 bps | 8.43% | 0.74 | -28.00% | $149,340 |
| 5 bps | 8.08% | 0.71 | -29.42% | $134,111 |
| 10 bps | 7.73% | 0.68 | -30.82% | $120,429 |
The cost sensitivity is not huge day to day, but it compounds across 215 position changes. Even a slow ETF rule is not immune to friction.
Limitations
Two simplifications matter most. First, cash earns 0%, which understates cash-period returns when short-term rates are high. Second, the execution model is a same-close close-to-close approximation rather than a strict next-open fill. I used that choice to keep the adjusted price series internally consistent, not because it is a perfect fill model.
Taxes, account constraints, and real-world implementation behavior are outside this version. A natural follow-up would add a Treasury bill cash proxy and compare this execution model with a next-open or one-day-delayed next-close version.
Reproducibility
Run the study from the research repository:
cd studies/spy-200-day-moving-average
pip install -r requirements.txt
python3 backtest.py
python3 plot.py
python3 -m unittest discover -s . -p "test_*.py"
The generated files are:
| File | Purpose |
|---|---|
data/SPY.csv | Cached adjusted OHLCV from yfinance |
outputs/spy-200dma-summary.csv | Summary metrics for 0, 5, and 10 bps costs |
outputs/spy-200dma-equity.csv | Daily signal, position, returns, costs, equity, and drawdowns |
outputs/spy-200dma-trades.csv | Position-change log |
charts/spy-200dma-equity-curve.svg | Equity curve chart |
charts/spy-200dma-drawdowns.svg | Drawdown chart |
charts/spy-200dma-price-sma.svg | Adjusted close and SMA200 chart |
FAQ
Does the SPY 200-day moving average strategy beat buy and hold?
Not by ending wealth or CAGR in this version. Under the base assumptions, the strategy ended at $134,111 versus $308,867 for buy and hold. The tradeoff was lower volatility and a smaller maximum drawdown.
How is look-ahead bias avoided in this backtest?
The signal is shifted before returns are applied. The return ending on day t is based on the signal from day t-1, not information from day t.
Why use adjusted close data for SPY?
SPY pays distributions. Adjusted close keeps the strategy and benchmark aligned on the same dividend- and split-adjusted return series.
How are transaction costs modeled?
Costs are deducted as a fixed percentage of portfolio equity on position-change days: 5 bps in the base case, with 0 bps and 10 bps sensitivity runs.
Why does the base case use 0% cash return?
To keep the first version to one market data source. It is simple and auditable, but it can understate returns while the strategy is in cash.