Monday, September 22, 2025

Changing the Ensemble Model to a Stacked Meta Ensemble

 
Earlier we had a weighted ensemble model that essentially took the r-squared values of Annual and Quarterly and used that as a weighting factor to ensemble them.

It was here, that  realized we were not calculating or saving the predicted fwd return - we were only calculating scores, writing them to a scoring summary and saving the R-squared.

So I changed things around. I added a stacked meta ensemble, and will describe how these work below. We now run BOTH of these.

Weighted Ensemble

  • A simple blend of the two base models.
  •  Annual and quarterly predictions are combined with weights proportional to their out-of-sample R² performance.

Result: ensemble_pred_fwdreturn and ensemble_pred_fwdreturn_pct.

This improves stability but is still fairly “rigid.”


Meta-Model Ensemble (Stacked Ensemble)

A second-level model (XGBoost) is trained on:

  1. Predictions from the annual model
  2. Predictions from the quarterly model
  3. Additional features (sector, industry, etc.)

This meta-model learns the optimal way to combine signals dynamically rather than relying on fixed weights.

Result: ensemble_pred_fwdreturn_meta and ensemble_pred_fwdreturn_meta_pct.

How well did it work?
Results

  1. Weighted Ensemble: R² ~0.19, Spearman ~0.50
  2. Meta-Model Ensemble: R² ~0.75, Spearman ~0.65

Quintile backtests confirm a strong monotonic relationship between predicted quintiles and realized forward returns.

No comments:

Target Encoding & One-Hot Encoding

I had completely overlooked these concepts earlier on, and somehow not until just now - this late in the game - did they come up as I was re...