Quick post to discuss some things...
First, what I am discovering after training this model daily, and running predictions, is that the news sentiment is NOT what is influencing this model.
The news sentiment is based on Transformer models for financial news (Finbert). The news IS fresh, as we get these articles and and then predict over a 1-3 day period, before they're expired out to a training database where predicted returns are compared with actual returns.
News is NOT what this model is learning and training on. Instead, momentum and macro environment features are the aspects that is influencing the returns. Part of this could be the news being stale - the market already knows the news before the model is running. Also, we are only scoring headlines - not actual articles - due to processing constraints. So there's that.
From the training output, here's the feature importance ranked across all three horizons:
Macro features (dominating ~50% of importance):
vix_0d— market fear indextreasury_spread_0d— yield curve spreadbusiness_confidence_0d— FRED macro indicatorconsumer_sentiment_0d— FRED macro indicator
Momentum features (~40%):
momentum_strength— short-term trend decelerationrisk_adjusted_momentum— momentum relative to volatilitytrend_consistent— direction consistency
Sentiment features (~10%):
tone_signed_resid— residualized tone scorefinbert_signed_resid— residualized FinBERT scoretone_signed— raw tonefinbert_signed— raw FinBERT
So the model is essentially saying: buy beaten-down stocks when the macro environment is calm. The news sentiment is contributing about 10% of the predictive signal, which is barely above noise given the R² of 0.002 on the residual models.
It's not really a news sentiment model at this point. It's a macro-regime mean-reversion model that happens to have sentiment features along for the ride.
Which raises the real question — is that actually a bad thing? Mean reversion in calm macro regimes is a legitimate strategy. The problem isn't the model's logic, it's that you only have 3 months of data so it hasn't seen enough regime diversity to be robust. And right now you're in a fear regime — elevated VIX, geopolitical uncertainty — which is exactly when this model historically underperforms.
No comments:
Post a Comment