Wednesday, February 18, 2026

Residuals - How to Separate two Inter-related Variables in a Model

I had someone on an Algotrading site, suggest that I should try to extract how much of momentum was sentiment - and vice-versa.

I had to look into this. It took me down a road. Which - as you shall see - introduced me to the concept of Residuals. And that is what this blog subject is about.

Chicken or egg?
Is sentiment causing returns, or just reacting to them? Raw scores conflate these.

Put another way, if news sentiment causes/impacts/influences Momentum, how much of the Momentum is due to the Sentiment? 

Well, there is a way to ferret this out. It is called Residuals.  

Step 1: Regress sentiment on momentum features:

finbert_raw ~ lag_ret_20d + lag_ret_5d + volatility_5d

Step 2: Residual = what's left after stripping out price explanation:

finbert_signed_resid = finbert_raw - predicted(finbert | momentum)

 

Now, you can use the residual as a feature in the model, instead of just tossing in the raw values. Residuals are the cleanest sentiment alpha you can extract. Raw scores are 90% noise.

What does a residual capture?

✅ News sentiment BEFORE price moves (true alpha)
✅ Analyst upgrades/downgrades not reflected in price yet  
✅ Management guidance changes
✅ Product launch sentiment
❌ Earnings reactions (already in lag_ret_20d)


No comments:

The News Sentiment Model is Not About News Sentiment

Quick post to discuss some things... First, what I am discovering after training this model daily, and running predictions, is that the news...