Showing posts with label Regression. Show all posts
Showing posts with label Regression. Show all posts

Tuesday, July 29, 2025

AI / ML - Feature Engineering - Earnings Surprises

I really felt that news would be a great thing to add to the model. But the problem with news, is that news is recent, and this data I am using with XGBoost is historical time-series data. 

If you added news, what would you do - cram the values only into the most recent year?

I think if you go with stuff that is changing and close to real-time, you need to re-think the whole model including the type of model. Maybe news works better with a Transformer model or a LSTM Neural Network model than a predictive regression model.

So - I am running out of new things to add to my model to try and boost the predictability of it (increase the R-squared).

Then I came up with the idea of adding earnings hits, misses and meets. A quick consult with an LLM suggested using an earnings_surprise score, so that not only do we get the misses/meets/beats counts but also we capture the magnitude.  A great idea.

I implemented this, and lo and behold, the earnings_surprise score moves the needle. Substantially and consistently.

The best thing about this, is that the earnings_surprise score is symbol-specific, and so it is not some macro feature I have to figure out how to interact with the symbol data. 

Saturday, July 5, 2025

AI / ML - Here is Why You Backtest

My model was working nicely. 

It scored stocks on a number of fronts (pillars).

It used Profitability. It used Solvency. It used Liquidity. It used Efficiency.

These are the "four horseman" of stock evaluation.

I added some of my own twists to the "grading formula", in the form of macro variables (consumer sentiment, business confidence, et al).  I also had some trend analysis, rewarding trends up and penalizing trends down. I rewarded (and penalized) profitability, cash flow, etc. I had scaling done correctly, too in order to ensure a "fair playing field", and also some sector normalization as well.

When I ran the model, using XGBoost to predict 1-year forward return, the stocks at the top of the report looked great when I spot-checked them against various sites that also grade out stocks. I felt good. The r-squared I was getting from XGBoost and a SHAP-pruned feature run was at academic levels (as high as .46 at one point).

As part of some final QA, I ran the resultant code through AI engines which praised the thoroughness, and slapped me on the back reassuring me that my model was on a par with, if not superior to, many academic models.

Then - someone asked me if this model has this been back-tested. 
And the answer was no.  I had not back-tested it up to that point. I didn't think I was ready for back-testing. 

Maybe back-testing is an iterative "continual improvement" process that should be done much earlier in the process, to ensure you don't go down the wrong road.  But I didn't do that.

So, I ran a back-test. And to my horror, the model was completely "upside down" in terms of stocks that would predict forward return.  The AI engines suggested I simply "flip the sign" on my score and invert them. But that didn't feel right. It felt like I was trying to force a score.  

So - the first thing we did, was evaluate the scoring. We looked at correlation between individual scoring pillars and forward return. Negative.

We then looked at correlation in more detail.

First, we calculated Pearson (row-level) and Spearman (rank-level). correlations.

They were negative.

Then, we calculated Average Fwd Return by Score Decile. Sure enough, there was a trend, but completely backwards from what one would expect. 

Quality stocks with scores of 9,8,7,6,5 had negative values that improved as the decile dropped, while the shaky stocks (0,1,2,3,4) had graduated positive values.

The interesting analysis, was a dump of the correlations of each individual pillar to fwd return. The strongest were Profitability and Valuation, followed by MacroBehavior (macroeconomic features) but these were not strong. And the correlations were slightly negative, a couple slightly above zero positive.

But - one was VERY interesting. A log1p correlation between the "final composite score" to forward return that was noticeable if not sizable - but negative.

We experimented with commenting out the penalties, so we could focus on "true metrics" (a flag was engineered in to turn these off which made it easy to test). Re-ran the model, STILL the correlations with forward return were negative.

Then - we decided to remove individual pillars. Didn't change a thing. STILL the correlations with forward return were negative.

Finally, after the AI ensured me - after reviewing the code - that there were no scoring errors, the only thing left to try, aside of shelving the model for lack of success in predicting forward return, was to in fact put a negative sign on the score to invert it and "flip the score".

I did this. And, while the companies that bubbled to the top were shaky on their fundamentals, I did see cases where Analyst Ratings on these stocks were above (and in some cases way above) the current stock price.  

So here is the evidence that we have a model that IS predicting forward return, in a real way.

So - in conclusion. Quality does NOT necessarily equate to forward return.

What does??? Well, nothing in those pillars individually. But - when you combined all of these metrics/features into a big pot, and send them to a sophisticated regression modeler, it does find a magical combination that can predict a relationship with forward return that is linear, and depending on whether you flip that line one way or another, you can theoretically gain - or lose - a return on your money.

Now, if we had put money into those "great stocks" at the top of that prior list, and then had to watch as we lost money, it would have been puzzling and frustrating. But - do we have the courage to put money into these less-than-stellar fundamental stocks to see if this model is right, and that we WILL get a positive forward return? 

I guess it takes some experimentation. Either a simulator, OR, put $X into the top ten and another $X into the bottom ten and see how the perform. Which is what I might be doing shortly. 


Friday, June 13, 2025

AI/ML Feature Engineering - Adding Feature-Based Features

I added some new features (metrics) to my model. The Quarterly model.

To recap, I have downloaded quarterly statements for stock symbols, and I use these to calculate an absolute slew of metrics and ratios. Then I feed them into the XGBoost regression model, to figure out whether they can predict a forward return of stock price.

I added some macro economic indicators, because I felt that those might impact the quarterly price of a stock (short term) more than the pure fundamentals of the stock. 

The fundamentals are used in an annual model - a separate model - and in that model, the model is not distracted or interrupted with "events" or macroeconomics that get in the way of understanding the true health of a company based on fundamentals over a years-long period of time.

So - what did I add to the quarterly model?

  • Consumer Sentiment
  • Business Confidence
  • Inflation Expectations
  • Treasury Data (1,3,10 year)
  • Unemployment 

And wow - did these variables kick in. At one point, I had the model up to .16. 

Unemployment did nothing, actually. And I wound up removing it as a noise factor. I also realized I had the fiscal quarter included, and removed that too since it, like sector and other descriptive variables, should not be in the model.

But - as I was about to put a wrap on it, I decided to do one more "push" to improve the R-squared value, and started fiddling around. I got cute, adding derived features. One of the things I did, was to add lag features for business confidence, consumer sentiment, inflation expectations. Interestingly, two of these shot to the top of influential metrics.

Feature Importance List Sorted by Importance (return price influence).
feature                                                 weight
business_confidence_lag1                0.059845
inflation_lag1                                       0.054764

But, others were a bust, with .00000 values.

I tried removing the original metrics and JUST keeping the lags - didn't really help.

Another thing worth noting, is that I added SHAP values - a topic I will get into more depth about shortly, perhaps in a subsequent post. SHAP (SHapley Additive exPlanations) is a method used to explain the output of machine learning models by assigning each feature an importance value for a specific prediction, so that models - like so many - are not completely "black box".

But one thing I noticed when I added the SHAP feature list, is that it does NOT match / line up with the feature importances that the XGBoost model espouses. 

So I definitely need to look into this.

Monday, May 19, 2025

AI / ML - It's All About the Data. Imputation and Clustering Algorithms

In spare time, I have been working on a Fintech project, which is done in conjunction with a thick book I have been reading called Machine Learning for Algorithmic Trading by Stefan Jansen.

I am mostly finished with this book, and have coded - from scratch - my own implementations of the concepts introduced in this book. 

What have I learned thus far?

It is ALL ABOUT THE DATA. Most of my time has been scrutinizing the data. Disqualifying data, throwing away of imputing data that has no values, and Winsorizing/capping data values so that they don't skew into outliers.

Dates. Dates have always been a problem. Dropping timestamps off of dates properly so that date comparisons and date math work properly.

So far, a lot of what I have done is data clustering, using algorithms like DBSCAN, K-Means, Agglomerative, etc to find useful cluster patterns. Regression techniques to find correlations. The models and scoring so far are my own "secret sauce" Deterministic models. But I do plan to snap in some AI to do automatic weight adjustment soon. 

Right now, I am using my own Deterministic scoring model - so it can be used as a comparative baseline. But eventually I will enhance this to be more dynamic through self-learning.  

AI / ML - Data Source Comparison with More Data

" To validate generalizability, I ran the same model architecture against two datasets: a limited Yahoo-based dataset and a deeper Stoc...