by Jonathan Widarsa
on the theory and practice of unveiling structure behind data.
-

What K-means Says about Stocks
Read more: What K-means Says about StocksOne natural question we should ask is whether stocks can be grouped by how they behave, rather than just by the sector labels someone assigned them decades ago. For example, here’s one of many things sector labels don’t capture: a tech company and a utility might sit in completely different industries, yet move in near-perfect lockstep […]
-

Penalized Regression for Stock Returns
Read more: Penalized Regression for Stock ReturnsPredicting asset returns is a difficult game where the usual rules of regression are stress-tested by noisy signals, correlated features, and the ever-present risk of overfitting to market microstructure. While ordinary least squares (OLS) gives us a clean starting point, penalized regression methods like Ridge, LASSO, and Elastic Net offer a principled way to impose […]
-

Continuous Latent States with Kalman Filters
Read more: Continuous Latent States with Kalman FiltersIn the previous article, we introduced Hidden Markov Models (HMMs) as a way to capture volatility regime-switching in SPY returns. By decomposing returns into distinct states, i.e., low, medium, and high volatility, we’re able to uncover meaningful structure that a single continuous model like GARCH could not explicitly represent. However, HMMs assume that the market […]
-

HMMs for Volatility Regime-Switching
Read more: HMMs for Volatility Regime-SwitchingPreviously, we saw that modeling variance, rather than the mean, provides a much more effective way of capturing financial time series dynamics. GARCH models, in particular, are able to reproduce volatility clustering and persistence, making them a strong baseline for volatility modeling. However, GARCH comes with an important structural assumption, which is that volatility evolves […]
-

GARCH Sees What ARIMA Cannot
Read more: GARCH Sees What ARIMA CannotIn the previous article, we fit AR, MA, ARMA, and ARIMA models to SPY log returns and watched them systematically fail in a very specific way. The residuals showed volatility clustering, the QQ plots showed fat tails, and ACF plot on squared residuals confirmed that the variance itself was autocorrelated. ARIMA models the conditional mean. […]
-

Can ARIMA Predict SPY Data?
Read more: Can ARIMA Predict SPY Data?This is (hopefully) a beginner-friendly tutorial in attempting to model SPY data using linear time series models. Specifically, we take a look at basic properties of the data, the fitness of AR, MA, ARMA, and ARIMA models, and, spoiler alert, why they suck at the job. *** A good refresher from my article on linear […]