I examine the sensitivity of scoring rules for distribution forecasts in two dimensions: sensitivity to linear rescaling of the data and the influence of measurement error on the forecast evaluation outcome. First, I show that all commonly used scoring rules for distribution forecasts are robust to rescaling the data. Second, it is revealed that the forecast ranking based on the continuous ranked probability score is less sensitive to gross measurement error than the ranking based on the log score. The theoretical results are complemented by a simulation study aligned with frequently revised quarterly US GDP growth data and an empirical application forecasting realized variances of S&P 100 constituents.
We study the relation between a comprehensive set of firm characteristics and the entire universe of individual equity option prices. We find that 42 out of 86 characteristics are priced in the option market, in the sense that they significantly explain differences in the implied volatility surface (IVS) across stocks. Motivated by this finding, we model the IVS of a given stock as a function of its characteristics with a local linear random forest. This approach addresses the illiquidity of the equity option market by effectively grouping similar stocks during estimation. Our method outperforms a stock-specific benchmark model out-of-sample and allows us to uncover the nonlinear interactions between characteristics and option prices.
The highfrequency package for the R programming language provides functionality for pre-processing financial high-frequency data, analyzing intraday stock returns, and forecasting stock market volatility. For academics and practitioners alike, it provides a tool chain required to work with such datasets and to conduct statistical analyses dedicated to spot volatility, jumps, realized measures, and many more. We showcase our implemented routines and models on raw high-frequency data from large stock exchanges.
We propose a heterogeneous autoregressive (HAR) model with time-varying parameters in the form of a local linear random forest. In contrast to conventional random forests that approximate the volatility nonparametrically using local averaging, the building blocks of our forest are HAR panel models. The local HAR panel models cover the established linear relationship in realized variances while the trees model nonlinearities and interaction effects. Our approach allows the model coefficients to depend on idiosyncratic stock information and overall changing market conditions. We observe superior risk forecasting performance of the HAR forest across multiple forecast horizons and across 186 S&P 500 constituents. This leads to significantly higher utility for volatility managed portfolios. Superior forecast performance is especially pronounced for firms with high leverage.
Low-volatility investing is typically implemented by sorting stocks based on simple risk measures; for example, the empirical standard deviation of last year’s daily returns. In contrast, we understand identifying next-month’s ranking of volatilities as a forecasting problem aimed at the ex-post optimal sorting. We show that time series models based on intraday data outperform simple risk measures in anticipating the cross-sectional ranking in real time. The corresponding portfolios are more similar to the ex-ante infeasible optimal portfolio in multiple dimensions. Moreover, the increased signal in our improved volatility sorts survives portfolio weight smoothing for mitigating transaction costs.
We examine the properties and forecast performance of multiplicative volatility specifications that belong to the class of GARCH-MIDAS models suggested in Engle et al. (2013). In those models volatility is decomposed into a short-term GARCH component and a long-term component that is driven by an explanatory variable. We derive the kurtosis of returns, the autocorrelation function of squared returns, and the R^2 of a Mincer-Zarnowitz regression and evaluate these models in a Monte-Carlo simulation. For S&P 500 data, we compare the forecast performance of GARCH-MIDAS models with a wide range of competitor models such as HAR, Realized GARCH, HEAVY and Markov-Switching GARCH. Our results show that the GARCH-MIDAS based on housing starts as an explanatory variable significantly outperforms all competitor models at forecast horizons of two and three months ahead.