How to measure the quality of a trading signal

The quality of a trading signal depends on its ability to predict future target returns and to generate material economic value when applied to positioning. Statistical metrics of these two properties are related but not identical. Empirical evidence must support both. Moreover, there are alternative criteria for predictive power and economic trading value, which are summarized in this post. The right choice depends on the characteristics of the trading signal and the objective of the strategy. Each strategy calls for a bespoke appropriate criterion function. This is particularly important for statistical learning that seeks to optimize hyperparameters of trading models and derive meaningful backtests.

(more…)

Nowcasting macro trends with machine learning

Nowcasting economic trends can make use of a broad range of machine learning methods. This not only serves the purpose of optimization but also allows replication of past information states of the market and supports realistic backtesting. A practical framework for modern nowcasting is the three-step approach of (1) variable pre-selection, (2) orthogonalized factor formation, and (3) regression-based prediction. Various methods can be applied at each step, in accordance with the nature of the task. For example, pre-selection can be based on sure independence screening, t-stat-based selection, least-angle regression, or Bayesian moving averaging. Predictive models include many non-linear models, such as Markov switching models, quantile regression, random forests, gradient boosting, macroeconomic random forests, and linear gradient boosting. There is some evidence that linear regression-based methods outperform random forests in the field of macroeconomics.

(more…)

Finding (latent) trading factors

Financial markets are looking at a growing and broadening range of correlated time series for the operation of trading strategies. This increases the importance of latent factor models, i.e., methods that condense high-dimensional datasets into a low-dimensional group of factors that retain most of their underlying relevant information. There are two principal approaches to finding such factors. The first uses domain knowledge to pick factor proxies up front. The second treats all factors as latent and applies statistical methods, such as principal components, to a comprehensive set of correlated variables. A new paper proposes to combine domain knowledge and statistical methods using penalized reduced-rank regression. The approach promises improved accuracy and robustness.

(more…)

Testing macro trading factors

The recorded history of modern financial markets and macroeconomic developments is limited. Hence, statistical analysis of macro trading factors often relies on panels, sets of time series across different currency areas. However, country experiences are not independent and subject to common factors. Simply stacking data can lead to “pseudo-replication” and overestimated significance of correlation. A better method is to check significance through panel regression models with period-specific random effects. This technique adjusts targets and features of the predictive regression for common (global) influences. The stronger these global effects, the greater the weight of deviations from the period-mean in the regression. In the presence of dominant global effects, the test for the significance of a macro factor would rely mainly upon its ability to explain cross-sectional target differences. Conveniently, the method automatically accounts for the similarity of experiences across markets when assessing the significance and, hence, can be applied to a wide variety of target returns and features. Examples show that the random effects method can deliver a quite different and more plausible assessment of macro factor significance than simplistic statistics based on pooled data.

(more…)

Identifying the drivers of the commodity market

Commodity futures returns are correlated across many different raw materials and products. Research has identified various types of factors behind this commonality: [i] macroeconomic changes, [ii]  financial market trends, and [iii] shifts in general uncertainty. A new paper proposes to estimate the strength and time horizon of these influences through mixed-frequency vector autoregression. Mixed-frequency Granger causality tests can assess the interaction of monthly, weekly, and daily data without aggregating to the lowest common frequency and losing information. An empirical analysis for 37 commodity futures from all major sectors, based on mixed-frequency Granger causality tests,  suggests that macroeconomic changes are the dominant common driver of monthly commodity returns, while financial market variables exercise commanding influence at a daily frequency.

(more…)

Identifying market regimes via asset class correlations

A recent paper suggests identifying financial market regimes through the correlations of asset class returns. The basic idea is to calculate correlation matrixes for sliding time windows and then estimate pairwise similarities. This gives a matrix of similarity across time. One can then perform principal component analysis on this similarity matrix and extract the “axes” of greatest relevance. Subsequently, one can cluster the dates in the new reduced space, for example by a K-means method, and choose an optimal number of clusters. These clusters would be market regimes. Empirical analyses of financial markets over the last 20-100 years identify 6-7 market regimes.

(more…)

Copulas and trading strategies

Reliance on linear correlation coefficients and joint normal distribution of returns in multi-asset trading strategies can be badly misleading. Such conventions often overestimate diversification benefits and underestimate drawdowns in times of market stress. Copulas can describe the joint distribution of multiple returns or price series more realistically. They separate the modelling of dependence structures from the marginal distributions of the individual returns. Copulas are particularly suitable for assessing joint tail distributions, such as the behaviour of portfolios in extreme market states. This is when risk management matters most. A critical choice is the appropriate marginal distributions and copula functions based on the stylized features of contract return data. Multivariate distributions based on these assumptions can be simulated in Python.

(more…)

Predicting volatility with neural networks

Predicting realized volatility is critical for trading signals and position calibration. Econometric models, such as GARCH and HAR, forecast future volatility based on past returns in a fairly intuitive and transparent way. However, recurrent neural networks have become a serious competitor. Neural networks are adaptive machine learning methods that use interconnected layers of neurons. Activations in one layer determine the activations in the next layer. Neural networks learn by finding activation function weights and biases through training data. Recurrent neural networks are a class of neural networks designed for modeling sequences of data, such as time series. And specialized recurrent neural networks have been developed to retain longer memory, particularly LSTM (Long Short-Term Memory) and GRU (Gated Recurrent Unit). The advantage of neural networks is their flexibility to include complex interactions of features, non-linear effects, and various types of non-price information.

(more…)

Statistical learning and macro trading: the basics

The rise of data science and statistical programming has made statistical learning a key force in macro trading. Beyond standard price-based trading algorithms, statistical learning also supports the construction of quantamental systems, which make the vast array of fundamental and economic time series “tradable” through cleaning, reformatting, and logical adjustments. Fundamental economic developments are poised to play a growing role in the statistical trading and support models of market participants. Machine learning methods automate the process and are a basis for reliable backtesting and efficient implementation.

(more…)

Measuring the value-added of algorithmic trading strategies

Standard performance statistics are insufficient and potentially misleading for evaluating algorithmic trading strategies. Metrics based on prediction errors mistakenly assume that all errors matter equally. Metrics based on classification accuracy disregard the magnitudes of errors. And traditional performance ratios, such as Sharpe, Sortino and Calmar are affected by factors outside the algorithm, such as asset class performance, and rely on the normal distribution of returns. Therefore, a new paper proposes a discriminant ratio (‘D-ratio’) that measures an algorithm’s success in improving risk-adjusted returns versus a related buy-and-hold portfolio. Roughly speaking, the metric divides annual return by a value-at-risk metric that does not rely on normality and then divides it by a similar ratio for the buy-and-hold portfolio. The metric can be decomposed into the contributions of return enhancement and risk reduction.

(more…)