Copulas and trading strategies

Reliance on linear correlation coefficients and joint normal distribution of returns in multi-asset trading strategies can be badly misleading. Such conventions often overestimate diversification benefits and underestimate drawdowns in times of market stress. Copulas can describe the joint distribution of multiple returns or price series more realistically. They separate the modelling of dependence structures from the marginal distributions of the individual returns. Copulas are particularly suitable for assessing joint tail distributions, such as the behaviour of portfolios in extreme market states. This is when risk management matters most. A critical choice is the appropriate marginal distributions and copula functions based on the stylized features of contract return data. Multivariate distributions based on these assumptions can be simulated in Python.

(more…)

Predicting volatility with neural networks

Predicting realized volatility is critical for trading signals and position calibration. Econometric models, such as GARCH and HAR, forecast future volatility based on past returns in a fairly intuitive and transparent way. However, recurrent neural networks have become a serious competitor. Neural networks are adaptive machine learning methods that use interconnected layers of neurons. Activations in one layer determine the activations in the next layer. Neural networks learn by finding activation function weights and biases through training data. Recurrent neural networks are a class of neural networks designed for modeling sequences of data, such as time series. And specialized recurrent neural networks have been developed to retain longer memory, particularly LSTM (Long Short-Term Memory) and GRU (Gated Recurrent Unit). The advantage of neural networks is their flexibility to include complex interactions of features, non-linear effects, and various types of non-price information.

(more…)

Statistical learning and macro trading: the basics

The rise of data science and statistical programming has made statistical learning a key force in macro trading. Beyond standard price-based trading algorithms, statistical learning also supports the construction of quantamental systems, which make the vast array of fundamental and economic time series “tradable” through cleaning, reformatting, and logical adjustments. Fundamental economic developments are poised to play a growing role in the statistical trading and support models of market participants. Machine learning methods automate the process and are a basis for reliable backtesting and efficient implementation.

(more…)

How to estimate factor exposure, risk premia, and discount factors

The basic idea behind factor models is that a large range of assets’ returns can be explained by exposure to a small range of factors. Returns reflect factor risk premia and price responses to unexpected changes in the factors. The theoretical basis is arbitrage pricing theory, which suggests that securities are susceptible to multiple systemic risks. The statistical toolkit to estimate factor models has grown in recent years. Factors and exposures can be estimated through various types of regressions, principal components analysis, and deep learning, particularly in form of autoencoders. Factor risk premia can be estimated through two-pass regressions and factor mimicking portfolios. Stochastic discount factors and loadings can be estimated with the generalized method of moments, principal components analysis, double machine learning, and deep learning. Discount factor loadings are particularly useful for checking if a new proposed factor does add any investment value.

(more…)

Classifying market regimes

Market regimes are clusters of persistent market conditions. They affect the relevance of investment factors and the success of trading strategies. The practical challenge is to detect market regime changes quickly and to backtest methods that may do the job. Machine learning offers a range of approaches to that end. Recent proposals include [1] supervised ensemble learning with random forests, which relate the market state to values of regime-relevant time series, [2] unsupervised learning with Gaussian mixture models, which fit various distinct Gaussian distributions to capture states of the data, [3] unsupervised learning with hidden Markov models, which relate observable market data, such as volatility, to latent state vectors, and [4] unsupervised learning with Wasserstein k-means clustering, which classifies market regimes based on the distance of observed points in a metric space.

(more…)

Measuring the value-added of algorithmic trading strategies

Standard performance statistics are insufficient and potentially misleading for evaluating algorithmic trading strategies. Metrics based on prediction errors mistakenly assume that all errors matter equally. Metrics based on classification accuracy disregard the magnitudes of errors. And traditional performance ratios, such as Sharpe, Sortino and Calmar are affected by factors outside the algorithm, such as asset class performance, and rely on the normal distribution of returns. Therefore, a new paper proposes a discriminant ratio (‘D-ratio’) that measures an algorithm’s success in improving risk-adjusted returns versus a related buy-and-hold portfolio. Roughly speaking, the metric divides annual return by a value-at-risk metric that does not rely on normality and then divides it by a similar ratio for the buy-and-hold portfolio. The metric can be decomposed into the contributions of return enhancement and risk reduction.

(more…)

Ten things investors should know about nowcasting

Nowcasting in financial markets is mainly about forecasting forthcoming data reports, particularly GDP releases. However, nowcasting models are more versatile and can be used for a range of market-relevant information, including inflation, sentiment, weather, and harvest conditions. Nowcasting is about information efficiency and is particularly suitable for dealing with big messy data. The underlying models typically condense large datasets into a few underlying factors. They also tackle mixed frequencies of time series and missing data. The most popular model class for nowcasting is factor models: there are different categories of these that produce different results. One size does not fit all purposes. Also, factor models have competitors in the nowcasting space, such as Bayesian vector autoregression, MIDAS models and bridge regressions. The reason why investors should understand their nowcasting models is that they can do a lot more than just nowcasting: most models allow tracking latent trends, spotting significant changes in market conditions, and quantifying the relevance of different data releases.

(more…)

Macro trends for trading models

Unlike market price trends, macroeconomic trends are hard to track in real-time. Conventional econometric models are immutable and not backtestable for algorithmic trading. That is because they are built with hindsight and do not aim to replicate perceived economic trends of the past (even if their parameters are sequentially updated). Fortunately, the rise of machine learning breathes new life into econometrics for trading. A practical approach is “two-stage supervised learning”. The first stage is scouting features, by applying an elastic net algorithm to available data sets during the regular release cycle, which identifies competitive features based on timelines and predictive power. Sequential scouting gives feature vintages. The second stage evaluates various candidate models based on the concurrent feature vintages and selects at any point in time one with the best historic predictive power. Sequential evaluation gives data vintages. Trends calculated based on these data vintages are valid backtestable contributors to trading signals.

(more…)

Machine learning for portfolio diversification

Dimension reduction methods of machine learning are suited for detecting latent factors of a broad set of asset prices. These factors can then be used to improve estimates of the covariance structure of price changes and – by extension – to improve the construction of a well-diversified minimum variance portfolio. Methods for dimension reduction include sparse principal components analysis, sparse partial least squares, and autoencoders. Both static and dynamic factor models can be built. Hyperparameters tuning can proceed in rolling training and validation samples. Empirical analysis suggests that that machine learning adds value to factor-based asset allocation in the equity market. Investors with moderate or conservative risk preferences would realize significant utility gains.

(more…)

Statistical arbitrage risk premium

Any asset can use a portfolio of similar assets to hedge against its factor exposure. The factor residual risk of the hedged position is called statistical arbitrage risk. Consequently, the statistical arbitrage risk premium is the expected return of such a hedged position. A recent paper shows that both theoretically and empirically this premium rises in the stock’s statistical arbitrage risk. ‘Unique’ stocks have higher excess returns than ‘ubiquitous’ stocks. The estimated premium is therefore a valid basis for investment strategies. Statistical arbitrage risk can be estimated by using ‘elastic net’ estimation and related machine learning. This method selects a relatively small hedge portfolio from a large array of candidate stocks.

(more…)