Nowcasting macro trends with machine learning

Nowcasting economic trends can make use of a broad range of machine learning methods. This not only serves the purpose of optimization but also allows replication of past information states of the market and supports realistic backtesting. A practical framework for modern nowcasting is the three-step approach of (1) variable pre-selection, (2) orthogonalized factor formation, and (3) regression-based prediction. Various methods can be applied at each step, in accordance with the nature of the task. For example, pre-selection can be based on sure independence screening, t-stat-based selection, least-angle regression, or Bayesian moving averaging. Predictive models include many non-linear models, such as Markov switching models, quantile regression, random forests, gradient boosting, macroeconomic random forests, and linear gradient boosting. There is some evidence that linear regression-based methods outperform random forests in the field of macroeconomics.

(more…)

Statistical learning and macro trading: the basics

The rise of data science and statistical programming has made statistical learning a key force in macro trading. Beyond standard price-based trading algorithms, statistical learning also supports the construction of quantamental systems, which make the vast array of fundamental and economic time series “tradable” through cleaning, reformatting, and logical adjustments. Fundamental economic developments are poised to play a growing role in the statistical trading and support models of market participants. Machine learning methods automate the process and are a basis for reliable backtesting and efficient implementation.

(more…)

Classifying market regimes

Market regimes are clusters of persistent market conditions. They affect the relevance of investment factors and the success of trading strategies. The practical challenge is to detect market regime changes quickly and to backtest methods that may do the job. Machine learning offers a range of approaches to that end. Recent proposals include [1] supervised ensemble learning with random forests, which relate the market state to values of regime-relevant time series, [2] unsupervised learning with Gaussian mixture models, which fit various distinct Gaussian distributions to capture states of the data, [3] unsupervised learning with hidden Markov models, which relate observable market data, such as volatility, to latent state vectors, and [4] unsupervised learning with Wasserstein k-means clustering, which classifies market regimes based on the distance of observed points in a metric space.

(more…)

Measuring the value-added of algorithmic trading strategies

Standard performance statistics are insufficient and potentially misleading for evaluating algorithmic trading strategies. Metrics based on prediction errors mistakenly assume that all errors matter equally. Metrics based on classification accuracy disregard the magnitudes of errors. And traditional performance ratios, such as Sharpe, Sortino and Calmar are affected by factors outside the algorithm, such as asset class performance, and rely on the normal distribution of returns. Therefore, a new paper proposes a discriminant ratio (‘D-ratio’) that measures an algorithm’s success in improving risk-adjusted returns versus a related buy-and-hold portfolio. Roughly speaking, the metric divides annual return by a value-at-risk metric that does not rely on normality and then divides it by a similar ratio for the buy-and-hold portfolio. The metric can be decomposed into the contributions of return enhancement and risk reduction.

(more…)

Macro trends for trading models

Unlike market price trends, macroeconomic trends are hard to track in real-time. Conventional econometric models are immutable and not backtestable for algorithmic trading. That is because they are built with hindsight and do not aim to replicate perceived economic trends of the past (even if their parameters are sequentially updated). Fortunately, the rise of machine learning breathes new life into econometrics for trading. A practical approach is “two-stage supervised learning”. The first stage is scouting features, by applying an elastic net algorithm to available data sets during the regular release cycle, which identifies competitive features based on timelines and predictive power. Sequential scouting gives feature vintages. The second stage evaluates various candidate models based on the concurrent feature vintages and selects at any point in time one with the best historic predictive power. Sequential evaluation gives data vintages. Trends calculated based on these data vintages are valid backtestable contributors to trading signals.

(more…)

A statistical learning workflow for macro trading strategies

Statistical learning for macro trading involves model training, model validation and learning method testing. A simple workflow [1] determines form and parameters of trading models, [2] chooses the best of these models based on past out-of-sample performance, and [3] assesses the value of the deployed learning method based on further out-of-sample results. A convenient technology is the ‘list-column workflow’ based on the tidyverse packages in R. It stores all related objects in a single data table, including models and nested data sets, and implements statistical learning through functional programming on that table. Key steps are [1] the creation of point-in-time data sets that represent information available at a particular date in the past, [2] the estimation of different model types based on initial training sets prior to each point in time, [3] the evaluation of these different model types based on subsequent validation data just before each point in time, and [4] the testing of the overall learning method based on testing data at each point in time.

(more…)

Reward-risk timing

Reward-risk timing refers to methods for allocating between a risky market index and a risk-free asset. It is a combination of reward timing, based on expected future risk asset returns, and volatility timing, based on recent price volatility. A new paper proposes to use machine learning with random forests for estimating both risk premia (return expectations) and optimal lookback windows for volatility estimates This method allows for non-linear prediction interaction and averages forecasts across a range of simplistic valid prediction functions. In an empirical analysis with data going back to 1952 the random forest method for reward-risk timing has outperformed other methods and earned significantly higher risk-adjusted returns than a buy-and-hold strategy.

(more…)

Detecting market price distortions with neural networks

Detecting price deviations from fundamental value is challenging because the fundamental value itself is uncertain. A shortcut for doing so is to look at return time series alone and to detect “strict local martingales”, i.e. episodes when the risk-neutral return temporarily follows a random walk while medium-term return expectations decline with the forward horizon length. There is a test based on the instantaneous volatility to identify such strict local martingales. The difficulty is to model the functional form of volatility, which may vary over time. A new approach is to use a recurrent neural network for this purpose, specifically a long short-term memory network. Based on simulated data the neural network approach achieves much higher detection rates for strict local martingales than methods based on conventional volatility estimates.

(more…)