Home » Research Blog » FX trading signals: Common sense and machine learning

FX trading signals: Common sense and machine learning

Jupyter Notebook

Two valid methods to combine macro trading factors into a single signal are “conceptual parity” and machine learning. Conceptual parity takes a set of conceptually separate normalized factors and gives them equal weights. Machine learning optimizes models and derives weights sequentially, potentially with theoretical restrictions. Both methods support realistic backtests. Conceptual parity works best in the presence of strong theoretical priors. Machine learning works best with large homogenous data sets.
We apply conceptual parity, and two machine learning methods to combine 11 macro-quantamental trading factors for developed and emerging market FX forwards in 16 currencies since 2000. The signals derived by all methods have been highly significant predictors and produced material and uncorrelated risk-adjusted trading returns. Machine learning methods have failed to outperform conceptual parity, probably reflecting that theoretical priors in the FX space are abundant while data are limited and heterogeneous.

Please quote as “Gholkar, Rushil, and Sueppel, Ralph, ‘FX trading signals: Common sense and machine learning,’ Macrosynergy research post, December 2024.”

A Jupyter notebook for audit and replication of the research results can be downloaded here. The notebook operation requires access to J.P. Morgan DataQuery to download data from JPMaQS. Everyone with DataQuery access can download data, except for the latest months. Moreover, J.P. Morgan offers free trials on the complete dataset for institutional clients. An academic support program sponsors data sets for research projects.

Legitimate signal calculation methods

For this post, it is essential to distinguish between trading factors and trading signals. A trading factor is an indicator that plausibly predicts the direction and magnitude of target returns. For example, under the assumption of rational inattention, relative economic growth trends are plausible positive predictors of subsequent FX forward returns. A trading signal combines trading factors into a single metric that guides positioning in a trading strategy.

The formation of signals from valid factors is a hazardous step in the development of macro trading strategies. A common pitfall is the casual “optimization” of signals based on “trialling” a range of factors for the whole sample and then keeping those with good performance or even weighing all “successful” factors in accordance with past predictive power. This approach does not produce meaningful backtests and is unlikely to generalize well to future market environments. For example, the specific relations of the past business cycle or economic crisis may not hold in the next.

For backtesting and good chances of generalization, there are two diametrically opposite but equally valid methods of combining factors into signals:

  • Conceptual parity: Conceptual parity selects a set of principally different and plausible factors based on theoretical priors alone. It normalizes the factors around their presumed neutral levels, and then takes an average. There is no optimization or statistical evaluation. In statistical learning lingo, this approach has high bias but low variance. This means the signal does not usually fit the available target return sample well. However, if the factors are chosen with strong logic and built on high-quality data, the simple average signal is often a robust predictor of future unseen data. The approach depends on sound judgment and understanding of macro and markets. It works well for contracts with obvious macro links, such as FX and rates.
  • Sequential machine learning: Statistical learning methods allow sequential optimization of both models and parameters that govern the combination of factors, strictly based on information available up to a point in time. This approach has principally low bias but high variance. Pre-selected factors that have historically failed as predictors will be discarded. The main drawback of machine learning for macro strategies is overfitting to macro conditions specific to a certain era.

In this post, we compare these two approaches for combining a set of 11 pure macro-quantamental factors into a single signal for a range of 16 developed and emerging market currency forwards that trade against either the U.S. dollar or the euro (see details below). The combination of factors will be equal for all currencies and is based on panel analysis. A panel is a dataset of a feature or return category with two dimensions: time and cross-section, i.e., trading days and currency areas. Both the normalization for conceptual parity and the statistical learning operations are based on such panels. Whatever statistic we estimate, we base it on all countries and all periods up to each point in time. This means that we forgo currency area-specifics for the sake of greater statistical power. The overall process operates in three steps:

  1. Specify monthly data panels for all end-of-period factors and subsequent monthly target returns.
  2. Sequentially calculate signals according to a single rule for all currency areas, based either on conceptual parity or statistical learning.
  3. Evaluate predictive power and accuracy and backtest the PnL value of a full panel of signals.

The specific macro-quantamental factors used here are explained below, but generally, they are all point-in-time information states from the J.P. Morgan Macrosynergy Quantamental System (“JPMaQS”). Hence, they are suitable for testing relations with subsequent returns and backtesting related trading strategies. JPMaQS is also the source of all generic returns used in this analysis.

A brief explanation of the statistical learning methods

In this post, we produce separate signals for statistical learning with linear regression models and with random forest regression. Linear regression is simpler and more intuitive but also more restrictive, allowing only linear relations without interactions. Random forest regression is based on regression trees, which adapt flexibly to the data but are also less transparent and controlled by common sense. For both methods, the signals are generated sequentially based on expanding time windows. The expanding datasets that are used for this point-in-time signal generation are called “development data sets”. The basic process of signal generation follows the principles explained in the post “Optimizing macro trading signals – A practical introduction”.

  • First, we specify model and hyperparameter grids containing all model versions for predicting target returns based on factors. The considered models are specific to the method and are explained below.
  • Then, we set cross-validation parameters, i.e., the rules for training and testing the different model versions within the development data sets. The main parameters are (1) the criterion according to which models are rated and (2) the cross-validation splitter that governs the partitions of the development data set into training and test sets. Here, the model evaluation criterion is a stylized Sharpe ratio of applying the sign of the signals to positioning as implemented by the sharpe_ratio function of the Macrosynergy package. The cross-validation splitter must respect the panel structure of the data to create temporally cohesive training and test splits. For this purpose, we use the RollingKFoldPanelSplit class of the Macrosynergy package. It predicts test sets by using both past and future training sets. The splitter starts out with five partitions for the shortest eligible period (3 years) and then adds one more for each additional three years of data in the development dataset.
  • The actual sequential model selection and optimized signal calculation can be executed by the Macrosynergy package’s SignalOptimizer class. It simulates a pipeline through time where scikit-learn model selection and cross-validation classes are used at each rebalancing date to produce return forecasts for a particular forward window, respecting the panel structure of the underlying data. Its calculate_predictions method governs the sequence of model selections, the optimal models’ parameter estimations, and the calculation of signals.

Method 1: Modified linear regression

Statistical learning based on linear regression assigns weights to the signal constituent candidate scores based on past linear relations and, for some regression types, removes factors that coefficients with little significance or wrong signs. The basics are explained in the post “Regression-based macro trading signals”. A regression-based trading signal is a modified point-in-time regression forecast of returns.

It is typically advisable to adjust regression coefficients for their statistical precision, i.e., to account for their standard errors of parameter estimates. Larger samples mean more precise coefficient estimates. The post “How to adjust regression-based trading signals for reliability” explains this adjustment by using the ModifiedLinearRegression class of the Macrosynergy package.

We use a minimalist hyperparameter grid for simplicity, allowing for OLS and non-negative least squares models, each with or without intercepts. As shown below, all factor candidates for the signal have presumed positive impact and a neutral level at zero. Hence, a non-negative least-squares regression without intercept is the model most restricted by theoretical priors, as it only allows coefficients with theoretically warranted signs and assumes that neutral values of factors have all been set correctly.

Method 2: Random forest regression

Random forest regression is a popular ensemble machine-learning method that combines the predictions of multiple regression trees into a single signal. A regression tree is a decision tree model that predicts continuous outcomes by recursively splitting the data into subsets. Unlike linear regression, it captures non-linear and non-monotonic relationships. The benefits and applications of random forests for macro signal generation have been explained in the post “How random forests can improve macro trading signals”.

Random forest regression is applied by using scikit-learn’s RandomForestRegressor method. Here, we also consider monotonic constraints to impose theoretical priors, which means we will enforce consideration of factors only with the “right sign” in accordance with theory or common sense, similar to non-negative least squares. In scikit-learn, one can assign a monotonic constraint to each factor using the monotonic_cst parameter. Loosely speaking, a positive monotonic constraint means that a higher value of a factor cannot have a negative effect on target prediction. As for linear regression, we use a minimalist grid, allowing choices of random forest regressions with or without positive monotonicity constraints.

Applying parity and learning to macro factors for FX forwards

The objective is to build and evaluate macro-quantamental trading signals for a strategy that trades 16 global developed and emerging market currencies. The basic idea of the strategy is similar to a previous post, “FX trading signals with regression-based learning”. However, here, we (1) trade a much broader set of currencies, including emerging markets, and (2) use a larger set of pure macro-quantamental factors, excluding all market factors. The idea is to make the comparison of signal generation methods based on pure macro factors more representative of the asset class. We look at trading two types of FX forward positions:

  • Vol-targeted positions: 1-month FX forward positions are scaled to a 10% volatility target of risk capital based on the historical standard deviation for an exponential moving average with a half-life of 11 days. Positions are rebalanced at the end of each month, respecting a maximum leverage ratio of 5 (of implied notional to cash position). Their PnLs are approximated by vol-targeted FX returns on JPMaQS.
  • Directionally hedged positions: These are 1-month FX forward positions that are hedged against directional risk through a position in a global directional risk basket. The risk basket contains risk parity positions in equity index futures, CDS indices and FX forwards. Hedge ratios are estimated sequentially based on past sensitivity. The PnLs of these positions are approximated by the hedged FX forward returns on JPMaQS.

The strategy takes positions in forwards in eight developed and eight emerging market currencies, both against their natural benchmark currency:

  • DM currencies: the Australian dollar (AUD), the Canadian dollar (CAD), the Japanese yen (JPY), and the New Zealand dollar (NZD), against the U.S. dollar; the Swiss franc (CHF), the Norwegian krone (NOK), and the Swedish krona (SEK), against the euro; and the British pound (GBP) against an equally weighted basket of dollar and euro.
  • EM currencies: the Mexican peso (MXN), Israeli shekel (ILS), the Korean won (KRW), the Thai baht (THB), the Taiwanese dollar (TWD) and the South African rand (ZAR), all against the U.S. dollar; and the Czech koruna (CZK), and the Polish zloty (PLN), against the euro.

As trading factors, we consider 11 composite macro-quantamental factors, each representing a plausible predictor of currency returns and each calculated based on macro-quantamental categories in JPMaQS in accordance with basic theory. Factors are calculated such that their presumed impact on returns on long positions in the local currency is positive, and their neutral level is zero:

  • International liability trends: International liabilities record external financial liabilities of residents to non-residents. Their accumulation should negatively predict subsequent local currency returns due to growing debt service and setback risks. The macro-quantamental factor is the negative of the average annualized change in liability ratios, based on the latest published month versus a 2-year or 5-year moving average (documentation here).
  • External balance ratios: Loosely speaking, external balances measure differences between exports and imports of goods, services, and other flows. Surpluses indicate long-term buying pressure, and deficits indicate long-term selling pressure on the local currency. The macro-quantamental factor is the average of the current account and basic external balances, 1-year moving averages, as a share of GDP  (documentation here and here).
  • External balance trends: Rising external surpluses or declining deficits often indicate an improvement in the competitiveness of a currency area. The macro factor averages several conventional metrics of short-term trends: the 3 months over 3 months and 6 months over 6 months changes in the seasonally adjusted merchandise trade balance ratio (to GDP), as well as the change of the 12-month current account balance ratio over the past three months (documentation here).
  • Relative excess CPI inflation pressure: Consumer price inflation above the central bank’s target supports monetary tightening and tolerance for currency strength. The factor looks at the difference between various inflation pressure metrics and the estimated effective inflation target of the local central bank (documentation here) relative to the benchmark currency area. The inflation metrics include headline CPI inflation as % over a year ago (documentation here) and % 6 months over 6 months, seasonally adjusted annualized (documentation here), core CPI inflation as % over a year ago (documentation here) and % 6 months over 6 months, seasonally adjusted annualized (documentation here), as well as 1-year, 2-year and 5-year ahead inflation expectations (documentation here).
  • Relative excess PPI inflation: Relative increases in prices for local output indicate improving competitiveness and support tolerance for currency strength. The factor averages the differences between two conceptually different annual produce price inflation rates in three-month moving averages relative to the effective inflation target and relative to the benchmark currency area. The underlying inflation rates are economy-wide estimated output price growth (documentation here) and industrial producer price growth (documentation here).
  • Relative unemployment declines: Relative tightening of the local labour market bodes for relative tightening monetary policy ahead, all other things being equal. The factor averages the negative annualized changes of seasonally adjusted unemployment rates over various lookback horizons in the local economy and subtracts the same metrics in the base currency area. The changes are 3 months over the previous three months, 6 months over the previous 6 months, and over a year ago (documentation here).
  • Relative real private credit growth: Countries with stronger credit growth are likely to see a relative tightening in monetary conditions, all other things being equal. Real credit growth here is the difference between annual private bank credit growth in the local currency area (documentation here) and the effective inflation target. The factor is the real credit growth rate in the local currency area versus the benchmark currency area.
  • Relative real GDP growth estimates: Stronger GDP growth in the local currency area bodes for tighter monetary policy and increased foreign investment. The factor takes the average of two quantamental “nowcasts” of annual GDP growth in 3-month moving averages relative to equivalent metrics in the base currency area. The nowcast methods are intuitive GDP growth estimates (view documentation here) and technical GDP growth estimates (view documentation here).
  • Relative industrial production growth: Stronger output growth in tradable goods often indicates better competitiveness and scope for currency appreciation. The factor averages two industrial production growth rates and subtracts those of the base currency area. The metrics are industrial production growth, % over a year ago, 3-month moving average (documentation here) and % 6 months over the previous 6 months, seasonally adjusted (documentation here).
  • Manufacturing confidence improvement: Improving manufacturing sentiments bodes for improving demand and competitiveness of local industry. The factor averages annualized changes of seasonally adjusted manufacturing business confidence scores (documentation here). The changes are 3 months over the previous three months, 6 months over the previous 6 months, or quarterly equivalents.
  • Terms-of-trade improvement: These are changes in export prices relative to changes in import prices of the local economy. Improving terms of trade often precede economic outperformance, capital inflows, and positive economic news, all of which support currency strength. The factor is the average of 3 commodity-based terms-of-trade changes, namely, % over a year ago, % of the latest month versus the previous 1-year average, and % of the latest week versus the previous 4-week average (documentation here).

The above factors are all conceptually different and theoretically positive FX forward return predictors. Their cross-correlation has been mostly modest over the past 25 years. There has been some positive correlation among information states that are related to relative economic growth and between relative CPI and PPI inflation rates.

A quick summary of signal calculation by various methods

We calculate signals for vol-targeted and hedged FX forward positions based on the 11 macro-quantamental factors by conceptual parity and the two machine learning methods explained above.

For conceptual parity, we first normalize each factor around its zero value based on the panel standard deviations up to the respective point in time. Then, we take an average of all available factor scores and re-normalize them. Conceptual parity signals are the same for vol-targeted and hedged FX forward positions.

For modified linear regression-based learning, we pass the 11 normalized factor scores as features to two sequential processes: one for vol-targeted positions and one for hedged positions. The targets are vol-targeted FX forward returns and hedged FX forward returns, respectively. The chart below shows that since the process’s inception in 2003, the dominant model has been a non-negative least squares panel regression without intercept. This means that regression-based learning has preferred the most restrictive model option, highlighting the steep bias-variance trade-off of machine learning with macro factors: as the macro-environment comes in seasons (high inflation, crisis, etc.), model variance increases disproportionately if more flexibility is granted to a model.

In descending order, the factors with the greatest impact on regression-based signals for vol-targeted FX positions have been relative unemployment rate declines, manufacturing confidence changes, relative CPI inflation pressure, relative industry growth, international liability declines, and relative real GDP growth. The weights are similar for hedged FX positions, albeit in this case, external balance ratios also play a role.

For the random forest-based learning process, we pass the 11 normalized factor scores as features to the learning process for vol-targeted positions and for hedged positions. As in the case of linear regression-based learning, the historically preferred random first regression model is the more restrictive one, i.e., the one with effective sign restraints for the impact of factors.

The below timeline facet shows the normalized and winsorized (at three standard deviations) composite signals for vol-targeted FX forward positions according to the three methods. The broad medium-term trends are similar. Learning signals appear to be a little more volatile, reflecting the added variances that arise from model and parameter changes. Also, linear regression-based learning seems to produce fatter tails, i.e., more frequent extreme signal values.

The greater variability of learning signals in general and the regression-based signals, in particular, is also visible in the case of hedged FX forward positions. It reflects a natural drawback of learning versus parity: model changes are a source of signal variation that has no plausible relation with subsequent market conditions and returns.

 

Comparing predictive power and backtested PnLs for vol-targeted positions

As usual, we assess signal quality by the significance of predictive power, the accuracy of directional predictions and naïve stylized PnLs. For the full sample period 2003-2024, all types of composite signals displayed highly significant predictive power for the 16-currency panel at a monthly or quarterly frequency. Note that the probability of significance has been estimated by using the Macrosynergy panel test and considering both intertemporal and cross-sectional predictive relations.

Importantly, highly significant positive relations have also been recorded for the first and second halves of the sample periods. Judging from Pearson forward coefficients, the predictive relation was strongest for the conceptual parity signal, with the random forest taking second place.

Accuracy and balanced accuracy of signals with respect to the subsequent direction of monthly returns have been well above 50% across all methods. Again, the conceptual parity signal posted the highest rates with above 53%, followed by the random forest signal with around 52.5%, and the linear regression signa with below 51.5%.

Naïve PnLs have been generated for all signals and a “long-only” risk parity book (vol-targeted positions in all 16 “small-country” currencies). These are based on monthly position rebalancing in accordance with the signals for all 16 currencies. The end-of-month score is the basis for the positions of the next month under the assumption of a 1-day slippage for trading. The naïve PnL does not consider transaction costs, risk management, or compounding. For charting, PnL has been scaled to an annualized volatility of 10%.

All types of macro signals have delivered material risk-adjusted returns with modest, if any, correlation of market benchmarks. The conceptual parity strategy has produced a 2003-2024 Sharpe ratio of 1.2 and a Sortino ratio of 1.8, with about 13% correlation to the S&P500 and almost no correlation to U.S. treasury and EURUSD returns. Seasonality of the strategy has been modest, with the 5% best-performing months accounting for less than 50% of the overall PnL. The random forest-based signal has produced a long-term Sharpe ratio of 1.0 and a Sortino ratio of 1.4, with a near-zero correlation to all benchmarks and modest seasonality. The linear-regression-based signal delivered Sharpe and Sortino ratios of 0.7 and 1.0, respectively, and greater seasonality, with most of the PnL accruing in the 2007-10 and 2020-23 crisis periods.

The outperformance of conceptual parity signal is not unusual. The lack of optimization and complete reliance on common sense is a process that generalizes well into the future. However, the success of conceptual parity depends critically on the selection and quality of the factors. Unlike in learning processes, there is no empirical filter. In the case of FX, generating factors based on theory is relatively easy as there is plenty of economic theory to go on. In other cases, such as for sectoral equity allocation, the economic theory of macro influences is hard to come by. In such cases, macro factors are more speculative, and machine learning methods are more likely to outperform simple conceptual parity.

Meanwhile, a drawback of using statistical machine learning on FX panels is the heterogeneity of economies and FX markets. Panel models work best when relationships between underlying (economic) factors and target returns are similar. Alas, the set of developed and emerging market countries naturally comes with great differences in monetary policy, economic structures, data quality, and exchange rate regimes. In separate tests on developed market FX panels alone, the statistical learning methods would have performed better relative to conceptual parity signals.

Comparing predictive power and backtested PnLs for hedged positions

Hedged FX positions remove the systematic medium-term dependence of currencies’ FX forward returns on the performance of global risk markets and, thereby, focus more on idiosyncratic currency movements. However, hedging is never perfect, and we typically replace unwanted directional risk with “basis risk”, i.e., sporadic positive or negative correlation of positions with global risk that arise from inaccuracies in the estimation of the hedge ratio or “beta”.

The predictive relation of the three signal types with subsequent hedged FX forward returns is positive and highly significant at a quarterly or monthly frequency, albeit less strong than for the vol-targeted positions. Also, a significant predictive relation can be found for both halves of the sample period, except for the 2003-2013 period of the linear regression signal, which only reaches a 70% probability of significance.

Accuracy and balanced accuracy ratios are also all above 50%. The highest ratios of just below 53.5% are reached by the random forest signal, followed by the conceptual parity signal at 53% and the linear regression signal at nearly 51%.

Value generation has been materially less than for the case of vol-targeted positions, reflecting the drawbacks of basis risk and the dominance of volatile EM currencies in a portfolio that does not apply vol targeting. Low PnLs in the 2000s reflect largely the influence of a massive drawdown in ZAR in 2004-07. The random forest-based strategy recorded the highest risk-adjusted returns, with long-term Sharpe and Sortino ratios of 0.5 and 0.6 and near-zero correlation with all risk benchmarks. The strategy has been very seasonal. For example, there has been no positive PnL for the first five years of the sample period. The results for the conceptual parity signals are very similar.

Share

Related articles