Regression-based statistical learning is convenient for combining candidate trading factors into single signals (view post here). Models and signals are updated sequentially using expanding time windows of empirical evidence and offering a realistic basis for backtesting. However, simple regression-based predictions disregard statistical reliability, which tends to increase as time passes or decrease after structural breaks. This short methodological post proposes signals based on regression coefficients adjusted for statistical precision. The adjustment correctly aligns intertemporal risk-taking with the predictive power of signals. PnLs become less seasonal and outperform as sample size and statistical quality grow.
The post below is based on Macrosynergy’s proprietary research.
Please quote as “Gholkar, Rushil, and Sueppel, Ralph, ‘How to adjust regression-based trading signals for reliability,’ Macrosynergy research post, July 2024.”
The attached Jupyter notebook allows for the audit and replication of the research results. The notebook’s operation requires access to J.P. Morgan DataQuery to download data from JPMaQS, a premium service of quantamental indicators. J.P. Morgan offers free trials for institutional clients.
Also, an academic research support program sponsors data sets for relevant projects.
A reminder of regression-based learning for macro trading signals
In macro trading strategies, we often combine a range of conceptually distinct candidate factors to create a single signal. The challenge is to assign appropriate weights and weed out those factors that have empirically failed to predict target returns. We can address these challenges by using statistical learning that uses multivariate regression. This method selects predictive regression models and hyperparameters sequentially over expanding time windows, distilling a signal for each point in time that corresponds to the optimal model’s prediction. The signal is a regression-based return prediction, i.e., a cross-product of the estimated coefficients of the optimal model and the factor values. Signals generated in this way make good use of the changing information set and are a valid basis for realistic backtests.
The basics of the method have been introduced in a previous post (Regression-based macro trading signals) and applied to the case of developed markets directional FX forward trading (FX trading signals with regression-based learning). To put regression-based learning into practice requires two basic ingredients:
- Point-in-time information states of macro factors: These “macro-quantamental indicators” can be downloaded from the P. Morgan Macrosynergy Quantamental System (“JPMaQS”). They are real-time information states of the market and the public with respect to various economic concepts and, hence, are suitable for testing relations with subsequent returns. JPMaQS also provides a broad range of daily generic returns across asset classes for these tests.
- Methods to apply regression-based learning to panels: The statistical evaluation of macro factors in a learning process is often based on panels, i.e., data sets with similar time series across different markets. For this purpose, the Macrosynergy package offers convenience classes and functions for Python in its learning module. These were explained in the previous post (Optimizing macro trading signals—A practical introduction), and their application to the present example can be seen in the attached Jupyter notebook.
An extended FX trading example
To illustrate the reliability adjustment of signals, we have extended the analysis of the previous post on FX trading signals with regression-based learning to a broader set of currencies, adding to the seven developed market currencies (excluding the G3) a set of seven emerging markets with convertible currencies and liquidity in the FX forward markets. Regression-based learning identifies models and derives coefficients for predicting returns on 1-month FX forward positions in the following currencies:
- the Australian dollar (AUD), the Canadian dollar (CAD), the New Zealand dollar (NZD), the Mexican peso (MXN), the Israeli shekel (ILS), the South African rand (ZAR), the Korean won (KRW), and the Taiwanese dollar (TWD), all versus the U.S. dollar,
- the Swiss franc (CHF), the Norwegian krone (NOK), the Swedish krona (SEK), the Polish zloty (PLN), and the Hungarian forint (HUF), all versus the euro and
- the British pound (GBP) versus an equally weighted basket of the U.S. dollar and euro.
The objective is to predict and subsequently trade these 14 global currencies based on a set of information state of nine conceptual factors: relative excess GDP growth trends, manufacturing confidence changes, relative unemployment trends, relative excess core CPI inflation, relative excess GDP deflators, relative inflation expectations, relative real interest rates, international liability trends, and terms-of-trade dynamics. For a detailed description of these factors, see the post “Optimizing macro trading signals – A practical introduction.” In the regression-based learning analyses below, the factors are all sequentially normalized to make them comparable in magnitude.
We apply statistical learning using a 14-country panel that starts in 2000 and ends in 2024 (June). The panel is imbalanced, meaning some countries have not been tradable or produced sufficient data for all dates. The learning process only considers OLS and non-negative least squares models, with or without intercept, as choices. It yields optimal models, coefficients, and signals sequentially from late 2003.
The summary graph below shows the size of the coefficients of sequentially optimal models as annual averages over time. Since these are coefficients or normalized factors, they approximate the importance of various factors. While there was some instability in the early years, they converged over time. The most important factors have been relative unemployment trends, manufacturing sentiment changes, and, to a lesser degree, terms-of-trade changes, relative core inflation, relative inflation expectations, and relative real interest rates.
The need for reliability adjustment
A problematic feature of the regression coefficient values in the above learning process is their decline over time. Since regression-based predictions are sums of products of coefficients and factor values, this decline implies a tendency towards smaller signals. Such intertemporal decline conflicts with the statistical precision of estimates: empirical evidence is scant and tentative in the early years but large and more reliable based on larger samples. Moreover, the chosen model versions and hyperparameters also stabilize over time, reducing a source of signal variation unrelated to target returns. Simply put, in the absence of structural breaks, there should be a tendency for signal reliability to increase over time.
The neglect of growing experience leads to an unsatisfactory property of the trading signal: we take more risk when empirical evidence is scant and less when it is ample. This goes against common sense and empirical evidence of signal quality. The scatter graph below shows the predictive power of learning-based signals for subsequent quarter FX returns for the sample’s first and second 10 years. The predictive correlation in the second decade was considerably larger and – unlike in the first decade – highly significant.
Simulated PnLs that use standard regression-based predictions will be unrealistic insofar as they ignore the statistical quality of the estimated regression coefficients and the confidence in the strategy’s principles. Simply put, they disregard an important piece of information, potentially at a high cost.
A method for reliability adjustment
A remedy for the neglect of statistical significance in regressions is the consideration of standard errors of parameter estimates. These estimate the variation in coefficient estimates across different samples of the same size. The greater the standard error of a coefficient estimate, the smaller its precision and the lesser the reliability of a related prediction. All other things equal, standard errors of small samples are greater than those of large samples. Conventionally, standard errors are used to perform hypothesis tests on the parameter estimates. For example, the standard t-test of an OLS regression divides a coefficient by its standard error and assigns a probability of the coefficient being different from zero, depending on the value of this ratio and its hypothesized distribution.
Here, we suggest a similar approach to align signal strength with reliability. After having identified an optimal regression model for each point in time, we estimate the coefficients of the regression, as usual. However, before calculating the trading signal, we divide the coefficients of the macro factors by their standard errors. There are two principal ways to estimate standard errors:
- Analytical methods rely on statistical theory formulas. For linear regression, they depend on the assumptions of the model, particularly in respect to error distribution.
- Bootstrapping repeatedly collects samples with replacements from the observed data. The regression model is then fit to each bootstrap sample to obtain a distribution of coefficient estimates.
Having estimated standard errors we divide coefficients by these errors to receive metrics that are like t-values. Then these ratios are multiplied with the respective macro factors and summed up to give the signal. The signal can be interpreted as reliability weighted sum of factors, where reliability reflects the significance of the estimated relation.
Practically, this is implemented as follows. The Python class managing the sequential statistical learning for the purpose of signal generation is the SignalOptimizer class of the Macrosynergy package’s Learning module. Its basic operation and the structure of the learning process have been explained in the post “Regression-based macro trading signals.”
For standard regression-based signals, we would pass the LinearRegression class of scikit-learn as an argument to SignalOptimizer. The output of its `predict` method would then be the basis of the learning process, while its `fit` method would deliver the information for signal generation. For reliability-adjusted regression-based learning, we use instead a customized ModifiedLinearRegression class of the Macrosynergy package that takes the functionality of LinearRegression but modifies its fit method so that coefficients are adjusted for estimated standard errors, which can be done analytically or based on bootstrapping.
This adjustment changes the magnitudes and intertemporal evolution of the coefficients applied to factors. For the global FX strategy example, this evolution is plotted as annual coefficient averages in the graph below. Business surveys and labor market trends remain the strongest factors, as the influence of our adjustment on relative factor weights is marginal. However, unlike standard regression coefficients, modified coefficient values tend to increase over time rather than decrease.
This means that the coefficients applied to macro factors overtime reflect the evolution of their statistical significance considering growing evidence. Unless the regression model takes provisions for structural breaks, such as maximum lookback windows or breakpoint tests, this upward trend could continue for a long time.
Consequences for value generation
Modifying regression coefficients for signal general does not alter the core of the strategy’s value generation, which is the choice of factor candidates and the learning method. Indeed, neither the relative factor weights nor the chosen models have changed. Nevertheless, the alignment of intertemporal variation of signal strength with statistical precision makes an important difference.
We can assess the pattern of value generation through naïve PnLs. These are based on monthly position rebalancing in accordance with the optimized signals for all currencies. Here, the optimized signals are sums of products of regression coefficients and macro factors, potentially enhanced by a constant. The end-of-month signal is the basis for the positions of the next month under the assumption of a 1-day slippage for trading. The naïve PnL does not consider transaction costs, risk management, or compounding. For the chart below, we calculated naïve PnLs based on unmodified regression coefficients and reliability-enhanced modified coefficients using analytical standard errors and those derived by bootstrapped samples. PnLs have all been scaled to an annualized volatility of 10% to represent them jointly in one graph.
The reliability adjustment had several effects. First and most importantly, coefficient modification has made PnL generation more balanced over the sample period. Without modification, the strategy would have produced roughly 70% of its PnL in the first decade. With modification, the weight would have been around 45%. The share of the top 5% of monthly PnLs in overall long-term PnL generation declined from 91% for standard regression predictions to 70-74% for reliability-adjusted signals.
The overall Sharpe ratio based on modified coefficients would have been 0.52-0.54 versus 0.42 without reliability adjustment. This reflects the greater emphasis on “high quality” signals in the second half of the sample. Also, the benchmark correlation of PnLs was slightly reduced through adjustment. The correlation of the PnL to EURUSD forward returns reduced to 13-16% from 23%. Correlation with the S&P500 fell to 9-12% from 14%.
In practice, the benefits of reliability adjustment may go beyond this plausible “evening out” of PnL generation. This is because growing signal values correctly represent growing predictive power over time and allow scaling up risk-taking as confidence in predictions increases. Simply put, more reliable signals do not only justify larger positions per currency and period, but also larger risk limits or assets under management for the overall strategy.
We have simulated this “upscaling” effect in the below version of a naïve PnL. This PnL does not equalize volatility across signals but instead normalizes signal values by standard deviations up to every point in time. For standard regression-based signals, this typically means that risk-taking remains stable over time or even decreases. For reliability-adjusted signals, this means that risk-taking automatically increases as long as the significance of regression coefficients improves with expanding time windows. This mimics simple intuition: growing confidence in a statistical relation affords greater risk-taking or allocation. This “reliability dividend” means acceptance of greater volatility over time, but for a good reason, and thus leads to larger absolute returns. In the present example, the PnL based on modified returns nearly doubled the value of the standard regression-based signals.