Home » Macro Quantamental Academy » Statistics Packages With Quantamental Indicators
A number of powerful and popular libraries, such as scikit-learn, PyTorch and PyMC, are available in Python to make statistical modelling efforts easy and accessible. The following notebooks describe techniques to create machine learning solutions with macro quantamental indicators, using these libraries together with the macrosynergy package.
A key task of macro strategy development is condensing candidate factors into a single positioning signal. Statistical learning offers methods for selecting factors, combining them to a return prediction, and classifying the market state. These methods efficiently incorporate diverse information sets and allow running realistic backtests.
This post applies sequential statistical learning to optimal signal generation for interest rate swap positions. Sequential methods update, estimate, and select models over time, adapting to growing development data sets, and apply signals based on the latest optimal model each month. These methods require intelligent choices on model versions, hyperparameters, cross-validation splitters, and model quality criteria. Sequential statistical learning has generally done a good job in discarding irrelevant information and has produced greater accuracy and higher risk-adjusted returns than simple factor averages.
Regression is one method for combining macro indicators into a single trading signal. Specifically, statistical learning based on regression can optimize both model parameters and hyperparameters sequentially and produce signals based on whichever model has predicted returns best up to a point in time. This method learns from growing datasets and produces valid point-in-time signals for backtesting. However, whether regression delivers good signals depends on managing the bias-variance trade-off.
This post and its associated Jupyter Notebook provides guidance on pre-selecting the right regression models and hyperparameter grids based on theory and empirical evidence. It considers the advantages and disadvantages of various regression methods, including non-negative least squares, elastic net, weighted least squares, least absolute deviations, and nearest neighbors.
Regression-based statistical learning helps build trading signals from multiple candidate constituents. The method optimizes models and hyperparameters sequentially and produces point-in-time signals for backtesting and live trading.
This post applies regression-based learning to macro trading factors for developed market FX trading, using an improved cross-validation method for expanding panel data. Sequentially optimized models consider nine theoretically valid macro trend indicators to predict FX forward returns. The learning process has delivered significant predictors of returns and consistent positive PnL generation for over 20 years. The most important macro-FX signals, in the long run, have been relative labor market trends, manufacturing business sentiment changes, relative inflation expectations, and terms of trade dynamics.
Macro trading factors are information states of economic developments that help predict asset returns. A single factor is typically represented by multiple indicators, just as a trading signal often combines several factors. Like signal generation, factor construction can be supported by regression-based statistical learning. Dimension reduction is particularly useful for factor discovery. It is the transformation of high-dimensional data into a lower-dimensional representation that retains most of the information content. Dimension reduction methods, such as principal components and partial least squares, reduce bias, increase objectivity, and strengthen the reliability of backtests.
This post applies statistical learning with dimension-reduction techniques to macro factor generation for developed fixed-income markets. The method adapts to the degree of theoretical guidance and the complexity of the data. Several dimension-reduction approaches have successfully produced factors for interest-rate swap trading, delivering positive predictive power, strong accuracy, and robust long-term PnL.
Macro beta is the sensitivity of a financial contract’s return to a broad economic or market factor. Macro betas broaden the traditional concept of equity market betas and can often be estimated using financial contract baskets. Macro sensitivities are endemic in trading strategies, diluting alpha, undermining portfolio diversification, and distorting backtests.
However, it is possible to immunize strategies through “beta learning,” a statistical learning method that supports identifying appropriate models and hyperparameters and allows backtesting of hedged strategies without look-ahead bias. The process can be easily implemented with existing Python classes and methods. This post illustrates the powerful beneficial impact of macro beta estimation and its application on an emerging market FX carry strategy.
Regression-based statistical learning is convenient for combining candidate trading factors into single signals (view post here). Models and signals are updated sequentially using expanding time windows of empirical evidence and offering a realistic basis for backtesting.
However, simple regression-based predictions disregard statistical reliability, which tends to increase as time passes or decrease after structural breaks. This short methodological post proposes signals based on regression coefficients adjusted for statistical precision. The adjustment correctly aligns intertemporal risk-taking with the predictive power of signals. PnLs become less seasonal and outperform as sample size and statistical quality grow.
There is sound reason and evidence for the predictive power of macro indicators for relative sectoral equity returns. However, the relations between economic information and equity sector performance can be complex. Considering the broad range of available point-in-time macro-categories that are now available, statistical learning has become a compelling method for discovering macro predictors and supporting prudent and realistic backtests of related strategies.
This post shows a simple five-step method to use statistical learning to select and combine macro predictors from a broad set of categories for the 11 major equity sectors in 12 developed countries. The learning process produces signals based on changing models and factors per the statistical evidence. These signals have been positive predictors for relative returns of all sectors versus a broad basket. Combined into a single strategy, these signals create material and uncorrelated investor value through sectoral allocation alone.
Random forest regression combines the discovery of complex predictive relations with efficient management of the “bias-variance trade-off” of machine learning. The method is suitable for constructing macro trading signals with statistical learning, particularly when relations between macro factors and market returns are multi-faceted or non-monotonic and do not have clear theoretical priors to go on.
This post shows how random forest regression can be used in a statistical learning pipeline for macro trading signals that chooses and optimizes models sequentially over time. For cross-sector equity allocation using a set of over 50 conceptual macro factors, regression trees have delivered signals with significant predictive power and economic value. Value generation has been higher and less seasonal than for statistical learning with linear regression models.
Boosting is a machine learning ensemble method that combines the predictions of a chain of basic models, whereby each model seeks to address the shortcomings of the previous one. This post applies adaptive boosting (Adaboost) to trading signal optimisation. Signals are constructed with macro factors to guide positioning in a broad range of global FX forwards.
Boosting is beneficial for learning from a wide and heterogeneous set of markets over time, because it is well-suited for exploiting the diversity of experiences across countries and global economic states. Empirically, we generate machine learning-based signals that use regularized regression and random forest regression, and compare processes with and without adaptive boosting methods. For both regression types, machine learning prefers boosting as datasets get larger and, by doing so, creates more profitable signals.
Financial markets’ broadening access to point-in-time economic indicators across countries offers a robust foundation for diversified international trading strategies. The central challenge lies in combining multiple macro factors into a single positioning signal for each country—drawing on statistical patterns from both global and country-specific (local) experiences. To address this, we propose a novel “global-local” method of machine learning for generating international macro trading signals. This method manages the bias-variance tradeoff by regularizing country-level coefficients toward their global counterparts.
Crucially, the strength of this regularization diminishes when historical evidence supports the value of emphasizing local relationships over global ones. We demonstrate the approach by applying it to international equity index futures strategies. The “global-local” method has generated stronger predictive power and higher risk-adjusted returns than either fully country-specific models or globally pooled alternatives.
Data analysis with macro quantamental indicators can be performed in both Python and R using standard data science libraries. The following notebooks contain entry-level analysis examples focusing on standard time series and panel analysis.
The notebook illustrates how to use the popular data visualization library Seaborn with quantamental data. In particular, it shows how to use the package to display historical distributions, panels of timelines, bivariate relations, and various types of heatmaps.
The notebook illustrates various types of quantamental panel analysis in Python. In particular, it shows the application of pooled regression, fixed-effects regression, random-effects regression, linear mixed-effects models, and seemingly unrelated regressions.
The notebook illustrates various types of quantamental panel analysis in R. In particular, it shows the application of pooled models, fixed effects models, and linear mixed-effects models.
This notebook gives a step-by-step strategy research example using quantamental data and the Macrosynergy package. It shows how to check data, how to construct panels with plausible trading factors, and how to value the predictive power and economic value of such factors.