Introduction to the Macrosynergy package: The “Learning” module

Introduction to the Macrosynergy package: The “Learning” module #

The macrosynergy.learning subpackage provides functions and classes to help create statistical machine learning signals from panels of JPMaQS data.
It is built to integrate the macrosynergy package with the widely used scikit-learn library.

Most standard scikit-learn classes do not work directly with panel data.
macrosynergy.learning provides wrappers that respect the cross-section and time indexing of quantamental dataframes and enable the use of scikit-learn models, feature selection, cross-validation, and metrics in a panel friendly way.

See also the introductory notebooks where macrosynergy.learning is applied:

Features (x) #

For this notebook, we build a monthly dataset with features lagged by one month. We take the last recorded value of each month for daily z-scores:

XGDP_NEG : negative of growth trend.
XCPI_NEG : negative of excess inflation measure.
XPCG_NEG : negative of excess private credit growth.
RYLDIRS05Y_NSA : real IRS yield, 5-year maturity (expectations-based).

Target (y) #

The target is a monthly aggregated return, created by summing daily returns for each month.
Here, we focus on the return of a fixed receiver position in 5Y IRS ( DU05YXR_VT10 ), scaled to a 10% annualized volatility target.

The first step is converting a quantamental dataframe into a wide format:

Columns = indicators or factors.
Rows = identified by cid (cross-section) and real_date .
Implemented via categories_df from macrosynergy.management .

This function also supports:

Downsampling
Feature lagging
Dropping rows with nulls

Both SignalOptimizer and BetaEstimator classes use this conversion internally.

Here, we prepare the macroeconomic dataset. We set up the currency universe, select the required JPMaQS categories (including the target series and inputs for an FX blacklist), and construct the FXBLACK series to filter out untradeable currencies. We then build several derived macro factors (growth, inflation, credit, and real rate measures), merge them back into the panel, and standardize them across countries with cross-sectional z-scores. These normalized macro signals form the inputs for the later learning and signaling analysis.

Further details on how the raw JPMaQS data is accessed and structured are provided in this notebook .

              import os
import warnings
warnings.filterwarnings('ignore')

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

import macrosynergy.management as msm 
import macrosynergy.panel as msp
import macrosynergy.signal as mss
import macrosynergy.pnl as msn
import macrosynergy.visuals as msv
import macrosynergy.learning as msl

from macrosynergy.download import JPMaQSDownload

from sklearn.pipeline import Pipeline
from sklearn.linear_model import LinearRegression, Ridge
from sklearn.metrics import make_scorer, r2_score

             

              # Cross-sections (cids) used throughout
cids_dm = ["AUD", "CAD", "CHF", "EUR", "GBP", "JPY", "NOK", "NZD", "SEK", "USD"]
cids_em = ["CLP", "COP", "CZK", "HUF", "IDR", "ILS", "INR", "KRW", "MXN", "PLN", "THB", "TRY", "TWD", "ZAR"]
cids = cids_dm + cids_em

cids_dux = list(set(cids) - set(["IDR", "NZD"]))

             

              # Minimal set of JPMaQS categories required to recreate dfx, macro factors, and fxblack

raw_xcats_for_calcs = [
    "INTRGDPv5Y_NSA_P1M1ML12_3MMA",
    "CPIC_SJA_P6M6ML6AR",
    "CPIH_SA_P1M1ML12",
    "INFTEFF_NSA",
    "PCREDITBN_SJA_P1M1ML12",
    "RGDP_SA_P1Q1QL4_20QMA",
    "RYLDIRS05Y_NSA",
    "INTRGDP_NSA_P1M1ML12_3MMA",
]

# The target category used in the learning_to_before_signaling notebook
targets_needed = [
    "DU05YXR_VT10"
]

# Categories needed to build the FX blacklist
fx_blacklist_inputs = [
    "FXTARGETED_NSA", 
    "FXUNTRADABLE_NSA"
]

             

              xcats_to_download = sorted(set(raw_xcats_for_calcs + targets_needed + fx_blacklist_inputs)) + ["FXXR_NSA", "EQXR_NSA"]

dwn = JPMaQSDownload(
    client_id=os.environ.get("JPM_CLIENT_ID", ""),
    client_secret=os.environ.get("JPM_CLIENT_SECRET", ""),
    oauth=True,
)

df = dwn.download(xcats=xcats_to_download, cids=cids)

              Downloading data from JPMaQS.
Timestamp UTC:  2025-09-04 12:08:03
Connection successful!
Some expressions are missing from the downloaded data. Check logger output for complete list.
11 out of 312 expressions are missing. To download the catalogue of all available expressions and filter the unavailable expressions, set `get_catalogue=True` in the call to `JPMaQSDownload.download()`.

             

              # Build fxblack (FX blacklist) 

dfb = df[df["xcat"].isin(["FXTARGETED_NSA", "FXUNTRADABLE_NSA"])][["cid", "xcat", "real_date", "value"]]
dfba = (
    dfb.groupby(["cid", "real_date"])
       .aggregate(value=pd.NamedAgg(column="value", aggfunc="max"))
       .reset_index()
)
dfba["xcat"] = "FXBLACK"
fxblack = msp.make_blacklist(dfba, "FXBLACK")


# Recreate dfx and macro factors from the intro notebook
dfx = df.copy()


calcs = [
    # intuitive growth trend
    "XGDP_NEG = - INTRGDPv5Y_NSA_P1M1ML12_3MMA",
    # excess inflation measure
    "XCPI_NEG =  - ( CPIC_SJA_P6M6ML6AR + CPIH_SA_P1M1ML12 ) / 2 + INFTEFF_NSA",
    # excess private credit growth
    "XPCG_NEG = - PCREDITBN_SJA_P1M1ML12 + INFTEFF_NSA + RGDP_SA_P1Q1QL4_20QMA",
    # excess real interest rate
    "XRYLD = RYLDIRS05Y_NSA - INTRGDP_NSA_P1M1ML12_3MMA",
    # combined real rate + inflation gap
    "XXRYLD = XRYLD + XCPI_NEG",
]

dfa = msp.panel_calculator(dfx, calcs=calcs, cids=cids)
dfx = msm.update_df(df=dfx, df_add=dfa)

# Create cross-sectional z-scores for the macro panels (ZN4), as used later
macros = ["XGDP_NEG", "XCPI_NEG", "XPCG_NEG", "RYLDIRS05Y_NSA"]
for xc in macros:
    dzn = msp.make_zn_scores(
        dfx,
        xcat=xc,
        cids=cids,
        neutral="zero",
        thresh=3,
        est_freq="M",
        pan_weight=1,
        postfix="_ZN4",
    )
    dfx = msm.update_df(dfx, dzn)

# the list of normalized macro factors referenced downstream
macroz = [m + "_ZN4" for m in macros]
xcatx=macros

             

              # Downsample from daily to monthly frequency (features as last and target as sum)
dfw = msm.categories_df(
    df=dfx,
    xcats=xcatx,
    cids=cids_dux,
    freq="M",
    lag=1,
    blacklist=fxblack,
    xcat_aggs=["last", "sum"],
)

# Drop rows with missing values and assign features and target
dfw.dropna(inplace=True)
X = dfw.iloc[:, :-1]
y = dfw.iloc[:, -1]

X

             

		XGDP_NEG	XCPI_NEG	XPCG_NEG
cid	real_date
AUD	2000-02-29	-0.127516	-0.162771	-2.316805
	2000-03-31	0.188010	-0.162771	-2.316805
	2000-04-28	0.033589	-0.162771	-3.137645
	2000-05-31	0.175323	-0.676674	-2.763879
	2000-06-30	0.205179	-0.676674	-2.422330
...	...	...	...	...
ZAR	2025-05-30	-0.426351	1.882825	1.799903
	2025-06-30	-0.030835	1.777107	0.718399
	2025-07-31	0.399231	1.732005	0.136673
	2025-08-29	0.213512	1.568641	0.322765
	2025-09-30	-0.369677	1.279459	-0.600531

5444 rows × 3 columns

Cross-validation splitters #

Cross-validation is a resampling technique used to evaluate how well a machine learning model generalizes to unseen data. Instead of training and testing on one fixed train-test split, cross-validation systematically splits the dataset into multiple parts (called folds) and rotates through them. Each division that results from the splitting is known as a “fold”.

The macrosynergy package specializes on cross-validation splits for panel data and supports the splitting of panel data into folds through five classes:

ExpandingIncrementPanelSplit()
ExpandingFrequencyPanelSplit()
ExpandingKFoldPanelSplit()
RollingKFoldPanelSplit()
RecencyKFoldPanelSplit()

`ExpandingIncrementPanelSplit()` #

The ExpandingIncrementPanelSplit() class facilitates the generation of expanding windows for cross-validation, essential for modeling scenarios where data is incrementally available over time.

This class divides the dataset into training and testing sets, systematically increasing the size of the training set by one observation with each iteration.

This approach effectively simulates environments where new information is gradually incorporated at set intervals.

Important parameters are:

train_intervals specifies the length of the training interval in time periods. This parameter controls how much the training set expands with each new split.
min_cids sets the minimum number of cross-sections required for the initial training set, with the default being four. This is crucial in scenarios where panel data is unbalanced, ensuring there are enough cross-sections to begin the training process.
min_periods sets the smallest number of time periods required for the initial training set, with the default being 500 native frequency units. This is particularly important in an unbalanced panel context and should be used in conjunction with min_cids .
test_size determines the length of the test set for each training interval. By default, this is set to 21 periods, which follows the training phase.
max_periods defines the maximum duration that any training set can reach during the expanding process. If this cap is reached, the earliest data periods are excluded to maintain this constraint. By setting this value, rolling training is effectively performed.

               split_xi = msl.ExpandingIncrementPanelSplit(train_intervals=12, min_periods=12, test_size=24, min_cids=2)

              

`visualise_splits()` #

The visualise_splits method an be applied to a splitter and is a convenient method for visualizing the splits produced by each splitter based on the full data sets of features and targets.

                split_xi.visualise_splits(X,y)

               

https://macrosynergy.com/notebooks.build/introductions/introduction-to-macrosynergy-package---learning-module/_images/b801531b8f64b45e3ec9fe96d95eba9951d8eb3bee3bc35895533b56774ac635.png

`ExpandingFrequencyPanelSplit()` #

As with ExpandingIncrementPanelSplit() , the ExpandingFrequencyPanelSplit() class generates expanding windows for cross-validation.

However, the user specifies the frequencies at which the training sets expand and at which the validation sets span.

The important parameters are:

expansion_freq specifies the frequency at which training sets expand.
test_freq specifies the frequency forward of each training set that each validation set spans.
min_cids sets the minimum number of cross-sections required for the initial training set, with the default being four. This is crucial in scenarios where panel data is unbalanced, ensuring there are enough cross-sections to begin the training process.
min_periods sets the smallest number of time periods required for the initial training set, with the default being 500 native frequency units. This is particularly important in an unbalanced panel context and should be used in conjunction with min_cids .
max_periods defines the maximum span that any training set can cover during the expanding process. If this cap is reached, the earliest data periods are excluded to maintain this constraint. By setting this value, rolling training is effectively performed.

               split_xf = msl.ExpandingFrequencyPanelSplit(
    expansion_freq="M",
    test_freq="Y",
    min_cids=2,
    min_periods=12,
)
split_xf.visualise_splits(X, y)

              

https://macrosynergy.com/notebooks.build/introductions/introduction-to-macrosynergy-package---learning-module/_images/6b0365c2791631b6209c54a987ed0c753487cdc6f02fa8836861e039c5d4f135.png

`ExpandingKFoldPanelSplit()` #

The ExpandingKFoldPanelSplit() class produces sequential learning scenarios, where information sets grow at fixed intervals.

The key parameter here is n_splits , which determines the number of desired splits (minimum 2).

As above, visualise_splits() method is used to visualise if the split has been performed as intended.

               split_xkf = msl.ExpandingKFoldPanelSplit(n_splits=5)
split_xkf.visualise_splits(X, y)

https://macrosynergy.com/notebooks.build/introductions/introduction-to-macrosynergy-package---learning-module/_images/94fbd27a332968c834bc013a318216bfadcd6c1cc9cedc7076694636db6eb8a8.png

`RollingKFoldPanelSplit()` #

The RollingKFoldPanelSplit class produces various adjacent paired training and test splits, where all splits use the full data panel.

The training set and test set have to sit right next to each other in time. But the test sets can be both after and before training data. This means the method can use the past to validate the a model built on future data.

               split_rkf = msl.RollingKFoldPanelSplit(n_splits=5)
split_rkf.visualise_splits(X, y)

https://macrosynergy.com/notebooks.build/introductions/introduction-to-macrosynergy-package---learning-module/_images/4ac8b9bb12388eb66b331ab76544a307980c042426ea9697c546ed31bb213ec2.png

`RecencyKFoldPanelSplit()` #

The RecencyKFoldPanelSplit class produces expanding training sets and constant-period test sets focusing on recent combinations of the two.

It is similar to the ExpandingKFoldPanelSplit , being an expanding splitter where the number of folds is specified. However, the size of each test set, in terms of the number of periods at native dataset frequency, is also specified.

Given parameters n_splits and n_periods , the last n_splits \(\times\) n_periods time periods in the panel are divided into n_splits test sets, each containing n_periods time periods.

The respective training set for each test set comprises all dates in the panel prior to the test set.

               split_rekf = msl.RecencyKFoldPanelSplit(n_splits=3, n_periods = 3)
split_rekf.visualise_splits(X, y)

https://macrosynergy.com/notebooks.build/introductions/introduction-to-macrosynergy-package---learning-module/_images/09851cf85dd04b9402a9f04c11d44660e364fd07becf8155f935abecbad68540.png

Model evaluation #

To check if a model is actually useful, we need performance metrics.

In scikit-learn terms:

A metric is just a function that takes the model’s predictions and the true labels and returns a number (e.g. accuracy, mean squared error).
A scorer is a wrapper that takes three inputs: a fitted scikit-learn model, the input features, the true labels; and then calculates the score.

You can use a metric or a scorer by itself to judge how well a model fits. Or, you can plug it into cross-validation, which will tell you how the model is expected to perform on new, unseen data.

Metrics #

In scikit-learn a metric is a function that measures the quality of predictions. The macrosynergy.learning subpackage provides a set of custom evaluation metrics designed to work with scikit-learn . Each metric is implemented as a function that takes two inputs:

y_true : Observed values
y_pred : Values predicted by the model

The available metrics include:

regression_accuracy() : Accuracy between the signs of predictions and targets
regression_balanced_accuracy() : Balanced accuracy between the signs of predictions and targets
panel_significance_probability() : Significance probability of correlation after fitting a linear mixed effects model between predictions and true targets, accounting for cross-sectional correlations present in the panel. See the research piece ‘ Testing macro trading factors ’ for more information
sharpe_ratio() : Naive Sharpe ratio based on the model predictions
sortino_ratio() : Naive Sortino ratio based on the model predictions
correlation_coefficient() : Specified correlation coefficient between model predictions and “ground truth” labels. Available correlation coefficients are “pearson”, “spearman” and “kendall”

With the exception of panel_significance_probability , all the above metrics can be computed along different panel dimensions: either across cross-sections or across time periods.

For example, accuracy can be measured over all samples at once, or it can be calculated separately for each cross-section and then averaged, or separately for each time period and averaged.

These approaches give estimates of the ‘expected’ accuracy for a typical cross-section or a typical time period. This is controlled by the type argument in the metric definition, where either:

type='cross_section'
type='time_periods' .

Sometimes, you may want to take an existing scikit-learn metric and adapt it for panel data evaluation along cross-sections or time periods. For this, the subpackage provides the create_panel_metric() function. It takes:

y_true
y_pred : a standard scikit-learn metric
type

and then evaluates the metric along the specified panel axis.

An example is shown in the code cell below:

               # Fit a linear regression model and make predictions
lr = LinearRegression().fit(X, y)
y_pred = lr.predict(X)

# Calculate expected in-sample R2 metric for a given cross-section
msl.create_panel_metric(
    y_true = y,
    y_pred = y_pred,
    sklearn_metric = r2_score,
    type = "cross_section"
)

              

np.float64(-0.779903424092632)

Scorers #

While metrics are general-purpose evaluation functions in sklearn.metrics scores are what estimators return from their .score() method Currently, macrosynergy.learning provides a single scorer: neg_mean_abs_corr() .

Important parameters:

estimator : a fitted custom linear regression model (subclassing BaseRegressionSystem in macrosynergy.learning) that fits a separate linear model per cross-section, storing one beta for each.
X_test : a multi-indexed panel of benchmark returns. It is named as such because it is generally expected for this to be out-of-sample, although in-sample statistics can also be computed.
y_test : a multi-indexed panel of returns, paired with X_test .

Given a collection of estimated betas stored in estimator, hedged returns can be computed for each cross-section in X_test . To assess hedge quality, the absolute correlation between the hedged returns and the corresponding benchmark returns is calculated for each cross-section.

As an overall panel measure, these absolute correlations are then averaged across all cross-sections. Finally, the result is multiplied by -1, since scorers in scikit-learn are defined to be maximized, and lower correlations indicate better hedge performance.

Preprocessing #

The macrosynergy.learning.preprocessing folder comprises various methods to manipulate the input panel of indicators in a statistical machine learning pipeline, preprocessing them in a number of possible manners before passing the transformed indicators into a predictive model.

We categorize the possible preprocessing methods into:

selectors
scalers
transformers

Feature selectors #

A scikit-learn pipeline can incorporate a layer of feature selection. We provide some custom selectors in the macrosynergy.learning subpackage for use over a panel.

LarsSelector() : selects features through the LARS algorithm.
- n_factors : Number of factors to be selected
- fit_intercept : If True includes an intercept in the LARS model
LassoSelector() : selects features through a LASSO regression.
- n_factors : Number of factors to be selected
- positive : When True , enforces a positive restriction.
MapSelector() : selects features based on significance from the Macrosynergy panel test.
- n_factors : Number of factors to be selected
- significance_level : P-value significance threshold
- positive : When True , enforces a positive restriction.

For more information on the panel test, see the research piece ‘ Testing macro trading factors ’.

               # Keep only factors that a significant at the 5% level based on the MAP test. 
map_test = msl.MapSelector(significance_level=0.05).fit(X, y)
map_test.transform(X)

              

		XGDP_NEG	XCPI_NEG	XPCG_NEG
cid	real_date
AUD	2000-02-29	-0.127516	-0.162771	-2.316805
	2000-03-31	0.188010	-0.162771	-2.316805
	2000-04-28	0.033589	-0.162771	-3.137645
	2000-05-31	0.175323	-0.676674	-2.763879
	2000-06-30	0.205179	-0.676674	-2.422330
...	...	...	...	...
ZAR	2025-05-30	-0.426351	1.882825	1.799903
	2025-06-30	-0.030835	1.777107	0.718399
	2025-07-31	0.399231	1.732005	0.136673
	2025-08-29	0.213512	1.568641	0.322765
	2025-09-30	-0.369677	1.279459	-0.600531

5444 rows × 3 columns

Feature scalers #

Some learning algorithms work best when the input data is scaled before training. Without scaling, models may converge more slowly or even produce misleading results.

To address this, the package provides the following scaling transformers:

PanelStandardScaler() : transforms features by subtracting historical mean and dividing by historical standard deviation
PanelMinMaxScaler() : transforms features by normalizing them between zero and one

Both classes admit a type parameter:

type='panel' (default) to calculate the mean/std, or min/max over the panel for scaling
type='cross_section' to scale within each cross-section and concatenate each of the scaled cross-sectional features to reconstruct the panel

               # Scale by training mean and standard deviation
msl.PanelStandardScaler().fit_transform(X)

		XGDP_NEG	XCPI_NEG	XPCG_NEG
cid	real_date
AUD	2000-02-29	-0.217460	0.047128	-0.058952
	2000-03-31	-0.105518	0.047128	-0.058952
	2000-04-28	-0.160303	0.047128	-0.198993
	2000-05-31	-0.110019	-0.201634	-0.135226
	2000-06-30	-0.099427	-0.201634	-0.076956
...	...	...	...	...
ZAR	2025-05-30	-0.323480	1.037329	0.643387
	2025-06-30	-0.183160	0.986155	0.458875
	2025-07-31	-0.030581	0.964323	0.359628
	2025-08-29	-0.096470	0.885244	0.391377
	2025-09-30	-0.303374	0.745261	0.233856

5444 rows × 3 columns

Feature transformers #

All other preprocessing classes are placed under the general tag of “transformers”. We provide two such classes:

PanelPCA : transforms features through principal component analysis and returns a multi-indexed dataframe
- n_components : If an integer, that many principal components are kept. If it’s a float between 0 and 1, enough components are kept to explain up to that proportion of total variance
- kaiser_criterion : If True , this parameter overrides n_components and keeps only the components with associated eigenvalues greater than one
- adjust_signs : If True , each eigenvector is multiplied by either one or minus one to ensure its projected component is positively correlated with a target vector, if provided.
ZnScoreAverager ( deprecated and to be replaced in a future release ) performs point-in-time zn-scoring (see section on make_zn_scores in the Introduction to Macrosynergy package for each feature and averages the result to form a composite signal

               # Scale dataframe before applying PCA
pipe = Pipeline([
    ("scaler", msl.PanelStandardScaler()),
    ("pca", msl.PanelPCA(n_components=2)),
]).fit(X,y)

pipe.transform(X)

              

		PCA 1	PCA 2
cid	real_date
AUD	2000-02-29	0.070187	-0.219199
	2000-03-31	0.039984	-0.117288
	2000-04-28	0.154591	-0.171389
	2000-05-31	0.256607	-0.021035
	2000-06-30	0.212212	-0.009634
...	...	...	...
ZAR	2025-05-30	-1.042873	-0.703141
	2025-06-30	-0.916078	-0.559843
	2025-07-31	-0.872366	-0.414922
	2025-08-29	-0.826028	-0.441317
	2025-09-30	-0.567299	-0.576667

5444 rows × 2 columns

In a scikit-learn pipeline, it is often helpful to transform features into new forms. e.g. by scaling or averaging them.

The macrosynergy.learning subpackage extends this functionality with a set of custom transformers:

PanelStandardScaler() : transforms features by subtracting historical mean and dividing by historical standard deviation
PanelMinMaxScaler() : transforms features by normalizing them between zero and one
FeatureAverager() : condenses features into a single feature through averaging

Forecasting #

The macrosynergy.learning.forecasting submodule comprises a collection of scikit-learn -compatible predictor classes that convert a collection of preprocessed features into predictions.

The following conventional predictor classes are provided in the package:

NaiveRegressor() : a naive predictor class that simply returns the average of the input features, for each cross-section and timestamp
LADRegressor() : a linear model that estimates parameters by minimising the mean absolute deviations between predictions and provided targets

Weighted LAD regression models:
- SignWeightedLADRegressor() : equalizes the importance of negative return with positive return historical samples, removing a possible sign bias learnt by the model
- TimeWeightedLADRegressor() : increases the importance of more recent samples, by specifying a half-life of exponentially decaying weights with time for each historical sample
Weighted least squares linear regression models:
- SignWeightedLinearRegression() : equalizes the importance of negative return with positive return historical samples, removing a possible sign bias learnt by the model
- TimeWeightedLinearRegression() : increases the importance of more recent samples, by specifying a half-life of exponentially decaying weights with time for each historical sample

Modified regressors #

Linear model coefficients tend to be more volatile when only limited data is available. To address this, it can be useful to adjust the coefficients based on their statistical precision, effectively creating an auxiliary factor model. This adjustment is made by estimating the coefficients’ standard errors and dividing the coefficients by these values (with a small offset added to avoid issues with very small errors).

The effect is that imprecise coefficients are shrunk, while precise coefficients are amplified, leading to more reliable estimates overall.

A key point is that the output of the auxiliary factor model is not an appropriate prediction, but it is a valid signal.

To distinguish between these two concepts in the classes, we leave the predict() function to make predictions using the unadjusted factor model, whilst we introduce a create_signal() function to output signals based on the adjusted factor model.

All such regressors have a method parameter:

method='analytic'
method='bootstrap'

Below is a list of modified regressors in macrosynergy.learning.forecasting :

ModifiedLinearRegression() : coefficient-adjusted OLS linear regression model
ModifiedSignWeightedLinearRegression() : coefficient-adjusted SWLS linear regression model
ModifiedTimeWeightedLinearRegression() : coefficient-adjusted TWLS linear regression model

Regressor systems #

The regressor systems in macrosynergy.learning.forecasting fit a regression model on each cross-section of a panel, inheriting from msl.BaseRegressionSystem .

The following are the currently implemented systems of regressions:

LinearRegressionSystem() : fits a linear regression model on each cross-section of a panel. Stores coefficients and intercepts for each cross-section when only a single feature is in the model
LADRegressionSystem() : fits a LAD regression model on each cross-section of a panel. Stores coefficients and intercepts for each cross-section when only a single feature is in the model
RidgeRegressionSystem() : fits a Ridge regression model on each cross-section of a panel. Stores coefficients and intercepts for each cross-section when only a single feature is in the model
CorrelationVolatilitySystem() : estimates betas through fitting moving average correlation and volatility estimators. This is used solely for the purpose of beta estimation

Sequential signal generation #

The macrosynergy.learning.sequential submodule contains classes that simulate the experience of a trader using statistical machine learning to create trading signals over time and point-in-time, thus producing data for a valid backtest.

Signal optimizer #

The SignalOptimizer class supports sequential model selection, fitting, optimization and forecasting based on quantamental panel data.

Three use cases are discussed in detail in the notebook Signal optimization basics :

Feature selection chooses from candidate features to combine them into an equally weighted score
Return prediction estimates the predictive relation of features and combines them in accordance with their coefficient into a single prediction
Classification estimates the relation between features and the sign of subsequent returns and combines their effect into a binary variable of positive or negative returns

Below, we showcase the second case, focusing on the principals of generation of an optimized regression-based signal:

The SignalOptimizer constructor builds a wide-format DataFrame that makes panel data suitable for supervised learning. Internally, it relies on the same categories_df function introduced earlier to create the required DataFrames.

As a result, all key arguments of categories_df can also be passed directly to SignalOptimizer when initializing an object.

The only additional argument is generate_labels , a function applied to the target vector created by categories_df . If provided, the transformed target vector is used as the supervised learning labels.

For example, in directional return classification, you might label positive returns as 1 and negative returns as -1 using:

generate_labels = lambda x: 1 if x >= 0 else -1

               so_reg = msl.SignalOptimizer(
    df = dfx,
    xcats = xcatx,
    cids = cids_dux,
    freq = "M",
    lag = 1,
    blacklist = fxblack,
    xcat_aggs=["last", "sum"],
)

so_reg.X

              

		XGDP_NEG	XCPI_NEG	XPCG_NEG
cid	real_date
AUD	2000-02-29	-0.127516	-0.162771	-2.316805
	2000-03-31	0.188010	-0.162771	-2.316805
	2000-04-28	0.033589	-0.162771	-3.137645
	2000-05-31	0.175323	-0.676674	-2.763879
	2000-06-30	0.205179	-0.676674	-2.422330
...	...	...	...	...
ZAR	2025-05-30	-0.426351	1.882825	1.799903
	2025-06-30	-0.030835	1.777107	0.718399
	2025-07-31	0.399231	1.732005	0.136673
	2025-08-29	0.213512	1.568641	0.322765
	2025-09-30	-0.369677	1.279459	-0.600531

5444 rows × 3 columns

`calculate_predictions()` #

The calculate_predictions() function generates and stores predictions for sequentially optimized models, along with their chosen hyperparameters and parameters. Model and hyperparameter selection follow standard cross-validation principles.

For explainability, the class instance also retains detailed information on model and hyperparameter choices, feature importances, model coefficients, feature selection, and correlations between transformed features.

Important parameters:

models : a dictionary of scikit-learn predictors or pipelines that contains choices for the type of model to be deployed,
hyperparameters : a nested dictionary defining the hyperparameters to consider for each model type,
scorers : a dictionary of scikit-learn -compatible scorer functions used to evaluate a model in the model selection stage,
inner_splitters : a dictionary of cross-validation splitters provided to the cross-validation module. When multiple inner splitters are provided, all splits provided by the splitters are concatenated.
search_type : type of hyperparameter search to undertake. Choices are:
- grid to perform a grid search
- prior to perform a randomized search where priors can be placed
normalize_fold_results : if True , standardizes cross-validation fold scores for a given metric and CV fold.
cv_summary : how to aggregate cross-validation scores across folds, for different models. Options are:
- 'mean' (default)
- 'median'
- 'mean-std'
- 'mean/std'
Alternatively, a callable can be passed into cv_summary , directly specifying the type of aggregation.

In order to showcase the different options that SignalOptimizer provides, we construct a pipeline that involves feature scaling, feature selection and predictor training.

In the example below, we train a Ridge regression model sequentially over the realized trading history. At each retraining date (every three months):

The data is scaled,
One feature is removed from the training set,
Ridge regression is selected from 50 candidate alpha values.

Model selection uses cross-validation that balances \(R^{2}\) and balanced accuracy, combining splits from both RollingKFoldPanelSplit (5 folds) and ExpandingKFoldPanelSplit (3 folds initially).

During the first three years of the backtest, the expanding splitter provides 3 folds, resulting in a total of 8 folds. After this period, it switches to 5 folds, giving a total of 10 folds for the remainder of the training history.

                mods_reg = {
    "ridge": Pipeline([
        ('scaler', msl.PanelStandardScaler()),
        ('selector', msl.LarsSelector(n_factors = 3)),
        ('model', Ridge()),
    ]),
}

grids_reg = {
    "ridge": {
        "model__alpha": list(np.logspace(-4, 4, 25))
    },
}

scorers_reg = {
    "R2": make_scorer(r2_score, greater_is_better=True),
    "BAC": make_scorer(msl.regression_balanced_accuracy, greater_is_better=True),
}

splitters_reg = {
    "Expanding": msl.ExpandingKFoldPanelSplit(n_splits = 3),
    "Rolling": msl.RollingKFoldPanelSplit(n_splits = 5),
}

so_reg.calculate_predictions(
    name = "MACRO_OPTREG",
    models = mods_reg,
    hyperparameters = grids_reg,
    scorers = scorers_reg,
    inner_splitters = splitters_reg,
    search_type = "grid",
    normalize_fold_results = True,
    cv_summary = "median",
    test_size = 3,
    min_cids=4,
    min_periods=36,
    split_functions = {"Expanding": lambda n: 0 if n < 12 * 3 else 2, "Rolling": None}
)

               

`models_heatmap()` #

The models_heatmap method of the SignalOptimizer class visualizes optimal models used for signal calculation over time.

If many models have been considered, their number can be limited by the cap argument.

                so_reg.models_heatmap("MACRO_OPTREG")

               

https://macrosynergy.com/notebooks.build/introductions/introduction-to-macrosynergy-package---learning-module/_images/9db80b7c3034156c7891a392435161c2e77afb5fb37c4c6495b0b690935006d4.png

`feature_importance_timeplot()` #

The feature_importance_timeplot function creates a time plot of linear model regression coefficients for each feature.

For these statistics to be recorded, the underlying scikit-learn predictor class (in this case, LinearRegression ) must contain coef_ and intercept_ attributes.

Gaps in the lines appear either when a model without the required attributes (e.g. a KNN or Random Forest) is selected or a feature selector (in this case, LassoSelector ) doesn’t select these features.

                so_reg.feature_importance_timeplot(name="MACRO_OPTREG", figsize=(16, 6))

               

https://macrosynergy.com/notebooks.build/introductions/introduction-to-macrosynergy-package---learning-module/_images/5c92ad56120b467e6f33e3441b492b5a0df5c8dd2b0c93512ec6e774abba3ef1.png

`coefs_stackedbarplot()` #

The coefs_stackedbarplot() method is an alternative to coefs_timeplot() and displays a stacked bar plot of average annual model coefficients over time.

                so_reg.coefs_stackedbarplot(name="MACRO_OPTREG", figsize=(16, 6))

               

https://macrosynergy.com/notebooks.build/introductions/introduction-to-macrosynergy-package---learning-module/_images/003ccc9a3af72e5d0d4b75c0b21b7f11e88011cccf87d778ef1a7b0a02d2f36a.png

`intercepts_timeplot()` #

Similarly to model coefficients, changing model intercepts can be visualised over time through a timeplot using the intercepts_timeplot() function.

                so_reg.intercepts_timeplot(name="MACRO_OPTREG", figsize=(16, 6))

               

https://macrosynergy.com/notebooks.build/introductions/introduction-to-macrosynergy-package---learning-module/_images/d0acf372805ec9425c43caa4ed5f792aaa8145949f206596287e7632d116f15c.png

`nsplits_timeplot()` #

The nsplits_timeplot() displays number of cross-validation splits that are applied over time.

This is useful if split_functions was specified when running a pipeline.

                so_reg.nsplits_timeplot(name="MACRO_OPTREG")

               

https://macrosynergy.com/notebooks.build/introductions/introduction-to-macrosynergy-package---learning-module/_images/960e83dff81b9175b20f0fffe66d0423c57b81b01893738d4531b8b5c6d87aa5.png

                dfx = msm.update_df(dfx, so_reg.get_optimized_signals("MACRO_OPTREG"))

               

Beta estimator #

The BetaEstimator class is used to calculate sequential betas for each cross-section of a financial market return panel, with respect to a common benchmark returns. Out-of-sample hedged returns are calculated between model refreshing dates. The same model selection process that SignalOptimizer uses is used to select between candidate models from which coefficients are extracted.

The constructor of BetaEstimator constructs a wide format dataframe where the benchmark ticker, the sole predictor of concurrent market returns, is replicated across all return cross-sections in the panel. In the below example, we compute betas for each cross-section of our FX forward returns with respect to S&P 500 futures.

               be = msl.BetaEstimator(
    df = dfx,
    xcats = ["FXXR_NSA"],
    benchmark_return = "USD_EQXR_NSA",
    cids = cids,
)

be.X

              

		EQXR_NSA
cid	real_date
AUDvUSD	2000-01-03	-1.172349
	2000-01-04	-3.749659
	2000-01-05	0.120414
	2000-01-06	-0.672091
	2000-01-07	4.024217
...	...	...
ZARvUSD	2025-08-28	0.330973
	2025-08-29	-0.686613
	2025-09-01	0.000000
	2025-09-02	-0.729983
	2025-09-03	0.494125

153417 rows × 1 columns

`estimate_beta()` #

The estimate_beta() method in the Macrosynergy package is used to work out how much an asset (or return series) moves in response to a chosen factor, while also estimating out-of-sample hedged returns at a set retraining frequency. Instead of just fitting a simple regression, it uses the same learning pipeline as SignalOptimizer, meaning it applies cross-validation to pick the best model type and hyperparameters, and then updates them as specified. The function doesn’t just output betas, it also gives you a way to test how those betas would have worked in practice by producing hedged return series.

Many parameters for estimate_beta() are also used for the calculate_predictions() method within SignalOptimizer .

The only parameters that differ are:

beta_xcat : Category name for stored estimated betas
hedged_return_xcat : Category name for stored out-of-sample hedged returns

Lastly, the models dictionary provided in estimate_beta expects models to inherit from the BaseRegressionSystem class, to ensure that different models are fit on different cross-sections, resulting in diverse betas amongst cross-sections.

                mods_be = {
    "LR": msl.RidgeRegressionSystem(
        fit_intercept = True,
        alpha = 1,
        positive = False,
        roll = "full",
        data_freq = "M",
    )
}

grids_be = {
    "LR": {
        "alpha": [1, 10, 100, 1000, 10000]
    }
}

scorers_be = {
    "neg_mean_abs_corr": msl.neg_mean_abs_corr
}

splitters_be = {
    "Expanding": msl.ExpandingKFoldPanelSplit(n_splits = 10),
}

be.estimate_beta(
    beta_xcat="BETA_NSA",
    hedged_return_xcat="HEDGED_RETURN_NSA",
    models = mods_be,
    hyperparameters = grids_be,
    scorers = scorers_be,
    inner_splitters = splitters_be,
    search_type = "grid",
    normalize_fold_results=True,
    min_cids = 4,
    min_periods = 60,
    est_freq = "Y",
)

               

`models_heatmap()` #

The models_heatmap method of the BetaEstimator class visualizes optimal models used for beta estimation over time.

If many models have been considered, their number can be limited by the cap argument.

                be.models_heatmap("BETA_NSA", figsize=(12, 2))

               

https://macrosynergy.com/notebooks.build/introductions/introduction-to-macrosynergy-package---learning-module/_images/ce83448223051fb62e17e8b38b6d9b1af8f8dd56ae020926e04c1a786fc4fcb8.png

`get_betas()` #

The get_betas() function can be used to extract the calculated panel of betas.

                be.get_betas("BETA_NSA")

               

	real_date	cid	xcat	value
0	2000-03-24	AUD	BETA_NSA	0.013376
1	2000-03-27	AUD	BETA_NSA	0.013376
2	2000-03-28	AUD	BETA_NSA	0.013376
3	2000-03-29	AUD	BETA_NSA	0.013376
4	2000-03-30	AUD	BETA_NSA	0.013376
...	...	...	...	...
151909	2025-08-28	ZAR	BETA_NSA	0.151301
151910	2025-08-29	ZAR	BETA_NSA	0.151301
151911	2025-09-01	ZAR	BETA_NSA	0.151301
151912	2025-09-02	ZAR	BETA_NSA	0.151301
151913	2025-09-03	ZAR	BETA_NSA	0.151301

151914 rows × 4 columns

`get_hedged_returns()` #

The get_hedged_returns() function can be used to extract the calculated panel of hedged returns.

                be.get_hedged_returns("HEDGED_RETURN_NSA")

               

	real_date	cid	xcat	value
0	2000-03-27	AUD	HEDGED_RETURN_NSA	0.980193
1	2000-03-28	AUD	HEDGED_RETURN_NSA	0.119258
2	2000-03-29	AUD	HEDGED_RETURN_NSA	-0.506133
3	2000-03-30	AUD	HEDGED_RETURN_NSA	0.233666
4	2000-03-31	AUD	HEDGED_RETURN_NSA	-0.827272
...	...	...	...	...
155405	2025-08-28	ZAR	HEDGED_RETURN_NSA	0.309650
155406	2025-08-29	ZAR	HEDGED_RETURN_NSA	0.104120
155407	2025-09-01	ZAR	HEDGED_RETURN_NSA	0.762983
155408	2025-09-02	ZAR	HEDGED_RETURN_NSA	-0.629873
155409	2025-09-03	ZAR	HEDGED_RETURN_NSA	0.097560

155410 rows × 4 columns

`evaluate_hedged_returns()` #

The evaluate_hedged_returns() function can be used to determine the average correlation between the calculated hedged returns and the inputted panel of contract returns. This gives a measure of the quality of hedge.

Important parameters:

hedged_return_xcat : Name of the hedged return category calculated in the BetaEstimator instance
correlation_types : Type of correlation to calculate. e.g. 'pearson' (default), 'kendall' , or 'spearman'
blacklist : Blacklisted periods that should be excluded from correlation calculation
freqs : String or list of strings of frequencies at which correlations are to be calculated

In the below example, we can see that hedging has produced lower absolute correlations with the benchmark on average than the unadjusted returns, meaning that hedge was successful.

                be.evaluate_hedged_returns(
    hedged_return_xcat="HEDGED_RETURN_NSA",
    correlation_types=["pearson", "kendall", "spearman"],
    blacklist = fxblack,
    freqs = ["M", "Q"],
)

               

			pearson	kendall	spearman
benchmark return	return category	frequency
USD_EQXR_NSA	HEDGED_RETURN_NSA	M	0.188726	0.102106	0.149935
	HEDGED_RETURN_NSA	Q	0.200774	0.112712	0.165224
	FXXR_NSA	M	0.354168	0.222569	0.321713
	FXXR_NSA	Q	0.375871	0.232173	0.332494

Introduction to the Macrosynergy package: The “Learning” module

Contents

Introduction to the Macrosynergy package: The “Learning” module #

Features (x) #

Target (y) #

Cross-validation splitters #

`ExpandingIncrementPanelSplit()` #

`visualise_splits()` #

`ExpandingFrequencyPanelSplit()` #

`ExpandingKFoldPanelSplit()` #

`RollingKFoldPanelSplit()` #

`RecencyKFoldPanelSplit()` #

Model evaluation #

Metrics #

Scorers #

Preprocessing #

Feature selectors #

Feature scalers #

Feature transformers #

Forecasting #

Modified regressors #

Regressor systems #

Sequential signal generation #

Signal optimizer #

`calculate_predictions()` #

`models_heatmap()` #

`feature_importance_timeplot()` #

`coefs_stackedbarplot()` #

`intercepts_timeplot()` #

`nsplits_timeplot()` #

Beta estimator #

`estimate_beta()` #

`models_heatmap()` #

`get_betas()` #

`get_hedged_returns()` #

`evaluate_hedged_returns()` #

ABOUT US

FOLLOW US

LEGAL

Introduction to the Macrosynergy package: The “Learning” module

Contents

Introduction to the Macrosynergy package: The “Learning” module #

Features (x) #

Target (y) #

Cross-validation splitters #

ExpandingIncrementPanelSplit() #

visualise_splits() #

ExpandingFrequencyPanelSplit() #

ExpandingKFoldPanelSplit() #

RollingKFoldPanelSplit() #

RecencyKFoldPanelSplit() #

Model evaluation #

Metrics #

Scorers #

Preprocessing #

Feature selectors #

Feature scalers #

Feature transformers #

Forecasting #

Modified regressors #

Regressor systems #

Sequential signal generation #

Signal optimizer #

calculate_predictions() #

models_heatmap() #

feature_importance_timeplot() #

coefs_stackedbarplot() #

intercepts_timeplot() #

nsplits_timeplot() #

Beta estimator #

estimate_beta() #

models_heatmap() #

get_betas() #

get_hedged_returns() #

evaluate_hedged_returns() #

`ExpandingIncrementPanelSplit()` #

`visualise_splits()` #

`ExpandingFrequencyPanelSplit()` #

`ExpandingKFoldPanelSplit()` #

`RollingKFoldPanelSplit()` #

`RecencyKFoldPanelSplit()` #

`calculate_predictions()` #

`models_heatmap()` #

`feature_importance_timeplot()` #

`coefs_stackedbarplot()` #

`intercepts_timeplot()` #

`nsplits_timeplot()` #

`estimate_beta()` #

`models_heatmap()` #

`get_betas()` #

`get_hedged_returns()` #

`evaluate_hedged_returns()` #