Global FX trading signals and regressionbased learning #
This notebook illustrates the points discussed in the post “FX trading signals and regressionbased learning” on the Macrosynergy website. It demonstrates how regressionbased statistical learning helps build trading signals from multiple candidate constituents. The method optimizes models and hyperparameters sequentially and produces pointintime signals for backtesting and live trading. This post applies regressionbased learning to a selection of macro trading factors for developed market FX trading, using a novel crossvalidation method for expanding panel data. Sequentially optimized models consider nine theoretically valid macro trend indicators to predict FX forward returns. The learning process has delivered significant predictors of returns and consistent positive PnL generation for over 20 years. The most important macroFX signals, in the long run, have been relative labor market trends, manufacturing business sentiment changes, inflation expectations, and terms of trade dynamics.
The notebook is organized into three main sections:

Get Packages and JPMaQS Data: This section is dedicated to installing and importing the necessary Python packages for the analysis. It includes standard Python libraries like pandas and seaborn, as well as the
scikitlearn
package and the specializedmacrosynergy
package. 
Transformations and Checks: In this part, the notebook conducts data calculations and transformations to derive relevant signals and targets for the analysis. This involves constructing simple linear composite indicators, means, relative values, etc.

Actual regressionbased learning: , employing the learning module of
macrosynergy
package for LinearRegression() fromscikitlearn
package defining hyperparameter grid, employed models, crossvalidation splits, optimization criteria, etc. 
Value checks and comparison of conceptual parity signal with OLS based learning signals. The comparison is performed using standard performance metrics, as well as calculating naive PnLs (Profit and Loss) for simple relative trading strategies based on both risk parity and OLS based learning signals.

Alternative models of regressionbased learning explores additional models: Regularized linear model (elastic net), Signweighted linear regression, Timeweighted linear regression  using the standard performance metrics with naive PnL calculations.
Get packages and JPMaQS data #
import os
import numpy as np
import pandas as pd
from sklearn.pipeline import Pipeline
from sklearn.linear_model import LinearRegression, ElasticNet
from sklearn.metrics import (
make_scorer,
r2_score,
)
import macrosynergy.management as msm
import macrosynergy.panel as msp
import macrosynergy.pnl as msn
import macrosynergy.signal as mss
import macrosynergy.learning as msl
import macrosynergy.visuals as msv
from macrosynergy.download import JPMaQSDownload
import warnings
warnings.simplefilter("ignore")
# Crosssections of interest  FX
cids_g3 = ["EUR", "USD", "JPY"] # DM large currency areas
cids_dmfx = ["AUD", "CAD", "CHF", "GBP", "NOK", "NZD", "SEK"] # DM small currency areas
cids_emfx = ["MXN", "ILS", "ZAR", "PLN", "HUF", "KRW", "TWD"] # 6 liquid EM currencies
cids_fx = cids_dmfx + cids_emfx
cids_dm = cids_g3 + cids_dmfx
cids = cids_dm + cids_emfx
cids_eur = ["CHF", "NOK", "SEK", "PLN", "HUF"] # trading against EUR
cids_eud = ["GBP"] # trading against EUR and USD
cids_usd = list(set(cids_fx)  set(cids_eur + cids_eud)) # trading against USD
# Quantamental features of interest
ir = [
"RIR_NSA",
]
ctots = [
"CTOT_NSA_P1M1ML12",
"CTOT_NSA_P1M12ML1",
"CTOT_NSA_P1W4WL1",
]
surv = [
"MBCSCORE_SA_D3M3ML3",
"MBCSCORE_SA_D1Q1QL1",
"MBCSCORE_SA_D6M6ML6",
"MBCSCORE_SA_D2Q2QL2",
]
ppi = [
"PGDPTECH_SA_P1M1ML12_3MMA",
]
gdps = [
"INTRGDP_NSA_P1M1ML12_3MMA",
]
emp = [
"UNEMPLRATE_SA_D3M3ML3",
"UNEMPLRATE_SA_D6M6ML6",
"UNEMPLRATE_SA_D1Q1QL1",
"UNEMPLRATE_SA_D2Q2QL2",
"UNEMPLRATE_NSA_3MMA_D1M1ML12",
"UNEMPLRATE_NSA_D1Q1QL4",
]
iip = [
"IIPLIABGDP_NSA",
]
cpi = [
"CPIXFE_SA_P1M1ML12",
"CPIXFE_SJA_P3M3ML3AR",
"CPIXFE_SJA_P6M6ML6AR",
"INFE1Y_JA",
"INFE2Y_JA",
"INFE5Y_JA",
]
tots = ctots
main = tots + gdps + surv + ir + emp + iip + ppi + cpi
adds = [
"INFTEFF_NSA",
]
mkts = [
"FXTARGETED_NSA",
"FXUNTRADABLE_NSA",
]
rets = [
"FXXR_NSA",
"FXXR_VT10",
"FXXRHvGDRB_NSA",
]
xcats = main + mkts + rets + adds
# Resultant tickers for download
single_tix = ["USD_GB10YXR_NSA", "EUR_FXXR_NSA", "USD_EQXR_NSA"]
tickers = [cid + "_" + xcat for cid in cids for xcat in xcats] + single_tix
JPMaQS indicators are conveniently grouped into six main categories: Economic Trends, Macroeconomic balance sheets, Financial conditions, Shocks and risk measures, Stylized trading factors, and Generic returns. Each indicator has a separate page with notes, description, availability, statistical measures, and timelines for main currencies. The description of each JPMaQS category is available under the Macro quantamental academy . For tickers used in this notebook see Real shortterm interest rates , Termsoftrade , Manufacturing confidence scores , Producer price inflation , Intuitive GDP growth estimates , Labor market dynamics , International investment position , Consistent core CPI trends , FX forward returns , and FX tradeability and flexibility . JPMaQS indicators are conveniently grouped into six main categories: Economic Trends, Macroeconomic balance sheets, Financial conditions, Shocks and risk measures, Stylized trading factors, and Generic returns. Each indicator has a separate page with notes, description, availability, statistical measures, and timelines for main currencies. The description of each JPMaQS category is available under Macro quantamental academy . For tickers used in this notebook see Real shortterm interest rates , Termsoftrade , Manufacturing confidence scores , Producer price inflation , Intuitive GDP growth estimates , Labor market dynamics , International investment position , Consistent core CPI trends , Inflation expectations , Inflation targets , FX forward returns , and FX tradeability and flexibility .
In this notebook, we introduce several lists of currencies for the sake of convenience in our subsequent analysis:

cids_g3
represents the 3 largest developed market currencies. 
cids_dmfx
represents DM small currency areas. 
cids_eur
trading against EUR. 
cids_eud
trading against EUR and USD 
cids_usd
trading against USD
The JPMaQS indicators we consider are downloaded using the J.P. Morgan Dataquery API interface within the
macrosynergy
package. This is done by specifying ticker strings, formed by appending an indicator category code
DB(JPMAQS,<cross_section>_<category>,<info>)
, where
value
giving the latest available values for the indicator
eop_lag
referring to days elapsed since the end of the observation period
mop_lag
referring to the number of days elapsed since the mean observation period
grade
denoting a grade of the observation, giving a metric of realtime information quality.
After instantiating the
JPMaQSDownload
class within the
macrosynergy.download
module, one can use the
download(tickers,start_date,metrics)
method to easily download the necessary data, where
tickers
is an array of ticker strings,
start_date
is the first collection date to be considered and
metrics
is an array comprising the times series information to be downloaded. For more information see
here
# Download series from J.P. Morgan DataQuery by tickers
start_date = "19900101"
end_date = None
# Retrieve credentials
oauth_id = os.getenv("DQ_CLIENT_ID") # Replace with own client ID
oauth_secret = os.getenv("DQ_CLIENT_SECRET") # Replace with own secret
# Download from DataQuery
downloader = JPMaQSDownload(client_id=oauth_id, client_secret=oauth_secret)
df = downloader.download(
tickers=tickers,
start_date=start_date,
end_date=end_date,
metrics=["value"],
suppress_warning=True,
show_progress=True,
)
dfx = df.copy()
dfx.info()
Downloading data from JPMaQS.
Timestamp UTC: 20241104 11:12:39
Connection successful!
Requesting data: 100%█████████████████████████████████████████████████████████████████ 25/25 [00:05<00:00, 4.73it/s]
Downloading data: 100%████████████████████████████████████████████████████████████████ 25/25 [00:26<00:00, 1.04s/it]
Some expressions are missing from the downloaded data. Check logger output for complete list.
90 out of 495 expressions are missing. To download the catalogue of all available expressions and filter the unavailable expressions, set `get_catalogue=True` in the call to `JPMaQSDownload.download()`.
Some dates are missing from the downloaded data.
2 out of 9093 dates are missing.
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2951399 entries, 0 to 2951398
Data columns (total 4 columns):
# Column Dtype
  
0 real_date datetime64[ns]
1 cid object
2 xcat object
3 value float64
dtypes: datetime64[ns](1), float64(1), object(2)
memory usage: 90.1+ MB
Availability and blacklisting #
Renaming #
Rename quarterly tickers to roughly equivalent monthly tickers to simplify subsequent operations.
dict_repl = {
"UNEMPLRATE_NSA_D1Q1QL4": "UNEMPLRATE_NSA_3MMA_D1M1ML12",
"UNEMPLRATE_SA_D1Q1QL1": "UNEMPLRATE_SA_D3M3ML3",
"UNEMPLRATE_SA_D2Q2QL2": "UNEMPLRATE_SA_D6M6ML6",
"MBCSCORE_SA_D1Q1QL1": "MBCSCORE_SA_D3M3ML3",
"MBCSCORE_SA_D2Q2QL2": "MBCSCORE_SA_D6M6ML6",
}
for key, value in dict_repl.items():
dfx["xcat"] = dfx["xcat"].str.replace(key, value)
Check availability #
Prior to commencing any analysis, it is crucial to evaluate the accessibility of data. This evaluation serves several purposes, including identifying potential data gaps or constraints within the dataset. Such gaps can significantly influence the trustworthiness and accuracy of the analysis. Moreover, it aids in verifying that ample observations are accessible for each chosen category and crosssection. Additionally, it assists in establishing suitable timeframes for conducting the analysis.
xcatx = cpi + ppi + ir
msm.check_availability(df=dfx, xcats=xcatx, cids=cids, missing_recent=False)
xcatx = surv + emp
msm.check_availability(df=dfx, xcats=xcatx, cids=cids, missing_recent=False)
xcatx = gdps
msm.check_availability(df=dfx, xcats=xcatx, cids=cids, missing_recent=False)
xcatx = ctots
msm.check_availability(df=dfx, xcats=xcatx, cids=cids, missing_recent=False)
xcatx = iip
msm.check_availability(df=dfx, xcats=xcatx, cids=cids, missing_recent=False)
xcatx = rets
msm.check_availability(df=dfx, xcats=xcatx, cids=cids, missing_recent=False)
Blacklisting #
Identifying and isolating periods of official exchange rate targets, illiquidity, or convertibilityrelated distortions in FX markets is the first step in creating an FX trading strategy. These periods can significantly impact the behavior and dynamics of currency markets, and failing to account for them can lead to inaccurate or misleading findings. A standard blacklist dictionary (
fxblack
in the cell below) can be passed to several
macrosynergy
package functions that exclude the blacklisted periods from related analyses.
# Create blacklisting dictionary
dfb = df[df["xcat"].isin(["FXTARGETED_NSA", "FXUNTRADABLE_NSA"])].loc[
:, ["cid", "xcat", "real_date", "value"]
]
dfba = (
dfb.groupby(["cid", "real_date"])
.aggregate(value=pd.NamedAgg(column="value", aggfunc="max"))
.reset_index()
)
dfba["xcat"] = "FXBLACK"
fxblack = msp.make_blacklist(dfba, "FXBLACK")
fxblack
{'CHF': (Timestamp('20111003 00:00:00'), Timestamp('20150130 00:00:00')),
'ILS': (Timestamp('19990101 00:00:00'), Timestamp('20051230 00:00:00'))}
Transformations and checks #
Singleconcept calculations #
Unemployment rate changes #
The
linear_composite
method from the
macrosynergy
package is used to combine individual category scores into a single composite indicator. This technique allows for the assignment of specific weights to each category, which can be adjusted over time according to analysis needs or data insights. In this example, the weights [4, 2, 1] are allocated to three different measures of unemployment trends: “UNEMPLRATE_SA_D3M3ML3”, “UNEMPLRATE_SA_D6M6ML6”, and “UNEMPLRATE_NSA_3MMA_D1M1ML12”. By applying these weights, a unified composite indicator, named
UNEMPLRATE_SA_D
, is created, offering a weighted perspective on unemployment trends.
# Combine to annualized change average
xcatx = [
"UNEMPLRATE_SA_D3M3ML3",
"UNEMPLRATE_SA_D6M6ML6",
"UNEMPLRATE_NSA_3MMA_D1M1ML12",
]
cidx = cids
dfa = msp.linear_composite(
dfx,
xcats=xcatx,
cids=cidx,
weights=[4, 2, 1],
normalize_weights=True,
complete_xcats=False,
new_xcat="UNEMPLRATE_SA_D",
)
dfx = msm.update_df(dfx, dfa)
International liabilities trends #
To analyze trends in the information states of liabilities compared to the past 2 and 5 years, the process begins by computing 2Year and 5Year rolling averages. These averages are then subtracted from the current trend to highlight deviations or changes over the specified periods. Utilizing the
linear_composite
method from the
macrosynergy
package, these calculated deviations for each time frame are combined into a single composite indicator named
IIPLIABGDP_NSA_D
. In this aggregation, greater importance is assigned to the 2Year trend compared to the 5Year trend, as reflected by the allocation of higher weights to the former in the weighting scheme used within the function.
# Two plausible information state trends
cidx = cids
calcs = [
" IIPLIABGDP_NSA_2YAVG = IIPLIABGDP_NSA.rolling(21*24).mean() ",
" IIPLIABGDP_NSA_5YAVG = IIPLIABGDP_NSA.rolling(21*60).mean() ",
" IIPLIABGDP_NSAv2YAVG = IIPLIABGDP_NSA  IIPLIABGDP_NSA_2YAVG ",
" IIPLIABGDP_NSAv5YAVG = IIPLIABGDP_NSA  IIPLIABGDP_NSA_5YAVG ",
]
dfa = msp.panel_calculator(dfx, calcs=calcs, cids=cidx)
dfx = msm.update_df(dfx, dfa)
# Combine to single trend measure
cidx = cids
xcatx = [
"IIPLIABGDP_NSAv2YAVG",
"IIPLIABGDP_NSAv5YAVG",
]
dfa = msp.linear_composite(
dfx,
xcats=xcatx,
cids=cidx,
weights=[1/2, 1/5],
normalize_weights=True,
complete_xcats=False,
new_xcat="IIPLIABGDP_NSA_D",
)
dfx = msm.update_df(dfx, dfa)
Excess producer and core consumer price inflation #
linear_composite
method from the
macrosynergy
package is utilized to synthesize individual inflation trends scores into a consolidated composite indicator, applying equal weights to each score. This approach ensures that all identified inflation trends contribute equally to the formation of the composite indicator, facilitating a balanced and comprehensive overview of inflationary movements.
cidx = cids
xcatx = ['CPIXFE_SA_P1M1ML12', 'CPIXFE_SJA_P3M3ML3AR', 'CPIXFE_SJA_P6M6ML6AR']
dfa = msp.linear_composite(
dfx,
xcats=xcatx,
cids=cidx,
complete_xcats=False,
new_xcat="CPIXFE_SA_PAR",
)
dfx = msm.update_df(dfx, dfa)
In this analysis, the difference is calculated between the estimated inflation expectations for the local economy, represented by the average labeled
CPIXFE_SA_PAR
, and the inflation expectations for the benchmark currency area, denoted by
INFTEFF_NSA
. This calculation produces a new indicator named
XCPIXFE_SA_PAR
, which reflects the deviation of local inflation expectations from those of the benchmark currency area.
Similarly, for the Producer Price Inflation trend, the same method is applied. The difference between the local Producer Price Inflation trend and the benchmark currency area’s trend is calculated. This results in the creation of a new indicator, which is labeled
XPGDPTECH_SA_P1M1ML12_3MMA
. This new indicator captures the disparity in the Producer Price Inflation trend between the local economy and the benchmark currency area, providing insights into relative inflationary pressures from the perspective of producers.
cidx = cids
xcatx = ["CPIXFE_SA_PAR", "PGDPTECH_SA_P1M1ML12_3MMA"]
calcs = []
for xc in xcatx:
calcs.append(f" X{xc} = {xc}  INFTEFF_NSA ")
dfa = msp.panel_calculator(dfx, calcs=calcs, cids=cidx)
dfx = msm.update_df(dfx, dfa)
Inflation expectations #
The
linear_composite
method from the
macrosynergy
package is used to compute the average of 1year, 2year, and 5year inflation expectations, culminating in the creation of a composite indicator named
INFE_JA
. This method effectively consolidates the various inflation expectation horizons into a single, comprehensive measure, providing a more holistic view of expected inflation trends over multiple time frames.
cidx = cids
xcatx = ['INFE1Y_JA', 'INFE2Y_JA', 'INFE5Y_JA']
dfa = msp.linear_composite(
dfx,
xcats=xcatx,
cids=cidx,
complete_xcats=False,
new_xcat="INFE_JA",
)
dfx = msm.update_df(dfx, dfa)
Commoditybased terms of trade improvement #
The
linear_composite
method from the
macrosynergy
package is utilized to develop a composite indicator of commoditybased terms of trade trends, named
CTOT_NSA_PAR
. In this process, weights are assigned to reflect the relative importance of shortterm changes, specifically comparing the most recent week against the preceding four weeks. This weighting strategy emphasizes the significance of the latest movements in commodity terms of trade, allowing the composite indicator to provide insights into shortterm trends and their potential impact on trade conditions.
cidx = cids
xcatx = ["CTOT_NSA_P1M1ML12", "CTOT_NSA_P1M12ML1", "CTOT_NSA_P1W4WL1"]
dfa = msp.linear_composite(
dfx,
xcats=xcatx,
cids=cidx,
weights=[1/12, 1/6, 2],
normalize_weights=True,
complete_xcats=False,
new_xcat="CTOT_NSA_PAR",
)
dfx = msm.update_df(dfx, dfa)
Manufacturing confidence improvement #
Using the same weighting approach and function, the
linear_composite
method from the
macrosynergy
package is employed to aggregate individual category scores of manufacturing confidence into a consolidated composite indicator named
MBCSCORE_SA_D
. This process involves combining various measures of manufacturing confidence into a single indicator, with weights applied to emphasize specific aspects of the data according to predefined criteria. The creation of the
MBCSCORE_SA_D
indicator enables a comprehensive view of manufacturing confidence, streamlining the analysis of this sector’s sentiment.
cidx = cids
xcatx = ["MBCSCORE_SA_D3M3ML3", "MBCSCORE_SA_D6M6ML6"]
dfa = msp.linear_composite(
dfx,
xcats=xcatx,
cids=cidx,
weights=[2, 1],
normalize_weights=True,
complete_xcats=False,
new_xcat="MBCSCORE_SA_D",
)
dfx = msm.update_df(dfx, dfa)
Relative values, sign adjustments and normalization #
The
make_relative_value()
function is designed to create a dataframe that showcases relative values for a specified list of categories. In this context, “relative” refers to comparing an original value against an average from a basket of choices. By default, this basket is composed of all available crosssections, with the relative value being derived by subtracting the average of the basket from each individual crosssection value.
For this specific application, relative values are computed against either the USD or EUR as benchmarks, contingent upon the currency in question. Currencies that are typically traded against the EUR, namely CHF, NOK, and SEK (grouped in the list
cids_eur
), are calculated in relation to the EUR. The GBP is unique in that it is considered in trading against both USD and EUR, while all other currencies for this analysis are benchmarked against the USD.
As a result of this process, the generated relative time series are appended with the postfix
_vBM
, indicating their comparison against a benchmark currency.
# Calculate relative values to benchmark currency areas
xcatx = [
"UNEMPLRATE_SA_D",
"RIR_NSA",
"XCPIXFE_SA_PAR",
"XPGDPTECH_SA_P1M1ML12_3MMA",
"INTRGDP_NSA_P1M1ML12_3MMA",
"INFE_JA",
]
dfa_usd = msp.make_relative_value(dfx, xcatx, cids_usd, basket=["USD"], postfix="vBM")
dfa_eur = msp.make_relative_value(dfx, xcatx, cids_eur, basket=["EUR"], postfix="vBM")
dfa_eud = msp.make_relative_value(
dfx, xcatx, cids_eud, basket=["EUR", "USD"], postfix="vBM"
)
dfa = pd.concat([dfa_eur, dfa_usd, dfa_eud])
dfx = msm.update_df(dfx, dfa)
To display and compare the values for one of the resulting
INFE_JAvBM
indicator we use
view_timelines()
function from the
macrosynergy
package:
# Check time series
xc = "INFE_JA"
cidx = cids_fx
xcatx = [xc, xc + "vBM"]
msp.view_timelines(
df=dfx,
xcats=xcatx,
cids=cidx,
start="20000101",
aspect=1.7,
ncol=3,
)
The negatives of relative unemployment rates and international liabilities to GDP series are defined. This is done for convenience, all constituent candidates are given the “right sign” that makes their theoretically expected predictive direction positive.
# Negative values
xcatx = ["UNEMPLRATE_SA_DvBM", "IIPLIABGDP_NSA_D"]
calcs = []
for xc in xcatx:
calcs += [f"{xc}_NEG =  {xc}"]
dfa = msp.panel_calculator(dfx, calcs=calcs, cids=cids_fx)
dfx = msm.update_df(dfx, dfa)
# Conceptual features with theoretical positive FX impact
cpos = [
"INFE_JAvBM",
"CTOT_NSA_PAR",
"UNEMPLRATE_SA_DvBM_NEG",
"IIPLIABGDP_NSA_D_NEG",
"RIR_NSAvBM",
"XCPIXFE_SA_PARvBM",
"MBCSCORE_SA_D",
"XPGDPTECH_SA_P1M1ML12_3MMAvBM",
"INTRGDP_NSA_P1M1ML12_3MMAvBM",
]
cpos.sort()
We normalize the plausible indicators collected in the list
cpos
above using
make_zn_scores()
function. This is a standard procedure aimed at making various categories comparable, in particular, summing or averaging categories with different units and time series properties.
# Znscores
xcatx = cpos
cidx = cids_fx
dfa = pd.DataFrame(columns=list(dfx.columns))
for xc in xcatx:
dfaa = msp.make_zn_scores(
dfx,
xcat=xc,
cids=cidx,
sequential=True,
min_obs=261 * 3,
neutral="zero",
pan_weight=1,
thresh=4,
postfix="_ZN",
est_freq="m",
)
dfa = msm.update_df(dfa, dfaa)
dfx = msm.update_df(dfx, dfa)
cpoz = [x + "_ZN" for x in cpos]
Renaming indicators to more intuitive names rather than technical tickers is a beneficial practice for enhancing the readability and interpretability of data analyses.
The
correl_matrix()
function visualizes correlation coefficient between signal constituent candidates for a panel of seven DM countries since
cidx = cids_fx
xcatx = cpoz
sdate = "20000101"
renaming_dict = {
"CTOT_NSA_PAR_ZN": "Termsoftrade dynamics",
"IIPLIABGDP_NSA_D_NEG_ZN": "International liability trends (negative)",
"INFE_JAvBM_ZN": "Relative inflation expectations",
"INTRGDP_NSA_P1M1ML12_3MMAvBM_ZN": "Relative excess GDP growth trends",
"MBCSCORE_SA_D_ZN": "Manufacturing confidence changes",
"RIR_NSAvBM_ZN": "Relative real interest rates",
"UNEMPLRATE_SA_DvBM_NEG_ZN": "Relative unemployment trends (negative)",
"XCPIXFE_SA_PARvBM_ZN": "Relative excess core CPI inflation",
"XPGDPTECH_SA_P1M1ML12_3MMAvBM_ZN": "Relative excess GDP deflators",
}
dfx_corr = dfx.copy()
for key, value in renaming_dict.items():
dfx_corr["xcat"] = dfx_corr["xcat"].str.replace(key, value)
msp.correl_matrix(
dfx_corr,
xcats=list(renaming_dict.values()),
cids=cidx,
start=sdate,
freq="M",
cluster=False,
title="Monthly cross correlation of signal constituent candidates for a panel of seven DM countries since 2000",
size=(14, 8),
)
Signal preparations #
Conceptual parity as benchmark #
linear_composite
method from the
macrosynergy
package aggregates the individual category scores into a unified (equally weighted) composite indicator
ALL_AVGZ
.
ALL_AVGZ
composite indicator serves as a benchmark for subsequent analyses, particularly in comparing its performance or relevance against signals optimized through machine learning techniques. By doing so, this composite indicator becomes a standard reference point, enabling a systematic evaluation of the effectiveness of machine learningderived signals in capturing the underlying dynamics represented by the individual category scores.
# Linear equallyweighted combination of all available candidates
xcatx = cpoz
cidx = cids_fx
dfa = msp.linear_composite(
df=dfx,
xcats=cpoz,
cids=cidx,
new_xcat="ALL_AVGZ",
)
dfx = msm.update_df(dfx, dfa)
Convert data to scikitlearn format #
Downsampling daily information states to a monthly frequency is a common preparation step for machine learning models, especially when dealing with financial and economic time series data. The
categories_df()
function applies the leg of 1 month and uses the last value in the month for explanatory variables and the sum for the aggregated target (return). As explanatory variables, we use plausible zscores of economic variables derived earlier and collected in the list
cpos
. As a target, we use
FXXR_VT10
, FX forward return for 10% vol target: dominant cross.
cidx = cids_fx
targ = "FXXR_VT10"
xcatx = cpoz + [targ]
# Downsample from daily to monthly frequency (features as last and target as sum)
dfw = msm.categories_df(
df=dfx,
xcats=xcatx,
cids=cidx,
freq="M",
lag=1,
blacklist=fxblack,
xcat_aggs=["last", "sum"],
)
# Drop rows with missing values and assign features and target
dfw.dropna(inplace=True)
X_fx = dfw.iloc[:, :1]
y_fx = dfw.iloc[:, 1]
Define splitter and scorer #
The
RollingKFoldPanelSplit
class instantiates splitters where temporally adjacent panel training sets of fixed joint maximum time spans can border the test set from both the past and future. Thus, most folds do not respect the chronological order but allow training with past and future information. While this does not simulate the evolution of information, it makes better use of the available data and is often acceptable for macro data as economic regimes come in cycles. It is equivalent to
scikitlearn
’s
Kfold class
but adapted for panels.
The standard
make_scorer()
function from the
scikitlearn
library is used to create a scorer object that is used to evaluate the performance on the test set. The standard
r2_score
function is used as a scorer.
splitter = msl.RollingKFoldPanelSplit(n_splits=5)
scorer = make_scorer(r2_score, greater_is_better=True)
The method
visualise_splits()
is a convenient method for visualizing the splits produced by each child splitter, giving the user confidence in the splits produced for their use case.
splitter.visualise_splits(X_fx, y_fx)
OLS/NNLS regressionbased learning #
Sequential optimization and analysis #
For a straightforward Ordinary Least Squares (OLS) learning process, employing the
LinearRegression()
from
scikitlearn
offers a streamlined and efficient approach to regression analysis. When setting up this model for machine learning, two hyperparameter decisions can significantly influence the model’s behavior and its interpretation of the data:

Nonnegativity constraint: This offers the option of nonnegative least squares (NNLS), rather than simple OLS. NNLS imposes the constraint that the coefficients must be nonnegative. The benefit of this restriction is that it allows consideration of theoretical priors on the direction of impact, reducing dependence on scarce data.

Inclusion of a regression intercept: Conceptually the neutral level of all (mostly relative) signal constituent candidates is zero. Hence, the regression intercept is presumed to be zero, albeit that may not always be exact, and some theoretical assumptions may have been wrong.
# Specify model options and grids
mods_ols = {
"ls": LinearRegression(),
}
grid_ols = {
"ls": {"positive": [True, False], "fit_intercept": [True, False]},
}
SignalOptimizer
class from the
macrosynergy
package is a sophisticated tool designed to facilitate the process of optimizing and evaluating different signals (predictors) for forecasting models in a time series context. Leveraging the earlier defined crossvalidation split (
RollingKFoldPanelSplit
), the blacklist period, defined data frames of dependent and independent variables, and a few other optional parameters, we instantiate the
SignalOptimizer
class.
calculate_predictions()
method is used to calculate, store and return sequentially optimized signals for a given process. This method implements the nested crossvalidation and subsequent signal generation. The name of the process, together with models to fit, hyperparameters to search over and a metric to optimize, are provided as compulsory arguments.
xcatx = cpoz + ["FXXR_VT10"]
cidx = cids_fx
so_ls = msl.SignalOptimizer(
df = dfx,
xcats = xcatx,
cids = cidx,
blacklist = fxblack,
freq = "M",
lag = 1,
xcat_aggs = ["last", "sum"]
)
so_ls.calculate_predictions(
name = "LS",
models = mods_ols,
hyperparameters = grid_ols,
scorers = {"r2": scorer},
inner_splitters = {"Rolling": splitter},
search_type = "grid",
normalize_fold_results = False,
cv_summary = "mean",
min_cids = 2,
min_periods = 36,
test_size = 1,
n_jobs_outer = 1,
split_functions={"Rolling": lambda n: n // 36},
)
dfa = so_ls.get_optimized_signals()
dfx = msm.update_df(dfx, dfa)
The “heatmap” serves as a powerful tool to illustrate the evolution and selection of predictive models over time, especially in a context. When the visualization indicates that over time the preferred model becomes the most restrictive one, it suggests a shift towards models with tighter constraints or fewer degrees of freedom.
# Illustrate model choice
so_ls.models_heatmap(
"LS",
title="Selected leastsquares models over time, based on crossvalidation and R2",
figsize=(12, 4),
)
# Number of splits used for crossvalidation over time
so_ls.nsplits_timeplot("LS", title="Number of splits used for rolling kfold crossvalidation")
coefs_timeplot()
custom function is designed for plotting the time series of feature coefficients obtained from regression models over time. This kind of visualization can be particularly valuable to understand the dynamics and stability of model coefficients, which is crucial for interpretation and forecasting.
ftrs_dict = {
"CTOT_NSA_PAR_ZN": "Termsoftrade trend",
"IIPLIABGDP_NSA_D_NEG_ZN": "International liability trends (negative)",
"INFE_JAvBM_ZN": "Relative inflation expectations",
"INTRGDP_NSA_P1M1ML12_3MMAvBM_ZN": "Relative GDP growth trends",
"MBCSCORE_SA_D_ZN": "Manufacturing confidence changes",
"RIR_NSAvBM_ZN": "Relative real interest rates",
"UNEMPLRATE_SA_DvBM_NEG_ZN": "Relative unemployment trends (negative)",
"XCPIXFE_SA_PARvBM_ZN": "Relative excess core inflation",
"XPGDPTECH_SA_P1M1ML12_3MMAvBM_ZN": "Relative excess GDP deflator growth",
}
so_ls.coefs_timeplot("LS", title="Model coefficients of normalized features",ftrs_renamed=ftrs_dict)
coefs_stackedbarplot()
is another convenient function, generating stacked bar plots to visualize the coefficients of features from regression models. This visualization method effectively displays the magnitude and direction of each feature’s influence on the outcome, allowing for easy comparison and interpretation of how different predictors contribute to the model.
so_ls.coefs_stackedbarplot("LS",
title="Standard optimal model coefficients of normalized features",
ftrs_renamed=ftrs_dict)
The
intercepts_timeplot()
function is designed to plot the time series of intercepts from regression models, enabling an analysis of how the model intercepts change over time.
so_ls.intercepts_timeplot("LS",
title="Intercepts of optimal models over time")
To display and compare the values for conceptual parity
ALL_AVGZ
versus OLSbased learning indicator
LS
we use
view_timelines()
function from the
macrosynergy
package
sigs_ls = [
"LS",
]
xcatx = sigs_ls
cidx = cids_fx
msp.view_timelines(
dfx,
xcats=xcatx,
cids=cidx,
ncol=4,
start="20030829",
title="FX forward trading signals: signals derived through regressionbased learning",
title_fontsize=30,
same_y=False,
cs_mean=False,
legend_fontsize=16,
)
Signal value checks #
Specs and panel test #
sigs = ["ALL_AVGZ", "LS"]
targs = [targ]
cidx = cids_fx
sdate = "20030829"
dict_ls = {
"sigs": sigs,
"targs": targs,
"cidx": cidx,
"start": sdate,
"black": fxblack,
"srr": None,
"pnls": None,
}
CategoryRelations
facilitates the visualization and analysis of relationships between two specific categories, namely, two panels of time series data. This tool is used to examine the interaction between the risk parity “ALL_AVGZ”, and optimized “LS”  on one side, and the FXXR_VT10, which represents the FX forward return for a 10% volatility target, focusing on the dominant cross, on the other. The
multiple_reg_scatter
method of the class displays correlation scatters for the two pairs.
The plot below shows the significant positive predictive power of the OLS learningbased signals with respect to subsequent quarterly FX returns. The correlation coefficient for the OLS learningbased signal is comparable to that of the risk parity version.
dix = dict_ls
sigx = dix["sigs"]
tarx = dix["targs"]
cidx = dix["cidx"]
blax = dix["black"]
start = dix["start"]
def crmaker(sig, targ):
crx = msp.CategoryRelations(
dfx,
xcats=[sig, targ],
cids=cidx,
freq="Q",
lag=1,
xcat_aggs=["last", "sum"],
start=start,
blacklist=blax,
)
return crx
lcrs = [crmaker(sig, targ) for sig in sigx for targ in tarx]
msv.multiple_reg_scatter(
lcrs,
ncol=2,
nrow=1,
figsize=(20, 10),
title="Macro signals and subsequent quarterly cumulative FX returns, 2003  2024",
xlab="Monthend macro signal value",
ylab="Cumulative FX forward return, 10% voltargeted position, next quarter",
coef_box="lower right",
prob_est="map",
subplot_titles=["Conceptual parity signals", "OLSbased learning signals"],
)
dix = dict_ls
sig = dix["sigs"][1]
targ = dix["targs"][0]
cidx = dix["cidx"]
blax = dix["black"]
start = dix["start"]
crx = msp.CategoryRelations(
dfx,
xcats=[sig, targ],
cids=cidx,
freq="Q",
lag=1,
xcat_aggs=["last", "sum"],
start=start,
blacklist=blax,
)
crx.reg_scatter(
title="Monthly composite macro signals and subsequent FX returns, 20042024 (Jul), 14 currencies",
reg_order=1,
labels=False,
xlab="Monthend macro signal value",
ylab="FX forward return, 10% voltargeted position, next quarter",
coef_box="lower right",
prob_est="map",
separator=2014,
size=(10, 7),
)
Accuracy and correlation check #
The
SignalReturnRelations
class of the
macrosynergy
package is specifically designed to assess the predictive power of signal categories in determining the direction of subsequent returns, particularly for data structured in the JPMaQS format. This class provides a streamlined approach for evaluating how well a given signal can forecast future market movements, offering insights for investment strategies, risk management, and economic analysis. It helps to identify which indicators have more predictive power.
## Compare optimized signals with simple average zscores
dix = dict_ls
sigx = dix["sigs"]
tarx = dix["targs"]
cidx = dix["cidx"]
blax = dix["black"]
startx = dix["start"]
srr = mss.SignalReturnRelations(
df=dfx,
rets=tarx,
sigs=sigx,
cids=cidx,
cosp=True,
freqs=["M"],
agg_sigs=["last"],
start=startx,
blacklist=blax,
slip=1,
ms_panel_test=True,
)
dix["srr"] = srr
srr = dict_ls["srr"]
selcols = [
"accuracy",
"bal_accuracy",
"pos_sigr",
"pos_retr",
"pearson",
"map_pval",
"kendall",
"kendall_pval",
]
srr.multiple_relations_table().round(3)[selcols]
accuracy  bal_accuracy  pos_sigr  pos_retr  pearson  map_pval  kendall  kendall_pval  

Return  Signal  Frequency  Aggregation  
FXXR_VT10  ALL_AVGZ  M  last  0.538  0.535  0.595  0.52  0.085  0.000  0.059  0.0 
LS  M  last  0.540  0.537  0.628  0.52  0.077  0.001  0.060  0.0 
Naive PnL #
In this analysis, the notebook calculates the naive Profit and Loss (PnL) for foreign exchange (FX) strategies utilizing both risk parity and OLSbased learning indicators. These PnL calculations are derived from simple trading strategies that interpret the indicators as signals to either go long (buy) or short (sell) FX positions. Specifically, the direction of the indicators guides the trading decisions, where a positive signal might suggest buying (going long) the FX asset, and a negative signal might suggest selling (going short). This approach allows for a direct evaluation of how effectively each indicator—risk parity and OLSbased—can predict profitable trading opportunities in the FX market.
Please refer to
NaivePnl()
class
for details.
dix = dict_ls
sigx = dix["sigs"]
tarx = dix["targs"]
cidx = dix["cidx"]
blax = dix["black"]
startx = dix["start"]
pnls = msn.NaivePnL(
df=dfx,
ret=tarx[0],
sigs=sigx,
cids=cidx,
start=startx,
blacklist=blax,
bms=["USD_GB10YXR_NSA", "EUR_FXXR_NSA", "USD_EQXR_NSA"],
)
for sig in sigx:
pnls.make_pnl(
sig=sig,
sig_op="zn_score_pan",
rebal_freq="monthly",
neutral="zero",
rebal_slip=1,
vol_scale=10,
thresh=4,
)
pnls.make_long_pnl(vol_scale=10, label="Long only")
dix["pnls"] = pnls
The PnLs are plotted using
.plot_pnls()
method of the
NaivePnl()
class
. The performance characteristics of the learningbased signal are encouraging. One should consider that the strategy only has seven markets to trade in, some of which are highly correlated. Also, the strategy only macro signals, without any other consideration of market prices and volumes. Position changes are infrequent and gentle.
pnls = dix["pnls"]
sigx = dix["sigs"]
pnls.plot_pnls(
title="Developed market FX forward trading signals: conceptual parity versus OLSbased learning",
title_fontsize=14,
xcat_labels=["Conceptual parity", "OLSbased learning", "Long only (small currencies)"],
)
pnls.evaluate_pnls(pnl_cats=["PNL_" + sig for sig in sigx] + ["Long only"])
xcat  PNL_ALL_AVGZ  PNL_LS  Long only 

Return %  7.911054  5.656404  0.439705 
St. Dev. %  10.0  10.0  10.0 
Sharpe Ratio  0.791105  0.56564  0.04397 
Sortino Ratio  1.128184  0.788986  0.059573 
Max 21Day Draw %  14.493673  20.516188  24.440545 
Max 6Month Draw %  22.999752  14.374759  25.769963 
Peak to Trough Draw %  27.131372  22.585145  65.565108 
Top 5% Monthly PnL Share  0.470927  0.640821  8.641919 
USD_GB10YXR_NSA correl  0.074335  0.076482  0.061182 
EUR_FXXR_NSA correl  0.235003  0.219415  0.475226 
USD_EQXR_NSA correl  0.228419  0.130705  0.326562 
Traded Months  256  256  256 
Correcting linear models for statistical precision #
A notable issue with using regressionbased learning as the sole component of a macro trading signal is that macroeconomic trends take time to form, meaning that model coefficients (and predictions) exhibit greater variability in early years, before the correlations have stabilised. As a consequence, absolute model coefficients have a tendency to decrease over time, producing smaller signals. This is exaggerated by the
NaivePnL
calculation because the signal considered by the PnL is a windorized zscore with standard deviation updated over time.
One way of mitigating this is by adjusting linear model coefficients by their estimated standard errors. The resulting factor model has coefficients that take into account statistical precision of the parameter estimates based on sample size.
OLS/NNLS: analytical standard error adjustment #
# Specify model options and grids
mods_mls = {
"mls": msl.ModifiedLinearRegression(method="analytic", error_offset=1e5),
}
grid_mls = {
"mls": {"positive": [True, False], "fit_intercept": [True, False]},
}
xcatx = cpoz + ["FXXR_VT10"]
cidx = cids_fx
so_mls = msl.SignalOptimizer(
df = dfx,
xcats = xcatx,
cids = cidx,
blacklist = fxblack,
freq = "M",
lag = 1,
xcat_aggs = ["last", "sum"]
)
so_mls.calculate_predictions(
name = "MLS_analytic",
models = mods_mls,
hyperparameters = grid_mls,
scorers = {"r2": scorer},
inner_splitters = {"Rolling": splitter},
search_type = "grid",
normalize_fold_results = False,
cv_summary = "mean",
min_cids = 2,
min_periods = 36,
test_size = 1,
n_jobs_outer = 1,
split_functions={"Rolling": lambda n: n // 36},
)
dfa = so_mls.get_optimized_signals()
dfx = msm.update_df(dfx, dfa)
# Illustrate model choice
so_mls.models_heatmap(
"MLS_analytic",
title="Modified least squares, analytical standard error adjustment, based on crossvalidation and R2",
figsize=(12, 3),
)
so_mls.coefs_stackedbarplot(
"MLS_analytic",
title="Modified optimal model coefficients of normalized features",
ftrs_renamed=ftrs_dict,
)
OLS/NNLS: White’s estimator for standard errors #
The usual standard error expressions are subject to assumptions that are violated  particularly the assumption of constant error variance. White’s estimator is a standard error estimator that takes into account heteroskedasticity. We implement the HC3 version of this estimator.
# Specify model options and grids
mods_mls = {
"mls": msl.ModifiedLinearRegression(method="analytic", analytic_method="White", error_offset=1e5),
}
grid_mls = {
"mls": {"positive": [True, False], "fit_intercept": [True, False]},
}
so_mls.calculate_predictions(
name = "MLS_white",
models = mods_mls,
hyperparameters = grid_mls,
scorers = {"r2": scorer},
inner_splitters = {"Rolling": splitter},
search_type = "grid",
normalize_fold_results = False,
cv_summary = "mean",
min_cids = 2,
min_periods = 36,
test_size = 1,
n_jobs_outer = 1,
split_functions={"Rolling": lambda n: n // 36},
)
dfa = so_mls.get_optimized_signals(name="MLS_white")
dfx = msm.update_df(dfx, dfa)
OLS/NNLS: panel bootstrap standard error adjustment #
# Specify model options and grids
mods_mls = {
"mls": msl.ModifiedLinearRegression(method="bootstrap",bootstrap_iters=100, error_offset=1e5),
}
grid_mls = {
"mls": {"positive": [True, False], "fit_intercept": [True, False]},
}
so_mls.calculate_predictions(
name = "MLS_bootstrap",
models = mods_mls,
hyperparameters = grid_mls,
scorers = {"r2": scorer},
inner_splitters = {"Rolling": splitter},
search_type = "grid",
normalize_fold_results = False,
cv_summary = "mean",
min_cids = 2,
min_periods = 36,
test_size = 1,
n_jobs_outer = 1,
split_functions={"Rolling": lambda n: n // 36},
)
dfa = so_mls.get_optimized_signals(name="MLS_bootstrap")
dfx = msm.update_df(dfx, dfa)
# Illustrate model choice
so_mls.models_heatmap(
"MLS_bootstrap",
title="Modified least squares, bootstrap standard error adjustment, based on crossvalidation and R2",
figsize=(12, 3),
)
so_mls.coefs_stackedbarplot("MLS_bootstrap", title="Model coefficients of modified least squares features, panel bootstrap", ftrs_renamed=ftrs_dict)
Comparison #
cidx = cids_fx
sigx = [
"LS",
"MLS_analytic",
"MLS_white",
"MLS_bootstrap",
]
pnl = msn.NaivePnL(
df=dfx,
ret=targ,
cids=cidx,
sigs=sigx,
blacklist=fxblack,
start="20040101",
bms=["USD_GB10YXR_NSA", "EUR_FXXR_NSA", "USD_EQXR_NSA"],
)
pnl.make_long_pnl(vol_scale=10, label="Long")
for sig in sigx:
pnl.make_pnl(
sig=sig,
sig_op="raw",
rebal_freq="monthly",
rebal_slip=1,
vol_scale=10,
thresh=5,
)
pnames = ["PNL_" + sig for sig in sigx] # + ["Long"]
pnl.plot_pnls(
pnl_cats=pnames,
title="Global FX forward PnLs: simple signals with volatility scaling",
xcat_labels=[
"Regular OLS coefficients",
"Modified coefficients (analytic)",
"Modified coefficients (white)",
"Modified coefficients (bootstrap)",
],
title_fontsize = 16,
figsize=(14, 8),
)
tbl = pnl.evaluate_pnls(pnl_cats=pnames)
display(tbl)
xcat  PNL_LS  PNL_MLS_analytic  PNL_MLS_white  PNL_MLS_bootstrap 

Return %  4.622696  5.706916  5.629832  5.521259 
St. Dev. %  10.0  10.0  10.0  10.0 
Sharpe Ratio  0.46227  0.570692  0.562983  0.552126 
Sortino Ratio  0.637643  0.798898  0.78809  0.769405 
Max 21Day Draw %  22.173093  19.451132  19.416882  18.503075 
Max 6Month Draw %  13.87266  16.255422  16.354009  17.28836 
Peak to Trough Draw %  22.247391  25.561659  25.121975  26.166283 
Top 5% Monthly PnL Share  0.821211  0.666593  0.676744  0.690975 
USD_GB10YXR_NSA correl  0.072766  0.060082  0.06376  0.066039 
EUR_FXXR_NSA correl  0.232128  0.133802  0.137822  0.163447 
USD_EQXR_NSA correl  0.141826  0.087774  0.090009  0.111772 
Traded Months  251  251  251  251 
cidx = cids_fx
sigx = [
"LS",
"MLS_analytic",
"MLS_bootstrap",
]
pnl = msn.NaivePnL(
df=dfx,
ret=targ,
cids=cidx,
sigs=sigx,
blacklist=fxblack,
start="20030101",
bms=["USD_GB10YXR_NSA", "EUR_FXXR_NSA", "USD_EQXR_NSA"],
)
pnl.make_long_pnl(vol_scale=10, label="Long")
for sig in sigx:
pnl.make_pnl(
sig=sig,
sig_op="zn_score_pan",
min_obs=22 * 6, # minimum required data for normalization
iis=False, # allow no insample bias
rebal_freq="monthly",
rebal_slip=1,
vol_scale=10,
thresh=5,
)
pnames = ["PNL_" + sig for sig in sigx] # + ["Long"]
pnl.plot_pnls(
pnl_cats=pnames,
title="Global FX forward PnLs: normalized signals with volatility scaling",
xcat_labels=[
"Regular OLS coefficients",
"Modified coefficients (analytic)",
"Modified coefficients (bootstrap)",
],
title_fontsize = 16,
figsize=(14, 8),
)
pnl.evaluate_pnls(pnl_cats=pnames)
xcat  PNL_LS  PNL_MLS_analytic  PNL_MLS_bootstrap 

Return %  4.885191  5.655945  5.697481 
St. Dev. %  10.0  10.0  10.0 
Sharpe Ratio  0.488519  0.565595  0.569748 
Sortino Ratio  0.678266  0.79481  0.798485 
Max 21Day Draw %  20.791026  18.830835  18.608962 
Max 6Month Draw %  14.567325  22.942042  23.071133 
Peak to Trough Draw %  22.887699  27.083571  26.894492 
Top 5% Monthly PnL Share  0.758648  0.696608  0.699196 
USD_GB10YXR_NSA correl  0.082912  0.046174  0.055041 
EUR_FXXR_NSA correl  0.198598  0.082414  0.109588 
USD_EQXR_NSA correl  0.135383  0.04544  0.070796 
Traded Months  249  249  249 
cidx = cids_fx
sigx = [
"LS",
"MLS_analytic",
"MLS_bootstrap",
]
pnl = msn.NaivePnL(
df=dfx,
ret=targ,
cids=cidx,
sigs=sigx,
blacklist=fxblack,
start="20030101",
bms=["USD_GB10YXR_NSA", "EUR_FXXR_NSA", "USD_EQXR_NSA"],
)
pnl.make_long_pnl(vol_scale=10, label="Long")
for sig in sigx:
pnl.make_pnl(
sig=sig,
sig_op="zn_score_pan",
min_obs=22 * 6, # minimum required data for normalization
iis=False, # allow no insample bias
rebal_freq="monthly",
rebal_slip=1,
vol_scale=None,
thresh=5,
)
pnames = ["PNL_" + sig for sig in sigx] # + ["Long"]
pnl.plot_pnls(
pnl_cats=pnames,
title="Global FX forward PnLs: normalized signals without volatility scaling",
xcat_labels=[
"Regular OLS coefficients",
"Modified coefficients (analytic)",
"Modified coefficients (bootstrap)",
],
title_fontsize = 16,
figsize=(14, 8),
)
tbl = pnl.evaluate_pnls(pnl_cats=pnames)
display(tbl)
xcat  PNL_LS  PNL_MLS_analytic  PNL_MLS_bootstrap 

Return %  27.518974  48.187792  46.271209 
St. Dev. %  56.331421  85.198476  81.213456 
Sharpe Ratio  0.488519  0.565595  0.569748 
Sortino Ratio  0.678266  0.79481  0.798485 
Max 21Day Draw %  117.118806  160.435849  151.129815 
Max 6Month Draw %  82.059814  195.462699  187.368647 
Peak to Trough Draw %  128.929664  230.747895  218.419464 
Top 5% Monthly PnL Share  0.758648  0.696608  0.699196 
USD_GB10YXR_NSA correl  0.082912  0.046174  0.055041 
EUR_FXXR_NSA correl  0.198598  0.082414  0.109588 
USD_EQXR_NSA correl  0.135383  0.04544  0.070796 
Traded Months  249  249  249 
The coefficients for the unadjusted OLSbased signal noticeably diminished post2010. A comparison between the adjusted and unadjusted OLS signals in this period revealed substantial outperformance of the adjusted signal in this period.
cidx = cids_fx
sigx = [
"LS",
"MLS_analytic",
]
pnl = msn.NaivePnL(
df=dfx,
ret=targ,
cids=cidx,
sigs=sigx,
blacklist=fxblack,
start="20100101",
bms=["USD_GB10YXR_NSA", "EUR_FXXR_NSA", "USD_EQXR_NSA"],
)
pnl.make_long_pnl(vol_scale=10, label="Long")
for sig in sigx:
pnl.make_pnl(
sig=sig,
sig_op="raw",
min_obs=22 * 6, # minimum required data for normalization
iis=False, # allow no insample bias
rebal_freq="monthly",
rebal_slip=1,
vol_scale=10,
thresh=5,
)
pnames = ["PNL_" + sig for sig in sigx]
pnl.plot_pnls(
pnl_cats=pnames,
title="Global FX forward PnLs: raw signals with volatility scaling, post2010",
xcat_labels=[
"Regular OLS coefficients",
"Modified coefficients (analytic)",
],
title_fontsize = 16,
figsize=(14, 8),
)
pnl.evaluate_pnls(pnl_cats=pnames)
xcat  PNL_LS  PNL_MLS_analytic 

Return %  3.734507  5.51176 
St. Dev. %  10.0  10.0 
Sharpe Ratio  0.373451  0.551176 
Sortino Ratio  0.506366  0.772839 
Max 21Day Draw %  27.868957  19.225611 
Max 6Month Draw %  17.198059  15.906944 
Peak to Trough Draw %  27.868957  25.265292 
Top 5% Monthly PnL Share  1.102725  0.699537 
USD_GB10YXR_NSA correl  0.104801  0.07624 
EUR_FXXR_NSA correl  0.231305  0.124606 
USD_EQXR_NSA correl  0.181072  0.095518 
Traded Months  179  179 
sigx = [
"LS",
"MLS_analytic",
]
cidx = cids_fx
pnl = msn.NaivePnL(
df=dfx,
ret=targ,
cids=cidx,
sigs=sigx,
blacklist=fxblack,
start="20100101",
bms=["USD_GB10YXR_NSA", "EUR_FXXR_NSA", "USD_EQXR_NSA"],
)
for sig in sigx:
pnl.make_pnl(
sig=sig,
sig_op="zn_score_pan",
min_obs=22 * 6, # minimum required data for normalization
iis=False, # allow no insample bias
rebal_freq="monthly",
rebal_slip=1,
vol_scale=10,
thresh=5,
)
pnames = ["PNL_" + sig for sig in sigx]
pnl.plot_pnls(
pnl_cats=pnames,
title="Global FX forward PnLs: normalized signals with volatility scaling, post2010",
xcat_labels=[
"Regular OLS coefficients",
"Modified coefficients (analytic)",
],
title_fontsize = 16,
figsize=(14, 8),
)
pnl.evaluate_pnls(pnl_cats=pnames)
xcat  PNL_LS  PNL_MLS_analytic 

Return %  4.732374  6.244401 
St. Dev. %  10.0  10.0 
Sharpe Ratio  0.473237  0.62444 
Sortino Ratio  0.656896  0.888454 
Max 21Day Draw %  15.334317  14.142899 
Max 6Month Draw %  18.908195  13.05028 
Peak to Trough Draw %  28.396048  19.702176 
Top 5% Monthly PnL Share  0.85449  0.715819 
USD_GB10YXR_NSA correl  0.063187  0.043461 
EUR_FXXR_NSA correl  0.163851  0.044889 
USD_EQXR_NSA correl  0.10534  0.015374 
Traded Months  173  173 