{ "cells": [ { "cell_type": "markdown", "id": "981c29fe", "metadata": {}, "source": [ "# Introduction to the Macrosynergy package: The \"Learning\" module" ] }, { "cell_type": "markdown", "id": "717daee7", "metadata": {}, "source": [ "The `macrosynergy.learning` subpackage provides functions and classes to help create statistical machine learning signals from panels of JPMaQS data. \n", "It is built to integrate the `macrosynergy` package with the widely used `scikit-learn` library." ] }, { "cell_type": "markdown", "id": "137e71c1", "metadata": {}, "source": [ "Most standard `scikit-learn` classes do not work directly with panel data. \n", "`macrosynergy.learning` provides wrappers that respect the cross-section and time indexing of quantamental dataframes and enable the use of scikit-learn models, feature selection, cross-validation, and metrics in a panel friendly way.\n", "\n", "See also the introductory notebooks where `macrosynergy.learning` is applied:\n", "- [Optimizing macro trading signals - A practical introduction](https://research.macrosynergy.com/optimal-signals/) \n", "- [Regression-based macro trading signals](https://research.macrosynergy.com/regression-signals/)\n" ] }, { "cell_type": "markdown", "id": "3ca39520", "metadata": {}, "source": [ "#### Features (x)\n", "For this notebook, we build a monthly dataset with features lagged by one month. We take the last recorded value of each month for daily z-scores:\n", "\n", "- `XGDP_NEG`: negative of growth trend. \n", "- `XCPI_NEG`: negative of excess inflation measure. \n", "- `XPCG_NEG`: negative of excess private credit growth. \n", "- `RYLDIRS05Y_NSA`: real IRS yield, 5-year maturity (expectations-based).\n", "\n", "\n", "#### Target (y)\n", "The target is a monthly aggregated return, created by summing daily returns for each month. \n", "Here, we focus on the return of a fixed receiver position in 5Y IRS (`DU05YXR_VT10`), scaled to a 10% annualized volatility target.\n", "\n" ] }, { "cell_type": "markdown", "id": "25ce8088", "metadata": {}, "source": [ "The first step is converting a quantamental dataframe into a wide format: \n", "- Columns = indicators or factors. \n", "- Rows = identified by `cid` (cross-section) and `real_date`. \n", "- Implemented via `categories_df` from `macrosynergy.management`.\n", "\n", "This function also supports:\n", "- Downsampling \n", "- Feature lagging\n", "- Dropping rows with nulls \n", "\n", "Both `SignalOptimizer` and `BetaEstimator` classes use this conversion internally." ] }, { "cell_type": "markdown", "id": "8b28826d", "metadata": {}, "source": [ "Here, we prepare the macroeconomic dataset. We set up the currency universe, select the required JPMaQS categories (including the target series and inputs for an FX blacklist), and construct the `FXBLACK` series to filter out untradeable currencies. We then build several derived macro factors (growth, inflation, credit, and real rate measures), merge them back into the panel, and standardize them across countries with cross-sectional z-scores. These normalized macro signals form the inputs for the later learning and signaling analysis.\n", "\n", "Further details on how the raw JPMaQS data is accessed and structured are provided in [this notebook](https://macrosynergy.com/academy/notebooks/check-out-jpmaqs/)." ] }, { "cell_type": "code", "execution_count": 32, "id": "180091b6", "metadata": {}, "outputs": [], "source": [ "import os\n", "import warnings\n", "warnings.filterwarnings('ignore')\n", "\n", "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "import seaborn as sns\n", "\n", "import macrosynergy.management as msm \n", "import macrosynergy.panel as msp\n", "import macrosynergy.signal as mss\n", "import macrosynergy.pnl as msn\n", "import macrosynergy.visuals as msv\n", "import macrosynergy.learning as msl\n", "\n", "from macrosynergy.download import JPMaQSDownload\n", "\n", "from sklearn.pipeline import Pipeline\n", "from sklearn.linear_model import LinearRegression, Ridge\n", "from sklearn.metrics import make_scorer, r2_score" ] }, { "cell_type": "code", "execution_count": 33, "id": "6b059850", "metadata": {}, "outputs": [], "source": [ "# Cross-sections (cids) used throughout\n", "cids_dm = [\"AUD\", \"CAD\", \"CHF\", \"EUR\", \"GBP\", \"JPY\", \"NOK\", \"NZD\", \"SEK\", \"USD\"]\n", "cids_em = [\"CLP\", \"COP\", \"CZK\", \"HUF\", \"IDR\", \"ILS\", \"INR\", \"KRW\", \"MXN\", \"PLN\", \"THB\", \"TRY\", \"TWD\", \"ZAR\"]\n", "cids = cids_dm + cids_em\n", "\n", "cids_dux = list(set(cids) - set([\"IDR\", \"NZD\"]))" ] }, { "cell_type": "code", "execution_count": 34, "id": "5efe358d", "metadata": {}, "outputs": [], "source": [ "# Minimal set of JPMaQS categories required to recreate dfx, macro factors, and fxblack\n", "\n", "raw_xcats_for_calcs = [\n", " \"INTRGDPv5Y_NSA_P1M1ML12_3MMA\",\n", " \"CPIC_SJA_P6M6ML6AR\",\n", " \"CPIH_SA_P1M1ML12\",\n", " \"INFTEFF_NSA\",\n", " \"PCREDITBN_SJA_P1M1ML12\",\n", " \"RGDP_SA_P1Q1QL4_20QMA\",\n", " \"RYLDIRS05Y_NSA\",\n", " \"INTRGDP_NSA_P1M1ML12_3MMA\",\n", "]\n", "\n", "# The target category used in the learning_to_before_signaling notebook\n", "targets_needed = [\n", " \"DU05YXR_VT10\"\n", "]\n", "\n", "# Categories needed to build the FX blacklist\n", "fx_blacklist_inputs = [\n", " \"FXTARGETED_NSA\", \n", " \"FXUNTRADABLE_NSA\"\n", "]" ] }, { "cell_type": "code", "execution_count": 35, "id": "e2fe8a1e", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Downloading data from JPMaQS.\n", "Timestamp UTC: 2025-09-04 12:08:03\n", "Connection successful!\n", "Some expressions are missing from the downloaded data. Check logger output for complete list.\n", "11 out of 312 expressions are missing. To download the catalogue of all available expressions and filter the unavailable expressions, set `get_catalogue=True` in the call to `JPMaQSDownload.download()`.\n" ] } ], "source": [ "xcats_to_download = sorted(set(raw_xcats_for_calcs + targets_needed + fx_blacklist_inputs)) + [\"FXXR_NSA\", \"EQXR_NSA\"]\n", "\n", "\n", "dwn = JPMaQSDownload(\n", " client_id=os.environ.get(\"JPM_CLIENT_ID\", \"\"),\n", " client_secret=os.environ.get(\"JPM_CLIENT_SECRET\", \"\"),\n", " oauth=True,\n", ")\n", "\n", "df = dwn.download(xcats=xcats_to_download, cids=cids)" ] }, { "cell_type": "code", "execution_count": 36, "id": "b4ad9bb9", "metadata": {}, "outputs": [], "source": [ "\n", "# Build fxblack (FX blacklist) \n", "\n", "dfb = df[df[\"xcat\"].isin([\"FXTARGETED_NSA\", \"FXUNTRADABLE_NSA\"])][[\"cid\", \"xcat\", \"real_date\", \"value\"]]\n", "dfba = (\n", " dfb.groupby([\"cid\", \"real_date\"])\n", " .aggregate(value=pd.NamedAgg(column=\"value\", aggfunc=\"max\"))\n", " .reset_index()\n", ")\n", "dfba[\"xcat\"] = \"FXBLACK\"\n", "fxblack = msp.make_blacklist(dfba, \"FXBLACK\")\n", "\n", "\n", "# Recreate dfx and macro factors from the intro notebook\n", "dfx = df.copy()\n", "\n", "\n", "calcs = [\n", " # intuitive growth trend\n", " \"XGDP_NEG = - INTRGDPv5Y_NSA_P1M1ML12_3MMA\",\n", " # excess inflation measure\n", " \"XCPI_NEG = - ( CPIC_SJA_P6M6ML6AR + CPIH_SA_P1M1ML12 ) / 2 + INFTEFF_NSA\",\n", " # excess private credit growth\n", " \"XPCG_NEG = - PCREDITBN_SJA_P1M1ML12 + INFTEFF_NSA + RGDP_SA_P1Q1QL4_20QMA\",\n", " # excess real interest rate\n", " \"XRYLD = RYLDIRS05Y_NSA - INTRGDP_NSA_P1M1ML12_3MMA\",\n", " # combined real rate + inflation gap\n", " \"XXRYLD = XRYLD + XCPI_NEG\",\n", "]\n", "\n", "dfa = msp.panel_calculator(dfx, calcs=calcs, cids=cids)\n", "dfx = msm.update_df(df=dfx, df_add=dfa)\n", "\n", "# Create cross-sectional z-scores for the macro panels (ZN4), as used later\n", "macros = [\"XGDP_NEG\", \"XCPI_NEG\", \"XPCG_NEG\", \"RYLDIRS05Y_NSA\"]\n", "for xc in macros:\n", " dzn = msp.make_zn_scores(\n", " dfx,\n", " xcat=xc,\n", " cids=cids,\n", " neutral=\"zero\",\n", " thresh=3,\n", " est_freq=\"M\",\n", " pan_weight=1,\n", " postfix=\"_ZN4\",\n", " )\n", " dfx = msm.update_df(dfx, dzn)\n", "\n", "# the list of normalized macro factors referenced downstream\n", "macroz = [m + \"_ZN4\" for m in macros]\n", "xcatx=macros" ] }, { "cell_type": "code", "execution_count": 37, "id": "ba1a6812", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
| \n", " | \n", " | XGDP_NEG | \n", "XCPI_NEG | \n", "XPCG_NEG | \n", "
|---|---|---|---|---|
| cid | \n", "real_date | \n", "\n", " | \n", " | \n", " |
| AUD | \n", "2000-02-29 | \n", "-0.127516 | \n", "-0.162771 | \n", "-2.316805 | \n", "
| 2000-03-31 | \n", "0.188010 | \n", "-0.162771 | \n", "-2.316805 | \n", "|
| 2000-04-28 | \n", "0.033589 | \n", "-0.162771 | \n", "-3.137645 | \n", "|
| 2000-05-31 | \n", "0.175323 | \n", "-0.676674 | \n", "-2.763879 | \n", "|
| 2000-06-30 | \n", "0.205179 | \n", "-0.676674 | \n", "-2.422330 | \n", "|
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| ZAR | \n", "2025-05-30 | \n", "-0.426351 | \n", "1.882825 | \n", "1.799903 | \n", "
| 2025-06-30 | \n", "-0.030835 | \n", "1.777107 | \n", "0.718399 | \n", "|
| 2025-07-31 | \n", "0.399231 | \n", "1.732005 | \n", "0.136673 | \n", "|
| 2025-08-29 | \n", "0.213512 | \n", "1.568641 | \n", "0.322765 | \n", "|
| 2025-09-30 | \n", "-0.369677 | \n", "1.279459 | \n", "-0.600531 | \n", "
5444 rows × 3 columns
\n", "| \n", " | \n", " | XGDP_NEG | \n", "XCPI_NEG | \n", "XPCG_NEG | \n", "
|---|---|---|---|---|
| cid | \n", "real_date | \n", "\n", " | \n", " | \n", " |
| AUD | \n", "2000-02-29 | \n", "-0.127516 | \n", "-0.162771 | \n", "-2.316805 | \n", "
| 2000-03-31 | \n", "0.188010 | \n", "-0.162771 | \n", "-2.316805 | \n", "|
| 2000-04-28 | \n", "0.033589 | \n", "-0.162771 | \n", "-3.137645 | \n", "|
| 2000-05-31 | \n", "0.175323 | \n", "-0.676674 | \n", "-2.763879 | \n", "|
| 2000-06-30 | \n", "0.205179 | \n", "-0.676674 | \n", "-2.422330 | \n", "|
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| ZAR | \n", "2025-05-30 | \n", "-0.426351 | \n", "1.882825 | \n", "1.799903 | \n", "
| 2025-06-30 | \n", "-0.030835 | \n", "1.777107 | \n", "0.718399 | \n", "|
| 2025-07-31 | \n", "0.399231 | \n", "1.732005 | \n", "0.136673 | \n", "|
| 2025-08-29 | \n", "0.213512 | \n", "1.568641 | \n", "0.322765 | \n", "|
| 2025-09-30 | \n", "-0.369677 | \n", "1.279459 | \n", "-0.600531 | \n", "
5444 rows × 3 columns
\n", "| \n", " | \n", " | XGDP_NEG | \n", "XCPI_NEG | \n", "XPCG_NEG | \n", "
|---|---|---|---|---|
| cid | \n", "real_date | \n", "\n", " | \n", " | \n", " |
| AUD | \n", "2000-02-29 | \n", "-0.217460 | \n", "0.047128 | \n", "-0.058952 | \n", "
| 2000-03-31 | \n", "-0.105518 | \n", "0.047128 | \n", "-0.058952 | \n", "|
| 2000-04-28 | \n", "-0.160303 | \n", "0.047128 | \n", "-0.198993 | \n", "|
| 2000-05-31 | \n", "-0.110019 | \n", "-0.201634 | \n", "-0.135226 | \n", "|
| 2000-06-30 | \n", "-0.099427 | \n", "-0.201634 | \n", "-0.076956 | \n", "|
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| ZAR | \n", "2025-05-30 | \n", "-0.323480 | \n", "1.037329 | \n", "0.643387 | \n", "
| 2025-06-30 | \n", "-0.183160 | \n", "0.986155 | \n", "0.458875 | \n", "|
| 2025-07-31 | \n", "-0.030581 | \n", "0.964323 | \n", "0.359628 | \n", "|
| 2025-08-29 | \n", "-0.096470 | \n", "0.885244 | \n", "0.391377 | \n", "|
| 2025-09-30 | \n", "-0.303374 | \n", "0.745261 | \n", "0.233856 | \n", "
5444 rows × 3 columns
\n", "| \n", " | \n", " | PCA 1 | \n", "PCA 2 | \n", "
|---|---|---|---|
| cid | \n", "real_date | \n", "\n", " | \n", " |
| AUD | \n", "2000-02-29 | \n", "0.070187 | \n", "-0.219199 | \n", "
| 2000-03-31 | \n", "0.039984 | \n", "-0.117288 | \n", "|
| 2000-04-28 | \n", "0.154591 | \n", "-0.171389 | \n", "|
| 2000-05-31 | \n", "0.256607 | \n", "-0.021035 | \n", "|
| 2000-06-30 | \n", "0.212212 | \n", "-0.009634 | \n", "|
| ... | \n", "... | \n", "... | \n", "... | \n", "
| ZAR | \n", "2025-05-30 | \n", "-1.042873 | \n", "-0.703141 | \n", "
| 2025-06-30 | \n", "-0.916078 | \n", "-0.559843 | \n", "|
| 2025-07-31 | \n", "-0.872366 | \n", "-0.414922 | \n", "|
| 2025-08-29 | \n", "-0.826028 | \n", "-0.441317 | \n", "|
| 2025-09-30 | \n", "-0.567299 | \n", "-0.576667 | \n", "
5444 rows × 2 columns
\n", "| \n", " | \n", " | XGDP_NEG | \n", "XCPI_NEG | \n", "XPCG_NEG | \n", "
|---|---|---|---|---|
| cid | \n", "real_date | \n", "\n", " | \n", " | \n", " |
| AUD | \n", "2000-02-29 | \n", "-0.127516 | \n", "-0.162771 | \n", "-2.316805 | \n", "
| 2000-03-31 | \n", "0.188010 | \n", "-0.162771 | \n", "-2.316805 | \n", "|
| 2000-04-28 | \n", "0.033589 | \n", "-0.162771 | \n", "-3.137645 | \n", "|
| 2000-05-31 | \n", "0.175323 | \n", "-0.676674 | \n", "-2.763879 | \n", "|
| 2000-06-30 | \n", "0.205179 | \n", "-0.676674 | \n", "-2.422330 | \n", "|
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| ZAR | \n", "2025-05-30 | \n", "-0.426351 | \n", "1.882825 | \n", "1.799903 | \n", "
| 2025-06-30 | \n", "-0.030835 | \n", "1.777107 | \n", "0.718399 | \n", "|
| 2025-07-31 | \n", "0.399231 | \n", "1.732005 | \n", "0.136673 | \n", "|
| 2025-08-29 | \n", "0.213512 | \n", "1.568641 | \n", "0.322765 | \n", "|
| 2025-09-30 | \n", "-0.369677 | \n", "1.279459 | \n", "-0.600531 | \n", "
5444 rows × 3 columns
\n", "| \n", " | \n", " | EQXR_NSA | \n", "
|---|---|---|
| cid | \n", "real_date | \n", "\n", " |
| AUDvUSD | \n", "2000-01-03 | \n", "-1.172349 | \n", "
| 2000-01-04 | \n", "-3.749659 | \n", "|
| 2000-01-05 | \n", "0.120414 | \n", "|
| 2000-01-06 | \n", "-0.672091 | \n", "|
| 2000-01-07 | \n", "4.024217 | \n", "|
| ... | \n", "... | \n", "... | \n", "
| ZARvUSD | \n", "2025-08-28 | \n", "0.330973 | \n", "
| 2025-08-29 | \n", "-0.686613 | \n", "|
| 2025-09-01 | \n", "0.000000 | \n", "|
| 2025-09-02 | \n", "-0.729983 | \n", "|
| 2025-09-03 | \n", "0.494125 | \n", "
153417 rows × 1 columns
\n", "| \n", " | real_date | \n", "cid | \n", "xcat | \n", "value | \n", "
|---|---|---|---|---|
| 0 | \n", "2000-03-24 | \n", "AUD | \n", "BETA_NSA | \n", "0.013376 | \n", "
| 1 | \n", "2000-03-27 | \n", "AUD | \n", "BETA_NSA | \n", "0.013376 | \n", "
| 2 | \n", "2000-03-28 | \n", "AUD | \n", "BETA_NSA | \n", "0.013376 | \n", "
| 3 | \n", "2000-03-29 | \n", "AUD | \n", "BETA_NSA | \n", "0.013376 | \n", "
| 4 | \n", "2000-03-30 | \n", "AUD | \n", "BETA_NSA | \n", "0.013376 | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 151909 | \n", "2025-08-28 | \n", "ZAR | \n", "BETA_NSA | \n", "0.151301 | \n", "
| 151910 | \n", "2025-08-29 | \n", "ZAR | \n", "BETA_NSA | \n", "0.151301 | \n", "
| 151911 | \n", "2025-09-01 | \n", "ZAR | \n", "BETA_NSA | \n", "0.151301 | \n", "
| 151912 | \n", "2025-09-02 | \n", "ZAR | \n", "BETA_NSA | \n", "0.151301 | \n", "
| 151913 | \n", "2025-09-03 | \n", "ZAR | \n", "BETA_NSA | \n", "0.151301 | \n", "
151914 rows × 4 columns
\n", "| \n", " | real_date | \n", "cid | \n", "xcat | \n", "value | \n", "
|---|---|---|---|---|
| 0 | \n", "2000-03-27 | \n", "AUD | \n", "HEDGED_RETURN_NSA | \n", "0.980193 | \n", "
| 1 | \n", "2000-03-28 | \n", "AUD | \n", "HEDGED_RETURN_NSA | \n", "0.119258 | \n", "
| 2 | \n", "2000-03-29 | \n", "AUD | \n", "HEDGED_RETURN_NSA | \n", "-0.506133 | \n", "
| 3 | \n", "2000-03-30 | \n", "AUD | \n", "HEDGED_RETURN_NSA | \n", "0.233666 | \n", "
| 4 | \n", "2000-03-31 | \n", "AUD | \n", "HEDGED_RETURN_NSA | \n", "-0.827272 | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 155405 | \n", "2025-08-28 | \n", "ZAR | \n", "HEDGED_RETURN_NSA | \n", "0.309650 | \n", "
| 155406 | \n", "2025-08-29 | \n", "ZAR | \n", "HEDGED_RETURN_NSA | \n", "0.104120 | \n", "
| 155407 | \n", "2025-09-01 | \n", "ZAR | \n", "HEDGED_RETURN_NSA | \n", "0.762983 | \n", "
| 155408 | \n", "2025-09-02 | \n", "ZAR | \n", "HEDGED_RETURN_NSA | \n", "-0.629873 | \n", "
| 155409 | \n", "2025-09-03 | \n", "ZAR | \n", "HEDGED_RETURN_NSA | \n", "0.097560 | \n", "
155410 rows × 4 columns
\n", "| \n", " | \n", " | \n", " | pearson | \n", "kendall | \n", "spearman | \n", "
|---|---|---|---|---|---|
| benchmark return | \n", "return category | \n", "frequency | \n", "\n", " | \n", " | \n", " |
| USD_EQXR_NSA | \n", "HEDGED_RETURN_NSA | \n", "M | \n", "0.188726 | \n", "0.102106 | \n", "0.149935 | \n", "
| Q | \n", "0.200774 | \n", "0.112712 | \n", "0.165224 | \n", "||
| FXXR_NSA | \n", "M | \n", "0.354168 | \n", "0.222569 | \n", "0.321713 | \n", "|
| Q | \n", "0.375871 | \n", "0.232173 | \n", "0.332494 | \n", "