Statistical learning for sectoral equity allocation

Statistical learning for sectoral equity allocation #

This notebook offers the necessary code to replicate the research findings discussed in the Macrosynergy research post “Statistical learning for sectoral equity allocation”. Its primary objective is to inspire readers to explore and conduct additional investigations whilst also providing a foundation for testing their own unique ideas.

On the back of our initial exploration of the relationship between macroeconomic trends and sectoral equity indices , we continue the analysis of sectoral rotation signals using macro trends as well as macro and financial conditions. We take advantage of previous data ingestion and transformations of JPMaQS macro quantamental indicators, as shown in Sectoral equity indicators , structuring the notebook as follows:

Get Packages and JPMaQS Data: This section is responsible for installing and importing the necessary Python packages used throughout the analysis, as well as loading all the required data
Feature filtering and imputation: We apply common sense in determining which categories should be part of the panel in the learning exercise. We look for consistency of feature availability across cross sections, as well as maximum coverage. Therefore we opt to discard some categories while imputing others in specific cross sections in the early period.
Sectoral signals and naive PnLs: We proceed with statistical learning on vol-targeted relative returns, using common pipeline across sectors. This combines a feature selection mechanism based upon LARS and linear models, as we opt for simplicity.

It is important to note that while the notebook covers a selection of indicators and strategies used for the post’s main findings, users can explore countless other possible indicators and approaches. Users can modify the code to test different hypotheses and strategies based on their research and ideas. Best of luck with your research!

Get packages and JPMaQS data #

Packages #

               import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from macrosynergy.download import JPMaQSDownload
import macrosynergy.management as msm
import macrosynergy.panel as msp
import macrosynergy.pnl as msn
import macrosynergy.signal as mss
import macrosynergy.learning as msl
import macrosynergy.visuals as msv

from sklearn.linear_model import LinearRegression
from sklearn.pipeline import Pipeline
from sklearn.metrics import make_scorer

from timeit import default_timer as timer
from datetime import timedelta, date, datetime

import warnings

from IPython.display import HTML

warnings.filterwarnings("ignore")

              

Previously prepared quantamental categories #

               # Import data from csv file created preparation notebook
# https://macrosynergy.com/academy/notebooks/sectoral-equity-indicators/

INPUT_PATH = os.path.join(os.getcwd(), r"../../../equity_sectoral_notebook_data.csv")

df_csv = pd.read_csv(INPUT_PATH, index_col=0)
df_csv["real_date"] = pd.to_datetime(df_csv["real_date"]).dt.date

df_csv = msm.utils.standardise_dataframe(df_csv)
df_csv = df_csv.sort_values(["cid", "xcat", "real_date"])

               # Equity sector labels and cross sections

sector_labels = {
    "ALL": "All sectors",
    "COD": "Cons. discretionary",
    "COS": "Cons. staples",
    "CSR": "Communication services",
    "ENR": "Energy",
    "FIN": "Financials",
    "HLC": "Healthcare",
    "IND": "Industrials",
    "ITE": "Information tech",
    "MAT": "Materials",
    "REL": "Real estate",
    "UTL": "Utilities",
}
cids_secs = list(sector_labels.keys())

# Equity countries cross sections

cids_eq = [
    "AUD",
    "CAD",
    "CHF",
    "EUR",
    "GBP",
    "ILS",
    "JPY",
    "NOK",
    "NZD",
    "SEK",
    "SGD",
    "USD",
]

              

               # Base category tickes of quantamental categories created by data preparation notebook:
# https://macrosynergy.com/academy/notebooks/sectoral-equity-indicators/

output_growth = [
    # industrial prod
    "XIP_SA_P1M1ML12_3MMA",
    "XIP_SA_P1M1ML12_3MMA_WG",
    # construction
    "XCSTR_SA_P1M1ML12_3MMA",
    "XCSTR_SA_P1M1ML12_3MMA_WG",
    # Excess GDP growth
    "XRGDPTECH_SA_P1M1ML12_3MMA",
    "XRGDPTECH_SA_P1M1ML12_3MMA_WG",
]
private_consumption = [
    # Consumer surveys
    "CCSCORE_SA",
    "CCSCORE_SA_D3M3ML3",
    "CCSCORE_SA_WG",
    "CCSCORE_SA_D3M3ML3_WG",
    
    "XNRSALES_SA_P1M1ML12_3MMA",
    "XRRSALES_SA_P1M1ML12_3MMA",
    "XNRSALES_SA_P1M1ML12_3MMA_WG",
    "XRRSALES_SA_P1M1ML12_3MMA_WG",
    "XRPCONS_SA_P1M1ML12_3MMA",
    "XRPCONS_SA_P1M1ML12_3MMA_WG",    
]
export = [
    "XEXPORTS_SA_P1M1ML12_3MMA",
]
labour_market = [
    "UNEMPLRATE_NSA_3MMA_D1M1ML12",
    "UNEMPLRATE_SA_3MMAv5YMA",
    "UNEMPLRATE_NSA_3MMA_D1M1ML12_WG",
    "UNEMPLRATE_SA_3MMAv5YMA_WG",
    "XEMPL_NSA_P1M1ML12_3MMA",
    "XEMPL_NSA_P1M1ML12_3MMA_WG",
    "XRWAGES_NSA_P1M1ML12",
]
business_surveys = [
    # Manufacturing
    "MBCSCORE_SA",
    "MBCSCORE_SA_D3M3ML3",
    "MBCSCORE_SA_WG",
    "MBCSCORE_SA_D3M3ML3_WG",
    # Services
    "SBCSCORE_SA",
    "SBCSCORE_SA_D3M3ML3",
    "SBCSCORE_SA_WG",
    "SBCSCORE_SA_D3M3ML3_WG",
    # Construction
    "CBCSCORE_SA",
    "CBCSCORE_SA_D3M3ML3",
    "CBCSCORE_SA_WG",
    "CBCSCORE_SA_D3M3ML3_WG",
]
private_credit = [
    "XPCREDITBN_SJA_P1M1ML12",
    "XPCREDITBN_SJA_P1M1ML12_WG",
    # liquidity conditions
    "INTLIQGDP_NSA_D1M1ML1",
    "INTLIQGDP_NSA_D1M1ML6",
]
broad_inflation = [
    # Inflation
    "XCPIC_SA_P1M1ML12",
    "XCPIH_SA_P1M1ML12",
    "XPPIH_NSA_P1M1ML12",    
]
specific_inflation = [
    "XCPIE_SA_P1M1ML12",
    "XCPIF_SA_P1M1ML12",
    "XCPIE_SA_P1M1ML12_WG",
    "XCPIF_SA_P1M1ML12_WG",
]
private_and_public_debt = [
    "HHINTNETGDP_SA_D1M1ML12",
    "HHINTNETGDP_SA_D1M1ML12_WG",
    "CORPINTNETGDP_SA_D1Q1QL4",
    "CORPINTNETGDP_SA_D1Q1QL4_WG",
    "XGGDGDPRATIOX10_NSA",
]
commodity_inventories = [
    "BMLXINVCSCORE_SA",
    "REFIXINVCSCORE_SA",
    "BASEXINVCSCORE_SA",
]
commodity_markets = [
    "BMLCOCRY_SAVT10_21DMA",
    "COXR_VT10vWTI_21DMA"    
]
real_appreciation_tot = [
    "CXPI_NSA_P1M12ML1",
    "CMPI_NSA_P1M12ML1",
    "CTOT_NSA_P1M12ML1",
    "REEROADJ_NSA_P1M12ML1",
]
interest_rates = [
    "RIR_NSA",
    "RYLDIRS02Y_NSA",
    "RYLDIRS05Y_NSA",
    "RSLOPEMIDDLE_NSA",
]

# All economic categories
ecos = output_growth + private_consumption + export + labour_market + business_surveys + private_credit + broad_inflation + specific_inflation + private_and_public_debt + commodity_inventories + commodity_markets + real_appreciation_tot + interest_rates

# Equity categories
eqrets = [
    "EQC" + sec + ret for sec in cids_secs for ret in ["XR_NSA", "R_NSAvALL", "R_VT10vALL"]
]

# All categories
all_xcats = [x + suff for x in ecos + ecos for suff in ["_ZN", "_ZN_NEG"]] + eqrets 

# Resultant tickers

tickers = [cid + "_" + xcat for cid in cids_eq for xcat in all_xcats]
print(f"Maximum number of tickers is {len(tickers)}")

              

Maximum number of tickers is 3552

Download additional data from JPMaQS #

               # Additional tickers for download from JPMaQS

untradeable = [
    "EQCCODUNTRADABLE_NSA",
    "EQCCOSUNTRADABLE_NSA",
    "EQCCSRUNTRADABLE_NSA",
    "EQCENRUNTRADABLE_NSA",
    "EQCFINUNTRADABLE_NSA",
    "EQCHLCUNTRADABLE_NSA",
    "EQCINDUNTRADABLE_NSA",
    "EQCITEUNTRADABLE_NSA",
    "EQCMATUNTRADABLE_NSA",
    "EQCRELUNTRADABLE_NSA",
    "EQCUTLUNTRADABLE_NSA",   
]  # dummy variables for dates where certain sectors were untradeable

bmrs = [
    "USD_EQXR_NSA",
    "USD_EQXR_VT10"
]  # U.S. equity returns for correlation analysis

xtickers = [cid + "_" + xcat for cid in cids_eq for xcat in untradeable] + bmrs
print(f"Maximum number of tickers is {len(xtickers)}")

              

Maximum number of tickers is 134

               # Download series from J.P. Morgan DataQuery by tickers
start_date = "2000-01-01"

# Retrieve credentials

client_id: str = os.getenv("DQ_CLIENT_ID")
client_secret: str = os.getenv("DQ_CLIENT_SECRET")

# Download from DataQuery
with JPMaQSDownload(client_id=client_id, client_secret=client_secret) as downloader:
    start = timer()
    assert downloader.check_connection()
    df_jpmaqs = downloader.download(
        tickers=xtickers,
        start_date=start_date,
        metrics=["value"],
        suppress_warning=True,
        show_progress=True,
    )
    end = timer()

print("Download time from DQ: " + str(timedelta(seconds=end - start)))

              

               Downloading data from JPMaQS.
Timestamp UTC:  2024-12-03 17:35:50
Connection successful!

              

               Requesting data: 100%|███████████████████████████████████████████████████████████████████| 7/7 [00:01<00:00,  4.61it/s]
Downloading data: 100%|██████████████████████████████████████████████████████████████████| 7/7 [00:07<00:00,  1.05s/it]

               Some expressions are missing from the downloaded data. Check logger output for complete list.
1 out of 134 expressions are missing. To download the catalogue of all available expressions and filter the unavailable expressions, set `get_catalogue=True` in the call to `JPMaQSDownload.download()`.
Some dates are missing from the downloaded data. 
3 out of 6504 dates are missing.
Download time from DQ: 0:00:10.911803

              

               df = msm.update_df(df_csv, df_jpmaqs)

              

               # Dictionary of featire category labels

cat_labels = {
    "BASEXINVCSCORE_SA_ZN": {
        "Group": "Commodity inventories",
        "Label": "Excess crude inventory score",
        "Description": "Crude oil excess inventory z-score, seasonally adjusted",
        "Geography": "global",
    },
    "BMLCOCRY_SAVT10_21DMA_ZN": {
        "Group": "Market metrics",
        "Label": "Base metals carry",
        "Description": "Nominal carry for base metals basket, seasonally and vol-adjusted, 21 days moving average",
        "Geography": "global",
    },
    "BMLXINVCSCORE_SA_ZN": {
        "Group": "Commodity inventories",
        "Label": "Excess metal inventory score",
        "Description": "Base metal excess inventory z-score, seasonally adjusted",
        "Geography": "global",
    },
    "CBCSCORE_SA_D3M3ML3_WG_ZN": {
        "Group": "Business surveys",
        "Label": "Construction confidence, q/q",
        "Description": "Construction business confidence score, seas. adjusted, change q/q",
        "Geography": "weighted",
    },
    "CBCSCORE_SA_D3M3ML3_ZN": {
        "Group": "Business surveys",
        "Label": "Construction confidence, q/q",
        "Description": "Construction business confidence score, seas. adjusted, change q/q",
        "Geography": "local",
    },
    "CBCSCORE_SA_WG_ZN": {
        "Group": "Business surveys",
        "Label": "Construction confidence",
        "Description": "Construction business confidence score, seas. adjusted",
        "Geography": "weighted",
    },
    "CBCSCORE_SA_ZN": {
        "Group": "Business surveys",
        "Label": "Construction confidence",
        "Description": "Construction business confidence score, seas. adjusted",
        "Geography": "local",
    },
    "CCSCORE_SA_D3M3ML3_WG_ZN": {
        "Group": "Private consumption",
        "Label": "Consumer confidence, q/q",
        "Description": "Consumer confidence score, seasonally adjusted, change q/q",
        "Geography": "weighted",
    },
    "CCSCORE_SA_D3M3ML3_ZN": {
        "Group": "Private consumption",
        "Label": "Consumer confidence, q/q",
        "Description": "Consumer confidence score, seasonally adjusted, change q/q",
        "Geography": "local",
    },
    "CCSCORE_SA_WG_ZN": {
        "Group": "Private consumption",
        "Label": "Consumer confidence",
        "Description": "Consumer confidence score, seasonally adjusted",
        "Geography": "weighted",
    },
    "CCSCORE_SA_ZN": {
        "Group": "Private consumption",
        "Label": "Consumer confidence",
        "Description": "Consumer confidence score, seasonally adjusted",
        "Geography": "local",
    },
    "CMPI_NSA_P1M12ML1_ZN": {
        "Group": "Real appreciation",
        "Label": "Import prices, %oya",
        "Description": "Commodity-based import price index, %oya",
        "Geography": "local",
    },
    "CTOT_NSA_P1M12ML1_ZN": {
        "Group": "Real appreciation",
        "Label": "Terms-of-trade, %oya",
        "Description": "Commodity-based terms-of-trade, %oya",
        "Geography": "local",
    },
    "CXPI_NSA_P1M12ML1_ZN": {
        "Group": "Real appreciation",
        "Label": "Export prices, %oya",
        "Description": "Commodity-based export price index, %oya",
        "Geography": "local",
    },
    "COXR_VT10vWTI_21DMA_ZN": {
        "Group": "Market metrics",
        "Label": "Refined vs crude oil returns",
        "Description": "Refined oil products vs crude oil vol-targeted return differential, 21 days moving average",
        "Geography": "global",
    },
    "INTLIQGDP_NSA_D1M1ML1_ZN": {
        "Group": "Private credit",
        "Label": "Intervention liquidity, diff m/m",
        "Description": "Intervention liquidity to GDP ratio, change over the last month",
        "Geography": "local",
    },
    "INTLIQGDP_NSA_D1M1ML6_ZN": {
        "Group": "Private credit",
        "Label": "Intervention liquidity, diff 6m",
        "Description": "Intervention liquidity to GDP ratio, change overlast 6 months",
        "Geography": "local",
    },
    "MBCSCORE_SA_D3M3ML3_WG_ZN": {
        "Group": "Business surveys",
        "Label": "Manufacturing confidence, q/q",
        "Description": "Manufacturing business confidence score, seas. adj., change q/q",
        "Geography": "weighted",
    },
    "MBCSCORE_SA_D3M3ML3_ZN": {
        "Group": "Business surveys",
        "Label": "Manufacturing confidence, q/q",
        "Description": "Manufacturing business confidence score, seas. adj., change q/q",
        "Geography": "local",
    },
    "MBCSCORE_SA_WG_ZN": {
        "Group": "Business surveys",
        "Label": "Manufacturing confidence",
        "Description": "Manufacturing business confidence score, seasonally adjusted",
        "Geography": "weighted",
    },
    "MBCSCORE_SA_ZN": {
        "Group": "Business surveys",
        "Label": "Manufacturing confidence",
        "Description": "Manufacturing business confidence score, seasonally adjusted",
        "Geography": "local",
    },
    "REEROADJ_NSA_P1M12ML1_ZN": {
        "Group": "Real appreciation",
        "Label": "Open-adj REER, %oya",
        "Description": "Openness-adjusted real effective exchange rate, %oya",
        "Geography": "local",
    },
    "REFIXINVCSCORE_SA_ZN": {
        "Group": "Commodity inventories",
        "Label": "Excess refined oil inventory score",
        "Description": "Refined oil product excess inventory z-score, seas. adjusted",
        "Geography": "global",
    },
    "RIR_NSA_ZN": {
        "Group": "Market metrics",
        "Label": "Real 1-month rate",
        "Description": "Real 1-month interest rate",
        "Geography": "local",
    },
    "RSLOPEMIDDLE_NSA_ZN": {
        "Group": "Market metrics",
        "Label": "Real 5y-2y yield",
        "Description": "Real IRS yield differentials, 5-years versus 2-years",
        "Geography": "local",
    },
    "RYLDIRS02Y_NSA_ZN": {
        "Group": "Market metrics",
        "Label": "Real 2-year yield",
        "Description": "Real 2-year IRS yield",
        "Geography": "local",
    },
    "RYLDIRS05Y_NSA_ZN": {
        "Group": "Market metrics",
        "Label": "Real 5-year yield",
        "Description": "Real 5-year IRS yield",
        "Geography": "local",
    },
    "SBCSCORE_SA_D3M3ML3_WG_ZN": {
        "Group": "Business surveys",
        "Label": "Service confidence, q/q",
        "Description": "Services business confidence score, seas. adjusted, change q/q",
        "Geography": "weighted",
    },
    "SBCSCORE_SA_D3M3ML3_ZN": {
        "Group": "Business surveys",
        "Label": "Service confidence, q/q",
        "Description": "Services business confidence score, seas. adjusted, change q/q",
        "Geography": "local",
    },
    "SBCSCORE_SA_WG_ZN": {
        "Group": "Business surveys",
        "Label": "Service confidence",
        "Description": "Services business confidence score, seasonally adjusted",
        "Geography": "weighted",
    },
    "SBCSCORE_SA_ZN": {
        "Group": "Business surveys",
        "Label": "Service confidence",
        "Description": "Services business confidence score, seasonally adjusted",
        "Geography": "local",
    },
    "UNEMPLRATE_NSA_3MMA_D1M1ML12_WG_ZN": {
        "Group": "Labour market",
        "Label": "Unemployment rate, diff oya",
        "Description": "Unemployment rate, change oya",
        "Geography": "weighted",
    },
    "UNEMPLRATE_NSA_3MMA_D1M1ML12_ZN": {
        "Group": "Labour market",
        "Label": "Unemployment rate, diff oya",
        "Description": "Unemployment rate, change oya",
        "Geography": "local",
    },
    "UNEMPLRATE_SA_3MMAv5YMA_WG_ZN": {
        "Group": "Labour market",
        "Label": "Unemployment rate, diff vs 5yma",
        "Description": "Unemployment rate, difference vs 5-year moving average",
        "Geography": "weighted",
    },
    "UNEMPLRATE_SA_3MMAv5YMA_ZN": {
        "Group": "Labour market",
        "Label": "Unemployment rate, diff vs 5yma",
        "Description": "Unemployment rate, difference vs 5-year moving average",
        "Geography": "local",
    },
    "XCPIC_SA_P1M1ML12_ZN": {
        "Group": "Inflation - broad",
        "Label": "Excess core CPI, %oya",
        "Description": "Core CPI, %oya, in excess of effective inflation target",
        "Geography": "local",
    },
    "XCPIE_SA_P1M1ML12_WG_ZN": {
        "Group": "Inflation - specific",
        "Label": "Excess energy CPI, %oya",
        "Description": "Energy CPI, %oya, in excess of effective inflation target",
        "Geography": "weighted",
    },
    "XCPIE_SA_P1M1ML12_ZN": {
        "Group": "Inflation - specific",
        "Label": "Excess energy CPI, %oya",
        "Description": "Energy CPI, %oya, in excess of effective inflation target",
        "Geography": "local",
    },
    "XCPIF_SA_P1M1ML12_WG_ZN": {
        "Group": "Inflation - specific",
        "Label": "Excess food CPI, %oya",
        "Description": "Food CPI, %oya, in excess of effective inflation target",
        "Geography": "weighted",
    },
    "XCPIF_SA_P1M1ML12_ZN": {
        "Group": "Inflation - specific",
        "Label": "Excess food CPI, %oya",
        "Description": "Food CPI, %oya, in excess of effective inflation target",
        "Geography": "local",
    },
    "XCPIH_SA_P1M1ML12_ZN": {
        "Group": "Inflation - broad",
        "Label": "Excess headline CPI, %oya",
        "Description": "Headline CPI, %oya, in excess of effective inflation target",
        "Geography": "local",
    },
    "XCSTR_SA_P1M1ML12_3MMA_WG_ZN": {
        "Group": "Output growth",
        "Label": "Excess construction growth",
        "Description": "Construction output, %oya, 3mma, in excess of 5-y median GDP growth",
        "Geography": "weighted",
    },
    "XCSTR_SA_P1M1ML12_3MMA_ZN": {
        "Group": "Output growth",
        "Label": "Excess construction growth",
        "Description": "Construction output, %oya, 3mma, in excess of 5-y median GDP growth",
        "Geography": "local",
    },
    "XEMPL_NSA_P1M1ML12_3MMA_WG_ZN": {
        "Group": "Labour market",
        "Label": "Excess employment growth",
        "Description": "Employment growth, %oya, 3mma, in excess of population growth",
        "Geography": "weighted",
    },
    "XEMPL_NSA_P1M1ML12_3MMA_ZN": {
        "Group": "Labour market",
        "Label": "Excess employment growth",
        "Description": "Employment growth, %oya, 3mma, in excess of population growth",
        "Geography": "local",
    },
    "XEXPORTS_SA_P1M1ML12_3MMA_ZN": {
        "Group": "Exports",
        "Label": "Excess export growth",
        "Description": "Exports growth, %oya, 3mma, in excess of 5-year median GDP growth",
        "Geography": "local",
    },
    "XGGDGDPRATIOX10_NSA_ZN": {
        "Group": "Debt",
        "Label": "Excess projected gov. debt",
        "Description": "Government debt-to-GDP ratio proj. in 10 years, in excess of 100%",
        "Geography": "local",
    },
    "CORPINTNETGDP_SA_D1Q1QL4_WG_ZN": {
        "Group": "Debt",
        "Label": "Corporate debt servicing, %oya",
        "Description": "Corporate net debt servicing-to-GDP ratio, seasonally-adjusted, %oya",
        "Geography": "weighted",
    },
    "CORPINTNETGDP_SA_D1Q1QL4_ZN": {
        "Group": "Debt",
        "Label": "Corporate debt servicing, %oya",
        "Description": "Corporate net debt servicing-to-GDP ratio, seasonally-adjusted, %oya",
        "Geography": "local",
    },
    "HHINTNETGDP_SA_D1M1ML12_WG_ZN": {
        "Group": "Debt",
        "Label": "Households debt servicing, %oya",
        "Description": "Households net debt servicing-to-GDP ratio, seasonally-adjusted, %oya",
        "Geography": "weighted",
    },
    "HHINTNETGDP_SA_D1M1ML12_ZN": {
        "Group": "Debt",
        "Label": "Households debt servicing, %oya",
        "Description": "Households net debt servicing-to-GDP ratio, seasonally-adjusted, %oya",
        "Geography": "local",
    },
    "XIP_SA_P1M1ML12_3MMA_WG_ZN": {
        "Group": "Output growth",
        "Label": "Excess industry growth",
        "Description": "Industrial output, %oya, 3mma, in excess of 5-y median GDP growth",
        "Geography": "weighted",
    },
    "XIP_SA_P1M1ML12_3MMA_ZN": {
        "Group": "Output growth",
        "Label": "Excess industry growth",
        "Description": "Industrial output, %oya, 3mma, in excess of 5-y median GDP growth",
        "Geography": "local",
    },
    "XNRSALES_SA_P1M1ML12_3MMA_WG_ZN": {
        "Group": "Private consumption",
        "Label": "Excess retail sales growth",
        "Description": "Nominal retail sales, %oya, 3mma, in excess of 5-y median GDP growth",
        "Geography": "weighted",
    },
    "XNRSALES_SA_P1M1ML12_3MMA_ZN": {
        "Group": "Private consumption",
        "Label": "Excess retail sales growth",
        "Description": "Nominal retail sales, %oya, 3mma, in excess of 5-y median GDP growth",
        "Geography": "local",
    },
    "XRRSALES_SA_P1M1ML12_3MMA_WG_ZN": {
        "Group": "Private consumption",
        "Label": "Excess real retail growth",
        "Description": "Real retail sales, %oya, 3mma, in excess of 5-y median GDP growth",
        "Geography": "weighted",
    },
    "XRRSALES_SA_P1M1ML12_3MMA_ZN": {
        "Group": "Private consumption",
        "Label": "Excess real retail growth",
        "Description": "Real retail sales, %oya, 3mma, in excess of 5-y median GDP growth",
        "Geography": "local",
    },
    "XPCREDITBN_SJA_P1M1ML12_WG_ZN": {
        "Group": "Private credit",
        "Label": "Excess credit growth",
        "Description": "Private credit, %oya, 3mma, in excess of 5-y median GDP growth",
        "Geography": "weighted",
    },
    "XPCREDITBN_SJA_P1M1ML12_ZN": {
        "Group": "Private credit",
        "Label": "Excess credit growth",
        "Description": "Private credit, %oya, 3mma, in excess of 5-y median GDP growth",
        "Geography": "local",
    },
    "XPPIH_NSA_P1M1ML12_ZN": {
        "Group": "Inflation - broad",
        "Label": "Excess PPI, %oya",
        "Description": "Producer price inflation, %oya, in excess of eff. inflation target",
        "Geography": "local",
    },
    "XRGDPTECH_SA_P1M1ML12_3MMA_WG_ZN": {
        "Group": "Output growth",
        "Label": "Excess GDP growth",
        "Description": "Real GDP, %oya, 3mma, using HF data, in excess of 5-y med. GDP growth",
        "Geography": "weighted",
    },
    "XRGDPTECH_SA_P1M1ML12_3MMA_ZN": {
        "Group": "Output growth",
        "Label": "Excess GDP growth",
        "Description": "Real GDP, %oya, 3mma, using HF data, in excess of 5-y med. GDP growth",
        "Geography": "local",
    },
    "XRPCONS_SA_P1M1ML12_3MMA_WG_ZN": {
        "Group": "Private consumption",
        "Label": "Excess consumption growth",
        "Description": "Real private consumption, %oya, 3mma, in excess of 5-y median GDP growth",
        "Geography": "weighted",
    },
    "XRPCONS_SA_P1M1ML12_3MMA_ZN": {
        "Group": "Private consumption",
        "Label": "Excess real consum growth",
        "Description": "Real private consumption, %oya, 3mma, in excess of 5-y median GDP growth",
        "Geography": "local",
    },
    "XRWAGES_NSA_P1M1ML12_ZN": {
        "Group": "Labour market",
        "Label": "Excess real wage growth",
        "Description": "Real wage growth, %oya, in excess of medium-term productivity growth",
        "Geography": "local",
    },
}

cat_labels = pd.DataFrame(cat_labels).T
cat_alllabel_dict = cat_labels[["Label", "Geography"]].agg(", ".join, axis=1).to_dict()

cat_labels = (
    cat_labels
    .reset_index(drop=False)
    .rename(columns={"index": "Category"})
    .set_index(["Group", "Category"])
    .sort_index()
)

              

               cat_groups_count = (
    cat_labels.index.to_frame()
    .reset_index(drop=True)
    .groupby("Group")["Category"].count()
    .sort_values(ascending=True)
)

fig = cat_groups_count.plot.barh(
    ylabel="",
    fontsize=11
)
fig.set_title(label="Number of categories by aggregate macro group", pad=20)
fig.title.set_size(16)

plt.plot()

              

[]

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/910df0f5a269d4624c464f1dea17ed1c89404f5e7cb09453dc15a778b2fc34fb.png

Feature filtering and imputation #

Cross-section availability requirement #

               # All normalized macroeconomic categories
all_macroz = [x + "_ZN" for x in ecos] 

# Identify categories with less than 10 cross sections
df_macro = df[df["xcat"].isin(all_macroz)]
cid_counts = df_macro.groupby('xcat')['cid'].nunique()
xcatx_low_cid = cid_counts[cid_counts < 10].index.tolist()
print("Categories with less than 10 cross sections:\n")
for xcat in xcatx_low_cid:
    print(xcat)

# Remove categories with less than 10 cross sections
macroz = [x for x in all_macroz if not x in xcatx_low_cid]

# Identify categories that have short history

df_macro = df[df["xcat"].isin(macroz)]
cutoff_date = pd.Timestamp("2003-01-01")
min_dates = df_macro.groupby('xcat')['real_date'].min()
xcatx_late_start = min_dates[min_dates >= cutoff_date].index.tolist()
print("\nCategories that start after 2002:\n")
for xcat in xcatx_late_start:
    print(xcat)

# Remove categories that start late
macroz = [x for x in macroz if not x in xcatx_late_start]

              

               Categories with less than 10 cross sections:

CBCSCORE_SA_D3M3ML3_WG_ZN
CBCSCORE_SA_D3M3ML3_ZN
CBCSCORE_SA_WG_ZN
CBCSCORE_SA_ZN
CORPINTNETGDP_SA_D1Q1QL4_WG_ZN
CORPINTNETGDP_SA_D1Q1QL4_ZN
HHINTNETGDP_SA_D1M1ML12_WG_ZN
HHINTNETGDP_SA_D1M1ML12_ZN

Categories that start after 2002:

COXR_VT10vWTI_21DMA_ZN

              

               # Reduce label dictionary

cat_label_dict = {k:v for k, v in cat_alllabel_dict.items() if k in macroz}

               # Visualize remaining macroeconomic categories
msm.check_availability(df, xcats=macroz, cids=cids_eq, missing_recent=False)

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/d0ba015e2a579a9c46411eb7365d0222456114b1952fdd2976db95f06fe59903.png

Conditional imputation of missing cross-sections #

               # Impute cross-sectional values if majority of cross sections are available

# Set parameters
impute_missing_cids = True
min_ratio_cids = 0.4

# Exclude categories than cannot logically be imputed
non_imputables = [
    "CXPI_NSA_P1M12ML1_ZN",
    "CMPI_NSA_P1M12ML1_ZN",
    "CTOT_NSA_P1M12ML1_ZN",
    "REEROADJ_NSA_P1M12ML1_ZN",
]
imputables = list(set(macroz) - set(non_imputables))

if impute_missing_cids:
    df_impute = msp.impute_panel(
        df=df, xcats=imputables, cids=cids_eq, threshold=min_ratio_cids
    )
    dfx = msm.update_df(df, df_impute)
else:
    dfx = df.copy()

              

               # Visualize imputed macroeconomic categories
msm.check_availability(dfx, xcats=macroz, cids=cids_eq, missing_recent=False)

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/b563c6a74748be4434cd621ea39221f7a7166b208c9ddf3eac28d08c81284a69.png

Equity sectoral return blacklisting #

               sector_blacklist = {}

for sec in list(set(cids_secs) - {"ALL"}):
    
    dfb = df[df["xcat"] == f"EQC{sec}UNTRADABLE_NSA"].loc[:, ["cid", "xcat", "real_date", "value"]]
    dfba = (
        dfb.groupby(["cid", "real_date"])
        .aggregate(value=pd.NamedAgg(column="value", aggfunc="max"))
        .reset_index()
    )
    dfba["xcat"] = f"EQC{sec}BLACK"
    
    sector_blacklist[sec] = msp.make_blacklist(dfba, f"EQC{sec}BLACK")

              

Visualize target availability #

              targets = [
    x for x in eqrets if x.endswith(("R_NSAvALL", "R_VT10vALL"))
]
msm.check_availability(dfx, targets, missing_recent=False)

             

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/b35f09e2cd0f98068d481d6af95dc48feab066b27c22e227a85ee1ebcd8eaf5a.png

Sectoral signals and naive PnLs #

Common pipeline for all sectors #

               default_target_type = "R_VT10vALL"

default_splitter = {"Expanding": msl.ExpandingKFoldPanelSplit(n_splits=3)}
default_metric = {"Sharpe": make_scorer(msl.sharpe_ratio, greater_is_better=True)}

# Model dictionary
default_models = {
        "ols": Pipeline(
          [
              ("selector", msl.LarsSelector()),
              ("predictor", msl.ModifiedLinearRegression(method = "analytic")),
          ]
        ),
        "swls": Pipeline(
          [
              ("selector", msl.LarsSelector()),
              ("predictor", msl.ModifiedSignWeightedLinearRegression(method = "analytic")),
          ]
        ),
        "twls": Pipeline(
          [
              ("selector", msl.LarsSelector()),
              ("predictor", msl.ModifiedTimeWeightedLinearRegression(method = "analytic")),
          ]
        ),
    }

# Hyperparameter grid   
default_hparam_grid = {
    "ols": {
        "selector__n_factors": [3, 5, 10],
        "predictor__fit_intercept": [True, False]
    },
    "swls": {
        "selector__n_factors": [3, 5, 10],
        "predictor__fit_intercept": [True, False]
    },
    "twls": {
       "selector__n_factors": [3, 5, 10],
       "predictor__fit_intercept": [True, False],
       "predictor__half_life": [12, 24, 36, 60],
    },
}


# Default parameters
default_test_size = 3  # retraining interval in months
default_min_cids = 2  # minimum number of cids to start predicting
default_min_periods = 24  # minimum number of periods to start predicting
default_split_functions = {"Expanding": lambda n: n // 24}
#default_threshold_ndates = 24 # number of rebalancing dates to pass before incrementing the number of CV splits
#default_initial_nsplits = 3 # initial number of cross-validation splits
default_start_date = "2003-01-31"  # start date for the analysis

              

Energy #

Factor selection and signal generation #

                sector = "ENR"

enr_dict = {
    "sector_name": sector_labels[sector],
    "signal_name": f"{sector}SOL",
    "pnl_name": f"{sector_labels[sector]} learning-based signal",
    "xcatx": macroz,
    "cidx": list(set(cids_eq)-set(["CHF"])), # CHF has no energy companies
    "ret": f"EQC{sector}{default_target_type}",
    "freq": "M",
    "black": sector_blacklist[sector],
    "srr": None,
    "pnls": None,
}

               

                xcatx = enr_dict["xcatx"] + [enr_dict["ret"]]
cidx = enr_dict["cidx"]

so_enr = msl.SignalOptimizer(
    df = dfx,
    xcats = xcatx,
    cids = cidx,
    blacklist = enr_dict["black"],
    freq = enr_dict["freq"],
    lag = 1,
    xcat_aggs=["last", "sum"],
)

               

                secname = enr_dict["sector_name"]
signal_name = enr_dict["signal_name"]


so_enr.calculate_predictions(
    name=signal_name,
    models=default_models,
    scorers=default_metric,
    hyperparameters=default_hparam_grid,
    inner_splitters = default_splitter,
    test_size=default_test_size,
    min_cids=default_min_cids, 
    min_periods=default_min_periods,
    n_jobs_outer=-1,
    split_functions=default_split_functions,
)
so_enr.models_heatmap(
    signal_name,
    cap=10,
    title=f"{secname} sector: model selection heatmap",
)

# Store signals
dfa = so_enr.get_optimized_signals()
dfx = msm.update_df(dfx, dfa)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/0445a4ccdef8acadce3dc4a6d264297a6237658fb9478941286f3afbf8d49e7e.png

                xcatx = enr_dict["signal_name"]
secname = enr_dict["sector_name"]

so_enr.feature_selection_heatmap(
    xcatx,
    remove_blanks=True,
    title=f"{secname} sector: feature selection heatmap",
    ftrs_renamed=cat_label_dict,
    figsize=(12, 10),
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/ba0f5d06a46cf0697205bb2fffd87f8e8df483afcdcf6bcbb473f3abd8e92cef.png

                xcatx = enr_dict["signal_name"]
secname = enr_dict["sector_name"]

so_enr.coefs_stackedbarplot(
    name=xcatx,
    ftrs_renamed=cat_label_dict,
    title=f"{secname} sector: annual averages of most important feature coefficients",
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/b9a4e0127fe2155731c7ae7864711c89d7ecd06f2356ff3aba01763a6f5d66cc.png

Signal quality check #

                xcatx = [enr_dict["signal_name"], enr_dict["ret"]]
cidx = enr_dict["cidx"]
secname = enr_dict["sector_name"]

cr_enr = msp.CategoryRelations(
    df=dfx,
    xcats=xcatx,
    cids=cidx,
    freq=enr_dict["freq"],
    lag=1,
    blacklist = enr_dict["black"],
    xcat_aggs=["last", "sum"],
    slip=1,
    xcat_trims=[30, 30],  # trim dodgy data point
)

cr_enr.reg_scatter(
    title=f"{secname} sector: learning-based signal and subsequent returns",
    labels=False,
    prob_est="map",
    xlab=f"{secname} signal, end-of-month, based on concurrent best model",
    ylab=f"Relative return of {secname.lower()} sector (vol-targeted), next month, %",
    coef_box="upper left",
    size=(12, 8),
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/59c44ba1e179d1f5832df665cf238adf3d91dfb5953b4e6f2fb0cf89a23c453b.png

                xcatx = [enr_dict["signal_name"]]
cidx = enr_dict["cidx"]
secname = enr_dict["sector_name"]

pnl_enr = msn.NaivePnL(
    df=dfx,
    ret=enr_dict["ret"],
    sigs=xcatx,
    cids=cidx,
    start=default_start_date,
    blacklist=enr_dict["black"],
    bms=["USD_EQXR_NSA"],
)

for xcat in xcatx:
    pnl_enr.make_pnl(
        sig=xcat,
        sig_op="zn_score_pan",
        rebal_freq="monthly",
        neutral="zero",
        rebal_slip=1,
        vol_scale = None,
        thresh=3,
        pnl_name=enr_dict["pnl_name"],
    )
pnl_enr.make_long_pnl(
    vol_scale=None, label=f"{secname} always long versus all-sector basket"
)

pnl_enr.plot_pnls(
    pnl_cats=pnl_enr.pnl_names,
    title=f"{secname} sector: naive PnLs of positions versus all-sector basket",
    title_fontsize=14
)

enr_dict["pnls"] = pnl_enr
pnl_enr.evaluate_pnls(pnl_cats=pnl_enr.pnl_names)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/3cf49ca717a9d5ef4e8d5443582a7b2496354f35b8d1ebc7d646fc4966237c87.png

xcat	Energy learning-based signal	Energy always long versus all-sector basket
Return %	18.193195	-18.406846
St. Dev. %	67.730761	52.984726
Sharpe Ratio	0.268611	-0.347399
Sortino Ratio	0.385311	-0.47721
Max 21-Day Draw %	-123.365164	-78.404808
Max 6-Month Draw %	-152.180947	-186.69396
Peak to Trough Draw %	-255.149725	-837.214078
Top 5% Monthly PnL Share	1.77158	-1.418463
USD_EQXR_NSA correl	-0.080132	-0.046463
Traded Months	263	263

                secname = enr_dict["sector_name"]

pnl_enr.signal_heatmap(
    pnl_name=enr_dict["pnl_name"],
    figsize=(12, 3),
    title=f"{secname} sector: signal heatmap",
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/2ac931365b552ba53dfc961346ca1454c50a4a60b6d0099bb036f76ff2239167.png

Materials #

Factor selection and signal generation #

                sector = "MAT"

mat_dict = {
    "sector_name": sector_labels[sector],
    "signal_name": f"{sector}SOL",
    "pnl_name": f"{sector_labels[sector]} learning-based signal",
    "xcatx": macroz,
    "cidx": cids_eq, 
    "ret": f"EQC{sector}{default_target_type}",
    "freq": "M",
    "black": sector_blacklist[sector],
    "srr": None,
    "pnls": None,
}

               

                xcatx = mat_dict["xcatx"] + [mat_dict["ret"]]
cidx = mat_dict["cidx"]

so_mat = msl.SignalOptimizer(
    df = dfx,
    xcats = xcatx,
    cids = cidx,
    blacklist = mat_dict["black"],
    freq = mat_dict["freq"],
    lag = 1,
    xcat_aggs=["last", "sum"],
)

               

                secname = mat_dict["sector_name"]
signal_name = mat_dict["signal_name"]


so_mat.calculate_predictions(
    name=signal_name,
    models=default_models,
    scorers=default_metric,
    hyperparameters=default_hparam_grid,
    inner_splitters = default_splitter,
    test_size=default_test_size,
    min_cids=default_min_cids, 
    min_periods=default_min_periods,
    n_jobs_outer=-1,
    split_functions=default_split_functions,
)
so_mat.models_heatmap(
    signal_name,
    cap=10,
    title=f"{secname} sector: model selection heatmap",
)

# Store signals
dfa = so_mat.get_optimized_signals()
dfx = msm.update_df(dfx, dfa)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/bd596f1598362af0b5c6a4c889371b9f40a55e7778efffc3b6bf12bd16eec9ca.png

                xcatx = mat_dict["signal_name"]
secname = mat_dict["sector_name"]

so_mat.feature_selection_heatmap(
    xcatx,
    remove_blanks=True,
    title=f"{secname} sector: feature selection heatmap",
    ftrs_renamed=cat_label_dict,
    figsize=(12, 10),
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/2303cc1bfd7688647e0f836e8e07e7ad2531738d60f2dd38bfe2a23b9ec2abc3.png

                xcatx = mat_dict["signal_name"]
secname = mat_dict["sector_name"]

so_mat.coefs_stackedbarplot(
    name=xcatx,
    ftrs_renamed=cat_label_dict,
    title=f"{secname} sector: annual averages of most important feature coefficients",
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/b2146ce9844066bd11c0f189231a7b9190ab22980ab7dc8bc2b8701be1afd8db.png

Signal quality check #

                xcatx = [mat_dict["signal_name"], mat_dict["ret"]]
cidx = mat_dict["cidx"]
secname = mat_dict["sector_name"]

cr_mat = msp.CategoryRelations(
    df=dfx,
    xcats=xcatx,
    cids=cidx,
    freq=mat_dict["freq"],
    blacklist=mat_dict["black"],
    lag=1,
    xcat_aggs=["last", "sum"],
    slip=1,
)

cr_mat.reg_scatter(
    title=f"{secname} sector: learning-based signal and subsequent returns",
    labels=False,
    prob_est="map",
    xlab=f"{secname} signal, end-of-month, based on concurrent best model",
    ylab=f"Relative return of {secname.lower()} sector (vol-targeted), next month, %",
    coef_box="upper left",
    size=(12, 8),
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/615a25ddae110fdce199d039456d9930dae49de1b84a25545397e23c50ae0d05.png

                xcatx = [mat_dict["signal_name"]]
cidx = mat_dict["cidx"]
secname = mat_dict["sector_name"]
pnl_name=mat_dict["pnl_name"]

pnl_mat = msn.NaivePnL(
    df=dfx,
    ret=mat_dict["ret"],
    sigs=xcatx,
    cids=cidx,
    start=default_start_date,
    blacklist=mat_dict["black"],
    bms=["USD_EQXR_NSA"],
)

for xcat in xcatx:
    pnl_mat.make_pnl(
        sig=xcat,
        sig_op="zn_score_pan",
        rebal_freq="monthly",
        neutral="zero",
        rebal_slip=1,
        vol_scale=None,
        thresh=3,
        pnl_name=pnl_name,
    )
pnl_mat.make_long_pnl(
    vol_scale=None, label=f"{secname} always long versus all-sector basket"
)

pnl_mat.plot_pnls(
    pnl_cats=pnl_mat.pnl_names,
    title=f"{secname} sector: naive PnLs of positions versus all-sector basket",
    title_fontsize=14
)

mat_dict["pnls"] = pnl_mat
pnl_mat.evaluate_pnls(pnl_cats=pnl_mat.pnl_names)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/d17c9e21985f28d31e7b14d128f11a6b4b1f03bacef18ce3d155f87a1ee28d23.png

xcat	Materials learning-based signal	Materials always long versus all-sector basket
Return %	25.973199	-22.888695
St. Dev. %	47.2663	40.064396
Sharpe Ratio	0.549508	-0.571298
Sortino Ratio	0.809036	-0.783012
Max 21-Day Draw %	-73.752503	-61.657388
Max 6-Month Draw %	-117.635772	-152.838949
Peak to Trough Draw %	-137.181085	-723.482521
Top 5% Monthly PnL Share	0.961768	-0.710606
USD_EQXR_NSA correl	-0.018983	0.025908
Traded Months	263	263

                secname = mat_dict["sector_name"]
xcatx = mat_dict["signal_name"]

pnl_mat.signal_heatmap(
    pnl_name=f"{secname} learning-based signal",
    figsize=(12, 3),
    title=f"{secname} sector: signal heatmap",
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/1dbd91b58f436ce4694fcbdc3de0f0ee55659690fc0ddf796beabf5faef5a7e3.png

Industrials #

Factor selection and signal generation #

                sector = "IND"

ind_dict = {
    "sector_name": sector_labels[sector],
    "signal_name": f"{sector}SOL",
    "pnl_name": f"{sector_labels[sector]} learning-based signal",
    "xcatx": macroz,
    "cidx": cids_eq,
    "ret": f"EQC{sector}{default_target_type}",
    "freq": "M",
    "black": sector_blacklist[sector],
    "srr": None,
    "pnls": None,
}

               

                xcatx = ind_dict["xcatx"] + [ind_dict["ret"]]
cidx = ind_dict["cidx"]

so_ind = msl.SignalOptimizer(
    df = dfx,
    xcats = xcatx,
    cids = cidx,
    blacklist = ind_dict["black"],
    freq = ind_dict["freq"],
    lag = 1,
    xcat_aggs=["last", "sum"],
)

               

                secname = ind_dict["sector_name"]
signal_name = ind_dict["signal_name"]


so_ind.calculate_predictions(
    name=signal_name,
    models=default_models,
    scorers=default_metric,
    hyperparameters=default_hparam_grid,
    inner_splitters = default_splitter,
    test_size=default_test_size,
    min_cids=default_min_cids, 
    min_periods=default_min_periods,
    n_jobs_outer=-1,
    split_functions=default_split_functions,
)
so_ind.models_heatmap(
    signal_name,
    cap=10,
    title=f"{secname} sector: model selection heatmap",
)

# Store signals
dfa = so_ind.get_optimized_signals()
dfx = msm.update_df(dfx, dfa)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/3e9473143f4cca8f38600faf24921e53532709ba5694d7d9fe6d8c626e7c142f.png

                secname = ind_dict["sector_name"]
xcatx = ind_dict["signal_name"]

so_ind.feature_selection_heatmap(
    xcatx,
    remove_blanks=True,
    title=f"{secname} sector: feature selection heatmap",
    ftrs_renamed=cat_label_dict,
    figsize=(12, 10),
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/ae0cb15bb817ec755a4e05d126071441f612ae5c9783b415c38046cd0ad21309.png

                secname = ind_dict["sector_name"]
xcatx = ind_dict["signal_name"]

so_ind.coefs_stackedbarplot(
    name=xcatx,
    ftrs_renamed=cat_label_dict,
    title=f"{secname} sector: annual averages of most important feature coefficients",
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/ead6d24f40ab0f9962c60d15d81680a61cf7df650f5b584c7bc7a90146147deb.png

Signal quality check #

                xcatx = [ind_dict["signal_name"], ind_dict["ret"]]
cidx = ind_dict["cidx"]
secname = ind_dict["sector_name"]

cr_ind = msp.CategoryRelations(
    df=dfx,
    xcats=xcatx,
    cids=cidx,
    freq=ind_dict["freq"],
    blacklist=ind_dict["black"],
    lag=1,
    xcat_aggs=["last", "sum"],
    slip=1,
)

cr_ind.reg_scatter(
    title=f"{secname} sector: learning-based signal and subsequent returns",
    labels=False,
    prob_est="map",
    xlab=f"{secname} signal, end-of-month, based on concurrent best model",
    ylab=f"Relative return of {secname.lower()} sector (vol-targeted), next month, %",
    coef_box="upper left",
    size=(12, 8),
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/5b6a11c457f9362d59cdc0d07c3d02937c8b30b0600cf54335f0cad39244331c.png

                xcatx = [ind_dict["signal_name"]]
cidx = ind_dict["cidx"]
secname = ind_dict["sector_name"]
pnl_name = ind_dict["pnl_name"]

pnl_ind = msn.NaivePnL(
    df=dfx,
    ret=ind_dict["ret"],
    sigs=xcatx,
    cids=cidx,
    start=default_start_date,
    bms=["USD_EQXR_NSA"],
    blacklist = ind_dict["black"],
)

for xcat in xcatx:
    pnl_ind.make_pnl(
        sig=xcat,
        sig_op="zn_score_pan",
        rebal_freq="monthly",
        neutral="zero",
        rebal_slip=1,
        vol_scale = None,
        thresh=3,
        pnl_name=pnl_name,
    )

    pnl_ind.make_long_pnl(
    vol_scale=None, label=f"{secname} always long versus all-sector basket"
)

pnl_ind.plot_pnls(
    pnl_cats=pnl_ind.pnl_names,
    title=f"{secname} sector: naive PnLs of positions versus all-sector basket",
    title_fontsize=14
)

ind_dict["pnls"] = pnl_ind
pnl_ind.evaluate_pnls(pnl_cats=pnl_ind.pnl_names)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/798364e04f37c73de748a1de672bd8cdbb987a63a9f3979d4248af1d775196f1.png

xcat	Industrials learning-based signal	Industrials always long versus all-sector basket
Return %	-2.619157	17.250604
St. Dev. %	28.243689	31.670945
Sharpe Ratio	-0.092734	0.544682
Sortino Ratio	-0.13157	0.775381
Max 21-Day Draw %	-39.076913	-54.172009
Max 6-Month Draw %	-55.204975	-63.203195
Peak to Trough Draw %	-164.488694	-91.886853
Top 5% Monthly PnL Share	-5.457844	0.854147
USD_EQXR_NSA correl	0.014874	0.266474
Traded Months	263	263

                xcatx = ind_dict["signal_name"]
pnl_ind.signal_heatmap(
    pnl_name=f"{secname} learning-based signal",
    figsize=(12, 3),
    title=f"{secname} sector: signal heatmap",
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/ada008b42a343eeef8740f9d3e1a16003464f8f40613297bc6d2a2e224d7e84f.png

Consumer discretionary #

Factor selection and signal generation #

                sector = "COD"

cod_dict = {
    "sector_name": sector_labels[sector],
    "signal_name": f"{sector}SOL",
    "pnl_name": f"{sector_labels[sector]} learning-based signal",
    "xcatx": macroz,
    "cidx": cids_eq,
    "ret": f"EQC{sector}{default_target_type}",
    "freq": "M",
    "black": sector_blacklist[sector],
    "srr": None,
    "pnls": None,
}

               

                xcatx = cod_dict["xcatx"] + [cod_dict["ret"]]
cidx = cod_dict["cidx"]

so_cod = msl.SignalOptimizer(
    df = dfx,
    xcats = xcatx,
    cids = cidx,
    blacklist = cod_dict["black"],
    freq = cod_dict["freq"],
    lag = 1,
    xcat_aggs=["last", "sum"],
)

               

                secname = cod_dict["sector_name"]
signal_name = cod_dict["signal_name"]


so_cod.calculate_predictions(
    name=signal_name,
    models=default_models,
    scorers=default_metric,
    hyperparameters=default_hparam_grid,
    inner_splitters = default_splitter,
    test_size=default_test_size,
    min_cids=default_min_cids, 
    min_periods=default_min_periods,
    n_jobs_outer=-1,
    split_functions=default_split_functions,
)
so_cod.models_heatmap(
    signal_name,
    cap=10,
    title=f"{secname} sector: model selection heatmap",
)

# Store signals
dfa = so_cod.get_optimized_signals()
dfx = msm.update_df(dfx, dfa)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/ac9f9fdb0d8cacb225bf5a2c12bb25d8459d3c033fc5ebddd9f994acec8a78cc.png

                xcatx = cod_dict["signal_name"]
secname = cod_dict["sector_name"]

so_cod.feature_selection_heatmap(
    xcatx,
    remove_blanks=True,
    title=f"{secname} sector: feature selection heatmap",
    ftrs_renamed=cat_label_dict,
    figsize=(12, 10),
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/8b43ac6015bd3e8000d045b489565864267b7bffc0ccb199551319a5236029e5.png

                xcatx = cod_dict["signal_name"]
secname = cod_dict["sector_name"]

so_cod.coefs_stackedbarplot(
    name=xcatx,
    ftrs_renamed=cat_label_dict,
    title=f"{secname} sector: annual averages of most important feature coefficients",
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/8740ded2aacc927db0272ac8b46a5b2427d08f1e42deb16b42b835ac868aee7a.png

Signal quality check #

                xcatx = [cod_dict["signal_name"], cod_dict["ret"]]
cidx = cod_dict["cidx"]
signal_name = cod_dict["signal_name"]

cr_cod = msp.CategoryRelations(
    df=dfx,
    xcats=xcatx,
    cids=cidx,
    freq=cod_dict["freq"],
    blacklist = cod_dict["black"],
    lag=1,
    xcat_aggs=["last", "sum"],
    slip=1,
)

cr_cod.reg_scatter(
    title=f"{secname} sector: learning-based signal and subsequent returns",
    labels=False,
    prob_est="map",
    xlab=f"{secname} signal, end-of-month, based on concurrent best model",
    ylab=f"Relative return of {secname.lower()} sector (vol-targeted), next month, %",
    coef_box="upper left",
    size=(12, 8),
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/4f5e1c715fc04898f77f537f4c508c5897e74cae746e1836457155dcaccf1c40.png

                xcatx = [cod_dict["signal_name"]]
cidx = cod_dict["cidx"]
secname = cod_dict["sector_name"]
signal_name = cod_dict["signal_name"]
pnl_name = cod_dict["pnl_name"]

pnl_cod = msn.NaivePnL(
    df=dfx,
    ret=cod_dict["ret"],
    sigs=xcatx,
    cids=cidx,
    start=default_start_date,
    blacklist = cod_dict["black"],
    bms=["USD_EQXR_NSA"],
)

for xcat in xcatx:
    pnl_cod.make_pnl(
        sig=xcat,
        sig_op="zn_score_pan",
        rebal_freq="monthly",
        neutral="zero",
        rebal_slip=1,
        vol_scale = None,
        thresh=3,
        pnl_name=pnl_name,

    )
pnl_cod.make_long_pnl(
    vol_scale=None, label=f"{secname} always long versus all-sector basket"
)

pnl_cod.plot_pnls(
    pnl_cats=pnl_cod.pnl_names,
    title=f"{secname} sector: naive PnLs of positions versus all-sector basket",
    title_fontsize=14
)

cod_dict["pnls"] = pnl_cod
pnl_cod.evaluate_pnls(pnl_cats=pnl_cod.pnl_names)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/a5e55cd881f65c44701a5995eb99eb64f9f6190cc1224c21e6c275c7bfc81d6c.png

xcat	Cons. discretionary learning-based signal	Cons. discretionary always long versus all-sector basket
Return %	15.116939	-15.540605
St. Dev. %	29.846915	31.260424
Sharpe Ratio	0.506482	-0.497134
Sortino Ratio	0.715244	-0.687177
Max 21-Day Draw %	-50.727988	-68.480798
Max 6-Month Draw %	-49.162342	-96.360886
Peak to Trough Draw %	-75.936448	-358.531109
Top 5% Monthly PnL Share	0.90696	-0.755639
USD_EQXR_NSA correl	-0.020859	0.095807
Traded Months	263	263

                pnl_name = cod_dict["pnl_name"]
secname = cod_dict["sector_name"]

pnl_cod.signal_heatmap(
    pnl_name=f"{secname} learning-based signal",
    figsize=(12, 3),
    title=f"{secname} sector: signal heatmap",
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/dad56750c8802e14431963e0f3b3fd08e509900a76c214cde3dde6f193f9948e.png

Consumer staples #

Factor selection and signal generation #

                sector = "COS"

cos_dict = {
    "sector_name": sector_labels[sector],
    "signal_name": f"{sector}SOL",
    "pnl_name": f"{sector_labels[sector]} learning-based signal",
    "xcatx": macroz,
    "cidx": cids_eq,
    "ret": f"EQC{sector}{default_target_type}",
    "freq": "M",
    "black": sector_blacklist[sector],
    "srr": None,
    "pnls": None,
}

               

                xcatx = cos_dict["xcatx"] + [cos_dict["ret"]]
cidx = cos_dict["cidx"]

so_cos = msl.SignalOptimizer(
    df = dfx,
    xcats = xcatx,
    cids = cidx,
    blacklist = cos_dict["black"],
    freq = cos_dict["freq"],
    lag = 1,
    xcat_aggs=["last", "sum"],
)

               

                secname = cos_dict["sector_name"]
signal_name = cos_dict["signal_name"]


so_cos.calculate_predictions(
    name=signal_name,
    models=default_models,
    scorers=default_metric,
    hyperparameters=default_hparam_grid,
    inner_splitters = default_splitter,
    test_size=default_test_size,
    min_cids=default_min_cids, 
    min_periods=default_min_periods,
    n_jobs_outer=-1,
    split_functions=default_split_functions,
)
so_cos.models_heatmap(
    signal_name,
    cap=10,
    title=f"{secname} sector: model selection heatmap",
)

# Store signals
dfa = so_cos.get_optimized_signals()
dfx = msm.update_df(dfx, dfa)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/42c50008eda16acd78261a68a245ed307c32629a986b90e153e60c165f4dd08a.png

                xcatx = cos_dict["signal_name"]
secname = cos_dict["sector_name"]

so_cos.feature_selection_heatmap(
    xcatx,
    remove_blanks=True,
    title=f"{secname} sector: feature selection heatmap",
    ftrs_renamed=cat_label_dict,
    figsize=(12, 10),
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/69e337f0150a1a6e9055f6e7e36b5cb60ca1fb5adb2929197a17b28f57f87954.png

                xcatx = cos_dict["signal_name"]
secname = cos_dict["sector_name"]

so_cos.coefs_stackedbarplot(
    name=xcatx,
    ftrs_renamed=cat_label_dict,
    title=f"{secname} sector: annual averages of most important feature coefficients",
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/38918da219a3f1670ab59ed35694354af918390e98795128e613dd9541fded00.png

Signal quality check #

                xcatx = [cos_dict["signal_name"], cos_dict["ret"]]
cidx = cos_dict["cidx"]
secname = cos_dict["sector_name"]

cr_cos = msp.CategoryRelations(
    df=dfx,
    xcats=xcatx,
    cids=cidx,
    freq=cos_dict["freq"],
    blacklist = cos_dict["black"],
    lag=1,
    xcat_aggs=["last", "sum"],
    slip=1,
)

cr_cos.reg_scatter(
    title=f"{secname} sector: learning-based signal and subsequent returns",
    labels=False,
    prob_est="map",
    xlab=f"{secname} signal, end-of-month, based on concurrent best model",
    ylab=f"Relative return of {secname.lower()} sector (vol-targeted), next month, %",
    coef_box="upper left",
    size=(12, 8),
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/4abf12052d119103513a5bccad20c305feb85edf4d847862ec62738c4774de4c.png

                xcatx = [cos_dict["signal_name"]]
cidx = cos_dict["cidx"]
secname = cos_dict["sector_name"]
signal_name = cos_dict["signal_name"]
pnl_name = cos_dict["pnl_name"]

pnl_cos = msn.NaivePnL(
    df=dfx,
    ret=cos_dict["ret"],
    sigs=xcatx,
    cids=cidx,
    start=default_start_date,
    blacklist = cos_dict["black"],
    bms=["USD_EQXR_NSA"],
)

for xcat in xcatx:
    pnl_cos.make_pnl(
        sig=xcat,
        sig_op="zn_score_pan",
        rebal_freq="monthly",
        neutral="zero",
        rebal_slip=1,
        vol_scale = None,
        thresh=3,
        pnl_name=pnl_name,
    )
pnl_cos.make_long_pnl(
    vol_scale=None, label=f"{secname} always long versus all-sector basket"
)

pnl_cos.plot_pnls(
    pnl_cats=pnl_cos.pnl_names, 
    title=f"{secname} sector: naive PnLs of positions versus all-sector basket",
    title_fontsize=14
)

cos_dict["pnls"] = pnl_cos
pnl_cos.evaluate_pnls(pnl_cats=pnl_cos.pnl_names)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/d76a3fcd2a277ae81212f8ac36166b9d11edb959c27400270f4ff8b0dd21af8f.png

xcat	Cons. staples learning-based signal	Cons. staples always long versus all-sector basket
Return %	14.219914	7.023877
St. Dev. %	33.631262	37.543193
Sharpe Ratio	0.422818	0.187088
Sortino Ratio	0.620258	0.268016
Max 21-Day Draw %	-48.463479	-33.681565
Max 6-Month Draw %	-53.161616	-93.828486
Peak to Trough Draw %	-98.285987	-188.904199
Top 5% Monthly PnL Share	1.25132	2.446341
USD_EQXR_NSA correl	-0.012331	-0.13535
Traded Months	263	263

                pnl_name = cos_dict["pnl_name"]
secname = cos_dict["sector_name"]

pnl_cos.signal_heatmap(
    pnl_name=pnl_name,
    figsize=(12, 3),
    title=f"{secname} sector: signal heatmap",
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/8e76f06c2dc733b6ee8f89638d24a61434a77bbfd1075761cf4eb6ff8f3fd52f.png

Healthcare #

Factor selection and signal generation #

                sector = "HLC"

hlc_dict = {
    "sector_name": sector_labels[sector],
    "signal_name": f"{sector}SOL",
    "pnl_name": f"{sector_labels[sector]} learning-based signal",
    "xcatx": macroz,
    "cidx": cids_eq,
    "ret": f"EQC{sector}{default_target_type}",
    "freq": "M",
    "black": sector_blacklist[sector],
    "srr": None,
    "pnls": None,
}

               

                xcatx = hlc_dict["xcatx"] + [hlc_dict["ret"]]
cidx = hlc_dict["cidx"]

so_hlc = msl.SignalOptimizer(
    df = dfx,
    xcats = xcatx,
    cids = cidx,
    blacklist = hlc_dict["black"],
    freq = hlc_dict["freq"],
    lag = 1,
    xcat_aggs=["last", "sum"],
)

               

                secname = hlc_dict["sector_name"]
signal_name = hlc_dict["signal_name"]


so_hlc.calculate_predictions(
    name=signal_name,
    models=default_models,
    scorers=default_metric,
    hyperparameters=default_hparam_grid,
    inner_splitters = default_splitter,
    test_size=default_test_size,
    min_cids=default_min_cids, 
    min_periods=default_min_periods,
    n_jobs_outer=-1,
    split_functions=default_split_functions,
)
so_hlc.models_heatmap(
    signal_name,
    cap=10,
    title=f"{secname} sector: model selection heatmap",
)

# Store signals
dfa = so_hlc.get_optimized_signals()
dfx = msm.update_df(dfx, dfa)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/95d554816e22d5a4373eaa71512f8b81619ca6d7a8b521c90eb5c8ac67352143.png

                xcatx = hlc_dict["signal_name"]

so_hlc.feature_selection_heatmap(
    xcatx,
    remove_blanks=True,
    title="Feature selection heatmap, healthcare sector",
    ftrs_renamed=cat_label_dict,
    figsize=(12, 10),
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/11c33e8af52a94643132629379780a6654aba8ec9daba31f90d7867edc7db4e1.png

                xcatx = hlc_dict["signal_name"]
signal_name = hlc_dict["signal_name"]

so_hlc.coefs_stackedbarplot(
    name=xcatx,
    ftrs_renamed=cat_label_dict,
    title=f"{secname} sector: annual averages of most important feature coefficients",
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/b10348d3b77e4ebc636043f5702be63379780f73003213a7378ec25b422270ac.png

Signal quality check #

                xcatx = [hlc_dict["signal_name"], hlc_dict["ret"]]
cidx = hlc_dict["cidx"]
secname = hlc_dict["sector_name"]

cr_hlc = msp.CategoryRelations(
    df=dfx,
    xcats=xcatx,
    cids=cidx,
    freq=hlc_dict["freq"],
    blacklist = hlc_dict["black"],
    lag=1,
    xcat_aggs=["last", "sum"],
    slip=1,
)

cr_hlc.reg_scatter(
    title=f"{secname} sector: learning-based signal and subsequent returns",
    labels=False,
    prob_est="map",
    xlab=f"{secname} signal, end-of-month, based on concurrent best model",
    ylab=f"Relative return of {secname.lower()} sector (vol-targeted), next month, %",
    coef_box="upper left",
    size=(12, 8),
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/e7fb5b3a2b812da9ba9c131859cd0c7cf0a364cb661c7ac961c45befb9377f68.png

                xcatx = [hlc_dict["signal_name"]]
cidx = hlc_dict["cidx"]
secname = hlc_dict["sector_name"]
signal_name = hlc_dict["signal_name"]
pnl_name = hlc_dict["pnl_name"]

pnl_hlc = msn.NaivePnL(
    df=dfx,
    ret=hlc_dict["ret"],
    sigs=xcatx,
    cids=cidx,
    start=default_start_date,
    blacklist = hlc_dict["black"],
    bms=["USD_EQXR_NSA"],
)

for xcat in xcatx:
    pnl_hlc.make_pnl(
        sig=xcat,
        sig_op="zn_score_pan",
        rebal_freq="monthly",
        neutral="zero",
        rebal_slip=1,
        vol_scale = None,
        thresh=3,
        pnl_name=pnl_name,
    )
pnl_hlc.make_long_pnl(
    vol_scale=None, label=f"{secname} always long versus all-sector basket"
)

pnl_hlc.plot_pnls(
    pnl_cats=pnl_hlc.pnl_names,
    title=f"{secname} sector: naive PnLs of positions versus all-sector basket",
    title_fontsize=14
)

hlc_dict["pnls"] = pnl_hlc
pnl_hlc.evaluate_pnls(pnl_cats=pnl_hlc.pnl_names)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/52b24079135c1e94eec02dd1ee380de714012c6ed0c383370765463c4dd8b9bf.png

xcat	Healthcare learning-based signal	Healthcare always long versus all-sector basket
Return %	8.702954	-3.726952
St. Dev. %	35.605841	38.49197
Sharpe Ratio	0.244425	-0.096824
Sortino Ratio	0.35035	-0.13858
Max 21-Day Draw %	-54.683253	-47.409795
Max 6-Month Draw %	-74.21209	-93.207315
Peak to Trough Draw %	-199.7919	-262.724098
Top 5% Monthly PnL Share	2.050835	-4.570977
USD_EQXR_NSA correl	0.002254	-0.158846
Traded Months	263	263

                pnl_name = hlc_dict["pnl_name"]
secname = hlc_dict["sector_name"]

pnl_hlc.signal_heatmap(
    pnl_name=pnl_name,
    figsize=(12, 3),
    title=f"{secname} sector: signal heatmap",
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/216879d09b11d550caf95a4eacaead214502724cc443ed3d1653ec20342c823e.png

Financials #

Factor selection and signal generation #

                sector = "FIN"

fin_dict = {
    "sector_name": sector_labels[sector],
    "signal_name": f"{sector}SOL",
    "pnl_name": f"{sector_labels[sector]} learning-based signal",
    "xcatx": macroz,
    "cidx": cids_eq,
    "ret": f"EQC{sector}{default_target_type}",
    "freq": "M",
    "black": sector_blacklist[sector],
    "srr": None,
    "pnls": None,
}

               

                xcatx = fin_dict["xcatx"] + [fin_dict["ret"]]
cidx = fin_dict["cidx"]

so_fin = msl.SignalOptimizer(
    df = dfx,
    xcats = xcatx,
    cids = cidx,
    blacklist = fin_dict["black"],
    freq = fin_dict["freq"],
    lag = 1,
    xcat_aggs=["last", "sum"],
)

               

                secname = fin_dict["sector_name"]
signal_name = fin_dict["signal_name"]


so_fin.calculate_predictions(
    name=signal_name,
    models=default_models,
    scorers=default_metric,
    hyperparameters=default_hparam_grid,
    inner_splitters = default_splitter,
    test_size=default_test_size,
    min_cids=default_min_cids, 
    min_periods=default_min_periods,
    n_jobs_outer=-1,
    split_functions=default_split_functions,
)
so_fin.models_heatmap(
    signal_name,
    cap=10,
    title=f"{secname} sector: model selection heatmap",
)

# Store signals
dfa = so_fin.get_optimized_signals()
dfx = msm.update_df(dfx, dfa)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/2a3505783c6200e37b10df9628344e6e7d815fc871fb420c3df4dbc720a9a896.png

                xcatx = fin_dict["signal_name"]
secname = fin_dict["sector_name"]

so_fin.feature_selection_heatmap(
    xcatx,
    remove_blanks=True,
    title=f"{secname} sector: feature selection heatmap",
    ftrs_renamed=cat_label_dict,
    figsize=(12, 10),
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/1ba53c55a50e18bee63ed2d16f6d190e9f8b17d52e43c52c456d69fc20d34a60.png

                xcatx = fin_dict["signal_name"]
secname = fin_dict["sector_name"]

so_fin.coefs_stackedbarplot(
    name=xcatx,
    ftrs_renamed=cat_label_dict,
    title=f"{secname} sector: annual averages of most important feature coefficients",
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/dc182e48c3cdaddb27e24af579099a630f140b1302d710a617653423d60a069a.png

Signal quality check #

                xcatx = [fin_dict["signal_name"], fin_dict["ret"]]
cidx = fin_dict["cidx"]
secname = fin_dict["sector_name"]

cr_fin = msp.CategoryRelations(
    df=dfx,
    xcats=xcatx,
    cids=cidx,
    freq=fin_dict["freq"],
    blacklist = fin_dict["black"],
    lag=1,
    xcat_aggs=["last", "sum"],
    slip=1,
)

cr_fin.reg_scatter(
    title=f"{secname} sector: learning-based signal and subsequent returns",
    labels=False,
    prob_est="map",
    xlab=f"{secname} signal, end-of-month, based on concurrent best model",
    ylab=f"Relative return of {secname.lower()} sector (vol-targeted), next month, %",
    coef_box="upper left",
    size=(12, 8),
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/8aa6baab39f3a3a7d984bae201964a8f70706b625c0c100375d7aa83ec4edbba.png

                xcatx = [fin_dict["signal_name"]]
cidx = fin_dict["cidx"]
secname = fin_dict["sector_name"]
signal_name = fin_dict["signal_name"]
pnl_name = fin_dict["pnl_name"]

pnl_fin = msn.NaivePnL(
    df=dfx,
    ret=fin_dict["ret"],
    sigs=xcatx,
    cids=cidx,
    start=default_start_date,
    blacklist = fin_dict["black"],
    bms=["USD_EQXR_NSA"],
)

for xcat in xcatx:
    pnl_fin.make_pnl(
        sig=xcat,
        sig_op="zn_score_pan",
        rebal_freq="monthly",
        neutral="zero",
        rebal_slip=1,
        vol_scale = None,
        thresh=3,
        pnl_name=pnl_name,
    )
pnl_fin.make_long_pnl(
    vol_scale=None, label=f"{secname} always long versus all-sector basket"
)

pnl_fin.plot_pnls(
    pnl_cats=pnl_fin.pnl_names,
    title=f"{secname} sector: naive PnLs of positions versus all-sector basket",
    title_fontsize=14
)

fin_dict["pnls"] = pnl_fin
pnl_fin.evaluate_pnls(pnl_cats=pnl_fin.pnl_names)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/a423ac12202190ac6b34f5c3fef36d3a88f82cb8156ce9e477492fc6795fd91a.png

xcat	Financials learning-based signal	Financials always long versus all-sector basket
Return %	-1.090167	3.305202
St. Dev. %	43.639779	38.175851
Sharpe Ratio	-0.024981	0.086578
Sortino Ratio	-0.037477	0.126331
Max 21-Day Draw %	-86.769769	-76.126375
Max 6-Month Draw %	-152.897607	-96.122679
Peak to Trough Draw %	-312.039862	-332.744644
Top 5% Monthly PnL Share	-23.924655	5.982793
USD_EQXR_NSA correl	-0.042891	0.219152
Traded Months	263	263

                xcatx = fin_dict["signal_name"]
pnl_name = fin_dict["pnl_name"]
secname = fin_dict["sector_name"]

pnl_fin.signal_heatmap(
    pnl_name=pnl_name,
    figsize=(12, 3),
    title=f"{secname} sector: signal heatmap",
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/b981dbc12bb6688fe6f195e83837c4df650633f829dfd5cb1e995c6e843b9dbf.png

Technology #

Factor selection and signal generation #

                sector = "ITE"

ite_dict = {
    "sector_name": sector_labels[sector],
    "signal_name": f"{sector}SOL",
    "pnl_name": f"{sector_labels[sector]} learning-based signal",
    "xcatx": macroz,
    "cidx": cids_eq,
    "ret": f"EQC{sector}{default_target_type}",
    "freq": "M",
    "black": sector_blacklist[sector],
    "srr": None,
    "pnls": None,
}

               

                xcatx = ite_dict["xcatx"] + [ite_dict["ret"]]
cidx = ite_dict["cidx"]

so_ite = msl.SignalOptimizer(
    df = dfx,
    xcats = xcatx,
    cids = cidx,
    blacklist = ite_dict["black"],
    freq = ite_dict["freq"],
    lag = 1,
    xcat_aggs=["last", "sum"],
)

               

                secname = ite_dict["sector_name"]
signal_name = ite_dict["signal_name"]


so_ite.calculate_predictions(
    name=signal_name,
    models=default_models,
    scorers=default_metric,
    hyperparameters=default_hparam_grid,
    inner_splitters = default_splitter,
    test_size=default_test_size,
    min_cids=default_min_cids, 
    min_periods=default_min_periods,
    n_jobs_outer=-1,
    split_functions=default_split_functions,
)
so_ite.models_heatmap(
    signal_name,
    cap=10,
    title=f"{secname} sector: model selection heatmap",
)

# Store signals
dfa = so_ite.get_optimized_signals()
dfx = msm.update_df(dfx, dfa)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/0644466c983f4dcb507765cd659e516081b7b892a9c5c86894026854407a7c32.png

                xcatx = ite_dict["signal_name"]
secname = ite_dict["sector_name"]

so_ite.feature_selection_heatmap(
    xcatx,
    remove_blanks=True,
    title=f"{secname} sector: feature selection heatmap",
    ftrs_renamed=cat_label_dict,
    figsize=(12, 10),
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/ae52669e1923fe688f6e986f355813aa879f2da4bfd85e607ad376ac92c00e7e.png

                xcatx = ite_dict["signal_name"]
secname = ite_dict["sector_name"]

so_ite.coefs_stackedbarplot(
    name=xcatx,
    ftrs_renamed=cat_label_dict,
    title=f"{secname} sector: annual averages of most important feature coefficients",
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/fe0e8d2fcab7b2ce1804653c5e519a31556d56f65b8251b5cb948d35da24729e.png

Signal quality check #

                xcatx = [ite_dict["signal_name"], ite_dict["ret"]]
cidx = ite_dict["cidx"]
secname = ite_dict["sector_name"]

cr_ite = msp.CategoryRelations(
    df=dfx,
    xcats=xcatx,
    cids=cidx,
    freq=ite_dict["freq"],
    blacklist = ite_dict["black"],
    lag=1,
    xcat_aggs=["last", "sum"],
    slip=1,
)

cr_ite.reg_scatter(
    title=f"{secname} sector: learning-based signal and subsequent returns",
    labels=False,
    prob_est="map",
    xlab=f"{secname} signal, end-of-month, based on concurrent best model",
    ylab=f"Relative return of {secname.lower()} sector (vol-targeted), next month, %",
    coef_box="upper left",
    size=(12, 8),
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/30b3e35c72668e66fa18270150e2b40f2ce8f65bd56d42859d7f859ec4ca2119.png

                xcatx = [ite_dict["signal_name"]]
cidx = ite_dict["cidx"]
secname = ite_dict["sector_name"]
signal_name = ite_dict["signal_name"]
pnl_name = ite_dict["pnl_name"]

pnl_ite = msn.NaivePnL(
    df=dfx,
    ret=ite_dict["ret"],
    sigs=xcatx,
    cids=cidx,
    start=default_start_date,
    blacklist=ite_dict["black"],
    bms=["USD_EQXR_NSA"],
)

for xcat in xcatx:
    pnl_ite.make_pnl(
        sig=xcat,
        sig_op="zn_score_pan",
        rebal_freq="monthly",
        neutral="zero",
        rebal_slip=1,
        vol_scale = None,
        thresh=3,
        pnl_name=pnl_name,
    )
pnl_ite.make_long_pnl(
    vol_scale=None, label=f"{secname} always long versus all-sector basket"
)

pnl_ite.plot_pnls(
    pnl_cats=pnl_ite.pnl_names,
    title=f"{secname} sector: naive PnLs of positions versus all-sector basket",
    title_fontsize=14,
)

ite_dict["pnls"] = pnl_ite
pnl_ite.evaluate_pnls(pnl_cats=pnl_ite.pnl_names)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/5f63df064135324b3ccaa7373302a8f71b2822f74b7b52947f54503d1cc6e67a.png

xcat	Information tech learning-based signal	Information tech always long versus all-sector basket
Return %	6.809801	-7.301641
St. Dev. %	33.917082	37.508259
Sharpe Ratio	0.200778	-0.194668
Sortino Ratio	0.269447	-0.269019
Max 21-Day Draw %	-84.295933	-43.579705
Max 6-Month Draw %	-88.795048	-116.166222
Peak to Trough Draw %	-175.534762	-404.663283
Top 5% Monthly PnL Share	2.774087	-2.273547
USD_EQXR_NSA correl	-0.003084	0.031326
Traded Months	263	263

                pnl_name = ite_dict["pnl_name"]
secname = ite_dict["sector_name"]

pnl_ite.signal_heatmap(
    pnl_name=pnl_name,
    figsize=(12, 3),
    title=f"{secname} sector: signal heatmap",
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/3e2958b1cf21b694fe817a5e1cc3a31718181f7e80247c83f1059ff72a9eb906.png

Communication #

Factor selection and signal generation #

                sector = "CSR"

csr_dict = {
    "sector_name": sector_labels[sector],
    "signal_name": f"{sector}SOL",
    "pnl_name": f"{sector_labels[sector]} learning-based signal",
    "xcatx": macroz,
    "cidx": cids_eq,
    "ret": f"EQC{sector}{default_target_type}",
    "freq": "M",
    "black": sector_blacklist[sector],
    "srr": None,
    "pnls": None,
}

               

                xcatx = csr_dict["xcatx"] + [csr_dict["ret"]]
cidx = csr_dict["cidx"]

so_csr = msl.SignalOptimizer(
    df = dfx,
    xcats = xcatx,
    cids = cidx,
    blacklist = csr_dict["black"],
    freq = csr_dict["freq"],
    lag = 1,
    xcat_aggs=["last", "sum"],
)

               

                secname = csr_dict["sector_name"]
signal_name = csr_dict["signal_name"]


so_csr.calculate_predictions(
    name=signal_name,
    models=default_models,
    scorers=default_metric,
    hyperparameters=default_hparam_grid,
    inner_splitters = default_splitter,
    test_size=default_test_size,
    min_cids=default_min_cids, 
    min_periods=default_min_periods,
    n_jobs_outer=-1,
    split_functions=default_split_functions,
)
so_csr.models_heatmap(
    signal_name,
    cap=10,
    title=f"{secname} sector: model selection heatmap",
)

# Store signals
dfa = so_csr.get_optimized_signals()
dfx = msm.update_df(dfx, dfa)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/2d39515cc5a51f26e13ee1dc770ea89e48231d0145b892a79b4d3d68140ea1f1.png

                xcatx = csr_dict["signal_name"]
secname = csr_dict["sector_name"]

so_csr.feature_selection_heatmap(
    xcatx,
    remove_blanks=True,
    title=f"{secname} sector: feature selection heatmap",
    ftrs_renamed=cat_label_dict,
    figsize=(12, 10),
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/0d0bb52e294f875015068b44c0f3f4319e18c3d99abbfdb1b88d16dd8cefdb40.png

                xcatx = csr_dict["signal_name"]
secname = csr_dict["sector_name"]

so_csr.coefs_stackedbarplot(
    name=xcatx,
    ftrs_renamed=cat_label_dict,
    title=f"{secname} sector: annual averages of most important feature coefficients",
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/4ca43642e3af5f5b90daab4712521e876337b3a9ecea5525d7b01cbc7274f019.png

Signal quality check #

                xcatx = [csr_dict["signal_name"], csr_dict["ret"]]
cidx = csr_dict["cidx"]
secname = csr_dict["sector_name"]

cr_csr = msp.CategoryRelations(
    df=dfx,
    xcats=xcatx,
    cids=cidx,
    freq=csr_dict["freq"],
    blacklist = csr_dict["black"],
    lag=1,
    xcat_aggs=["last", "sum"],
    slip=1,
)

cr_csr.reg_scatter(
    title=f"{secname} sector: learning-based signal and subsequent returns",
    labels=False,
    prob_est="map",
    xlab=f"{secname} signal, end-of-month, based on concurrent best model",
    ylab=f"Relative return of {secname.lower()} sector (vol-targeted), next month, %",
    coef_box="upper left",
    size=(12, 8),
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/02c9824ce18bea71805e447edb9a2d3cc35bf24a54252a529cab431b9625c22e.png

                xcatx = [csr_dict["signal_name"]]
cidx = csr_dict["cidx"]
secname = csr_dict["sector_name"]
signal_name = csr_dict["signal_name"]
pnl_name = csr_dict["pnl_name"]

pnl_csr = msn.NaivePnL(
    df=dfx,
    ret=csr_dict["ret"],
    sigs=xcatx,
    cids=cidx,
    start=default_start_date,
    blacklist = csr_dict["black"],
    bms=["USD_EQXR_NSA"],
)

for xcat in xcatx:
    pnl_csr.make_pnl(
        sig=xcat,
        sig_op="zn_score_pan",
        rebal_freq="monthly",
        neutral="zero",
        rebal_slip=1,
        vol_scale = None,
        thresh=3,
        pnl_name=pnl_name,
    )
pnl_csr.make_long_pnl(
    vol_scale=None, label=f"{secname} always long versus all-sector basket"
)

pnl_csr.plot_pnls(
    pnl_cats=pnl_csr.pnl_names,
    title=f"{secname} sector: naive PnLs of positions versus all-sector basket",
    title_fontsize=14
)

csr_dict["pnls"] = pnl_csr
pnl_csr.evaluate_pnls(pnl_cats=pnl_csr.pnl_names)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/bc78f6078fee0dce789f32c09a6339060fcc159c7263fe051521ae15e77ed9a9.png

xcat	Communication services learning-based signal	Communication services always long versus all-sector basket
Return %	18.496164	-11.402601
St. Dev. %	29.464506	35.693288
Sharpe Ratio	0.627744	-0.319461
Sortino Ratio	0.913374	-0.452367
Max 21-Day Draw %	-58.825569	-41.34344
Max 6-Month Draw %	-86.419369	-110.518234
Peak to Trough Draw %	-99.632582	-322.629621
Top 5% Monthly PnL Share	0.918294	-1.31652
USD_EQXR_NSA correl	0.008472	-0.039535
Traded Months	263	263

                pnl_name = csr_dict["pnl_name"]
secname = csr_dict["sector_name"]

pnl_csr.signal_heatmap(
    pnl_name=pnl_name,
    figsize=(12, 3),
    title=f"{secname} sector: signal heatmap",
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/56e26210abdd056d8ac423ebe4741ac729ed28a8aa4f6039496245b1b97e8bdc.png

Utilities #

Factor selection and signal generation #

                sector = "UTL"

utl_dict = {
    "sector_name": sector_labels[sector],
    "signal_name": f"{sector}SOL",
    "pnl_name": f"{sector_labels[sector]} learning-based signal",
    "xcatx": macroz,
    "cidx": cids_eq,
    "ret": f"EQC{sector}{default_target_type}",
    "freq": "M",
    "black": sector_blacklist[sector],
    "srr": None,
    "pnls": None,
}

               

                xcatx = utl_dict["xcatx"] + [utl_dict["ret"]]
cidx = utl_dict["cidx"]

so_utl = msl.SignalOptimizer(
    df = dfx,
    xcats = xcatx,
    cids = cidx,
    blacklist = utl_dict["black"],
    freq = utl_dict["freq"],
    lag = 1,
    xcat_aggs=["last", "sum"],
)

               

                secname = utl_dict["sector_name"]
signal_name = utl_dict["signal_name"]

so_utl.calculate_predictions(
    name=signal_name,
    models=default_models,
    scorers=default_metric,
    hyperparameters=default_hparam_grid,
    inner_splitters = default_splitter,
    test_size=default_test_size,
    min_cids=default_min_cids, 
    min_periods=default_min_periods,
    n_jobs_outer=-1,
    split_functions=default_split_functions,
)
so_utl.models_heatmap(
    signal_name,
    cap=10,
    title=f"{secname} sector: model selection heatmap",
)

# Store signals
dfa = so_utl.get_optimized_signals()
dfx = msm.update_df(dfx, dfa)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/f81b82638ab927d35e58cce33343c978e4db050b2fbe5b6ece23366f2bec2b9d.png

                xcatx = utl_dict["signal_name"]
secname = utl_dict["sector_name"]

so_utl.feature_selection_heatmap(
    xcatx,
    remove_blanks=True,
    title=f"{secname} sector: feature selection heatmap",
    ftrs_renamed=cat_label_dict,
    figsize=(12, 10),
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/6834b7fac4388c2791d5c138c64c6e173e62e6ecd11b2c3d2e691648f0c842b5.png

                xcatx = utl_dict["signal_name"]
secname = utl_dict["sector_name"]

so_utl.coefs_stackedbarplot(
    name=xcatx,
    ftrs_renamed=cat_label_dict,
    title=f"{secname} sector: annual averages of most important feature coefficients",
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/820b61532c697d7d5b5f74d7a8d41036ef49a7aada2a7662fe34d402034a8ba0.png

Signal quality check #

                xcatx = [utl_dict["signal_name"], utl_dict["ret"]]
cidx = utl_dict["cidx"]
secname = utl_dict["sector_name"]

cr_utl = msp.CategoryRelations(
    df=dfx,
    xcats=xcatx,
    cids=cidx,
    freq=utl_dict["freq"],
    blacklist=utl_dict["black"],
    lag=1,
    xcat_aggs=["last", "sum"],
    slip=1,
)

cr_utl.reg_scatter(
    title=f"{secname} sector: learning-based signal and subsequent returns",
    labels=False,
    prob_est="map",
    xlab=f"{secname} signal, end-of-month, based on concurrent best model",
    ylab=f"Relative return of {secname.lower()} sector (vol-targeted), next month, %",
    coef_box="upper left",
    size=(12, 8),
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/b76276d12cb005de3f46c9a1e21fa74c0851a4facdc6a0ad2c36dc96d5713c31.png

                xcatx = [utl_dict["signal_name"]]
cidx = utl_dict["cidx"]
secname = utl_dict["sector_name"]
signal_name = utl_dict["signal_name"]
pnl_name = utl_dict["pnl_name"]

pnl_utl = msn.NaivePnL(
    df=dfx,
    ret=utl_dict["ret"],
    sigs=xcatx,
    cids=cidx,
    start=default_start_date,
    blacklist = utl_dict["black"],
    bms=["USD_EQXR_NSA"],
)

for xcat in xcatx:
    pnl_utl.make_pnl(
        sig=xcat,
        sig_op="zn_score_pan",
        rebal_freq="monthly",
        neutral="zero",
        rebal_slip=1,
        vol_scale = None,
        thresh=3,
        pnl_name=pnl_name,
    )
pnl_utl.make_long_pnl(
    vol_scale=None, label=f"{secname} always long versus all-sector basket"
)

pnl_utl.plot_pnls(
    pnl_cats=pnl_utl.pnl_names,
    title=f"{secname} sector: naive PnLs of positions versus all-sector basket",
    title_fontsize=14
)

utl_dict["pnls"] = pnl_utl
pnl_utl.evaluate_pnls(pnl_cats=pnl_utl.pnl_names)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/b7fb3fc9fcf0e9f1d4c890881e66b778f996eed257f72e7db1599f4c4f3660d2.png

xcat	Utilities learning-based signal	Utilities always long versus all-sector basket
Return %	11.080234	10.700873
St. Dev. %	39.046812	39.466097
Sharpe Ratio	0.283768	0.271141
Sortino Ratio	0.410529	0.395969
Max 21-Day Draw %	-109.347992	-45.235513
Max 6-Month Draw %	-118.250706	-93.980448
Peak to Trough Draw %	-192.878245	-272.563373
Top 5% Monthly PnL Share	1.638451	1.788333
USD_EQXR_NSA correl	-0.051172	-0.197999
Traded Months	263	263

                pnl_name = utl_dict["pnl_name"]
secname = utl_dict["sector_name"]

pnl_utl.signal_heatmap(
    pnl_name=pnl_name,
    figsize=(12, 3),
    title=f"{secname} sector: signal heatmap",
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/27d13fcc45077c4b425568fa004b1eeda7cf4539a1dacc2bcb7a7306579b2d99.png

Real estate #

Factor selection and signal generation #

                sector = "REL"

rel_dict = {
    "sector_name": sector_labels[sector],
    "signal_name": f"{sector}SOL",
    "pnl_name": f"{sector_labels[sector]} learning-based signal",
    "xcatx": macroz,
    "cidx": cids_eq,
    "ret": f"EQC{sector}{default_target_type}",
    "freq": "M",
    "black": sector_blacklist[sector],
    "srr": None,
    "pnls": None,
}

               

                xcatx = rel_dict["xcatx"] + [rel_dict["ret"]]
cidx = rel_dict["cidx"]

so_rel = msl.SignalOptimizer(
    df = dfx,
    xcats = xcatx,
    cids = cidx,
    blacklist = rel_dict["black"],
    freq = rel_dict["freq"],
    lag = 1,
    xcat_aggs=["last", "sum"],
)

               

                secname = rel_dict["sector_name"]
signal_name = rel_dict["signal_name"]


so_rel.calculate_predictions(
    name=signal_name,
    models=default_models,
    scorers=default_metric,
    hyperparameters=default_hparam_grid,
    inner_splitters = default_splitter,
    test_size=default_test_size,
    min_cids=default_min_cids, 
    min_periods=default_min_periods,
    n_jobs_outer=-1,
    split_functions=default_split_functions,
)
so_rel.models_heatmap(
    signal_name,
    cap=10,
    title=f"{secname} sector: model selection heatmap",
)

# Store signals
dfa = so_rel.get_optimized_signals()
dfx = msm.update_df(dfx, dfa)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/3a9c7a500d59dbc266051c6a43382174e8a4b65c1ca1f606d96d4a86cb0d64b7.png

                xcatx = rel_dict["signal_name"]
secname = rel_dict["sector_name"]

so_rel.feature_selection_heatmap(
    xcatx,
    remove_blanks=True,
    title=f"{secname} sector: feature selection heatmap",
    ftrs_renamed=cat_label_dict,
    figsize=(12, 10),
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/c84c18a22ef98c8ffb85e9464c98e0e99080e22b6593af06c9ba1461fe85953e.png

                xcatx = rel_dict["signal_name"]
secname = rel_dict["sector_name"]

so_rel.coefs_stackedbarplot(
    name=xcatx,
    ftrs_renamed=cat_label_dict,
    title=f"{secname} sector: annual averages of most important feature coefficients",
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/ed33d95d904d21d0d67422805df4b7550a578d5f07d67226374c2d10e5ed5365.png

Signal quality check #

                xcatx = [rel_dict["signal_name"], rel_dict["ret"]]
cidx = rel_dict["cidx"]
secname = rel_dict["sector_name"]

cr_rel = msp.CategoryRelations(
    df=dfx,
    xcats=xcatx,
    cids=cidx,
    freq=rel_dict["freq"],
    blacklist=rel_dict["black"],
    lag=1,
    xcat_aggs=["last", "sum"],
    slip=1,
)

cr_rel.reg_scatter(
    title=f"{secname} sector: learning-based signal and subsequent returns",
    labels=False,
    prob_est="map",
    xlab=f"{secname} signal, end-of-month, based on concurrent best model",
    ylab=f"Relative return of {secname.lower()} sector (vol-targeted), next month, %",
    coef_box="upper left",
    size=(12, 8),
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/3f7d213515805e25c51cfbe539cb93dedc1c24dd6d085442eb9a176e0ce35f4f.png

                xcatx = [rel_dict["signal_name"]]
cidx = rel_dict["cidx"]
secname = rel_dict["sector_name"]
signal_name = rel_dict["signal_name"]
pnl_name = rel_dict["pnl_name"]

pnl_rel = msn.NaivePnL(
    df=dfx,
    ret=rel_dict["ret"],
    sigs=xcatx,
    cids=cidx,
    start=default_start_date,
    blacklist = rel_dict["black"],
    bms=["USD_EQXR_NSA"],
)

for xcat in xcatx:
    pnl_rel.make_pnl(
        sig=xcat,
        sig_op="zn_score_pan",
        rebal_freq="monthly",
        neutral="zero",
        rebal_slip=1,
        vol_scale = None,
        thresh=3,
        pnl_name=pnl_name,
    )
pnl_rel.make_long_pnl(vol_scale=None, label=f"{secname} always long versus all-sector basket")

pnl_rel.plot_pnls(
    pnl_cats=pnl_rel.pnl_names, 
    title=f"{secname} sector: naive PnLs of positions versus all-sector basket",
    title_fontsize=14
)

rel_dict["pnls"] = pnl_rel
pnl_rel.evaluate_pnls(pnl_cats=pnl_rel.pnl_names)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/9974311af6743c70198bab8624111bce6e1217e22c0ec9672976bb20654ae9ad.png

xcat	Real estate learning-based signal	Real estate always long versus all-sector basket
Return %	35.146641	26.023118
St. Dev. %	40.79298	42.63276
Sharpe Ratio	0.861586	0.610402
Sortino Ratio	1.298488	0.871942
Max 21-Day Draw %	-78.103799	-107.121602
Max 6-Month Draw %	-138.261811	-123.928869
Peak to Trough Draw %	-210.994384	-228.650391
Top 5% Monthly PnL Share	0.83304	0.896213
USD_EQXR_NSA correl	-0.044871	-0.057001
Traded Months	263	263

                pnl_name = rel_dict["pnl_name"]
secname = rel_dict["sector_name"]

pnl_rel.signal_heatmap(
    pnl_name=pnl_name,
    figsize=(12, 3),
    title=f"{secname} sector: signal heatmap",
)

               

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/a92540f012b1aa2a849faf8bc6f8730423cce606f9c01a360f81683eb3b917cc.png

Summary #

Sector-specific signals and returns #

               sec_catregs = {
    "enr": cr_enr,
    "mat": cr_mat,
    "ind": cr_ind,
    "cod": cr_cod, 
    "cos": cr_cos,
    "hlc": cr_hlc,
    "fin": cr_fin,
    "ite": cr_ite, 
    "csr": cr_csr,
    "utl": cr_utl, 
    "rel": cr_rel,
}


msv.multiple_reg_scatter(
    cat_rels=list(sec_catregs.values()),
    ncol=3,
    nrow=4,
    figsize=(15, 15),
    title="Statistical macro signals and subsequent sectoral equity returns, 11 currency areas, since 2003",
    title_xadj=0.5,
    title_yadj=0.99,
    title_fontsize=20,
    xlab="Sector-specific statistical signal",
    ylab="Sector return versus equal weighted local index (all vol-targeted), next month %",
    coef_box="lower right",
    prob_est="map",
    single_chart=True,
    subplot_titles=[sector_labels[sector.upper()] for sector in sec_catregs.keys()],
)

              

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/a05a73a9ac912f39b23bab1a39c3b827a5c65fe36ca3997140bfc07ddc3b0976.png

Combined cross-sector trading PnL #

               sec_pnls = {
    "enr": pnl_enr,
    "mat": pnl_mat,
    "ind": pnl_ind,
    "cod": pnl_cod, 
    "cos": pnl_cos,
    "hlc": pnl_hlc,
    "fin": pnl_fin,
    "ite": pnl_ite, 
    "csr": pnl_csr,
    "utl": pnl_utl, 
    "rel": pnl_rel,
}

ma_pnl = msn.MultiPnL()
for sec, pnl in sec_pnls.items():
    ma_pnl.add_pnl(pnl, pnl_xcats=[f"{sector_labels[sec.upper()]} learning-based signal"])

              

               ma_pnl.plot_pnls(
    pnl_xcats=[
        f"{sector_labels[sec.upper()]} learning-based signal" for sec in sec_pnls.keys()
    ],
    title="Naive PnLs for relative sector strategies",
    xcat_labels=[sector_labels[sec.upper()] for sec in sec_pnls.keys()],
)

              

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/458dc4787a3bfc27784008ac3b442ed34557bc616601b289dab89eddf2571db4.png

               cpname = "Simple average PnL of relative sector strategies based on statistical learning and macro signals"

macro_sector_pnl = ma_pnl.combine_pnls(
    pnl_xcats=[f"{sector_labels[sec.upper()]} learning-based signal" for sec in sec_pnls.keys()],
    composite_pnl_xcat=cpname,
    weights=None,
)
ma_pnl.plot_pnls(
    [cpname],
    title="Cumulative naive PnL value of cross-sectoral equity allocation",
)

              

https://macrosynergy.com/notebooks.build/data-science/statistical-learning-for-sectoral-equity-allocation/_images/ce8c959d938e6e82eff7fe2af7be2e33bb73720db8f02e4fc8047551c9e2b3f6.png

               tbr = ma_pnl.evaluate_pnls()
tbr = tbr.rename(columns={
    **{
        f"{sector_labels[sec.upper()]} learning-based signal/EQC{sec.upper()}R_VT10vALL": f"{sector_labels[sec.upper()]}" 
        for sec in sec_pnls.keys()
    }, 
    **{
        "Simple average PnL of relative sector strategies based on statistical learning and macro signals": "Simple average"
    }
})

selected_rows = ["Return %", "St. Dev. %", "Sharpe Ratio", "Sortino Ratio", "USD_EQXR_NSA correl"]
selected_columns = ["Simple average"] + [sector_labels[sec.upper()] for sec in sec_pnls.keys()] 
selected_pnl_stats = tbr.loc[selected_rows, selected_columns].T

              

               display(selected_pnl_stats.style.format("{:.2f}"))

              

	Return %	St. Dev. %	Sharpe Ratio	Sortino Ratio	USD_EQXR_NSA correl
Simple average	15.12	15.10	1.00	1.49	nan
Energy	18.19	67.73	0.27	0.39	-0.08
Materials	25.97	47.27	0.55	0.81	-0.02
Industrials	-2.62	28.24	-0.09	-0.13	0.01
Cons. discretionary	15.12	29.85	0.51	0.72	-0.02
Cons. staples	14.22	33.63	0.42	0.62	-0.01
Healthcare	8.70	35.61	0.24	0.35	0.00
Financials	-1.09	43.64	-0.02	-0.04	-0.04
Information tech	6.81	33.92	0.20	0.27	-0.00
Communication services	18.50	29.46	0.63	0.91	0.01
Utilities	11.08	39.05	0.28	0.41	-0.05
Real estate	35.15	40.79	0.86	1.30	-0.04

Appendix #

Appendix 1 - Macro quantamental indicators description #

               # Convert the dictionary to an HTML table with custom inline CSS
html_table = cat_labels.to_html(index=True, table_id="custom_table")

# Inject CSS to align text to the left and reduce font size
css = """
<style>
#custom_table th, #custom_table td {
    text-align: left;
    font-size: 12px; /* Adjust the font size as needed */
}
</style>
"""

              

               # Display the styled HTML table
HTML(css + html_table)

		Label	Description	Geography
Group	Category
Business surveys	CBCSCORE_SA_D3M3ML3_WG_ZN	Construction confidence, q/q	Construction business confidence score, seas. adjusted, change q/q	weighted
	CBCSCORE_SA_D3M3ML3_ZN	Construction confidence, q/q	Construction business confidence score, seas. adjusted, change q/q	local
	CBCSCORE_SA_WG_ZN	Construction confidence	Construction business confidence score, seas. adjusted	weighted
	CBCSCORE_SA_ZN	Construction confidence	Construction business confidence score, seas. adjusted	local
	MBCSCORE_SA_D3M3ML3_WG_ZN	Manufacturing confidence, q/q	Manufacturing business confidence score, seas. adj., change q/q	weighted
	MBCSCORE_SA_D3M3ML3_ZN	Manufacturing confidence, q/q	Manufacturing business confidence score, seas. adj., change q/q	local
	MBCSCORE_SA_WG_ZN	Manufacturing confidence	Manufacturing business confidence score, seasonally adjusted	weighted
	MBCSCORE_SA_ZN	Manufacturing confidence	Manufacturing business confidence score, seasonally adjusted	local
	SBCSCORE_SA_D3M3ML3_WG_ZN	Service confidence, q/q	Services business confidence score, seas. adjusted, change q/q	weighted
	SBCSCORE_SA_D3M3ML3_ZN	Service confidence, q/q	Services business confidence score, seas. adjusted, change q/q	local
	SBCSCORE_SA_WG_ZN	Service confidence	Services business confidence score, seasonally adjusted	weighted
	SBCSCORE_SA_ZN	Service confidence	Services business confidence score, seasonally adjusted	local
Commodity inventories	BASEXINVCSCORE_SA_ZN	Excess crude inventory score	Crude oil excess inventory z-score, seasonally adjusted	global
	BMLXINVCSCORE_SA_ZN	Excess metal inventory score	Base metal excess inventory z-score, seasonally adjusted	global
	REFIXINVCSCORE_SA_ZN	Excess refined oil inventory score	Refined oil product excess inventory z-score, seas. adjusted	global
Debt	CORPINTNETGDP_SA_D1Q1QL4_WG_ZN	Corporate debt servicing, %oya	Corporate net debt servicing-to-GDP ratio, seasonally-adjusted, %oya	weighted
	CORPINTNETGDP_SA_D1Q1QL4_ZN	Corporate debt servicing, %oya	Corporate net debt servicing-to-GDP ratio, seasonally-adjusted, %oya	local
	HHINTNETGDP_SA_D1M1ML12_WG_ZN	Households debt servicing, %oya	Households net debt servicing-to-GDP ratio, seasonally-adjusted, %oya	weighted
	HHINTNETGDP_SA_D1M1ML12_ZN	Households debt servicing, %oya	Households net debt servicing-to-GDP ratio, seasonally-adjusted, %oya	local
	XGGDGDPRATIOX10_NSA_ZN	Excess projected gov. debt	Government debt-to-GDP ratio proj. in 10 years, in excess of 100%	local
Exports	XEXPORTS_SA_P1M1ML12_3MMA_ZN	Excess export growth	Exports growth, %oya, 3mma, in excess of 5-year median GDP growth	local
Inflation - broad	XCPIC_SA_P1M1ML12_ZN	Excess core CPI, %oya	Core CPI, %oya, in excess of effective inflation target	local
	XCPIH_SA_P1M1ML12_ZN	Excess headline CPI, %oya	Headline CPI, %oya, in excess of effective inflation target	local
	XPPIH_NSA_P1M1ML12_ZN	Excess PPI, %oya	Producer price inflation, %oya, in excess of eff. inflation target	local
Inflation - specific	XCPIE_SA_P1M1ML12_WG_ZN	Excess energy CPI, %oya	Energy CPI, %oya, in excess of effective inflation target	weighted
	XCPIE_SA_P1M1ML12_ZN	Excess energy CPI, %oya	Energy CPI, %oya, in excess of effective inflation target	local
	XCPIF_SA_P1M1ML12_WG_ZN	Excess food CPI, %oya	Food CPI, %oya, in excess of effective inflation target	weighted
	XCPIF_SA_P1M1ML12_ZN	Excess food CPI, %oya	Food CPI, %oya, in excess of effective inflation target	local
Labour market	UNEMPLRATE_NSA_3MMA_D1M1ML12_WG_ZN	Unemployment rate, diff oya	Unemployment rate, change oya	weighted
	UNEMPLRATE_NSA_3MMA_D1M1ML12_ZN	Unemployment rate, diff oya	Unemployment rate, change oya	local
	UNEMPLRATE_SA_3MMAv5YMA_WG_ZN	Unemployment rate, diff vs 5yma	Unemployment rate, difference vs 5-year moving average	weighted
	UNEMPLRATE_SA_3MMAv5YMA_ZN	Unemployment rate, diff vs 5yma	Unemployment rate, difference vs 5-year moving average	local
	XEMPL_NSA_P1M1ML12_3MMA_WG_ZN	Excess employment growth	Employment growth, %oya, 3mma, in excess of population growth	weighted
	XEMPL_NSA_P1M1ML12_3MMA_ZN	Excess employment growth	Employment growth, %oya, 3mma, in excess of population growth	local
	XRWAGES_NSA_P1M1ML12_ZN	Excess real wage growth	Real wage growth, %oya, in excess of medium-term productivity growth	local
Market metrics	BMLCOCRY_SAVT10_21DMA_ZN	Base metals carry	Nominal carry for base metals basket, seasonally and vol-adjusted, 21 days moving average	global
	COXR_VT10vWTI_21DMA_ZN	Refined vs crude oil returns	Refined oil products vs crude oil vol-targeted return differential, 21 days moving average	global
	RIR_NSA_ZN	Real 1-month rate	Real 1-month interest rate	local
	RSLOPEMIDDLE_NSA_ZN	Real 5y-2y yield	Real IRS yield differentials, 5-years versus 2-years	local
	RYLDIRS02Y_NSA_ZN	Real 2-year yield	Real 2-year IRS yield	local
	RYLDIRS05Y_NSA_ZN	Real 5-year yield	Real 5-year IRS yield	local
Output growth	XCSTR_SA_P1M1ML12_3MMA_WG_ZN	Excess construction growth	Construction output, %oya, 3mma, in excess of 5-y median GDP growth	weighted
	XCSTR_SA_P1M1ML12_3MMA_ZN	Excess construction growth	Construction output, %oya, 3mma, in excess of 5-y median GDP growth	local
	XIP_SA_P1M1ML12_3MMA_WG_ZN	Excess industry growth	Industrial output, %oya, 3mma, in excess of 5-y median GDP growth	weighted
	XIP_SA_P1M1ML12_3MMA_ZN	Excess industry growth	Industrial output, %oya, 3mma, in excess of 5-y median GDP growth	local
	XRGDPTECH_SA_P1M1ML12_3MMA_WG_ZN	Excess GDP growth	Real GDP, %oya, 3mma, using HF data, in excess of 5-y med. GDP growth	weighted
	XRGDPTECH_SA_P1M1ML12_3MMA_ZN	Excess GDP growth	Real GDP, %oya, 3mma, using HF data, in excess of 5-y med. GDP growth	local
Private consumption	CCSCORE_SA_D3M3ML3_WG_ZN	Consumer confidence, q/q	Consumer confidence score, seasonally adjusted, change q/q	weighted
	CCSCORE_SA_D3M3ML3_ZN	Consumer confidence, q/q	Consumer confidence score, seasonally adjusted, change q/q	local
	CCSCORE_SA_WG_ZN	Consumer confidence	Consumer confidence score, seasonally adjusted	weighted
	CCSCORE_SA_ZN	Consumer confidence	Consumer confidence score, seasonally adjusted	local
	XNRSALES_SA_P1M1ML12_3MMA_WG_ZN	Excess retail sales growth	Nominal retail sales, %oya, 3mma, in excess of 5-y median GDP growth	weighted
	XNRSALES_SA_P1M1ML12_3MMA_ZN	Excess retail sales growth	Nominal retail sales, %oya, 3mma, in excess of 5-y median GDP growth	local
	XRPCONS_SA_P1M1ML12_3MMA_WG_ZN	Excess consumption growth	Real private consumption, %oya, 3mma, in excess of 5-y median GDP growth	weighted
	XRPCONS_SA_P1M1ML12_3MMA_ZN	Excess real consum growth	Real private consumption, %oya, 3mma, in excess of 5-y median GDP growth	local
	XRRSALES_SA_P1M1ML12_3MMA_WG_ZN	Excess real retail growth	Real retail sales, %oya, 3mma, in excess of 5-y median GDP growth	weighted
	XRRSALES_SA_P1M1ML12_3MMA_ZN	Excess real retail growth	Real retail sales, %oya, 3mma, in excess of 5-y median GDP growth	local
Private credit	INTLIQGDP_NSA_D1M1ML1_ZN	Intervention liquidity, diff m/m	Intervention liquidity to GDP ratio, change over the last month	local
	INTLIQGDP_NSA_D1M1ML6_ZN	Intervention liquidity, diff 6m	Intervention liquidity to GDP ratio, change overlast 6 months	local
	XPCREDITBN_SJA_P1M1ML12_WG_ZN	Excess credit growth	Private credit, %oya, 3mma, in excess of 5-y median GDP growth	weighted
	XPCREDITBN_SJA_P1M1ML12_ZN	Excess credit growth	Private credit, %oya, 3mma, in excess of 5-y median GDP growth	local
Real appreciation	CMPI_NSA_P1M12ML1_ZN	Import prices, %oya	Commodity-based import price index, %oya	local
	CTOT_NSA_P1M12ML1_ZN	Terms-of-trade, %oya	Commodity-based terms-of-trade, %oya	local
	CXPI_NSA_P1M12ML1_ZN	Export prices, %oya	Commodity-based export price index, %oya	local
	REEROADJ_NSA_P1M12ML1_ZN	Open-adj REER, %oya	Openness-adjusted real effective exchange rate, %oya	local

Statistical learning for sectoral equity allocation

Contents

Statistical learning for sectoral equity allocation #

Get packages and JPMaQS data #

Packages #

Previously prepared quantamental categories #

Download additional data from JPMaQS #

Feature filtering and imputation #

Cross-section availability requirement #

Conditional imputation of missing cross-sections #

Equity sectoral return blacklisting #

Visualize target availability #

Sectoral signals and naive PnLs #

Common pipeline for all sectors #

Energy #

Factor selection and signal generation #

Signal quality check #

Materials #

Factor selection and signal generation #

Signal quality check #

Industrials #

Factor selection and signal generation #

Signal quality check #

Consumer discretionary #

Factor selection and signal generation #

Signal quality check #

Consumer staples #

Factor selection and signal generation #

Signal quality check #

Healthcare #

Factor selection and signal generation #

Signal quality check #

Financials #

Factor selection and signal generation #

Signal quality check #

Technology #

Factor selection and signal generation #

Signal quality check #

Communication #

Factor selection and signal generation #

Signal quality check #

Utilities #

Factor selection and signal generation #

Signal quality check #

Real estate #

Factor selection and signal generation #

Signal quality check #

Summary #

Sector-specific signals and returns #

Combined cross-sector trading PnL #

Appendix #

Appendix 1 - Macro quantamental indicators description #

ABOUT US

FOLLOW US

LEGAL