Check out JPMaQS #
This introductory notebook serves as a guide for utilizing macro-quantamental datasets from the J.P. Morgan Macrosynergy Quantamental System (JPMaQS). It provides an overview of the dataset’s conventions and structure and offers practical examples of data analysis, including data manipulation techniques such as data slicing, visualization, frequency conversions, and the application of lagging and differencing indicators.
The contents of this notebook are organized into the following main sections:
-
Package Setup and JPMaQS Data Retrieval: In this section, we handle the installation and importing of essential Python packages required for our analysis. This includes importing the
macrosynergy
package for downloading required dataset from JPMaQS. -
Exploring JPMaQS Time Series Data: This part provides examples of how to create a data panel, extract and analyze specific categories or cross-sections of data, and select specific time intervals for analysis.
-
Applying Mathematical Operations: Here, we demonstrate a straightforward example of creating an additional data category through mathematical operations or transformations.
-
Time Series Transformation Made Easy: This section focuses on converting time series data from its standard daily frequency to other intervals such as weekly, monthly, quarterly, or annually. We also explore how to lag and difference variables to prepare them for further analysis.
-
Window Operations: The final part of the notebook introduces rolling and expanding window functions, which are essential for calculating moving averages, cumulative sums, and other time-dependent metrics.
Get packages and JPMaQS data #
# Uncomment below if running on Kaggle
"""
%%capture
! pip install macrosynergy --upgrade"""
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn as sns
from macrosynergy.download import JPMaQSDownload
import os
import warnings
warnings.simplefilter("ignore")
The JPMaQS indicators we consider are downloaded using the J.P. Morgan Dataquery API interface within the
macrosynergy
package. This is done by specifying ticker strings, formed by appending an indicator category code
DB(JPMAQS,<cross_section>_<category>,<info>)
, where
value
giving the latest available values for the indicator
eop_lag
referring to days elapsed since the end of the observation period
mop_lag
referring to the number of days elapsed since the mean observation period
grade
denoting a grade of the observation, giving a metric of real time information quality.
After instantiating the
JPMaQSDownload
class within the
macrosynergy.download
module, one can use the
download(tickers,start_date,metrics)
method to easily download the necessary data, where
tickers
is an array of ticker strings,
start_date
is the first collection date to be considered and
metrics
is an array comprising the times series information to be downloaded. For more information see
here
or use the free dataset on
Kaggle
To ensure reproducibility, only samples between January 2000 (inclusive) and May 2023 (exclusive) are considered.
# Lists of cross-sections (countries with IRS markets and appropriate data)
cids_dm = ["AUD", "CAD", "CHF", "EUR", "GBP", "JPY", "NOK", "NZD", "SEK", "USD"]
cids_em = [
"CLP",
"COP",
"CZK",
"HUF",
"IDR",
"ILS",
"INR",
"KRW",
"MXN",
"PLN",
"THB",
"TRY",
"TWD",
"ZAR",
]
cids = cids_dm + cids_em
# Lists of main quantamental and return categories
main = [
"CPIC_SA_P1M1ML12",
"CPIC_SJA_P3M3ML3AR",
"CPIC_SJA_P6M6ML6AR",
"CPIH_SA_P1M1ML12",
"CPIH_SJA_P3M3ML3AR",
"CPIH_SJA_P6M6ML6AR",
"INFTEFF_NSA",
"INTRGDP_NSA_P1M1ML12_3MMA",
"INTRGDPv5Y_NSA_P1M1ML12_3MMA",
"PCREDITGDP_SJA_D1M1ML12",
"RGDP_SA_P1Q1QL4_20QMA",
"RYLDIRS02Y_NSA",
"RYLDIRS05Y_NSA",
"PCREDITBN_SJA_P1M1ML12",
]
rets = [
"DU02YXR_NSA",
"DU05YXR_NSA",
"DU02YXR_VT10",
"DU05YXR_VT10",
"EQXR_NSA",
"EQXR_VT10",
"FXXR_NSA",
"FXXR_VT10",
]
xcats = main + rets
The description of each JPMaQS category is available either under Macro Quantamental Academy , JPMorgan Markets (password protected), or on Kaggle (just for the tickers used in this notebook). In particular, the set used for this notebook is using Consumer price inflation trends , Inflation targets , Intuitive growth estimates , Domestic credit ratios , Long-term GDP growth , Real interest rates , Private credit expansion , Duration returns , Equity index future returns , and FX forward returns .
# Download series from J.P. Morgan DataQuery by tickers
start_date = "2000-01-01"
end_date = "2023-05-01"
tickers = [cid + "_" + xcat for cid in cids for xcat in xcats]
print(f"Maximum number of tickers is {len(tickers)}")
# Retrieve credentials
client_id: str = os.getenv("DQ_CLIENT_ID")
client_secret: str = os.getenv("DQ_CLIENT_SECRET")
with JPMaQSDownload(client_id=client_id, client_secret=client_secret) as dq:
df = dq.download(
tickers=tickers,
start_date="2000-01-01",
end_date="2023-05-01",
suppress_warning=True,
metrics=["value"],
show_progress=True,
)
Maximum number of tickers is 528
Downloading data from JPMaQS.
Timestamp UTC: 2023-09-19 11:11:59
Connection successful!
Number of expressions requested: 528
Requesting data: 100%|██████████| 27/27 [00:08<00:00, 3.30it/s]
Downloading data: 100%|██████████| 27/27 [00:14<00:00, 1.89it/s]
# uncomment if running on Kaggle
"""for dirname, _, filenames in os.walk('/kaggle/input'):
for filename in filenames:
print(os.path.join(dirname, filename))
df = pd.read_csv('../input/fixed-income-returns-and-macro-trends/JPMaQS_Quantamental_Indicators.csv', index_col=0, parse_dates=['real_date'])"""
# to explore the data, we can use the following simple code:
display(df["xcat"].unique())
display(df["cid"].unique())
df["ticker"] = df["cid"] + "_" + df["xcat"]
array(['CPIC_SA_P1M1ML12', 'CPIC_SJA_P3M3ML3AR', 'CPIC_SJA_P6M6ML6AR',
'CPIH_SA_P1M1ML12', 'CPIH_SJA_P3M3ML3AR', 'CPIH_SJA_P6M6ML6AR',
'FXXR_NSA', 'FXXR_VT10', 'INFTEFF_NSA',
'INTRGDP_NSA_P1M1ML12_3MMA', 'INTRGDPv5Y_NSA_P1M1ML12_3MMA',
'PCREDITBN_SJA_P1M1ML12', 'PCREDITGDP_SJA_D1M1ML12',
'RGDP_SA_P1Q1QL4_20QMA', 'RYLDIRS02Y_NSA', 'RYLDIRS05Y_NSA',
'DU02YXR_NSA', 'DU02YXR_VT10', 'DU05YXR_NSA', 'DU05YXR_VT10',
'EQXR_NSA', 'EQXR_VT10'], dtype=object)
array(['AUD', 'CAD', 'CHF', 'CLP', 'COP', 'CZK', 'EUR', 'GBP', 'HUF',
'IDR', 'ILS', 'INR', 'JPY', 'KRW', 'MXN', 'NOK', 'NZD', 'PLN',
'SEK', 'THB', 'TRY', 'TWD', 'USD', 'ZAR'], dtype=object)
df.head(3)
real_date | cid | xcat | value | ticker | |
---|---|---|---|---|---|
0 | 2000-01-03 | AUD | CPIC_SA_P1M1ML12 | 1.244168 | AUD_CPIC_SA_P1M1ML12 |
1 | 2000-01-03 | AUD | CPIC_SJA_P3M3ML3AR | 3.006383 | AUD_CPIC_SJA_P3M3ML3AR |
2 | 2000-01-03 | AUD | CPIC_SJA_P6M6ML6AR | 1.428580 | AUD_CPIC_SJA_P6M6ML6AR |
JPMaQS time series dataframes #
Make a panel #
Many data analyses, as well as charting with
seaborn
and
matplotlib
can be done directly on the long dataframe.
However, it is also easy to extract one or more categories from the standard long JPMaQS dataframe into a standard time series panel dataframe by using the
.pivot_table()
method, as shown below. This format is helpful for certain mathematical operations on categories and for passing them on to other packages, such as
scikit-learn
.
# Extract and pivot a single category panel (core inflation) for three chosen currencies
dfx = df[
(df["xcat"].isin(["CPIC_SJA_P6M6ML6AR", "INFTEFF_NSA"])) & (df["cid"] == "USD")
]
# Pivot the selected dataframe into a "wide" time series format giving every ticker a column
dfw = dfx.pivot_table(index="real_date", columns="ticker", values="value").replace(
0, np.nan
)
# Check out the panel
sns.set(rc={"figure.figsize": (15, 5)})
dfw.plot(
title="Adjusted latest core consumer price trend: % 6m/6m and Inflation Target"
)
plt.show()

Slice a panel #
Time series panels are easy to slice by column (with strings and regex) and by row (using intuitive Datetime conventions).
# Extract two categories and pivot to a time series dataframe (core CPI trend and estimated GDP trend)
xcats_sel = ["CPIC_SJA_P6M6ML6AR", "INTRGDPv5Y_NSA_P1M1ML12_3MMA"]
cids_sel = cids_dm[2:6]
dfx = df.loc[df["xcat"].isin(xcats_sel) & df["cid"].isin(cids_sel), :]
dfw = dfx.pivot_table(index="real_date", columns="ticker", values="value").replace(
0, np.nan
) # pivot by ticker
# Replace technical names with something you understand
nicknames = {
"CPIC_SJA_P6M6ML6AR": "CORE_CPI_TREND",
"INTRGDPv5Y_NSA_P1M1ML12_3MMA": "GDP_TREND",
}
colnames = list(dfw.columns)
for key, value in nicknames.items():
colnames = [col.replace(key, value) for col in colnames]
dfw.columns = colnames
# Check out double category panel
dfw.info()
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 6086 entries, 2000-01-03 to 2023-05-01
Data columns (total 8 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 CHF_CORE_CPI_TREND 5221 non-null float64
1 CHF_GDP_TREND 6086 non-null float64
2 EUR_CORE_CPI_TREND 6069 non-null float64
3 EUR_GDP_TREND 5804 non-null float64
4 GBP_CORE_CPI_TREND 6086 non-null float64
5 GBP_GDP_TREND 6086 non-null float64
6 JPY_CORE_CPI_TREND 4678 non-null float64
7 JPY_GDP_TREND 6086 non-null float64
dtypes: float64(8)
memory usage: 427.9 KB
Extract and analyze all category series for one cross-section (currency) #
display(dfw.aggregate(["mean", "std"]))
CHF_CORE_CPI_TREND | CHF_GDP_TREND | EUR_CORE_CPI_TREND | EUR_GDP_TREND | GBP_CORE_CPI_TREND | GBP_GDP_TREND | JPY_CORE_CPI_TREND | JPY_GDP_TREND | |
---|---|---|---|---|---|---|---|---|
mean | 0.360978 | -0.238452 | 1.463912 | -0.461945 | 2.003847 | -0.623765 | 0.363984 | -0.004743 |
std | 0.762042 | 1.662564 | 0.863351 | 2.605391 | 1.273862 | 3.343750 | 1.151326 | 2.283116 |
Extract and analyze all cross-sectional series for one category #
cols = dfw.columns.str.contains(
"_GDP_TREND$"
) # $ means that the preceding string must be at the end of the ticker
dfwx = dfw.loc[:, cols].dropna()
display(dfwx.aggregate(["median", "max", "min"]))
CHF_GDP_TREND | EUR_GDP_TREND | GBP_GDP_TREND | JPY_GDP_TREND | |
---|---|---|---|---|
median | -0.255295 | 0.089108 | -0.253841 | 0.357414 |
max | 5.680445 | 10.562771 | 17.154058 | 6.302844 |
min | -8.404751 | -12.695249 | -22.994717 | -10.268773 |
Select an interval #
display(dfw["2020-3":"2021-3"].agg(["mean", "max", "min"]))
CHF_CORE_CPI_TREND | CHF_GDP_TREND | EUR_CORE_CPI_TREND | EUR_GDP_TREND | GBP_CORE_CPI_TREND | GBP_GDP_TREND | JPY_CORE_CPI_TREND | JPY_GDP_TREND | |
---|---|---|---|---|---|---|---|---|
mean | -0.052870 | -2.791175 | 0.897769 | -6.480288 | 2.000190 | -10.043104 | -0.104146 | -4.085845 |
max | 0.451175 | -0.188698 | 1.410560 | -0.897233 | 3.158075 | -0.655196 | 0.834353 | -0.529565 |
min | -0.824449 | -8.404751 | -0.201780 | -12.695249 | 1.299452 | -22.994717 | -1.140706 | -8.702678 |
Apply mathematical operations #
Times series dataframes with equal column names and overlapping time indexes can be arguments of standard mathematical operations. Time series panels can be simply added, subtracted, divided, and applied to other operations.
Important:
Since JPMaQS series are all dated as information states they can be added, subtracted, multiplied, and so forth not just among each other but also with market data to give more elaborate trading signals.
# Create an Inflation trend dataframe for selected currencies and pivot it to the wide format by currency identifier
dfx_inf = df.loc[
(df["xcat"] == "CPIC_SJA_P6M6ML6AR") & df["cid"].isin(["EUR", "USD", "GBP", "JPY"]),
:,
]
dfx_inf = dfx_inf[
dfx_inf["real_date"] >= pd.to_datetime("2020-01-01")
] # set start date
dfw_inf = dfx_inf.pivot_table(index="real_date", columns="cid", values="value").replace(
0, np.nan
)
# Create GDP trend dataframe for selected currencies and pivot it to a wide format by currency identifier
dfx_gdp = df.loc[
(df["xcat"] == "INTRGDPv5Y_NSA_P1M1ML12_3MMA")
& df["cid"].isin(["EUR", "USD", "GBP", "JPY"]),
:,
]
dfx_gdp = dfx_gdp[
dfx_gdp["real_date"] >= pd.to_datetime("2020-01-01")
] # set start date
dfw_gdp = dfx_gdp.pivot_table(index="real_date", columns="cid", values="value").replace(
0, np.nan
)
# Add Inflation and GDP trend values by currency
dfw_ngdp = dfw_gdp + dfw_inf
# Add Inflation and GDP trend values by currency
sns.set(rc={"figure.figsize": (15, 5)})
dfw_ngdp.plot(title="Sum of estimated growth and core inflation trends")
plt.show()

Convert panels back to long format #
Time series panels can be simply converted back to the long format by using the
.unstack()
method and added to the original dataframe, for subsequent processing.
dfl = dfw_ngdp.unstack().reset_index().rename(columns={0: "value"})
dfl["xcat"] = "NGDP_TREND"
dfl["ticker"] = dfl["cid"] + "_" + dfl["xcat"]
dfd_new = pd.concat([df, dfl], axis=0, ignore_index=True)
dfd_new.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2889584 entries, 0 to 2889583
Data columns (total 5 columns):
# Column Dtype
--- ------ -----
0 real_date datetime64[ns]
1 cid object
2 xcat object
3 value float64
4 ticker object
dtypes: datetime64[ns](1), float64(1), object(3)
memory usage: 110.2+ MB
Easy time series transformations #
Frequency conversion #
The most common frequency conversion of quantamental panels is downsampling, i.e. reducing the frequency of the daily data to weekly, monthly, or quarterly by some aggregation method, such as
mean
or
last
.
The most flexible method for that purpose is
.resample()
, which can be combined with
.groupby()
to operate on specific tickers.
cids_sel = ["EUR", "GBP"]
xcats_sel = [
"INTRGDPv5Y_NSA_P1M1ML12_3MMA",
"INTRGDPv10Y_NSA_P1M1ML12_3MMA",
"CPIC_SJA_P6M6ML6AR",
]
dfx = df.loc[df["xcat"].isin(xcats_sel) & df["cid"].isin(cids_sel), :]
dfxm = (
dfx.groupby(["cid", "xcat"])
.resample("M", on="real_date")
.last()["value"]
.reset_index()
)
dfxm.head(3)
cid | xcat | real_date | value | |
---|---|---|---|---|
0 | EUR | CPIC_SJA_P6M6ML6AR | 2000-01-31 | 1.093003 |
1 | EUR | CPIC_SJA_P6M6ML6AR | 2000-02-29 | 1.042372 |
2 | EUR | CPIC_SJA_P6M6ML6AR | 2000-03-31 | 0.975460 |
Lagging #
Like all other time series, quantamental panels can be transformed with standard time series methods, such as lags, differences, percent changes, and so forth. Only the interpretation here is different. Lagging here means that the availability of information is delayed. This allows aligning one date’s information status with a subsequent period’s return.
# Slice out a relevant dataframe
cids_sel = ["EUR", "USD"]
xcats_sel = ["CPIC_SJA_P6M6ML6AR", "DU02YXR_NSA"]
filt1 = df["cid"].isin(cids_sel) # filter for cross-sections
filt2 = df["xcat"].isin(xcats_sel) # filter for category
filt3 = df["real_date"] >= pd.to_datetime("2010-06-01") # filter for start date
dfx = df[filt1 & filt2 & filt3] # filter out relevant data frame
filt4 = (
dfx["xcat"] == xcats_sel[0]
) # filter for selecting just CPI for subsequent frequency conversion and lagging
filt5 = (
dfx["xcat"] == xcats_sel[1]
) # filter for selecting just IRS returns for subsequent frequency conversion and lagging
# Apply appropriate frequency conversion: last value or CPI inflation (predictor) and the sum of subsequent returns
dff1 = (
dfx[filt4]
.groupby(["cid", "xcat"])
.resample("Q", on="real_date")
.last()["value"]
.reset_index()
) # quarterly features based on last value
dff2 = (
dfx[filt5]
.groupby(["cid", "xcat"])
.resample("Q", on="real_date")
.sum(numeric_only="True")["value"]
.reset_index()
) # quarterly features based on mean
# Shift returns forward by one period to prepare for regression analysis
dff2["value"] = dff2.groupby(["xcat", "cid"])[
"value"
].shift() # replace by lagged value
dff2.head()
cid | xcat | real_date | value | |
---|---|---|---|---|
0 | EUR | DU02YXR_NSA | 2010-06-30 | NaN |
1 | EUR | DU02YXR_NSA | 2010-09-30 | -0.024612 |
2 | EUR | DU02YXR_NSA | 2010-12-31 | -0.146008 |
3 | EUR | DU02YXR_NSA | 2011-03-31 | -0.172114 |
4 | EUR | DU02YXR_NSA | 2011-06-30 | -1.316241 |
Differencing #
A difference here means the difference in the information state of the market, not in the actual underlying economic concept. It shows how the market has updated its assessment of fundamentals.
# Slice out relevant dataframe
xcats_sel = ["CPIC_SJA_P6M6ML6AR", "INTRGDPv10Y_NSA_P1M1ML12_3MMA"]
cids_sel = ["EUR", "USD"]
dfx = df.loc[df["xcat"].isin(xcats_sel) & df["cid"].isin(cids_sel), :]
dfw = dfx.pivot_table(index="real_date", columns="ticker", values="value").replace(
0, np.nan
)
# Apply name replacement
nicknames = {
"CPIC_SJA_P6M6ML6AR": "CORE_CPI_TREND",
"INTRGDPv10Y_NSA_P1M1ML12_3MMA": "GDP_TREND",
}
colnames = list(dfw.columns)
for key, value in nicknames.items():
colnames = [col.replace(key, value) for col in colnames]
dfw.columns = colnames
# Apply simple monthly conversion (takes last day's value). JPM's convention uses BM - "Business Month"
dfw = dfw.asfreq("BM")
# Take the first difference
dfw.diff().tail(3)
EUR_CORE_CPI_TREND | USD_CORE_CPI_TREND | |
---|---|---|
real_date | ||
2023-02-28 | 0.404137 | 0.235954 |
2023-03-31 | 0.008200 | -0.039917 |
2023-04-28 | 0.001464 | -0.062272 |
Window operations #
Rolling windows #
The most common window operations are based on rolling windows, such as rolling (“moving”) averages or standard deviations.
Using rolling averages on quantamental data, which are information states, is equivalent to sticky or sluggish expectations.
# Slice out the panel
dfx_inf = df.loc[(df["xcat"] == "CPIC_SJA_P6M6ML6AR") & df["cid"].isin(cids_em[:5]), :]
dfx_inf = dfx_inf[
dfx_inf["real_date"] >= pd.to_datetime("2015-01-01")
] # set start date
dfw_inf = dfx_inf.pivot_table(index="real_date", columns="cid", values="value").replace(
0, np.nan
)
# Rolling 90 working day averages
sns.set(rc={"figure.figsize": (15, 5)})
dfw_inf90 = dfw_inf.rolling(90).mean()
dfw_inf90.plot(
title="Consistent core inflation trends, %6m/6m, saar, 90-day moving averages of information states"
)
plt.show()

Expanding windows #
Expanding window operations are likewise useful. Rather than applying a fixed lookback window, these operations use a lookback that increases with available history.
# Slice out single category time series
dfx = df.loc[
(
df["xcat"].isin(
["INTRGDPv5Y_NSA_P1M1ML12_3MMA", "INTRGDPv10Y_NSA_P1M1ML12_3MMA"]
)
& df["cid"].isin(["EUR", "USD"])
)
]
dfx = dfx[dfx["real_date"] >= pd.to_datetime("2010-01-01")] # set start date
dfw = dfx.pivot_table(index="real_date", columns="cid", values="value").replace(
0, np.nan
)
dfwm = dfw.resample(
"M"
).mean() # monthly average information states on estimated growth
# Z-score growth values estimated without using future information
dfwm_mean = dfwm.expanding(
min_periods=24
).mean() # mean based on expanding window with minimum of 2-years required
dfwm_std = dfwm.expanding(
min_periods=24
).std() # mean based on expanding window with minimum of 2-years required
sns.set(rc={"figure.figsize": (15, 5)})
dfwm_z = (dfwm - dfwm_mean) / dfwm_std
dfwm_z.plot(title="Time-consistent z-scores of estimated GDP growth")
plt.show()
