Technical GDP growth estimates #
The category group contains real-time GDP growth trends based on vintages of standard econometric (“technical”) estimates. Historic and current models are based on the simplest conventions and are recurrently reconstructed based on learning (view basic principles here ). This process reduces the contamination of model hyperparameters by hindsight, makes backtests more meaningful, and sets out clear rules for the generation of estimates in the future.
This category group is an instance of “technical macro trends”, real-time econometric trend estimates of fundamental developments that complement and qualify market trends for the purpose of algorithmic trading.
Technical real GDP growth trends #
Ticker : RGDPTECH_SA_P1M1ML12_3MMA / RGDPTECH_SA_P1M1ML12
Label : Technical real GDP growth trend: % oya, 3mma / % oya
Definition : Technical real GDP growth trend based on supervised learning: % over a year ago, 3-month moving average / % over a year ago.
Notes :
-
Annual growth rates are estimated for every new release of a set of “credible macroeconomic predictors” using their actual or estimated available data vintages at the time. “Credible macro indictors” are those that were plausibly watched by the market and that had explanatory power for concurrent (unreleased) GDP growth at least at some point during the data release cycle. The presence of predictive power is ascertained by using the “elastic net” earning method.
-
When aligning various predictors for recent observation periods, some values are usually missing. This reflects differences in release schedules and leads to a data structure that is called a “jagged edge”. It is a common inconvenience of economic data watching. The technical macro trend model estimates the missing data points (based on the available data) and then uses the estimated full recent data set to predict GDP trends. See Appendix 1 for further details.
-
The credible macroeconomic predictors are chosen periodically by the learning algorithm from the full set of real activity indicators (“feature candidates”). The feature candidates are those indicators that have been watched by the market according to the data release calendars of major market data services and that could plausibly have provided information prior to the release of the official national accounts. A full list of these indicators for each market is provided in Appendix 2 .
-
Important note: The quantamental series of a 3-month moving average, as presented here, is not the same as the 3-month moving average of a quantamental series. Instead it is a 3-month moving average of the concurrent available vintage. Since the latest month and the previous months may be estimated based on different monthly-frequency data, depending on publication lag, the quantamental 3-month moving averages contain independent information and may look very different from the smoothed monthly quantamental series.
Excess technical real GDP growth trends #
Ticker : RGDPTECHv5Y_SA_P1M1ML12_3MMA / RGDPTECHv10Y_SA_P1M1ML12_3MMA
Label : Excess technical real GDP growth trend, % oya, 3mma: based on 5-year lookback / based on 10-year lookback.
Definition : Latest estimated technical real GDP growth trend based on supervised learning, % over a year ago, 3-month moving average minus a long-term median of that country’s actual GDP growth rate at that time: based on 5 year lookback of the latter / based on 10 year lookback of the latter.
Notes :
-
For a description of the estimation of technical GDP growth, see the notes for
RGDPTECH_SA_P1M1ML12_3MMA
andRGDPTECH_SA_P1M1ML12
. -
The excess technical GDP growth trend subtracts a real-time 5-year or 10-year GDP growth median. This serves as a simplistic but fairly objective estimate for potential GDP growth at the time. See further the notes for
RGDP_SA_P1Q1QL4_40QMM
in the categoryLong-term GDP growth
.
Technical real GDP growth trend changes #
Ticker : RGDPTECH_SA_P1M1ML12_D3M3ML3
Label : Change in technical real GDP trend over the past 3 months
Definition : Real GDP growth, % over a year ago, 3-month average, change over the last three estimable months.
Notes :
-
This metric is measuring estimations of growth changes, not changes in estimations. The changes are always calculated based on the latest available vintage of growth estimations.
-
Technical RGDP growth trend means the latest estimated GDP growth trend based on pre-selected monthly economic indicators related to RGDP growth based on a two-stage supervised learning method. See Appendix 1 below.
-
For further details, see the tickers
RGDPTECH_SA_P1M1ML12_3MMA
/RGDPTECH_SA_P1M1ML12
.
Imports #
Only the standard Python data science packages and the specialized
macrosynergy
package are needed.
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import math
import json
import yaml
import macrosynergy.management as msm
import macrosynergy.panel as msp
import macrosynergy.signal as mss
import macrosynergy.pnl as msn
from macrosynergy.download import JPMaQSDownload
from timeit import default_timer as timer
from datetime import timedelta, date, datetime
import warnings
warnings.simplefilter("ignore")
The
JPMaQS
indicators we consider are downloaded using the J.P. Morgan Dataquery API interface within the
macrosynergy
package. This is done by specifying
ticker strings
, formed by appending an indicator category code
<category>
to a currency area code
<cross_section>
. These constitute the main part of a full quantamental indicator ticker, taking the form
DB(JPMAQS,<cross_section>_<category>,<info>)
, where
<info>
denotes the time series of information for the given cross-section and category. The following types of information are available:
-
value
giving the latest available values for the indicator -
eop_lag
referring to days elapsed since the end of the observation period -
mop_lag
referring to the number of days elapsed since the mean observation period -
grade
denoting a grade of the observation, giving a metric of real time information quality.
After instantiating the
JPMaQSDownload
class within the
macrosynergy.download
module, one can use the
download(tickers,start_date,metrics)
method to easily download the necessary data, where
tickers
is an array of ticker strings,
start_date
is the first collection date to be considered and
metrics
is an array comprising the times series information to be downloaded.
cids_dm = ["USD", "GBP", "EUR", "JPY", "AUD", "CAD", "CHF", "NZD", "SEK", "NOK"]
cids_em = [
"CNY",
"KRW",
"SGD",
"MXN",
"INR",
"RUB",
"ZAR",
"TRY",
"BRL",
"TWD",
"PLN",
"THB",
"IDR",
"HUF",
"CZK",
"ILS",
"CLP",
"PHP",
"COP",
"MYR",
"RON",
"PEN",
]
cids = cids_dm + cids_em
main = [
"RGDPTECH_SA_P1M1ML12",
"RGDPTECH_SA_P1M1ML12_3MMA",
"RGDPTECH_SA_P1M1ML12_D3M3ML3",
"RGDPTECHv10Y_SA_P1M1ML12_3MMA",
"RGDPTECHv5Y_SA_P1M1ML12_3MMA",
]
econ = [
"RIR_NSA",
"FXCRR_NSA",
"INTRGDP_NSA_P1M1ML12_3MMA",
"FXTARGETED_NSA",
"FXUNTRADABLE_NSA",
] # economic context
mark = [
"EQXR_NSA",
"EQXR_VT10",
"DU05YXR_NSA",
"DU05YXR_VT10",
"FXXR_NSA",
"FXXR_VT10",
] # market links
xcats = main + econ + mark
# Download series from J.P. Morgan DataQuery by tickers
start_date = "1990-01-01"
tickers = [cid + "_" + xcat for cid in cids for xcat in xcats]
print(f"Maximum number of tickers is {len(tickers)}")
# Retrieve credentials
client_id: str = os.getenv("DQ_CLIENT_ID")
client_secret: str = os.getenv("DQ_CLIENT_SECRET")
# Download from DataQuery
with JPMaQSDownload(client_id=client_id, client_secret=client_secret) as downloader:
start = timer()
assert downloader.check_connection()
df = downloader.download(
tickers=tickers,
start_date=start_date,
show_progress=True,
metrics=["value", "eop_lag", "mop_lag", "grading"],
suppress_warning=True,
)
end = timer()
dfd = df
print("Download time from DQ: " + str(timedelta(seconds=end - start)))
Maximum number of tickers is 512
Downloading data from JPMaQS.
Timestamp UTC: 2023-07-14 16:17:12
Connection successful!
Number of expressions requested: 2048
Requesting data: 100%|███████████████████████████████████████████████████████████████| 103/103 [00:32<00:00, 3.13it/s]
Downloading data: 100%|██████████████████████████████████████████████████████████████| 103/103 [00:59<00:00, 1.72it/s]
Download time from DQ: 0:02:09.746329
Availability #
cids_exp = cids_dm + cids_em # cids expected in category panels
msm.missing_in_df(dfd, xcats=main, cids=cids_exp)
Missing xcats across df: set()
Missing cids for RGDPTECH_SA_P1M1ML12: set()
Missing cids for RGDPTECH_SA_P1M1ML12_3MMA: set()
Missing cids for RGDPTECH_SA_P1M1ML12_D3M3ML3: set()
Missing cids for RGDPTECHv10Y_SA_P1M1ML12_3MMA: set()
Missing cids for RGDPTECHv5Y_SA_P1M1ML12_3MMA: set()
In developed markets and a portion of the emerging world, real-time measures of technical real GDP trends are available from the early 2000s. However, for some emerging countries these trends could only be estimated from the late 2000s due to lack of meaningful high-frequency growth indicators. Excess growth versus 10-year medians have significantly shorter history than outright growth trend estimates for some EM countries.
xcatx = main
cidx = cids_exp
dfx = msm.reduce_df(dfd, xcats=xcatx, cids=cidx)
dfs = msm.check_startyears(
dfx,
)
msm.visual_paneldates(dfs, size=(20, 5))
print("Last updated:", date.today())
Last updated: 2023-07-14
plot = msm.check_availability(
dfd, xcats=main, cids=cids_exp, start_size=(20, 2), start_years=False
)
Average grading is high across all indicators and cross-sections. Brazil, the Czech Republic and India are the only currency areas to attain perfect grades.
plot = msp.heatmap_grades(
dfd,
xcats=main,
cids=cids_exp,
size=(19, 2),
title=f"Average vintage grades from {start_date} onwards",
)
xcatx = main
cidx = cids_exp
msp.view_ranges(
dfd,
xcats=xcatx,
cids=cidx,
val="eop_lag",
title="End of observation period lags (ranges of time elapsed since end of observation period in days)",
start="2000-01-01",
kind="box",
size=(16, 4),
)
msp.view_ranges(
dfd,
xcats=xcatx,
cids=cidx,
val="mop_lag",
title="Median of observation period lags (ranges of time elapsed since middle of observation period in days)",
start="2000-01-01",
kind="box",
size=(16, 4),
)
History #
Technical real GDP growth trends #
There have been sizeable differences in the variations in technical real GDP trends. These are not only related to different economic structures, but also, and probably more so, to the availability of high-frequency predictors of real activity. More credible predictors also means more sources of variations in estimates. Conversely, economies with few meaningful higher-frequency growth indicators experiences less fluctuations in estimated growth. A prime example for the latter is China.
xcatx = ["RGDPTECH_SA_P1M1ML12"]
cidx = cids_exp
msp.view_ranges(
dfd,
xcats=xcatx,
cids=cidx,
sort_cids_by="mean",
start="2000-01-01",
kind="bar",
title="Means and standard deviations of annual technical real GDP trends, since 2000",
size=(16, 8),
)
xcatx = ["RGDPTECH_SA_P1M1ML12", "RGDPTECH_SA_P1M1ML12_3MMA"]
cidx = cids_exp
msp.view_timelines(
dfd,
xcats=xcatx,
cids=cidx,
start="2000-01-01",
title="Technical real GDP growth trends",
title_fontsize=27,
legend_fontsize=17,
title_adj=1.02,
title_xadj=0.45,
label_adj=0.075,
xcat_labels=["%oya, monthly", "3-month averages"],
ncol=4,
same_y=False,
aspect=1.7,
all_xticks=False,
)
Correlations of estimated growth trends have been almost uniformly positive across developed economies. Australia is the notable exception.
xcatx = "RGDPTECH_SA_P1M1ML12_3MMA"
cidx = cids_exp
msp.correl_matrix(
dfd,
xcats=xcatx,
cids=cidx,
title="Cross-sectional correlations of technical GDP growth trends, 3-month moving averages, across countries",
size=(20, 14),
)
Technical real growth trends have broadly tracked similar cyclical patterns as intuitive real growth trends (a separate quantamental growth trend indicator that mimics the more practical chart-based analysis of growth developments that is often employed by market economists). Neither of these two categories has had a clear and obvious information advantage over the other.
Yet there have been notable differences between the two concepts on various occasions. In Australia, the technical indicator was much less variable in the 2010s. In Switzerland and the Eurozone, the technical indicator suggested a less pronounced COVID recession. In New Zealand, the technical indicator showed more pronounced cyclical dynamics in the 2000s.
xcatx = ["RGDPTECH_SA_P1M1ML12_3MMA", "INTRGDP_NSA_P1M1ML12_3MMA"]
cidx = cids_exp
msp.view_timelines(
dfd,
xcats=xcatx,
cids=cidx,
start="2000-01-01",
title="Different types of real-time GDP growth trend estimations, %oya, 3mma",
title_fontsize=27,
legend_fontsize=17,
title_adj=1.03,
title_xadj=0.46,
label_adj=0.075,
xcat_labels=["Technical trend", "Intuitive trend"],
ncol=4,
same_y=False,
aspect=1.7,
all_xticks=False,
)
Technical excess GDP growth trends #
Technical excess growth trends have mostly yielded negative values since the beginning of the millenium, since trend growths in most countries slow. Compared to the regular technical growth estimates, the excess metrics mostly represent a qualified level shift that reduced mostly the values of countries with high past growth.
xcatx = ["RGDPTECH_SA_P1M1ML12_3MMA", "RGDPTECHv5Y_SA_P1M1ML12_3MMA"]
cidx = cids_exp
msp.view_ranges(
dfd,
xcats=xcatx,
cids=cidx,
sort_cids_by="mean",
start="2000-01-01",
kind="bar",
title="Means and standard deviations of annual technical real and excess GDP growth trends, since 2000",
xcat_labels=["Real, %oya, 3mma", "Vs 5-year lookback, %oya, 3mma"],
size=(16, 8),
)
xcatx = ["RGDPTECH_SA_P1M1ML12_3MMA", "RGDPTECHv5Y_SA_P1M1ML12_3MMA"]
cidx = cids_exp
msp.view_timelines(
dfd,
xcats=xcatx,
cids=cidx,
start="2000-01-01",
title="Technical GDP growth trends and excess trends (based on 5-year trailing mean)",
title_adj=1.03,
title_xadj=0.43,
title_fontsize=27,
legend_fontsize=17,
label_adj=0.075,
xcat_labels=["Real, %oya, 3mma", "Excess, %oya, 3mma"],
ncol=4,
same_y=False,
aspect=1.7,
all_xticks=False,
)
Using 10-year rather than 5-year historical benchmarks typically reduces excess growth rates further. The 10-year adjustment comes at the expense of a loss in available meaningful history.
xcatx = ["RGDPTECHv5Y_SA_P1M1ML12_3MMA", "RGDPTECHv10Y_SA_P1M1ML12_3MMA"]
cidx = cids_exp
msp.view_ranges(
dfd,
xcats=xcatx,
cids=cidx,
sort_cids_by="mean",
start="2000-01-01",
title="Means and standard deviations of excess GDP growth trends",
xcat_labels=["5-year lookback", "10-year lookback"],
kind="bar",
size=(16, 8),
)
xcatx = ["RGDPTECHv5Y_SA_P1M1ML12_3MMA", "RGDPTECHv10Y_SA_P1M1ML12_3MMA"]
cidx = cids_exp
msp.view_timelines(
dfd,
xcats=xcatx,
cids=cidx,
start="2000-01-01",
title="Technical GDP excess growth trends, %oya, 3mma, (based on 5-year and 10-year trailing means)",
title_adj=1.02,
title_xadj=0.43,
title_fontsize=27,
legend_fontsize=17,
xcat_labels=["5-year lookback", "10-year lookback"],
ncol=4,
same_y=False,
label_adj=0.075,
aspect=1.7,
all_xticks=True,
)
Technical real GDP growth trend changes #
Technical real growth changes have displayed vastly different ranges across countries, accentuated by the COVID pandemic downturn and recovery. As with trend variation these differences partly reflect availability of timely relevant higher-frequency activity indicators. The more good-quality indicators are available, the more often ecometric models would they their output.
xcatx = ["RGDPTECH_SA_P1M1ML12_D3M3ML3"]
cidx = cids_exp
msp.view_ranges(
dfd,
xcats=xcatx,
cids=cidx,
sort_cids_by="std",
start="2000-01-01",
title="Means and standard deviations of 3-month changes in annual technical real GDP growth trends",
kind="box",
size=(16, 8),
)
xcatx = ["RGDPTECH_SA_P1M1ML12_D3M3ML3"]
cidx = cids_exp
msp.view_timelines(
dfd,
xcats=xcatx,
cids=cidx,
start="2000-01-01",
title="Technical real GDP growth trend changes",
title_fontsize=27,
title_adj=1.02,
title_xadj=0.515,
legend_fontsize=17,
ncol=4,
same_y=False,
aspect=1.7,
all_xticks=False,
)
Like trends, technical real growth trend changes have been positively correlated across most available cross-sections.
xcatx = "RGDPTECH_SA_P1M1ML12_D3M3ML3"
cidx = cids_exp
msp.correl_matrix(
dfd,
xcats=xcatx,
cids=cidx,
title="Cross-sectional correlations of technical real GDP growth trend changes",
size=(20, 14),
)
Importance #
Research links #
“Global economic growth at the end of the year strongly predicts returns from a wide spectrum of international assets, such as global, regional, and individual-country stocks, FX, and commodities…Low growth in the global economy at the end of the year predicts higher returns over the following year. It also predicts the global business cycle. When global economic growth at the end of the year is low, investors expect a worsening of the global business cycle and increase their required returns.” Møller & Rangvid
“The relationship between GDP growth and financial markets needs to be seen in the context of the broader economic cycle…While there may be no consistent long term positive correlation between equity returns and actual GDP growth, a significant relationship has been found between equity returns and expected GDP growth.” May & Wade
“Economic growth differentials are plausible predictors of foreign exchange return trends because they are related to differences in monetary policy and return on investment. Suitable metrics for testing growth differentials as trading signals must replicate historic information states.For simple growth differentials, the statistical probability of positive correlation with subsequent returns has been near 100% with a quite stable relationship across time. Excess growth trends, relative to potential growth proxies, would have been more appropriate predictors for non-directional (hedged) FX forward returns.” Macrosynergy
Empirical clues #
As proposed by most theories, there has been a positive long-term relationship between observed technical GDP trends and real short-term interest rates in developed markets. While real rates are affected by many other factors, information efficiency in tracking growth almost certainly helps predictions. Real growth trends can also serve as a “gravity centre” for real interest rates in the longer term.
cr = msp.CategoryRelations(
dfd,
xcats=["RGDPTECH_SA_P1M1ML12", "RIR_NSA"],
cids=cids_dm,
freq="A",
lag=0,
xcat_aggs=["mean", "mean"],
start="2000-01-01",
)
cr.reg_scatter(
title="Technical GDP trends and real interest rates in developed markets, annual averages since 2000",
labels=True,
coef_box="upper left",
xlab="Technical GDP growth trend, annual",
ylab="Main real short-term interest rate",
)
By contrast, the relation between GDP trends and real interest rates in the emerging world has been rather negative. This is plausibly due to “reverse causality”: countries or periods that impose high real policy rates or high implicit risk premia for local bonds and deposits provide less favorable growth conditions.
cr = msp.CategoryRelations(
dfd,
xcats=["RGDPTECH_SA_P1M1ML12", "RIR_NSA"],
cids=cids_em,
years=3,
lag=0,
xcat_aggs=["mean", "mean"],
start="2000-01-01",
)
cr.reg_scatter(
title="Technical GDP trends and real interest rates in EM countries, 3-year averages since 2000",
labels=True,
coef_box="lower left",
xlab="Technical GDP growth trend 3MMA",
ylab="Main real short-term interest rate",
)
In the developed world, technical GDP trends have been significant preductors of subsequent monthly 10% vol-targeted equity returns.
cr = msp.CategoryRelations(
dfd,
xcats=["RGDPTECH_SA_P1M1ML12", "EQXR_VT10"],
cids=cids_dm,
freq="M",
lag=1,
xcat_aggs=["last", "sum"],
fwin=1,
start="2002-01-01",
)
cr.reg_scatter(
title="Technical real GDP trends and subsequent equity returns, 10% vol-target, since 2000",
labels=False,
coef_box="upper right",
xlab="Observed technical GDP trend, %oya",
ylab="Equity index future return, next month",
)
EQXR_VT10 misses: ['NOK', 'NZD'].
Technical excess growth in the developed world has been negatively correlated with subsequent duration returns (measured by 5-year IRS fixed receiver positions), consistent with the intuition of a Taylor rule for monetary policy and some market inefficiency in pricing changes in excess growth trends consistently.
cr = msp.CategoryRelations(
dfd,
xcats=["RGDPTECHv5Y_SA_P1M1ML12_3MMA", "DU05YXR_NSA"],
cids=cids_dm,
freq="M",
lag=1,
xcat_aggs=["last", "sum"],
fwin=1,
start="2002-01-01",
xcat_trims=[20, 200],
)
cr.reg_scatter(
title="Technical GDP excess growth trend changes and subsequent IRS returns in developed countries since 2002",
labels=False,
coef_box="upper right",
xlab="Excess technical real GDP growth trend, % oya, 3mma, based on 5 year lookback",
ylab="Next month 5-year IRS returns, % mr",
)
Appendices #
Appendix 1: How technical macro trends are estimated #
The basics in a nutshell #
Unlike market trends, trends in the macro quantamental space refer to fundamental economic changes as observed by financial markets. Since broad trends in many areas are reported with long lags, their timely estimation requires some econometric modeling. And since these models plausibly change over time, their construction for the purpose of backtesting and future trading requires some automation, by combining econometrics with machine learning. A practical approach to such technical macro trends is “two-stage supervised learning”. Its first stage is scouting features (predictors), by applying a regularization algorithm, such as “elastic net”, to available data sets during the regular release cycle, which identifies competitive features based on timelines and predictive power. Sequential scouting gives feature vintages. The second stage evaluates various candidate models based on the concurrent feature vintages and selects at any point in time one with the best historic predictive power. Sequential evaluation gives data vintages. For a more thorough explanation of this principle, view the summary here .
Conventional econometric models, such as those used by central banks, are immutable and not a valid basis for backtesting trading strategies. That is because they are being built and redesigned with hindsight and do not aim at replicating perceived economic trends in real time. Even if estimated parameters are sequentially updated, hyperparameters (structural model decisions) are not. By contrast, technical macro trends evolve in accordance with three types of changes arising from the information flow: (1) new information on fundamental developments (as captured by static prediction models), (2) revisions of prediction model parameters (implemented by some nowcasting frameworks), and (3) changes in prediction model structure (implemented by supervised statistical learning). In practice, this requires dealing with three major challenges: the feature selection problem, the publication lag and frequency problem, and the prediction problem.
The feature selection problem #
Many broad macro indicators such as national accounts and corporate earnings, are published in low frequency and with considerable lags after the observed period. That is why markets predict their forthcoming releases based on more timely higher-frequency indicators. In the context of machine learning, unpublished key indicators become targets and the higher-frequency predictors become features (aka predictors) or at least feature candidates.
The choice of features is a key structural model decision. The challenge is to allow the algorithm to support that choice in a consistent fashion for the past, present and future. A feature should only be added at any given real time date if it had helped predicting the target prior to the date. JPMaQS makes this choice by applying the elastic net regularization method to all plausible features used by the market at a given point in the data cycle. All plausible features refers to all indicators that are rated as important by market news services (release calendars) that are logically and directly related to the target. For example, the plausible features for annual GDP growth rate are mostly annual percent changes or differences in activity indices. Asset returns or analyst predictions do not qualify. Point in the data cycle means a point in time between two first releases of the low-frequency target, which is related to the available subset of predictors.
During the length of the cycle, JPMaQS considers the growing set of features as forecasting candidates and if a feature has value as a predictor even just one day in the cycle - according to the elastic net - it is pre-selected as feature of the model going forward. This pre-selection is revised annually.
The lag and frequency problem #
Economic data that informs on activity of an observed period (typically a month) are released with varying delays. On any given day, the market sees only a smaller or larger subset of all interesting indicators for a recent period. This is equivalent to what econometricians call the “jagged edge” of recent data sets, the appearance of various “non-available” entries in the data matrix used for low-frequency indicator prediction.
These non-available entries will be filled by predictions based on all available high-frequency values. For example, if a feature is available for January but not February and March, its values for February and March will be equal to the January value plus a prediction of change based on its correlation with all other features that have already been released for February and March respectively. For further information, see Appendix 2 below.
The prediction problem #
If one has chosen the right features and “filled in” their missing values, one needs to formalize an appropriate and robust relation between the features and their target. Estimating such a relation may not be very stable if there are many features and short time series. Hence, JPMaQS predicts the low-frequency target value by using the values of the first principal component of the feature set for all high-frequency periods in question. The prediction itself is than done with ordinary least squares regression. Linear regression on principle components is a most simple prediction method that has been very popular in market economics for decades. For more information, see Appendix 3 .
Important technical details #
Some of the most financially significant economic indicators suffer from the low frequency with which they are published. Fortunately, indicators of a higher frequency often prevail which indirectly inform on forthcoming low-frequency tickers. For example, the mammothian task of calculating the Real Gross Domestic Product (RGDP) inexorably renders it a quarterly observed macro-variable. Incremental changes in RGDP can nevertheless be estimated between releases by looking at innovations in monthly macro variables such as the unemployment rate, industrial production, and consumer confidence surveys.
Technical Indicators was thus born out of a need for a universal machine which maps high frequency macro-variables into an effective present estimate of lower frequancy variables of interest - without losing economic plausibility and intuition in the process. In designing such a system, we have addressed three key methodological issues viz.:
(1) Pre-selection of features : How can we make an economically plausible preliminary assessment of which high-frequency variables to include, void of forward looking bias?
(2) Dealing with release schedule inequality : Upon lining up high-frequency series for prediction purposes, one may find some of the recent values to be missing. These NaNs are spawned by series which have yet to be released in the latest available observation period. How does one go about handling this so-called jagged edge? (See figure below).
(3) Estimating the low-frequency series : Finally, a map between input features (high-frequency variables) and response (low-frequency variable) needs to be learned. How do we formalize such a relationship?
The technical details of these three points are presented below.
We will assume that across our features and response variables, we are dealing with growth rates (or difference in levels) over the same window of time (typically, annually). For example, for RGDP (an Index) we look at the percentage change over a year ago, whereas for the unemployment rate we’d be looking at the difference over a year ago. Applying these transformations will in the majority of cases guarantee that our series are stationary. To curb the effect of outliers in the data, it may be desirable to further introduce some level of winsorization , e.g. by applying a log transformation beyond a certain threshold.
Pre-selection of Features #
Deciding which features to use in our model is a delicate matter. Should we invoke considerable domain-expertise by having an economist pre-select our feature space, or should we throw anything but the kitchen sink at our problem? Neither approach is strictly speaking desirable, being respectively prone to hindsight bias and numerical noise / short fat data issues. Rather, we desire a systematic feature reduction technique (i.e. elastic net ) which can help prune our feature space. To mimic the data release schedule, we will make a running assessment of which features economist plausibly would consider adding to their model between quarterly releases. As information is released between quarters, we shall consider this growing set of macro-indicators as forecasting candidates. If a feature at any point during this quarterly life-cycle exhibits explanatory power, we consider it as pre-selected for the purpose of steps (2) and (3).
More formally, let \(q_0, q_1, q_2, ..., q_n\) be a series of quarterly releases of the response variable, and let \(\{q_i, q_{i+1}\}\) be any two contiguous releases. Let \(t[i], t[i+1]\) be the times at which they are released and let \(t \in [t[i], t[i+1]) \equiv \mathbb{T}_{i+1}\) be some time at which we desire an estimate of the quarterly release. Furthermore, let \(\mathbb{F}_{i+1}=\{f_1,f_2,...,f_p\}\) be the set of macro-predictors (features) released in the interval \(\mathbb{T}_{i+1}\) . Typically these predictors have monthly observations, but importantly they are not all released at the same time.
Now suppose we are in the business of forecasting \(q_{i+1}\) . First we will look at the previous release interval \(\mathbb{T}_i \equiv[t[i-1],t[i])\) and rank all macro-economic indicators according to their release date. For this exercise we are only interested in those releases which have observation periods inside of \(\mathbb{T}_i\) . Now \(\forall t \in \mathbb{T}_{i}\) let \(\mathbb{F}_{t,i}\) be the space of all such features that have been released in \([t[i-1],t]\) (e.g. in today is Feb 27 we’d consider all features with releases between this date and the previous quarterly release). We align all features \(f \in \mathbb{F}_{t,i}\) based on their observation periods in the matrix \(X_t\) , forward-fill the jagged edge, and finally shift all observation periods \(\Delta\) months forward, where \(\Delta\) is the number of months between the last observation period in \(X_t\) and the last observation period in \(q_t\) . Next we extract the important features at time \(t\) by performing a constrained regression of the quarterly release against this time-shifted feature matrix, specifically via a cross-validated Elastic Net. By normalizing features inside the cross-validation pipeline, we allow ourselves to read-off the magnitude of the regression coefficients directly as importance weights, without leaking in future information. Importantly, features with zero coefficients are deemed irrelevant for the forecast at time \(t\) .
Let \(\mathbb{I}_{t,i} \subseteq \mathbb{F}_{t,i}\) be the subset of features thus selected, then for the purpose of our forecast in step 2 we will maximally consider using all of those features we at some point picked out during \(\mathbb{T}_i\) . Specifically, for our \(q_{i+1}\) forecast we will use the features:
NOTE : In practice, we only perform pre-selection once per annum to reduce the computational burden.
Dealing with Release Schedule Inequality (the “Jagged Edge”) #
Having established which features we want to consider at time \(t\) , we can construct the associated feature matrix \(X_t\) which will be used in learning the map \(f: X \mapsto q\) . Again, we are faced with the jagged edge: certain features might have missing recent observations in \(X_t\) as they are yet to be published.
To overcome this issue, we make two important empirical observations: first , individual features typically exhibit strong correlation with their first lag. Secondly , the first difference of \(X_t\) showns non-zero cross correlations between features. Using the notational convention that \(x_{i,:}\) is the \(i^{th}\) row of \(X_t\) , this motivates us to postulate that \(x_{i,:}\) obeys the dynamical relationship
where \(\Sigma\) is a covariance matrix the off-diagnoal entries of which need not be zero. Prosaically, feature values at observation point \(i\) are assumed multi-normally distributed, centered around their most recent value, but not independently so. The benefit of this model is that it is easy to estimate. Crucially, it also enables us to fill out NaNs in \(X_t\) in a “non-trivial” way by computing conditional expectations based of partially observed information (the publication of a subset of features). Effectively, shocks in observed values vis-a-vis their expectation will propogate to unbserved values.
To see how this works, consider the simple case of a bivariate feature matrix \(X_t \in \mathbb{R}^{N \times 2}\) for which \(x_{:,1}\) is completely specified but \(x_{:,2}\) has a missing value at the end ( \(x_{N;2} \in \emptyset\) ). We are interested in computing \(\mathbb{E}[x_{N;2} | x_{N;1}=x]\) - the conditional expectation for the missing value in feature 2, given our knowledge of the most recent realisation in feature 1. Assuming the joint distribution \(\forall i\) :
where the \(\sigma_i\) s codify variance and \(\rho\) the correlation, it can be shown that
Intuitively this tells us that if the (differenced) series are independent ( \(\rho=0\) ) then the expected value for \(x_{N;2}\) is just the most recently available observation: \(x_{N-1;2}\) . On the other hand, if \(\rho \neq 0\) , the surprise from the observation in feature 1 (i.e. the observed feature 1 value less its marginal expectation), perturbs our expectation for \(x_{N,2}\) away from \(x_{N-1;2}\) with magnitude \(\rho \sigma_2 /\sigma_1\) . If the surprise is positive (negative) and \(\rho>0\) ( \(\rho<0\) ) this would entail a bump upwards in our expectation for \(x_{N;2}\) . On the other hand, if the surprise is positive (negative) and \(\rho<0\) ( \(\rho > 0\) ) we would lower our expectations for the value of \(x_{N;2}\) .
The same general idea extends to feature matrices of higher dimensionality. For an overview of computing conditional expectations with the multivariate normal distribution the reader is referred to the Wikipedia entry.
Estimating the Low-frequency Series #
Upon formalizing a relationship between input features and response we follow a somewhat familiar path in the now-casting literature which is to deploy a factor-model. Specifically, we extract the first principal component of the feature-matrix alongside the first and second lag thereof and run the regression:
(Here \(PC_1\) is the first principal component of \(X_t\) and \(\text{lag}(PC_1,n)\) is the same vector shifted \(n\) units forward, while \(\epsilon\) is a noise term). The coefficients are immediately obtained through ordinary least squares (OLS) using the available (quarterly) data. The model is thus not only easy to estimate, but also easy to interpret. The inclusion of the lagged variables is motivated by the idea that running monthly observations inside a quarter all “add up” to the quarterly release.
Finally, by plugging in the final entry of \([PC_1, \text{lag}(PC_1,1), \text{lag}(PC_1,2)]\) into this relationship we obtain the present purported value of the quarterly variable.
Methodological Pseudo-code #
To summarise, technical indicators operates along the following lines:
For \(t\) in all possible ordered feature releases :{
If \(t\) is a new year: Pre-select features \(\mathbb{I}\) for consideration using a running elastic net regression (step (1)).
Construct the feature matrix \(X_t\) using features \(\mathbb{I}\) . Populate the jagged edge using a conditional multivariate normal assumption (step (2)).
Regress the first PC of \(X_t\) against the quarterly variable (overlapping quarterly observations only). Use the last entry in \(X_t\) to make an estimate as to where the quarterly variable is “now” (step (3)).
}
Appendix 2: Feature candidates for technical growth trends #
Developed markets #
-
Business confidence, National Australia Bank
-
Consumer sentiment index, current conditions, Melbourne Institute, sa
-
Industrial production index, ABS, sa, %oya
-
Manufacturing PMI, Australian Industry Group, sa
-
New vehicles sales, Chamber of Automotive Industries, diff oya
-
Residential construction starts, value, ABS, %oya
-
Retail sales value, ABS, sa, %oya
-
Service business survey index, Australian Industry Group, sa
-
Unemployment rate, ABS, sa, diff oya
CAD
-
Construction starts, CMHC, %oya
-
Employment, Statistics Canada, sa, %oya
-
Industrial production index, Statistics Canada, sa, %oya
-
Monthly real GDP, Statistics Canada, sa, %oya
-
Retail sales value, Statistics Canada, sa, %oya
-
Unemployment rate, Statistics Canada, sa, diff oya
CHF
-
Industrial production index (monthly), Bundesamt fuer Statistik, calendar-adjusted, %oya
-
Industrial production index (quarterly), Bundesamt fuer Statistik, calendar-adjusted, %oya
-
KOF business survey, overall economic index, KOF Institute
-
Manufacturing survey, business situation, KOF Institute, sa
-
New passenger cars registrations, Bundesamt fuer Statistik, diff oya
-
Unemployment rate, SECO, sa, diff oya
EUR
-
Business sentiment indicator, Eurostat, sa
-
Business confidence index (Italy), ISTAT, sa
-
Construction production index, Eurostat, sa, %oya
-
Consumer confidence, EU Commission, sa
-
Gross Domestic Product (Germany), Federal Statistics Office, calendar-adjusted, sa, % oya
-
Ifo business survey (Germany), climate, Ifo, sa
-
Industrial production, Eurostat, calendar-adjusted, sa, % oya
-
Industry sentiment indicator, EU commission, sa
-
Manufacturing business survey (France), INSEE, sa
-
Retail sales volume, Eurostat, calendar-adjusted, sa, % oya
-
Unemployment rate, Eurostat, sa, diff oya
-
Unemployment rate (France), sa, diff oya
-
Unemployment rate (Germany), sa, diff oya
-
Unemployment rate (Italy), sa, diff oya
GBP
-
Employment, ONS, sa, diff oya
-
Gross value added volume, ONS, sa, %oya
-
Industrial production index, ONS, sa, %oya
-
Industry total order book assessment, CBI, balance
-
Motor vehicles registrations, new cars, SMMT, diff oya
-
Retail sales volume, ONS, sa, %oya
-
Services trade volume, ONS, sa, %oya
-
Unemployment rate, ONS, sa, diff oya
JPY
-
Consumer confidence index, Cabinet Office
-
Consumer survey, consumption trend, Statistics Bureau, %oya
-
Export volume index, Ministry of Finances, %oya
-
Housing starts, MLIT, diff oya
-
Import volume index, Ministry of Finances, %oya
-
Industrial production index, METI, sa, %oya
-
Larger manufacturing enterprises’ business conditions, TANKAN, diffusion index
-
Machinery orders (value in JPY), Cabinet Office, %oya
-
Retail sales volume, METI, sa, %oya
-
Unemployment rate, sa, diff oya
NOK
-
Household consumption of goods, Statistics Norway, sa, %oya
-
Housing starts, Statistics Norway, sa, diff oya
-
Industrial production (incl. mining and extraction), Statistics Norway, calendar-ajd., %oya
-
Retail sales volumes, Statistics Norway, sa, %oya
-
Unemployment rate, Statistics Norway, 3mma, sa, diff oya
-
Manufacturing confidence, Statistics Norway, sa
NZD
-
Business activity outlook, ANZ, index
-
Capacity utilization, ANZ survey, %
-
Manufacturing sales volumes, Statistics NZ, %oya
-
Residential construction permits, Stats New Zealand, diff oya
-
Unemployment rate, Statistics NZ, sa, diff oya
SEK
-
Business survey, manufacturing confidence, Konjunkturinstitutet, sa
-
Consumer confidence, Konjunkturinstitutet, sa
-
Economic sentiment indicator, EU Commission, sa
-
Industrial production index, SCB, calendar-adjusted, sa, %oya
-
PMI composite index, sa
-
Private-sector production index, SCB, calendar-adjusted, sa, %oya
-
Retail sales volume, SCB, sa, %oya
-
Unemployment rate, SCB, sa, diff oya
USD
-
Business applications, Census, diff oya
-
Business leaders survey, current conditions, diffusion index , New York Fed
-
Construction spending, Census, saar, diff oya
-
Consumer confidence index, Conference Board, sa
-
Consumer sentiment, expectation index, University of Michigan
-
Continuing jobless claims, Department of Labor, sa, diff oya
-
Hours worked per week by all private sector employees, BLS, sa, diff oya
-
Housing starts, private and residential, Census, saar, diff oya
-
Industrial production index, Federal Reserve, sa, %oya
-
Initial jobless claims, Department of Labor, sa, diff oya
-
Manufacturing sales (nominal), Census, sa, % oya
-
Manufacturing purchasing managers’ index, ISM, sa
-
Manufacturing survey, current general activity index, Philadelphia Fed, sa
-
Manufacturing survey, future general activity index, New York Fed, sa
-
Mortgage applications index, MBA, calendar-adjusted, saar, % oya
-
National activity index, Chicago Fed
-
National financial conditions index, Chicago Fed
-
Nonfarm payroll, BLS, sa, diff oya
-
Non-manufacturing business outlook, diffusion index, Philadelphia Fed, sa
-
Pending home sales index, NAR, saar, % oya
-
Real personal consumption expenditures, BEA, saar, %oya
-
Redbook retail sales index, same stores, Redbook Research, % oya
-
Retail sales (nominal), Census, calendar-adjusted, saar, % oya
Emerging markets #
BRL
-
Business confidence, ICEI, diffusion index
-
Economic activity index, CBR, sa, %oya
-
Retail sales volumes, IBGE , sa, %oya
-
Industry capacity utilization, CNI, %
-
Consumer confidence survey, Fecomerico
-
Industrial production, IBGE, %oya
-
Services volume, IBGE, sa, %oya
CLP
-
Business confidence, CIRAE, diffusion index
-
Industrial production, INE, sa, %oya
-
Supermarket sales volume, INE, %oya
-
Economic perceptionn index, IPEC
-
Monthly economic activity indicator, Central Bank of Chile, sa, %oya
CNY
-
Industrial production, NBS, index oya
-
Consumer confidence index, CEMAC
-
Electricity consumption, NEA, %oya
-
Rail freight volume, NBS, %oya
-
Ports freight volume, NBS, %oya
-
Roads, freight volume, NBS, %oya
COP
-
Manufacturing output, DANE, %oya
-
Retail sales value, ex fules and vehicles, DANE, sa, %oya
-
Unemployment rate, DANE, diff oya
-
Industry capacity utilization, ANDI, %
-
Consumer confidence index, Fedesarollo
-
Consumer confidence index, Fedesarollo
CZK
-
Business confidence, CSU, sa
-
Industrial production, CZSO, calendar-adj., sa, %oya
-
Retail sales volume, CZSO, %oya
-
Unemployment rate, MoLSA, diff oya
-
Consumer confidence, CZSO, balance
-
Economic sentiment, EU Commission, sa
-
Construction output, CZSO, %oya
HUF
-
Business confidence indicator, GKI, sa
-
Industrial production, HCSO, calendar-adj., sa, %oya
-
Retail sales volume, HCSO, calendar-adj., %oya
-
Unemployment, 3mma, HCSO, sa, diff oya
-
Economic sentiment indicator (all sectors), EU Commission, sa
-
Consumer cinfidence index, GKI, sa
-
Connstruction output, HCSO, calendar-adj., sa, %oya
IDR
-
Retail sales volume, Bank Indonesia, %oya
-
Car sales, number, GAIKINDO, diff oya
-
Industrial production, Bank Indonesia, %oya
-
Business activity survey, Bank Indonesia
ILS
-
Manufacturing output, CBS, %oya
-
Retail sales volume, CBS, sa, %oya
-
Unemployment rate, CBS, sa, diff oya
-
State of economy indicator, Bank of Israel
INR
-
Industrial production, MoS&PI, %oya
-
Consumer confidence index, RBI
-
Industry business assessment index, RBI
KRW
-
All industries business confidence, KERI
-
Retail sales volume, Statistics Korea, %oya
-
Unemployment rate, Statistics Korea, sa, diff oya
-
Economic sentiment indicator, Bank of Korea
-
Industrial production, Statistics Korea, %oya
-
Manufacturing business conditions, Bank of Korea, sa
-
Non-manufacturing business conditions, Bank of Korea, sa
-
Services output, Bank of Korea, %oya
-
Employed persons, Statistics Korea, %oya
MXN
-
Manufacturing business confidence, IMEF, sa
-
Monthly GDP, IGAE, sa, %oya
-
Industrial production, IGAE, %oya
-
Unemployment , IGAE, diff oya
-
Non-manufacturing business confidence, IMEF, sa
MYR
-
Employment, DOSM, diff oya
-
Industrial production, DOSM, sa, %oya
-
Unemployment rate, DOSM, diff oya
PEN
-
Manufacturing output, INEI, %oya
-
Unemployment rate, 3mma, BCRP, diff oya
-
Business expectations in industry, BCRP, index
-
Monthly GDP, INEI/ BCRP, sa, %oya
PHP
-
Manufacturing output, Statistics Office, %oya
-
Manufacturing capacity utilization, Statistics Office, %
-
Car sales, CAMPI, diff oya
-
Workers remittances, Central Bank, USD, %oya
PLN
-
Retail sales value, GUS, %oya
-
Industrial production, GUS, sa, %oya
-
Unemployment rate, MoFLSP, sa, diff oya
-
Consumer confidence, EU Commission, sa
-
Economic sentiment indicator (all sectors), EU Commission, sa
-
Manufacturing business climate indicator, GUS
RON
-
Employment (registered), NIS, diff oya
-
Economic sentiment indicator, EU Commission, sa
-
Industrial production, NIS, %oya
-
Retail sales volume (ex motor vehicles), NIS, sa, calendar-adj., %oya
-
Unemployment rate, RNAE, diff oya
-
Manufacturing new orders volume, NIS, %oya
RUB
-
Unemployment rate (ILO concept), Rosstat, diff oya
-
Industrial production, Rosstat, %oya
-
Monthly GDP, Rosstat, %oya
SGD
-
Retail sales volume, SingStat, sa %oya
-
PMI survey, SIPMM
-
Manufacturing output, SingStat, %oya
-
Export volume, SingStat, %oya
-
Import volume, SinStat, %oya
THB
-
Business sentiment, Bank of Thailand, diffusion index
-
Industrial production, TOIE, %oya
-
Unemployment rate, Bank of Thailand , diff oya
-
Consumer confidence index, Chamber of Commerce
TRY
-
Business confidence index, CBRT
-
Industrial production, TurkStat, sa and calendar-adj., %oya
-
Economic confidence index (consumers and producers), Turkstat
-
Retail sales volumes, TurkStat, %oya
-
Consumer confidence, TurkStat
-
Manufacturing capacity utilization, CBRT, sa
TWD
-
Unemployment rate, DGBAS, sa, diff oya
-
Industrial production, Ministry of Economics, %oya
-
Retail sales volume, Ministry of Economics, %oya
ZAR
-
Business confidence index, BER (quarterly)
-
Business confidence index, SACCI SASCCONDR
-
Mining and quarrying production, StatSA, sa, %oya
-
Wholesale trade volume, StatSA, sa, %oya
-
Retail sales volume, StatSA, sa, %oya
-
Industrial production, StatSA, sa, %oya
Appendix 3: Currency symbols #
The word ‘cross-section’ refers to currencies, currency areas or economic areas. In alphabetical order, these are AUD (Australian dollar), BRL (Brazilian real), CAD (Canadian dollar), CHF (Swiss franc), CLP (Chilean peso), CNY (Chinese yuan renminbi), COP (Colombian peso), CZK (Czech Republic koruna), DEM (German mark), ESP (Spanish peseta), EUR (Euro), FRF (French franc), GBP (British pound), HKD (Hong Kong dollar), HUF (Hungarian forint), IDR (Indonesian rupiah), ITL (Italian lira), JPY (Japanese yen), KRW (Korean won), MXN (Mexican peso), MYR (Malaysian ringgit), NLG (Dutch guilder), NOK (Norwegian krone), NZD (New Zealand dollar), PEN (Peruvian sol), PHP (Phillipine peso), PLN (Polish zloty), RON (Romanian leu), RUB (Russian ruble), SEK (Swedish krona), SGD (Singaporean dollar), THB (Thai baht), TRY (Turkish lira), TWD (Taiwanese dollar), USD (U.S. dollar), ZAR (South African rand).