China special data #
This group contains special categories related to the economy of the People’s Republic of China. China’s statistics are quite specific in terms of content and methodology and do not provide the same degree of transparency and consistency as other countries. This means that vintages often require special estimates for missing pieces of information.
Note that other internationally comparable Chinese data are grouped outside this data set under the respective international category, such as industrial production, producer price index, or foreign trade.
Manufacturing #
Ticker : PMIMANU_SA_3MMA / PMIMANU_SA_D3M3ML3 / PMIMANU_SA_D3M3ML12
Label : PMI manufacturing survey, sa: 3mma / diff 3m/3m / diff oya, 3mma
Definition : Purchasing managers manufacturing survey headline index, seasonally adjusted: 3 month moving average / difference of latest 3 months over previous 3 months / difference over a year ago, 3-month moving average
Notes :
-
Data are released by the China Federation of Logistics & Purchasing.
-
The survey uses responses from executives across more than 700 manufacturing enterprises across mainland China.
Ticker : PMIMANUORD_SA_3MMA / PMIMANUORD_SA_D3M3ML3 / PMIMANUORD_SA_D3M3ML12
Label : PMI manufacturing survey new orders, sa: 3mma / diff 3m/3m / diff oya, 3mma
Definition : Purchasing managers manufacturing survey new orders index, seasonally adjusted: 3 month moving average / difference of latest 3 months over previous 3 months / difference over a year ago, 3-month moving average
Notes :
-
Data are released by the China Federation of Logistics & Purchasing.
-
New orders refer to how many new orders have been received from customers compared to the previous month
Ticker : INDPROFIT_NSA_P1M1ML12 / INDPROFIT_NSA_P1M1ML12_3MMA
Label : Industrial profits: %oya / %oya, 3mma
Definition : Industrial profits: percent over year ago / percent over a year ago, 3-month moving average
Notes :
-
Data are released by the National Bureau of Statistics of China.
-
Industrial profits refer to all state-owned industries and non-state-owned industrial enterprises with annual sales revenue above 5 million yuan.
-
Data are not released for January but are combined with the release for February. JPMaQS vintages spread this value evenly over both months.
Ticker : INDCAPUT_NSA_D1Q1QL1 / INDCAPUT_NSA_D1Q1QL4
Label : Industrial capacity utilization: diff q/q / diff oya
Definition : Industrial capacity utilization: difference of latest quarter over previous quarter / difference over a year ago
Notes :
-
Data are released by the China National Bureau of Statistics.
-
Represents the % of actual utilization in terms of potential.
-
Information is gathered by surveys from about 90,000 enterprises.
Ticker : CARPROD_NSA_P1M1ML12 / CARPROD_NSA_P1M1ML12_3MMA
Label : Car production: %oya / %oya, 3mma
Definition : Car production: percent over year ago / percent over a year ago, 3-month moving average
Notes :
-
Data are from the China Passenger Car Association.
Other activity indicators #
Ticker : FDI_NSA_P1M1ML12 / FDI_NSA_P1M1ML12_3MMA
Label : Foreign direct investment: %oya / %oya, 3mma
Definition : Foreign direct investment: % over year ago / % over a year ago, 3-month moving average
Notes :
-
Data are from China’s National Bureau of Statistics.
-
The series is in nominal terms and covers investments for projects with a total investment of over 500,000 yuan.
Ticker :PMISERV_SA_3MMA / PMISERV_SA_D3M3ML3 / PMISERV_SA_D3M3ML12
Label : Non-manufacturing PMI, sa: 3mma / diff 3m/3m / diff oya, 3mma
Definition : Non-manufacturing PMI, seasonally adjusted: 3 month moving average / difference of latest 3 months over previous 3 months / difference over a year ago, 3-month moving average
Notes :
-
Data are from China’s National Bureau of Statistics.
-
The survey uses responses from executives across more than 1200 non-manufacturing enterprises across mainland China. They are operating in services, construction, and agriculture.
Ticker : URBINV_NSA_P1M1ML12 / URBINV_NSA_P1M1ML12_3MMA
Label : Urban infrastructure investment: %oya / %oya, 3mma
Definition : Urban infrastructure investment: % over year ago / % over year ago, 3-month average
Notes :
-
Data are from the China National Bureau of Statistics.
-
The series covers the nominal value of investment into urban infrastructure. This includes a wide range of projects and initiatives, such as building and maintaining roads, bridges, public transportation systems, water and sewerage systems, power grids, telecommunications networks.
-
January data are not released separately but included in the February value. We spread the February values evenly over January and February.
Ticker : URBINVP_NSA_P1M1ML12 / URBINVP_NSA_P1M1ML12_3MMA
Label : Private urban infrastructure investment: %oya / %oya, 3mma
Definition : Private urban infrastructure investment: % over year ago / % over year ago, 3-month average ago
Notes :
-
Data are from China National Bureau of Statistics.
-
This typically refers to the nominal amount of investment coming from the private sector towards the development and upgrade of urban infrastructure.
-
January data are not released separately but included in the February value. We spread the February values evenly over January and February.
Ticker : FIXEDINV_NSA_P1M1ML12 / FIXEDINV_NSA_P1M1ML12_3MMA
Label : Fixed asset investments into urban assets: %oya / %oya, 3mma
Definition : Fixed asset investments: % over year ago / % over year ago, 3-month average ago
Notes :
-
Data are from China’s National Bureau of Statistics.
-
This series counts the total value of investments into fixed assets
-
Fixed assets refer to long-term tangible assets that are used in the production of income, such as buildings, machinery, and equipment. In the context of urban assets, these could include things like residential and commercial real estate, urban infrastructure (like roads, bridges, utilities), and public facilities (like schools, hospitals, parks).
-
January data are not released separately but included in the February value. We spread the February values evenly over January and February.
Ticker : ELECPROD_NSA_P1M1ML12 / ELECPROD_NSA_P1M1ML12_3MMA
Label : Electricity production: %oya / %oya, 3mma
Definition : Electricity production: %over year ago / % over year ago, 3-month average
Notes :
-
Data are from the China National Bureau of Statistics.
-
Production is measured in kilowatt hours.
Freight transport #
Ticker : FREIGHT_NSA_P1M1ML12 / FREIGHT_NSA_P1M1ML12_3MMA
Label : Total freight transport: %oya / %oya, 3mma
Definition : Total freight transport : % over year ago / % over a year ago, 3-month moving average
Notes :
-
Data are from the China National Bureau of Statistics.
-
This is a measure of the total volume of goods transported through various modes of transport within a specific period. This measure is typically reported in tonne-kilometers, which is a unit that represents the transport of one tonne of goods over a distance of one kilometer.
Ticker : FREIGHTRAIL_NSA_P1M1ML12 / FREIGHTRAIL_NSA_P1M1ML12_3MMA
Label : Rail freight transport: %oya / %oya, 3mma
Definition : Rail freight transport : % over year ago / % over a year ago, 3-month moving average
Notes :
-
Data are from China’s National Bureau of Statistics.
-
The series counts the metric tons of total freight transported by rail. It is a crucial measure in China, given the extensive railway network across the country.
Ticker : FREIGHTROAD_NSA_P1M1ML12 / FREIGHTROAD_NSA_P1M1ML12_3MMA
Label : Road freight transport: %oya / %oya, 3mma
Definition : Road freight transport : % over year ago / % over a year ago, 3-month moving average
Notes :
-
Data are from China’s National Bureau of Statistics.
-
The measure the metric tons of total freight transported by road.
Ticker : FREIGHTAIR_NSA_P1M1ML12 / FREIGHTAIR_NSA_P1M1ML12_3MMA
Label : Air freight transport: %oya / %oya, 3mma
Definition : Air freight transport : % over year ago / % over a year ago, 3-month moving average
Notes :
-
Data are from China National Bureau of Statistics.
-
They measure metric tons of total freight transported by air, representing a smaller proportion of total freight.
Private consumption #
Ticker : RSALES_NSA_P1M1ML12 / RSALES_NSA_P1M1ML12_3MMA
Label : Retail sales: %oya / %oya, 3mma
Definition : Retail sales: % over year ago / % over a year ago, 3-month moving average
Notes :
-
Data are from China’s National Bureau of Statistics.
-
Retail sales refers to all consumer goods sold by all sectors to urban and rural consumers
-
The series is nominal, not real.
Ticker :VEHICLESALES_NSA_P1M1ML12 / VEHICLESALES_NSA_P1M1ML12_3MMA
Label : Vehicle sales: % oya / % over a year ago
Definition : Vehicle sales: percent over year ago / percent of latest 3 months over 3 months a year ago
Notes :
-
The data are from China’s Association of Automobile Manufacturers.
-
Vehicles here includes passenger cars, trucks and other vehicles.
Ticker : CARREGS_NSA_P1M1ML12 / CARREGS_NSA_P1M1ML12_3MMA
Label : Car registrations: %oya / %oya, 3mma
Definition : Car registrations: % over year ago / % over a year ago, 3-month moving average
Notes :
-
The data are from China’s Association of Automobile Manufacturers.
-
Registrations refer to passenger car sales only.
Ticker :CCONF_NSA_3MMA / CCONF_NSA_D3M3ML3 / CCONF_NSA_D3M3ML12
Label : Consumer confidence: 3mma / diff 3m/3m / diff oya, 3mma
Definition : Consumer confidence: 3 month moving average / difference of latest 3 months over previous 3 months / difference over a year ago, 3-month moving average
Notes :
-
Data are from China’s National Bureau of Statistics.
Housing #
Ticker : RESIDSALES_NSA_P1M1ML12 / RESIDSALES_NSA_P1M1ML12_3MMA
Label : Residential building sales: %oya / %oya, 3mma
Definition : Residential building sales: % over year ago / % over a year ago, 3-month average
Notes :
-
Data are from China’s National Bureau of Statistics.
-
They are based on the contract price agreed by two parties during the observation period.
-
January data are not released separately but included in the February value. We spread the February values evenly over January and February.
Ticker : URBANREALINV_NSA_P1M1ML12 / URBANREALINV_NSA_P1M1ML12_3MMA
Label : Investment in urban real estate: %oya / %oya, 3mma
Definition : Investment in urban real estate: % over year ago / % over a year ago, 3-month average
Notes :
-
Data are from China’s National Bureau of Statistics.
-
The series looks at the value of investment of real estate companies into urban real estate.
-
This figure encapsulates all types of investment involved in urban real estate, which generally include land purchase, construction, and infrastructure.
-
January data are not released separately but included in the February value. We spread the February values evenly over January and February.
Ticker : REALPRMPRICE_NSA_P1M1ML12 / REALPRMPRICE_NSA_P1M1ML12_3MMA
Label : Primary real estate prices: %oya / %oya, 3mma
Definition : Primary real estate prices: % over year ago / % over a year ago, 3-month average
Notes :
-
Data are from China’s National Bureau of Statistics.
-
The index is based on prices of new or first-hand properties that are being sold by developers or the government. These are homes that have not been previously owned or lived in.
Ticker : REALSECPRICE_NSA_P1M1ML12 / REALSECPRICE_NSA_P1M1ML12_3MMA
Label : Real estate prices secondary: %oya / %oya, 3mma
Definition : Real estate prices secondary: % over year ago / % over a year ago, 3-month average
Notes :
-
Data are from China’s National Bureau of Statistics.
-
The index is based on prices of previously owned or second-hand homes. These are properties that are being sold by individuals or entities that had previously owned or lived in them, as opposed to primary real estate, which refers to new homes being sold directly by developers or the government.
Finance #
Ticker : TSF_NSA_P1M1ML12 / TSF_NSA_P1M1ML12_3MMA
Label : Total social financing: %oya / %oya, 3mma
Definition : Total social financing: % over year ago / % over a year ago, 3-month average
Notes :
-
Data are collected from the People’s Bank of China.
-
This refers to loans to the private sector.
Ticker : BANKLOANS_NSA_P1M1ML12 / BANKLOANS_NSA_P1M1ML12_3MMA
Label : Bank loans: %oya / %oya, 3mma
Definition : Bank loans: % over year ago / % over a year ago, 3-month average
Notes :
-
Total Social Financing (TSF) is a broad measure of credit and liquidity in the economy used by the People’s Bank of China.
-
These data include off-balance sheet forms of financing that exist outside the conventional bank lending system, such as initial public offerings, loans from trust companies, and bond sales. It can also encompass financing activities that the People’s Bank of China view as potentially contributing to systemic risks within the Chinese financial system.
Ticker : NEWLOANS_NSA_P1M1ML12 / NEWLOANS_NSA_P1M1ML12_3MMA
Label : New loans: % oya / % oya, 3mma
Definition : New loans: % over year ago / % over a year ago, 3-month average
Notes :
-
The data are collected from the People’s Bank of China.
-
They measure the change in the value of bank loans to consumers and businesses.
Imports #
Only the standard Python data science packages and the specialized
macrosynergy
package are needed.
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import math
import json
import macrosynergy.management as msm
import macrosynergy.panel as msp
import macrosynergy.signal as mss
import macrosynergy.pnl as msn
from macrosynergy.download import JPMaQSDownload
from timeit import default_timer as timer
from datetime import timedelta, date, datetime
import warnings
warnings.simplefilter("ignore")
The
JPMaQS
indicators we consider are downloaded using the J.P. Morgan Dataquery API interface within the
macrosynergy
package. This is done by specifying
ticker strings
, formed by appending an indicator category code
<category>
to a currency area code
<cross_section>
. These constitute the main part of a full quantamental indicator ticker, taking the form
DB(JPMAQS,<cross_section>_<category>,<info>)
, where
<info>
denotes the time series of information for the given cross-section and category.
The following types of information are available:
-
value
giving the latest available values for the indicator -
eop_lag
referring to days elapsed since the end of the observation period -
mop_lag
referring to the number of days elapsed since the mean observation period -
grade
denoting a grade of the observation, giving a metric of real time information quality.
After instantiating the
JPMaQSDownload
class within the
macrosynergy.download
module, one can use the
download(tickers,
start_date,
metrics)
method to obtain the data. Here
tickers
is an array of ticker strings,
start_date
is the first release date to be considered and
metrics
denotes the types of information requested.
cids = ["CNY"]
main = [
"PMIMANU_SA_3MMA",
"PMIMANU_SA_D3M3ML3",
"PMIMANU_SA_D3M3ML12",
"PMISERV_SA_3MMA",
"PMISERV_SA_D3M3ML3",
"PMISERV_SA_D3M3ML12",
"INDPROFIT_NSA_P1M1ML12",
"INDPROFIT_NSA_P1M1ML12_3MMA",
"INDCAPUT_NSA_D1Q1QL1",
"INDCAPUT_NSA_D1Q1QL4",
"RESIDSALES_NSA_P1M1ML12",
"RESIDSALES_NSA_P1M1ML12_3MMA",
"URBANREALINV_NSA_P1M1ML12",
"URBANREALINV_NSA_P1M1ML12_3MMA",
"REALPRMPRICE_NSA_P1M1ML12",
"REALPRMPRICE_NSA_P1M1ML12_3MMA",
"REALSECPRICE_NSA_P1M1ML12",
"REALSECPRICE_NSA_P1M1ML12_3MMA",
"BANKLOANS_NSA_P1M1ML12",
"BANKLOANS_NSA_P1M1ML12_3MMA",
"NEWLOANS_NSA_P1M1ML12",
"NEWLOANS_NSA_P1M1ML12_3MMA",
"TSF_NSA_P1M1ML12",
"TSF_NSA_P1M1ML12_3MMA",
"VEHICLESALES_NSA_P1M1ML12",
"VEHICLESALES_NSA_P1M1ML12_3MMA",
"CARREGS_NSA_P1M1ML12",
"CARREGS_NSA_P1M1ML12_3MMA",
"FIXEDINV_NSA_P1M1ML12",
"FIXEDINV_NSA_P1M1ML12_3MMA",
"ELECPROD_NSA_P1M1ML12",
"ELECPROD_NSA_P1M1ML12_3MMA",
"FDI_NSA_P1M1ML12",
"FDI_NSA_P1M1ML12_3MMA",
"URBINV_NSA_P1M1ML12",
"URBINV_NSA_P1M1ML12_3MMA",
"URBINVP_NSA_P1M1ML12",
"URBINVP_NSA_P1M1ML12_3MMA",
"FREIGHT_NSA_P1M1ML12",
"FREIGHT_NSA_P1M1ML12_3MMA",
"FREIGHTRAIL_NSA_P1M1ML12",
"FREIGHTRAIL_NSA_P1M1ML12_3MMA",
"FREIGHTROAD_NSA_P1M1ML12",
"FREIGHTROAD_NSA_P1M1ML12_3MMA",
"FREIGHTAIR_NSA_P1M1ML12",
"FREIGHTAIR_NSA_P1M1ML12_3MMA",
"CCONF_NSA_3MMA",
"CCONF_NSA_D3M3ML3",
"CCONF_NSA_D3M3ML12",
"CARPROD_NSA_P1M1ML12",
"CARPROD_NSA_P1M1ML12_3MMA",
"PMIMANUORD_SA_3MMA",
"PMIMANUORD_SA_D3M3ML3",
"PMIMANUORD_SA_D3M3ML12",
"RSALES_NSA_P1M1ML12",
"RSALES_NSA_P1M1ML12_3MMA",
]
xcats = main
start_date = "1995-01-01"
tickers = [cid + "_" + xcat for cid in cids for xcat in xcats]
print(f"Maximum number of tickers is {len(tickers)}")
# Retrieve credentials
client_id: str = os.getenv("DQ_CLIENT_ID")
client_secret: str = os.getenv("DQ_CLIENT_SECRET")
# Download from DataQuery
with JPMaQSDownload(client_id=client_id, client_secret=client_secret) as downloader:
start = timer()
df = downloader.download(
tickers=tickers,
start_date=start_date,
metrics=["value", "eop_lag", "mop_lag", "grading"],
suppress_warning=True,
show_progress=True,
)
end = timer()
dfd = df
print("Download time from DQ: " + str(timedelta(seconds=end - start)))
Maximum number of tickers is 56
Downloading data from JPMaQS.
Timestamp UTC: 2023-07-14 16:51:11
Connection successful!
Number of expressions requested: 224
Requesting data: 100%|█████████████████████████████████████████████████████████████████| 12/12 [00:03<00:00, 3.21it/s]
Downloading data: 100%|████████████████████████████████████████████████████████████████| 12/12 [00:15<00:00, 1.26s/it]
Download time from DQ: 0:00:23.309576
# Assign categories to high level labels to make subsequent analysis more readable
cat_dict = {
"Manufacturing": ["PMIMANUORD", "INDPROFIT", "INDCAPUT", "CARPROD", "PMIMANU"],
"Consumption": ["RSALES", "VEHICLESALES", "CCONF", "CARREGS"],
"Finance": [
"NEWLOANS",
"TSF",
"BANKLOANS",
],
"Housing": [
"RESIDSALES",
"URBANREALINV",
"REALPRMPRICE",
"REALSECPRICE",
],
"Freight": [
"FREIGHT",
"FREIGHTRAIL",
"FREIGHTROAD",
"FREIGHTAIR",
],
"Other": ["ELECPROD", "FIXEDINV", "URBINV", "URBINVP", "FDI", "PMISERV"],
}
order = ["Manufacturing", "Other", "Freight", "Consumption", "Housing", "Finance"]
def cat_dict_to_column(cat_string, cat_dict):
for cat_list in cat_dict:
if cat_string in cat_dict[cat_list]:
cat_keys = [*cat_dict.keys()]
return cat_keys.index(cat_list)
else:
continue
def cat_dict_to_key(cat_string, cat_dict):
for cat_list in cat_dict:
if cat_string in cat_dict[cat_list]:
return cat_list
else:
continue
Availability #
msm.missing_in_df(dfd, xcats=xcats, cids=["CNY"])
Show code cell output
Missing xcats across df: set()
Missing cids for BANKLOANS_NSA_P1M1ML12: set()
Missing cids for BANKLOANS_NSA_P1M1ML12_3MMA: set()
Missing cids for CARPROD_NSA_P1M1ML12: set()
Missing cids for CARPROD_NSA_P1M1ML12_3MMA: set()
Missing cids for CARREGS_NSA_P1M1ML12: set()
Missing cids for CARREGS_NSA_P1M1ML12_3MMA: set()
Missing cids for CCONF_NSA_3MMA: set()
Missing cids for CCONF_NSA_D3M3ML12: set()
Missing cids for CCONF_NSA_D3M3ML3: set()
Missing cids for ELECPROD_NSA_P1M1ML12: set()
Missing cids for ELECPROD_NSA_P1M1ML12_3MMA: set()
Missing cids for FDI_NSA_P1M1ML12: set()
Missing cids for FDI_NSA_P1M1ML12_3MMA: set()
Missing cids for FIXEDINV_NSA_P1M1ML12: set()
Missing cids for FIXEDINV_NSA_P1M1ML12_3MMA: set()
Missing cids for FREIGHTAIR_NSA_P1M1ML12: set()
Missing cids for FREIGHTAIR_NSA_P1M1ML12_3MMA: set()
Missing cids for FREIGHTRAIL_NSA_P1M1ML12: set()
Missing cids for FREIGHTRAIL_NSA_P1M1ML12_3MMA: set()
Missing cids for FREIGHTROAD_NSA_P1M1ML12: set()
Missing cids for FREIGHTROAD_NSA_P1M1ML12_3MMA: set()
Missing cids for FREIGHT_NSA_P1M1ML12: set()
Missing cids for FREIGHT_NSA_P1M1ML12_3MMA: set()
Missing cids for INDCAPUT_NSA_D1Q1QL1: set()
Missing cids for INDCAPUT_NSA_D1Q1QL4: set()
Missing cids for INDPROFIT_NSA_P1M1ML12: set()
Missing cids for INDPROFIT_NSA_P1M1ML12_3MMA: set()
Missing cids for NEWLOANS_NSA_P1M1ML12: set()
Missing cids for NEWLOANS_NSA_P1M1ML12_3MMA: set()
Missing cids for PMIMANUORD_SA_3MMA: set()
Missing cids for PMIMANUORD_SA_D3M3ML12: set()
Missing cids for PMIMANUORD_SA_D3M3ML3: set()
Missing cids for PMIMANU_SA_3MMA: set()
Missing cids for PMIMANU_SA_D3M3ML12: set()
Missing cids for PMIMANU_SA_D3M3ML3: set()
Missing cids for PMISERV_SA_3MMA: set()
Missing cids for PMISERV_SA_D3M3ML12: set()
Missing cids for PMISERV_SA_D3M3ML3: set()
Missing cids for REALPRMPRICE_NSA_P1M1ML12: set()
Missing cids for REALPRMPRICE_NSA_P1M1ML12_3MMA: set()
Missing cids for REALSECPRICE_NSA_P1M1ML12: set()
Missing cids for REALSECPRICE_NSA_P1M1ML12_3MMA: set()
Missing cids for RESIDSALES_NSA_P1M1ML12: set()
Missing cids for RESIDSALES_NSA_P1M1ML12_3MMA: set()
Missing cids for RSALES_NSA_P1M1ML12: set()
Missing cids for RSALES_NSA_P1M1ML12_3MMA: set()
Missing cids for TSF_NSA_P1M1ML12: set()
Missing cids for TSF_NSA_P1M1ML12_3MMA: set()
Missing cids for URBANREALINV_NSA_P1M1ML12: set()
Missing cids for URBANREALINV_NSA_P1M1ML12_3MMA: set()
Missing cids for URBINVP_NSA_P1M1ML12: set()
Missing cids for URBINVP_NSA_P1M1ML12_3MMA: set()
Missing cids for URBINV_NSA_P1M1ML12: set()
Missing cids for URBINV_NSA_P1M1ML12_3MMA: set()
Missing cids for VEHICLESALES_NSA_P1M1ML12: set()
Missing cids for VEHICLESALES_NSA_P1M1ML12_3MMA: set()
Show code cell source
# Create start years (data availability) plot
dfs = msm.check_startyears(dfd)
df_new = dfs.iloc[0]
df_new = df_new.to_frame()
df_new = df_new.reset_index()
df_new["cat"] = df_new["xcat"].apply(lambda x: x.split("_")[0])
df_new["adj"] = df_new["xcat"].apply(lambda x: x.split("_")[1])
df_new["trans"] = df_new["xcat"].apply(lambda x: x.split("_")[2])
df_new["col"] = df_new["cat"].apply(lambda x: cat_dict_to_column(x, cat_dict))
df_new2 = df_new
df_new2["col"] = df_new["cat"].apply(lambda x: cat_dict_to_key(x, cat_dict))
df_new2 = df_new2.sort_values(by="CNY")
def add_row(df):
df_return = pd.DataFrame()
for col_value in df["col"].unique():
df_reduced = df[df["col"] == col_value]
df_reduced.insert(0, "row", range(0, len(df_reduced)))
df_return = pd.concat([df_return, df_reduced])
return df_return
def reorder_cols(df):
idea_count = []
for idea in df.T.index:
idea_count.append(df[idea].count())
ideas = [*df.T.index]
df_to_sort = pd.DataFrame(idea_count, ideas)
df_sort = df_to_sort.sort_values(by=0)
df_sort_index = [*df_sort.index]
return df[df_sort_index]
df_new2 = add_row(df_new2)
df_new2_pivot = df_new2.pivot(index="row", columns="col", values="CNY")
df_new2_pivot = df_new2_pivot[order]
df_new2["label"] = (
df_new["cat"]
.str.cat(df_new["adj"].apply(lambda x: "\n" + x))
.str.cat(df_new["trans"].apply(lambda x: "\n" + x))
.str.cat(df_new2["CNY"].apply("{:.0f}".format), sep="\n")
)
labels = df_new2.pivot(index="row", columns="col", values="label")
labels = labels[order]
header = "Start years of quantamental indicators."
sns.set(rc={"figure.figsize": (25, 25)})
sns.heatmap(
df_new2_pivot, annot=labels, fmt="", cmap="YlOrBr", linewidths=10, yticklabels=[]
)
plt.tick_params(
axis="both",
which="major",
labelsize=18,
labelbottom=False,
bottom=False,
top=False,
labeltop=True,
)
plt.xlabel("")
plt.ylabel("")
plt.title(header, fontsize=25, pad=50)
plt.show()
Show code cell source
# Create missing days plot
def business_day_dif(df: pd.DataFrame, maxdate: pd.Timestamp):
"""
Number of business days between two respective business dates.
:param <pd.DataFrame> df: DataFrame cross-sections rows and category columns. Each
cell in the DataFrame will correspond to the start date of the respective series.
:param <pd.Timestamp> maxdate: maximum release date found in the received DataFrame.
In principle, all series should have values up until the respective business
date. The difference will represent possible missing values.
:return <pd.DataFrame>: DataFrame consisting of business day differences for all
series.
"""
year_df = maxdate.year - df.apply(lambda x: x.dt.year)
year_df *= 52
week_max = maxdate.week
week_df = week_max - df.apply(lambda x: x.dt.isocalendar().week)
# Account for difference over a year.
week_df += year_df
# Account for weekends.
week_df *= 2
df = (maxdate - df).apply(lambda x: x.dt.days)
return df - week_df
df_ends = dfd[["cid", "xcat", "real_date"]].groupby(["cid", "xcat"]).max()
df_ends["real_date"] = df_ends["real_date"].dt.strftime("%Y-%m-%d")
df_end = df_ends.unstack().loc[:, "real_date"]
df_end = df_end.apply(pd.to_datetime)
maxdate = df_end.max().max()
maxdate
df_days = business_day_dif(df=df_end, maxdate=maxdate)
df_right = df_days.iloc[0].to_frame()
df_merged = df_new2.merge(df_right, on="xcat", how="inner")
df_merged = df_merged.reindex(df_new2.index)
df_days_trans = pd.DataFrame(
[df_days.iloc[0].to_frame().index, df_days.iloc[0].to_frame().values]
).T
df_days_trans = df_days_trans.rename(columns={0: "xcat", 1: "days"})
df_days_trans["days"] = df_days_trans["days"].apply(int)
df_merged = df_new2.merge(df_days_trans, on="xcat", how="inner")
df_merged = df_merged.reindex(df_new2.index)
df_merged = df_merged.dropna()
df_new2_pivot = df_merged.pivot(index="row", columns="col", values="days")
df_new2_pivot = df_new2_pivot[order]
df_new2_pivot = df_new2_pivot.astype(float)
df_merged["label"] = (
df_merged["cat"]
.str.cat(df_merged["adj"].apply(lambda x: "\n" + x))
.str.cat(df_merged["trans"].apply(lambda x: "\n" + x))
.str.cat(df_merged["days"].apply("{0:.0f}".format), sep="\n")
)
labels = df_merged.pivot(index="row", columns="col", values="label")
labels = labels[order]
header = "Missing days of quantamental indicators."
sns.set(rc={"figure.figsize": (25, 25)})
sns.heatmap(
df_new2_pivot, annot=labels, fmt="", cmap="YlOrBr", linewidths=10, yticklabels=[]
)
plt.tick_params(
axis="both",
which="major",
labelsize=18,
labelbottom=False,
bottom=False,
top=False,
labeltop=True,
)
plt.xlabel("")
plt.ylabel("")
plt.title(header, fontsize=25, pad=50)
plt.show()
Average vintage grading is consistently low.
Show code cell source
# Create vintage grading chart
dfd_grad = dfd[dfd["real_date"].apply(lambda x: x.date()) > date(2000, 1, 3)]
dfd_grad = dfd_grad[["xcat", "cid", "real_date", "grading"]]
dfd_grad["grading"] = dfd_grad["grading"].astype(float).round(2)
df_ags = dfd_grad.drop(["cid"], axis=1).groupby(["xcat"]).mean()
df_merged = df_new2.merge(df_ags, on="xcat", how="left")
df_merged = df_merged.reindex(df_new2.index)
df_merged = df_merged.dropna()
df_new2_pivot = df_merged.pivot(index="row", columns="col", values="grading")
df_new2_pivot = df_new2_pivot[order]
df_new2_pivot = df_new2_pivot.astype(float)
df_merged["label"] = (
df_merged["cat"]
.str.cat(df_merged["adj"].apply(lambda x: "\n" + x))
.str.cat(df_merged["trans"].apply(lambda x: "\n" + x))
.str.cat(df_merged["grading"].apply("{0:.1f}".format), sep="\n")
)
labels = df_merged.pivot(index="row", columns="col", values="label")
labels = labels[order]
header = "Average grade of vintages since 2000-01-03."
sns.set(rc={"figure.figsize": (25, 25)})
sns.heatmap(
df_new2_pivot, annot=labels, fmt="", cmap="YlOrBr", linewidths=10, yticklabels=[]
)
plt.tick_params(
axis="both",
which="major",
labelsize=18,
labelbottom=False,
bottom=False,
top=False,
labeltop=True,
)
plt.xlabel("")
plt.ylabel("")
plt.title(header, fontsize=25, pad=50)
plt.show()
The main set of China indiactors informs on the economy with average time lags between 0 and 60 days from the end of the observed periods and between 50 and 300 days from the median date of the observed period.
Show code cell source
# Create lag plots
def group_mean_sd(df, val):
mean = df.groupby(["xcat"])[val].mean()
sd = df.groupby(["xcat"])[val].std()
return mean, sd
df_range = dfd
df_range["cat"] = df_range["xcat"].apply(lambda x: x.split("_")[0])
df_range["adj"] = df_range["xcat"].apply(lambda x: x.split("_")[1])
df_range["trans"] = df_range["xcat"].apply(lambda x: x.split("_")[2])
df_range["idea"] = df_range["cat"].apply(lambda x: cat_dict_to_key(x, cat_dict))
df_range["label"] = (
df_range["cat"]
.str.cat(df_range["adj"].apply(lambda x: "\n" + x))
.str.cat(df_range["trans"].apply(lambda x: "\n" + x))
)
df_mean_eop, df_sd_eop = group_mean_sd(df_range, "eop_lag")
order_eop = df_mean_eop.sort_values().index
df_mean_mop, df_sd_mop = group_mean_sd(df_range, "mop_lag")
order_mop = df_mean_mop.sort_values().index
sns.set_theme(style="darkgrid")
g = sns.barplot(
y="xcat",
x="eop_lag",
data=df_range,
ci="sd",
order=order_eop,
hue="idea",
dodge=False,
)
plt.title("End of observation period lags", fontsize=25)
plt.show()
h = sns.barplot(
y="xcat",
x="mop_lag",
data=df_range,
ci="sd",
order=order_mop,
hue="idea",
dodge=False,
)
plt.title("Median of observation period lags", fontsize=25)
plt.show()
History #
Business surveys #
Industry and non-manufacturing surveys have posted similar sentiment swings but very different amplitudes across various episodes. For example, the COVID and lockdown crisis had a disproportionate impact on the service sector.
xcatx = ["PMIMANU", "PMISERV", "PMIMANUORD"]
calcs = []
for xc in xcatx:
calcs += [f"{xc}_SA_3MMAv50 = {xc}_SA_3MMA - 50"]
dfa = msp.panel_calculator(dfd, calcs=calcs, cids=["CNY"])
xcatxx = [xc + "_SA_3MMAv50" for xc in xcatx[:2]]
msp.view_timelines(
dfa,
xcats=xcatxx,
cids=["CNY"],
title="China main business confidence levels, daily information states",
title_fontsize=20,
title_adj=1.02,
xcat_labels=[
"Manufacturing PMI, 3-month average (minus 50)",
"Non-manufacturing PMI, 3-month average (minus 50)",
],
label_adj=0,
legend_fontsize=8,
start="1995-01-01",
size=(10, 5),
)
Similarly, the amplitudes of survey swings have been very different across the two sectors. The manufacturing sector has been more volatile than the non-manufacturing sector in the 2000s/2010s but this reversed during the COVID crisis.
xcatx = ["PMIMANU", "PMISERV"]
xcatxx = [xc + "_SA_D3M3ML3" for xc in xcatx]
labels = [
"PMI manufacturing survey, sa: diff 3m/3m",
"PMI non-manufacturing survey, sa: diff 3m/3m",
]
msp.view_timelines(
dfd,
xcats=xcatxx,
cids=["CNY"],
title="China main business surveys, sa, diff 3m/3m, daily information states",
legend_fontsize=8,
xcat_labels=labels,
start="1995-01-01",
title_fontsize=20,
title_adj=1.02,
size=(10, 5),
)
Industry #
Industrial profit growth has been a lot more volatile than production or order growth and reached negative territory recurrently.
xcatx = ["INDPROFIT"]
xcatxx = [xc + "_NSA_P1M1ML12_3MMA" for xc in xcatx]
labels = ["Industrial profits: %oya, 3mma"]
cidx = ["CNY"]
msp.view_timelines(
dfd,
xcats=xcatxx,
cids=cidx,
title="China industrial profit, %oya, daily information states",
title_fontsize=20,
title_adj=1.02,
xcat_labels=labels,
start="1990-01-01",
size=(10, 5),
)
Changes in operating rates do not have a long history and displayed rather abrupt swings between positive and negative values.
xcatx = ["INDCAPUT"]
xcatxx = [xc + "_NSA_D1Q1QL4" for xc in xcatx]
labels = ["Industrial capacity utilization: diff 1q/1q oya"]
msp.view_timelines(
dfd,
xcats=xcatxx,
cids=["CNY"],
title="China industry capacity usage, changes over a year ago in %-points, daily information states",
title_fontsize=20,
title_adj=1.02,
xcat_labels=labels,
start="1990-01-01",
size=(10, 5),
)
Car output growth trend have displayed many small cycles and two large swings.
xcatx = ["CARPROD"]
xcatxx = [xc + "_NSA_P1M1ML12_3MMA" for xc in xcatx]
labels = ["Car production: %oya, 3mma"]
msp.view_timelines(
dfd,
xcats=xcatxx,
cids=["CNY"],
title="China car production, %oya, daily information states",
title_fontsize=20,
title_adj=1.02,
xcat_labels=labels,
start="1990-01-01",
size=(10, 5),
)
Electricity production growth has been fairly volatile and its overall pattern has not always be in sync with industry or GDP growth.
xcatx = ["ELECPROD"]
xcatxx = [xc + "_NSA_P1M1ML12_3MMA" for xc in xcatx]
labels = ["Electricity production: %oya, 3mma"]
msp.view_timelines(
dfd,
xcats=xcatxx,
cids=["CNY"],
title="China electricity production, 3m over 3m a year ago, daily information states",
title_fontsize=20,
title_adj=1.02,
xcat_labels=labels,
start="1990-01-01",
size=(10, 5),
)
Freight #
All forms of freight transport follow similar trends although the rates of which grow and decay differ between them
xcatx = ["FREIGHT", "FREIGHTRAIL", "FREIGHTROAD", "FREIGHTAIR"]
xcatxx = [xc + "_NSA_P1M1ML12_3MMA" for xc in xcatx]
labels = [
"Freight Volume total : %oya, 3mma",
"Freight Volume rail : %oya, 3mma",
"Freight Volume road: %oya, 3mma",
"Freight Volume air : %oya, 3mma",
]
msp.view_timelines(
dfd,
xcats=xcatxx,
cids=["CNY"],
title="China freight indicators, %oya, daily information states",
title_fontsize=20,
title_adj=1.02,
xcat_labels=labels,
legend_fontsize=7,
start="1990-01-01",
size=(10, 5),
)
Private consumption #
Car sales and registrations have been a lot more volatile than overall retail sales growth and, on some occasions, even displayed swings in the opposite direction.
xcatx = ["VEHICLESALES", "CARREGS", "RSALES"]
xcatxx = [xc + "_NSA_P1M1ML12_3MMA" for xc in xcatx]
labels = [
"Vehicle sales: %oya, 3mma",
"Car registrations: %oya, 3mma",
"Retail sales: %oya, 3mma",
]
msp.view_timelines(
dfd,
xcats=xcatxx,
cids=["CNY"],
title="China consumer indicators, %oya, daily information states",
title_fontsize=20,
title_adj=1.02,
xcat_labels=labels,
legend_fontsize=8,
start="1990-01-01",
size=(10, 5),
)
Consumner confidence has displayed a mixture of high autocorrelation and sudden large shifts.
calcs = [f"CCONF_NSA_3MMAv100 = CCONF_NSA_3MMA - 100"]
dfa = msp.panel_calculator(dfd, calcs=calcs, cids=["CNY"])
msp.view_timelines(
dfa,
xcats=["CCONF_NSA_3MMAv100"],
cids=["CNY"],
title="China consumer confidence indicators, daily information states",
xcat_labels=["Consumer confidence level (minus 100)"],
title_fontsize=20,
title_adj=1.02,
start="1995-01-01",
size=(10, 5),
)
Housing #
House price cycles have been similar across primary and secondary markets, but naturally with greater amplitudes in the latter.
xcatx = [
"REALPRMPRICE",
"REALSECPRICE",
]
xcatxx = [xc + "_NSA_P1M1ML12_3MMA" for xc in xcatx]
labels = [
"Real estate prices primary: %oya, 3mma",
"Real estate prices secondary: %oya, 3mma",
]
msp.view_timelines(
dfd,
xcats=xcatxx,
cids=["CNY"],
title="China house price indicators, %oya, daily information states",
title_fontsize=20,
title_adj=1.02,
legend_fontsize=8,
xcat_labels=labels,
start="1990-01-01",
size=(10, 5),
)
Residential building sales growth has posted tremendous swings in the past, with extreme values of over 700%.
xcatx = ["RESIDSALES"]
xcatxx = [xc + "_NSA_P1M1ML12_3MMA" for xc in xcatx]
labels = ["Comerical residential building sales: %oya, 3mma"]
msp.view_timelines(
dfd,
xcats=xcatxx,
cids=["CNY"],
title="Residential building sales, %oya, daily information states",
title_fontsize=20,
title_adj=1.02,
xcat_labels=labels,
start="1990-01-01",
size=(10, 5),
)
Finance #
New bank loan growth and total social financing growth have not always been in sync.
xcatx = ["NEWLOANS", "TSF"]
xcatxx = [xc + "_NSA_P1M1ML12_3MMA" for xc in xcatx]
labels = ["New loans: %oya, 3mma", "Total social financing: %oya, 3mma"]
msp.view_timelines(
dfd,
xcats=xcatxx,
cids=["CNY"],
title="China finance indicators, %oya, daily information states",
title_fontsize=20,
title_adj=1.02,
xcat_labels=labels,
start="1990-01-01",
size=(10, 5),
)
Investment #
Urban Investment #
Urban investment growth shows no clear cyclical pattern but occasional sharp swings.
xcatx = ["FIXEDINV"] # ["URBANREALINV","FIXEDINV","URBINV","URBINVP"]
xcatxx = [xc + "_NSA_P1M1ML12_3MMA" for xc in xcatx]
labels = ["Total investments into urban assets: %oya, 3mma"]
msp.view_timelines(
dfd,
xcats=xcatxx,
cids=["CNY"],
title="China urban investment, %oya, daily information states",
title_fontsize=20,
title_adj=1.02,
xcat_labels=labels,
start="1990-01-01",
size=(10, 5),
)
Foreign direct investment #
Foreign direct investment displays vastly greater volatility than production indicators.
xcatx = ["FDI"]
xcatxx = [xc + "_NSA_P1M1ML12_3MMA" for xc in xcatx]
labels = ["Foreign Direct Investment: %oya, 3mma"]
msp.view_timelines(
dfd,
xcats=xcatxx,
cids=["CNY"],
title="China foreign direct investment, %oya, daily information states",
xcat_labels=labels,
title_fontsize=20,
title_adj=1.02,
start="1990-01-01",
size=(10, 5),
)
Importance #
Relevant research #
“China consumes half of the world’s base metal supply. Its housing market is the most metal-intensive large sector…China housing…is a crucial ingredient of forecasting models for directional commodity trading.” Macrosynergy
“China consumes about one third of the world’s commodities. However, its influence on commodity prices goes beyond that. Chinese institutions are also major users of commodities as collateral. Empirical evidence shows a significant link between domestic lending and global commodity prices, particularly through so-called commodity financing deals.” Macrosynergy