Home » Research Blog » Business sentiment and commodity future returns

Business sentiment and commodity future returns

Business sentiment is a key driver of inventory dynamics in global industry and, therefore, a powerful indicator of aggregate demand for industrial commodities. Changes in manufacturing business confidence can be aggregated by industry size across all major economies to give a powerful directional signal of global demand for metals and energy. An empirical analysis based on information states of sentiment changes and subsequent commodity futures returns shows a clear and highly significant predictive relation. Various versions of trading signals based on short-term survey changes all produce significant long-term alpha. The predictive relation and value generation apply to all liquid commodity futures contracts.

The below post is based on proprietary research of Macrosynergy.

A Jupyter notebook for audit and replication of the research results can be downloaded here. The notebook operation requires access to J.P. Morgan DataQuery to download data from JPMaQS, a premium service of quantamental indicators. J.P. Morgan offers free trials for institutional clients. Also, there is an academic research support program that sponsors data sets for relevant projects.

Also, there is an academic research support program that sponsors data sets for relevant projects.

This post ties in with this site’s summary of systematic trading strategies based on macro trends.

Why business confidence matters for commodity markets

Industrial commodities serve as input to industrial production, either as base materials for products or energy production. Industrial production is very cyclical and the main driver of both larger business cycles and mini-cycles in the economy. Beyond fluctuations in final sales, this cyclicality also reflects inventory dynamics: when business prospects improve, businesses in various stages of production seek to increase their stocks of inputs and outputs. When business prospects deteriorate, they lower their inventories. As a result, aggregate demand for industrial products and, by extension, raw materials fluctuates much more than aggregate final sales.

Manufacturing business surveys have been taken for decades in the major economies. These surveys usually inform on past and future expected business developments and their various aspects, such as orders, output, and prices. Overall composite survey indices are typically interpreted as “business sentiment” or “business confidence.” They are published at monthly and – sometimes – quarterly frequency and with fairly short time lags after the end of their observation periods, typically from a few days to 3 weeks. While business confidence tracks managers’ assessment and not actual activity, it is indicative of customer order trends and has very real implications for inventory dynamics and markets’ assessment thereof.

Quantamental macro data on business confidence

For a meaningful analysis of business survey data and financial contract returns one needs to use point-in-time quantamental data, such as the indicators of the J.P. Morgan Macrosynergy Quantamental System (“JPMaQS”). Generally, quantamental indicators are real-time information states of the market with respect to an economic concept and, hence, are suitable for backtesting related trading strategies.

For this post, we use quantamental data based on “manufacturing confidence scores” (view documentation here). These are real-time standardized and seasonally adjusted measures of manufacturing business confidence and their changes based on one or more publicly available surveys per country. Unlike other economic data, quantamental survey scores must compute point-in-time estimates of the neutral level of a sentiment indicator and its standard deviation to assess if the market would have read a specific index level as “good” or “bad” and if the indication would have been considered significant. Therefore, survey vintages are standardized by using historical means and volatility on the survey level. For the early years of a survey with short history, weighted averages of theoretical and empirical neutral levels have been used. Standard deviations are estimated based on mean absolute values of deviations from neutral levels.

For the present research, we consider three standard versions of manufacturing confidence score changes across 32 economies, all defined as daily information states based on concurrent vintages: % change of the last month versus the previous month, % change of the last 3 months versus the previous 3 months, and % change of the last 6 months versus the previous 6 months. To facilitate subsequent aggregation across these metrics and countries, all changes have been normalized based on local standard deviations.

The graph below shows the historical dynamic of the three types of changes across all countries. It illustrates the trade-off between timeliness and volatility across lookback horizons. In some currency areas, only quarterly surveys are available. In this case, quarterly changes are used instead of 3-month averages, and monthly changes cannot be calculated.

We perform two types of aggregations based on the country survey changes:

  • First, we take for each country the mean of the three normalized changes. This composite change metric is agnostic as to what would be the most informative lookback period but accommodates different time series patterns across surveys and is a fairly objective benchmark for testing proofs of concept of the predictive power of survey changes.
  • Second, we calculate a weighted mean composite and specific lookback-horizon changes across all countries. The weights are countries’ shares in world industry value added based on 1-year moving averages (view documentation here).

For both types of aggregation, we do not propagate missing values. This means that if a metric or a country is missing, the aggregate indicator is calculated based on the available metrics or cross-sections. This preserves valuable historic information.

The charts below show information states of normalized changes in manufacturing confidence scores for the three types of lookback (first chart) and the composite (second chart).

Industrial commodity futures and vol-targeted returns

The present analysis focuses on dollar-denominated futures returns of the most liquid industrial commodities outside China. These returns are based on positions in the front contract of the most liquid available future of each type of commodity. We classify the following underlying materials as industrial commodities:

  • Base metals: London Metal Exchange aluminum (ALM), Comex copper (CPR), London Metal Exchange lead (LED), London Metal Exchange nickel (NIC), London Metal Exchange tin (TIN), and London Metal Exchange zinc (ZNC).
  • Industrial precious metals: NYMEX palladium (PAL) and NYMEX platinum (PLT)
  • Fuels: ICE Brent crude (BRT), NYMEX WTI light crude (WTI), NYMEX natural gas, Henry Hub (NGS), NYMEX RBOB gasoline (GSO), and NYMEX heating oil, New York Harbor ULSD (HOL)

The targets of the analysis are returns of vol-targeted futures positions, as available on JPMaQS (view documentation here). These are returns in % of risk capital on positions scaled to a 10% annualized volatility target based on the historical standard deviation for an exponential moving average with a half-life of 11 days. Volatility scales are rebalanced at the end of each month. Vol-targeting secures risk parity for a commodity basket and prevents individual contract returns from dominating the long-term panel analysis.

The chart below shows cumulative daily returns of vol-targeted positions across all industrial commodities. Generally, returns have been positively correlated, but long-term performance has been very different across underlying materials.

A strong case can be made for testing the predictive power of business confidence changes on a broad basket to mitigate the influence of supply and sectoral demand shocks. A risk-parity industrial commodities futures basket based on vol-targeted positions in the above 13 contracts shows strong seasonality in performance, much more so than in equity markets. A long-only industrial commodity exposure would only have produced a positive PnL in 2003-2008. This emphasizes the need for some form of exposure management.

Predicting industrial commodity basket returns

We present standard predictive analysis used for other quantamental research posts, as an objective proof of concept. The main target in the below analysis is the industrial commodities’ risk-parity basket return. The below regression chart shows a clear and highly significant predictive relation between the global composite manufacturing confidence change and subsequent monthly commodity basket returns from 1995 to 2023 (mid-August). The relationship is also highly significant at a weekly frequency.

Other measures of predictive power support this finding:

  • Non-parametric correlation of survey changes and subsequent monthly commodity returns, based on the Kendall metric, is likewise highly significant.
  • Accuracy and balanced accuracy of correct prediction of commodity market direction are both 56.6% at the monthly frequency.
  • The predictive power for confidence changes of all lookback horizons has been strong since 1995. All have been highly significant based on both parametric and non-parametric correlation estimates. Accuracy of monthly directional commodity market directions has ranged from 55% to near 60%, with the shortest lookback producing the strongest relation.

To gauge the quality of value generation of survey-based commodity positions, we calculate stylized PnLs, i.e., dollar-based profit and loss developments over and above funding costs, according to standard rules applied in previous posts.

  • Positions are taken in the risk-parity industrial commodity futures basket following the normalized survey score changes. We consider proportionate positioning, in accordance with normalized signal and a limit of 3 standard deviations, and binary positioning, i.e., unit long or short positions in accordance with the sign of the survey score change.
  • Positions are re-calculated weekly at the end of the week and re-balanced in the following with a one-day slippage for trading time.
  • The long-term volatility of the PnL for positions across all currency areas has been set to 10% annualized. This is no proper vol-targeting but mainly a scaling that makes it easy to compare different types of PnLs in graphs.

It is important to note that this PnL calculation method does not consider transaction costs or realistic risk management rules, which would depend on portfolio size and the institution that is trading. The purpose of naïve PnLs is to inform purely on the value generation of the factor.

Using the global composite manufacturing confidence change as a factor, the naïve PnL points to considerable alpha generation. The long-term Sharpe ratio of the proportionate strategy from 1995 to 2023 (mid-August) would have been 0.79, and the Sharpe ratio of the binary strategy 0.77. Importantly, the correlation of this strategy with major benchmarks, such as the S&P500 and U.S. 10-year treasury return, has been near zero. For comparison, the long-term Sharpe ratio of a long-only position in the industrial commodities futures’ basket would have been 0.32, with a 30% correlation with U.S. equity market returns. This suggests that survey-based aggregate exposure management would greatly enhance the value of commodity exposure.

The survey-based commodity PnL has displayed seasonality. Generally, there has been more value generation in pronounced industry cycles. Also, there was no positive PnL generation in the 1990s when the range of surveys and number of tradable contracts was narrower.

Signals based in confidence change have produced significant value for all lookback periods. Simulated Sharpe ratios have ranged between 0.50 and 0.91, with shorter looks producing higher value.

Predicting individual contract returns

Finally, we take a look at the predictive power of information states of global manufacturing confidence changes on individual industrial commodity contracts. The main finding is that surveys help predict the returns of all industrial commodity futures.

Accuracy and balanced accuracy of the prediction of monthly return direction based on the global composite survey change have been above 50% for all commodities. Even the lowest monthly accuracy (gasoline) was 51%. The accuracy of predicting Nickel contracts was over 58%.

Similarly positive correlation probability, based on parametric and non-parametric tests has generally been high. For 9 out of 13 contracts it has been above 95% or 99%. Only led posted a sub-50% positive correlation probability.

Finally, all commodities contributed to a survey-based futures contract PnL since 1995, notwithstanding the crude “one-size fits all” type of signal. There have been notable performance differences however, with Nickel recording the strongest PnL generation and U.S. gasoline the weakest.



Related articles