Home » Research Blog » Predicting base metal futures returns with economic data

Predicting base metal futures returns with economic data

Unlike other derivatives markets, for commodity futures, there is a direct relation between economic activity and demand for the underlying assets. Data on industrial production and inventory build-ups indicate whether recent past demand for industrial commodities has been excessive or repressed. This helps to spot temporary price exaggerations. Moreover, changes in manufacturing sentiment should help predict turning points in demand. Empirical evidence based on real-time U.S. data and base metal futures returns confirms these effects. Simple strategies based on a composite score of inventory dynamics, past industry growth, and industry mood swings would have consistently added value to a commodities portfolio over the past 28 years, without adding aggregate commodity exposure or correlation with the broader (equity) market.

The below post is based on proprietary research of Macrosynergy Ltd. 

A Jupyter notebook for audit and replication of the research results can be downloaded here. The notebook operation requires access to J.P. Morgan DataQuery to download data from JPMaQS, a premium service of quantamental indicators. J.P. Morgan offers free trials for institutional clients.

This post ties in with this site’s summary of trading strategies based on macro trends.

Economic data and industrial commodity markets

Unlike for bonds and equity, in the commodity futures markets, economic information effectively also informs on past flows of the underlying asset. Specifically, data on manufacturing orders, production, and inventories are related to the purchases of raw materials by industrial consumers. In other financial markets, economic information mostly affects subsequent demand and supply. This is important for formulating hypotheses on the relationship between economic data and commodity returns.

In particular, this post focuses on predicting changes in the pace of physical commodity demand based on two effects:

  • Excess demand: Since physical commodities require storage and neither shortages nor excesses of stocks are desirable, it is plausible that commodity demand, to some extent, rises and falls with industrial activity. Hence, if past manufacturing activity has expanded at an above-par rate and inventories increased, we can diagnose temporary excess demand and increased probability of negative payback. Conversely, if past industry growth has been unusually low or negative and inventories decreased, we can diagnose a temporary shortfall and a higher likelihood of acceleration in demand.
  • Mood swings: If manufacturing business confidence has weakened in recent months, we expect all other things equal, that industrial commodity demand would slow or even decline accordingly. As sentiment is often affected by orders or order prospects, rather than production, it should on balance have some predictive power for physical consumption.

We test these hypotheses for industrial or base metal futures returns. The underlying materials are aluminum (ALM), copper (CPR), lead (LED), nickel (NIC), tin (TIN), and zinc (ZNC). Put simply, reported excess demand, and opposing mood swings in manufacturing are expected to be negatively related to subsequent metal futures returns.

Some words about generic metal excess returns

Generic metal futures returns are taken from the J.P. Morgan Macrosynergy Quantamental System or “JPMaQS”. The documentation page for commodity returns can be found here (requires J.P. Morgan Markets access). The return is simply the % change of the futures price. JPMaQS constructs the continuously rolling future series directly from the individual contracts that are part of the regular trading cycle for that commodity. The return calculation assumes that futures positions are rolled (from front to second contract) on the first day of the month when the front contract is deliverable. The specific contracts are London Metals Exchange aluminum, lead, nickel, and tin, as well as Comex copper.

To disentangle the specific performance of base metals contracts from common factors that drive all commodity returns and to avoid aggregate exposure to (and risk premia of) the broad commodity market, we focus on the relative returns of each metal contract versus a basket of non-industrial commodity contracts. For this purpose, we first standardize all individual commodity futures returns to those of positions with 10% (annualized) volatility targets. This is done based on a historic standard deviation for an exponential moving average with a half-time of 11 days. Positions are rebalanced at the end of each month. Then we subtract the returns of a non-industrials’ basket from each base metal return. The non-industrials’ basket includes in equal weights sub-baskets of precious metals, U.S. agricultural commodities, other agricultural commodities, and livestock.

The below panel shows cumulative outright and relative base metal future returns. The different metals have their own long-term patterns, but also much communal short-term variation and a common “super-cycle” in the late 1990s and 2000s.

Some simple U.S. economic indicators

As usual, for a meaningful analysis of the impact of economic data on market returns, we use indicators of the J.P. Morgan Macrosynergy Quantamental System (“JPMaQS”). Quantamental indicators are real-time information states of the market and the public with respect to an economic concept and, hence, are suitable for testing relations with subsequent returns and backtesting related trading strategies.

Since commodity prices depend on global demand, principally the best economic data to use would be global aggregates or proxies. However, here we limit the analysis to U.S. indicators for two reasons. First, U.S. point-in-time data vintages (unlike China’s) are available in good quality back to the mid-1990s. Second, industry cycles are globally correlated and here we only want to support the proof of concept for the relevance of those cycles for metals trading, rather than optimizing a trading factor.

The focus is on real-time information on the following economic statistics:

  • Business inventories of domestic manufacturing and trade business, as % over a year ago and in the 3-month moving average. This is a representation of the recent stock building.
  • Industrial production, as % over a year ago and as a 12-month moving average. This is a representation of the past cyclical state of the industry sector.
  • ISM manufacturing survey, in form of differences in the main survey index over the past three months. This is a representation of the direction and magnitude of sentiment changes.

In accordance with the above hypothesis of payback for excess demand and predictive power of mood swings, we expect the former two indicators to be negatively related to metals returns and the latter positively so. For testing, all indicators are z-scored, with a maximum threshold of 3 standard deviations, and appropriately combined.

Empirical evidence for the predictive power of excess demand

As hypothesized, both past inventory growth and past industrial production growth have negatively predicted future monthly or quarterly base metal excess returns (relative to non-industrial commodity returns) from 1995 to 2023. Across the full panel, directional monthly accuracy, the ratio of correct prediction of metal futures’ return direction, has been 52% for the two indicators separately and 53% for a composite score. The (negative) forward correlation coefficients have been 7-7.5% and are highly significant at a monthly or quarterly frequency.

Across contracts, negative correlations prevailed across 5 out of 6 sections, with the lead being the odd metal out. The strongest negative forward return correlations were recorded for aluminum and tin with over 20% quarterly forward correlation coefficients.

Empirical evidence for the predictive power of sentiment changes

The data for 1995-2023 also confirm the predictive power of mood swings in manufacturing for subsequent monthly and quarterly metals futures excess returns (relative to non-industrial commodity returns). Monthly accuracy based on ISM manufacturing survey changes has been 51.5%. Pearson correlation across all contracts has been 11% and highly significant.

As an aside, the predictive power of changes in the Philadelphia Fed manufacturing survey for commodities has historically been much greater than for the ISM, with 52.4% monthly directional prediction accuracy and a correlation coefficient as high as 17%. As a regional survey (albeit a widely watched one) it is, however, conceptually inferior to the ISM and is, therefore, less suitable for a proof of concept. However, the finding suggests that trading signals should consider a range of available surveys.

ISM survey changes have displayed a positive correlation with subsequent quarterly returns of all six base metal futures. However, the strength of the correlation has been different. On a quarterly basis, it was strong for aluminum, tin, and zinc, but quite faint for nickel and lead.

Empirical evidence for predictive power and investment value of a composite score

To test for correlation, accuracy, and trading value, we average the z-scores of excess demand and sentiment changes. The excess demand score itself is an average of the inventory and industrial production scores and is used as a negative input value. The sentiment score is merely based on ISM survey changes.

The composite score displayed monthly directional accuracy of 54.8% with respect to subsequent metals excess returns. The correlation coefficient was near 10% and positive for all metal contracts. Also, above-50% accuracy and positive correlation have been reasonably consistent across time, prevailing in 72% and 62% of all years respectively.

This low-key but consistent predictive value is reflected in the performance of related trading strategies. As for other proofs of concept, we simulated two simple strategies based on the composite score as the only positioning signal. One strategy takes positions proportionate to the score, the other takes positions of constant size but in accordance with the sign of the score. In both cases, the score is applied equally to all contracts, in form of relative vol-targeted positions in metals versus the non-industrial commodity contracts. Rebalancing takes place monthly with one-day slippage. Transaction costs are not considered. PnLs are scaled to a 10% annualized volatility.

The below chart shows the naïve PnLs for both basic strategies and a “long only” strategy, which here is a strategy that is always long the base metal contracts versus the non-industrials.

The average Sharpe ratio has been around 0.5 and the average Sortino ratio 0.6-0.8 for the period 1995 to 2023 (February 22). The binary long-short strategy performed a little better.

The performance ratios may not look that high before transaction costs but are respectable given that the data input into the signals is very basic. More importantly, value generation has been consistent across time and not concentrated on just a few episodes. Also, the strategies did not require aggregate exposure to the commodities market and posted virtually no correlation with broad equity market returns. Thus, this simple economic data-based rule can be seen as an “add-on” return generator for a broad commodity futures portfolio.


Related articles