Home » Research Blog » Real-time growth estimation with reinforcement learning

Real-time growth estimation with reinforcement learning

Survey data and asset prices can be combined to estimate high-frequency growth expectations. This is a specific form of nowcasting that implicitly captures all types of news on the economy, not just official data releases. Methods for estimation include the Kalman filter, MIDAS regression, and reinforcement learning. Since reinforcement learning is model-free it can estimate more efficiently. And a recent paper suggests that this efficiency gain brings great benefits for nowcasting growth expectations. Nowcasting with reinforcement learning can be applied to expectations for a variety of macro variables.

Chaudhry Aditya and Sangmin Oh (2020), “High-Frequency Expectations from Asset Prices: A Machine Learning Approach”.

The below sections are mostly quotes from the above paper. Cursive text and text in brackets have been added for clarity.
The post ties up with this site’s summary on quantitative methods for macro information efficiency.

The basic idea

“While…expectations-based research in macroeconomics and finance relies on low-frequency surveys…[a] multitude of events pass between survey dates…[calling for a] method [that] allows…to construct a daily time-series.”

“Investors price assets based on their beliefs about the joint distribution of the stochastic discount factor and the asset’s cash flows. One of the key drivers of investor expectations is news of macroeconomic events such as impending trade wars, interest rate changes, or announcements of new tax policy.”

“With a single asset, we cannot extract the component of asset returns driven solely by changes in expectations of macroeconomic growth. But with multiple assets, a suitable linear combination of them can cancel the extraneous sources of return variation and deliver an estimate of the change in growth expectations. In other words, the econometrician’s task can be interpreted as finding an optimal combination of asset returns that correlates maximally with the change investors’ expectations of future macroeconomic growth.”

Estimation methods

“Since we are interested in measuring the expectations of growth, we utilize GDP growth forecasts from the Survey of Professional Forecasters. The survey occurs every quarter, asking participants for quarterly projections up to five quarters ahead as well as annual projections for the current year and the following year.”

“We construct a daily time series of investor expectations of macroeconomic growth. Since surveyed expectations are available at a quarterly frequency, our task is to recover the unobserved daily series of expectations between two quarterly survey releases dates…We utilize daily asset prices that reflect investors’ updated beliefs about macroeconomic growth. Thus, as econometricians we tackle the inverse problem of extracting beliefs from daily asset prices.”

“We consider an economy in which expected returns and dividend growth across assets and over all horizons are linear in common factors, one of which is macroeconomic growth…We interpret asset prices broadly to include interest rates, spreads, returns, and various measures related to the value of financial assets. Since we seek to construct a daily time-series, we require assets for which liquid daily returns are available.”

“We find that the following pair of assets explain the greatest amount of variance in the quarterly forecast innovations (R2 of 38:3%): the U.S. Treasury five-year fixed-term index and the value-weighted [equity] portfolio. Other pairs of assets involving bond returns, credit spreads, and VIX also yield sizable R2 values of over 25%. Thus, we found empirically that asset returns contain useful information about forecast innovations.”

We provide a comparison of three approaches…mixed data sampling (MIDAS)…Kalman filtering…[and] reinforcement learning…through which an econometrician can estimate the latent factor processes.

  • Ghysels and Wright (2009)…propose a mixed frequency data sampling (MIDAS) approach for using asset price data to predict the forecasts of professional forecasters…MIDAS regressions forecast low-frequency variables from higher- frequency predictors. [For an introduction to nowcasting with MIDAS view post here.]
  • [In] the Kalman filtering..rearranging the state-space and observation equations from our simulated economy yields a final system of state and observation equations… We fit the model in-sample using maximum likelihood and then use the estimated Kalman filter to obtain daily estimates out-of-sample. [For an introduction to nowcasting with Kalman filters view post here.]
  • We propose a simple reinforcement learning approach using asset prices to estimate high-frequency expectations of macroeconomic growth. Specifically, we provide a framework for constructing daily series of the cross-sectional mean of growth forecasts and find that our method proves efficient and robust to model specifications.

The core difficulty of our task is obtaining a daily law of motion for expectations given quarterly training data. The Kalman filter approach accomplishes this task by imposing parametric assumptions and using maximum likelihood. The reinforcement learning approach builds upon the Kalman filter approach by directly estimating the Kalman gain using a linear learning rule, and we have illustrated the bias-efficiency trade-off that occurs at this step. The MIDAS benchmark also estimates a linear learning rule from quarterly data.

What is reinforcement learning anyway?

“Reinforcement learning is a subfield of machine learning that teaches an agent how to choose an action from its action space, within a particular environment…to maximize rewards over time. reinforcement Learning has four essential elements: [1] Agent [refers to] the program you train, with the aim of doing a job you specify. [2] Environment [refers to] the world, real or virtual, in which the agent performs actions. [3] Action [refers to] a move made by the agent, which causes a status change in the environment. [4] Rewards [refers to] the evaluation of an action, which can be positive or negative.” [from Dan Lee’s post on Medium]

Reinforcement learning demonstrated by cats:

“In general, reinforcement learning algorithms enable an agent to learn the optimal policy that dictates what action to take given the current state. In our setting, the agent’s state is the current expectation of next quarter growth and the current asset returns. The policy is the function of the current state that yields the agent’s new growth expectation, and action is the agent’s updated growth expectation.”

“Unlike the Kalman filter…reinforcement learning is model-free in that it does not require an explicit model of the underlying state transition dynamics of the environment. Instead of using maximum likelihood to fit model parameters and then computing the optimal Kalman gain, reinforcement learning directly learns the policy function. Therefore, reinforcement learning enables more efficient estimation by omitting the model of the state transitions.”

Introduction to context, potential and limitations of reinforcement learning:

Reinforcement learning in Python:

How is reinforcement learning applied to real-time growth estimation?

“In our setting, [reinforcement learning is based on] the observed cross-sectional mean expectations of a survey of forecasters in two consecutive quarters, and…a policy function that yields daily estimates of the latent cross-sectional mean between these two quarterly releases. Specifically, the reinforcement learning agent observes [the mean expectations] on the survey release date at the start of quarter j and iteratively uses observed asset returns to construct daily estimates…The agent continues this process until the next survey release date at the start of the quarter…at which point [the new mean expectation] is revealed and the loss function in can be computed. The optimal policy minimizes the average end-of-quarter loss.”

“We construct a state-space of growth and returns that distinguishes discount rate shocks from cash flow shocks. This approach enables our algorithm to use multiple assets and extract only the unexpected component of returns in constructing our forecasts.”

The benefits of reinforcement learning for real-time growth estimation

“We take our reinforcement learning algorithm to the real data. Specifically, we…construct a daily series of investor expectations and disagreement. In a recursive out-of-sample estimation procedure, we train six models with different lookback windows and average the resulting policy weights. Across the entire out-of-sample period, we find that the constructed daily series of average growth expectations realizes an R2 of 82.3% against the true quarterly series. These results prove far superior to the results from the Kalman filer and MIDAS, which achieve R2 values of 2.3% and 39.2%, respectively.”

Reinforcement learning achieves a significant gain in efficiency over traditional filtering techniques such as the Kalman filter. Reinforcement learning avoids an explicit model of the state dynamics and thus requires estimation of far fewer parameters…The efficiency gain from estimating fewer parameters lies at the core of why our reinforcement learning approach outperforms existing methods in the task of interest.”

“Our reinforcement learning approach can be applied to obtain a daily series of expectations for any macro variable for which a low-frequency panel of forecasts is available. Immediate candidates include interest rate expectations and inflation expectations.”


Related articles