Estimating portfolio risk in extreme situations means answering two questions: First, has the market entered an extreme state? Second, how are returns likely to be distributed in such an extreme state? There are three different types of models to address these questions statistically. Conventional “extreme value theory” really only answers the second question, by fitting an appropriate limiting distribution over observations that exceed a fixed threshold. “Extreme value mixture models” simultaneously estimate the threshold for extreme distributions and the extreme distribution itself. This method seems appropriate if uncertainty over threshold values is high. Finally, “changepoint extreme value mixture models” even go a step further and estimate the timing and nature of changes in extreme distributions. The assumption of changing extreme distributions across episodes seems realistic but should make it harder to apply the method out-of-sample.
Lattanzi, Chiara and Manuele Leonelli (2019), “A changepoint approach for the identification of financial extreme regimes”,
with some quotes from MacDonald, A. et al. (2010), “A Flexible Extreme Value Mixture Model”, and and from Ben Shaver’s 2017 post “A Zero-Math Introduction to Markov Chain Monte Carlo Methods”.
The post ties in with SRSV’s summary lecture on systemic risk management, particularly the section on “calibrating tail risk”.
The below are quotes from the paper. Emphasis and cursive text have been added.
What is extreme value theory?
“The financial market is characterized by periods of turbulence where extreme events shock the system, potentially leading to huge profit losses. For this reason it is fundamental to understand and predict the tail distribution of financial returns…A portfolio is more affected by a few extreme movements in the market than by the sum of many small movements. This motivates risk managers to be primarily concerned with avoiding big unexpected losses. The tool to perform inference over such unexpected events is extreme value theory, which provides a coherent probabilistic framework to model the tail of a distribution.”
“Extreme value theory is used to derive…models for unusual or rare events, e.g. the upper or lower tails of a distribution…Extreme value theory is unlike most traditional statistical theory, which typically examines the ‘usual’ or ‘average’ behaviour of process, in that it is used to motivate models for describing the unusual behaviour or rare events of a process.” [MacDonald et al.]
“Inference over tails is usually performed by fitting an appropriate limiting distribution over observations that exceed a fixed threshold… A common approach to model extremes, often referred to as peaks over threshold, studies the exceedances over a threshold.”
“At the heart of extreme value techniques is reliable extrapolation of risk estimates past the observed range of the sample data. Typically, a parametric extreme value model for describing the upper (or lower) tail of the data generating process is proposed, which is fitted to the available extreme value data. The model performance is evaluated by how well is describes the observed tail behaviour of the sample data. If the model provides a good fit then it is used for extrapolation of the quantities of interest, e.g. typically certain high quantiles, with estimation of the associated extrapolation uncertainty.” [MacDonald et al.]
“The choice of the threshold over which to fit a [probability distribution] is hard and arbitrary. Although tools to guide this choice exist, inference can greatly vary for different thresholds.”
What are extreme value mixture models?
“To overcome the difficulties associated with the selection of a threshold… extreme value mixture models have been recently defined, which formally use the full dataset and do not require a fixed threshold. These combine a flexible model for the bulk of the data below the threshold, a formally justifiable distribution for the tail and uncertainty measures for the threshold…[The figure below] illustrates the typical form of an extreme value mixture model using a flexible model for the bulk of the distribution, often defined as a mixture of density functions.”
“Mixture models have been proposed for the entire distribution function, simultaneously capturing the bulk of the distribution with the flexibility of an extreme value model for the upper and/or lower tails. These mixture models either explicitly include the threshold as a parameter to be estimated, or somewhat bypass this choice by the use of smooth transition functions between the bulk and tail components, thus overcoming the issues associated with threshold choice and uncertainty estimation.” [MacDonald et al.]
“The mixture model has the benefit of avoiding the subjectivity of the commonly used graphical diagnostic for threshold choice, and permits the complex uncertainties associated with threshold estimation to be fully accounted for…It is clear that compared to the traditional fixed threshold approach that the extra uncertainty associated with threshold choice should be accounted for, and the mixture model presented herein appears to have successfully encapsulate this uncertainty.” [MacDonald et al.]”
“When uncertainty about the threshold location is high, extreme value mixture models outperform the standard peaks over threshold approach.”
What are changepoint extreme value mixture models?
“Changepoint models…explicitly study distributional changes in the structure of the extremes… The changepoints mark a distributional change in the extremes only, and not on the overall distribution of the data.”
“The structure and amplitude of extremes events usually changes through time…The extreme behavior of financial returns often changes considerably in time and such changes occur by sudden shocks of the market… We introduce…a new class of models, termed changepoint extreme value mixture models, which…are also able to formally represent different extreme regimes caused by financial shocks… We demonstrate that this approach not only correctly identifies the location of such shocks, but also gives model-based uncertainty measures about these.”
“In financial settings extreme variations occur by sudden shocks caused by exogenous agents…Financial returns typically show clusters of observations in the tails, a phenomenon often termed volatility clustering. For this reason, inference can be expected to be more accurate by formally taking into account the nature of financial extreme events. Changepoint models allow for changes of the model distribution at multiple unknown time points and therefore can be faithfully used to represent and make inference over financial shocks.”
“We extend the extreme value mixture model class to formally take into account distributional extreme changepoints, by allowing for the presence of regime-dependent parameters modeling the tail of the distribution. This extension formally uses the full dataset to both estimate the thresholds and the extreme changepoint locations, giving uncertainty measures for both quantities.”
“Since financial markets are heavily affected by unexpected and abrupt variations, extreme regimes are well-captured using changepoint tools, identifying periods of changing volatility. Return levels, value-at-risk and expected shortfall measures are well estimated by our approach, making it a very powerful tool in a real-data context.”
What are Markov chain Monte Carlo (MCMC) methods?
“Estimation of functions of interest in extreme value analyses is performed via… the Bayesian paradigm using the MCMC machinery, enabling us to straightforwardly deliver a wide variety of estimates and predictions of quantities of interest, e.g. high quantiles.”
“What are Markov chain Monte Carlo (MCMC) methods? The short answer is: MCMC methods are used to approximate the posterior distribution of a parameter of interest by random sampling in a probabilistic space.” [Shaver]
“A distribution is a mathematical representation of every possible value of our parameter and how likely we are to observe each one…In Bayesian statistics, the distribution representing our beliefs about a parameter is called the prior distribution, because it captures our beliefs prior to seeing any data… The likelihood distribution summarizes what the observed data are telling us, by representing a range of parameter values accompanied by the likelihood that each parameter explains the data we are observing…The key to Bayesian analysis…is to combine the prior and the likelihood distributions to determine the posterior distribution. This tells us which parameter values maximize the chance of observing the particular data that we did, taking into account our prior beliefs.” [Shaver]
“We know that the posterior distribution is somewhere in the range of our prior distribution and our likelihood distribution, but for whatever reason, we can’t compute it directly. Using Markov chain Monte Carlo (MCMC) methods, we’ll effectively draw samples from the posterior distribution, and then compute statistics like the average on the samples drawn.” [Shaver]
“Monte Carlo simulations are just a way of estimating a fixed parameter by repeatedly generating random numbers. By taking the random numbers generated and doing some computation on them, Monte Carlo simulations provide an approximation of a parameter where calculating it directly is impossible or prohibitively expensive.” [Shaver]
“Markov chains…are simply sequences of events that are probabilistically related to one another. Each event comes from a set of outcomes, and each outcome determines which outcome occurs next, according to a fixed set of probabilities…An important feature of Markov chains is that they are memoryless: everything that you would possibly need to predict the next event is available in the current state.” [Shaver]