CONTENTS
1. Introduction
In order to estimate the potential future losses on an option it is necessary to determine the change in the current instrument price arising from a change in the current value of the input variables, that is, changes in the value of the underlying, implied volatility, the discount rate, the expected dividend and time to expiration. For risk management purposes, we refer to these inputs as the risk factors. All the current risk factors are observable except for the implied volatility. Implied volatility must be derived from the current option price and the observable risk factors.
A naive approach to risk management would then involve applying arbitrary stresses to relevant risk factors for each instrument in the portfolio, usually, the spot of the underlying and the implied vol, in order to obtain a new price for each instrument and thus the P&L of each position, which is simply the current position value minus the initial premium incurred to enter the position.The naive approach is obviously unsatisfactory for a number of reasons. Firstly, even on a stand-alone basis, the assumption that the "spot" and "implied vol" risk factors for a single instrument are independent is clearly incorrect. Changes in the implied vols of a particular security tend to exhibits a strong relationship with changes in the spot price of the security.Secondly, the approach should, but often fails to, incorporate reasonable "shocks" to the risk factors, based on empirical or simulated price processes.
Nevertheless, this methodology is applied by brokers to calculate the margining requirements for their counterparties, based on both a so-called "Rules-Based" approach (informed by the "Regulation T" framework) and "Risk-Based" approach (informed by the "Portfolio Margining" regulations). The materialization of collateralisation obligations due to the mechanical application of these simple models has liquidity implications for the holder of the options positions. The likelihood and magnitude of future margining obligations and liquidity demands will however need to take into consideration the expected P&L of the options positions. In other words, an estimation of liquidity risk over a specified time horizon requires a reasonable estimation of the market risk of the portfolio over the same time horizon. A reasonable estimation of market risk of a portfolio must incorporate the expected volatilities (and correlations) of the portfolio's assets. For an options portfolio, this implies estimating the volatilities (and correlations) of the all the risk factors for all the constituent positions.
The virtue and necessity of volatility estimation is not limited to the valuation and risk management of options portfolios. The effective computation of the ex-ante Value-at-Risk of a portfolio of cash equities requires the construction of covariance matrices which incorporate the updated information content available from the statistical distribution of asset price returns or the the implied distribution which can be extracted from traded option prices. The simple standard deviation of historic returns as employed in Markowitz's semimal paper on efficient portfolio formation is inadequate. One characteristic of VaR measues relying on this simple defintion of volaility is a so-called "Ghost VaR" where sustained periods of low volatility have an excessive weight when the variance applies equal weighting to historic observations, which leads to understimation of risk during short episodes of elevated volatility. The "Ghost effect" refers to the VaR computation being haunted by a former low volatility regime and the failure to effectively update the volatility forecast with current price dynamics.
2. Volatility Estimation with the Generalised Autoregressive Conditional Heteroskedasticity (GARCH) model
Recall that simple volatility estimation by statistical means assume equal weights to all returns measured over the period. We know that over 1-day, the mean return is small as compared to standard deviation. If we consider a simple m-period moving average, where $σ_n$ is the volatility of return on day $n$, then with $\bar{\mu } \approx 0$, we have:
$$\sigma ^{2}_n=\frac{1}{m} \sum_{i=1}^{m}{\mu ^{2}_{n-i}}$$
where: $m$ is the number of returns, $\mu$ is return and $\sigma ^{2}$ is the variance. Volatility is the square root of variance.
Some phenomena are systematically observed in asset return time series. A good volatility estimation model should be able to capture most of these empirical "stylised" facts.
Volatility Clustering: The volatility is more likely to be high at time n if it was also high at time n−1. That is, a shock at time n−1 increases not only the variance at time n−1 but also the variance at time n. In other words, the markets are more volatile in some periods, and they are more tranquil in others. The volatilities are clustered in time.
Fat Tails: Return time series generally present fat tails, also known as excess kurtosis, or leptokurtosis. That is, their kurtosis (the fourth central moment normalized by the square of the variance) is usually greater than three, the kurtosis of a Gaussian random variable. In fact, a popular statistical test for the hypothesis of Gaussianity of a distribution, the Jarque-Bera Test, jointly test both if the distribution is symmetric and if the distribution presents kurtosis equal to three. If returns are fat-tailed, the probability of having extreme events (very high or very low returns) is higher than it would be in the normal (Gaussian) case.
Any large return within this n period will elevate the variance or standard deviation of returns until it drops out of the sample. This is often referred to as the""Ghost Effect". To address this effect, we adopt weighting schemes.
One weighting scheme popularized by RiskMetrics is the Exponentially Weighted Moving Average (EWMA) where we weight the squared returns in an exponentially declining fashion. The rate of the exponetial decline of the weights is controlled by the $\lambda$ parameter, often called the persistence parameter, which following the RiskMetrics document is often set at 0.94. The recursive formula for EWMA Volatility is the following:
$$\sigma _{n}^{2}=\lambda \sigma _{n-1}^{2}+(1-\lambda )\mu _{n-1}^{2}$$
where: $\lambda$ is applied to the lagged variance of the previous day and $1 - \lambda$ is applied to the squared return of the previous day. The lower the persitence parameter, $\lambda$ , the greater the weight of recent volatility in this weighting scheme.
Under the assumption that volatility is mean-reverting, the GARCH model adds a third variable, Long Run Variance, denoted by $V_L$ and a third coeffcient, the rate of mean reversion, denoted by $\gamma$. $\beta$ and $\alpha$ subsitute for $\lambda$ and $1 - \lambda$ respectively. The three coeffcients sum to one. Thus:
$$\sigma _{n}^{2}=\gamma V_{L}+\beta \sigma _{n-1}^{2}+\alpha\mu _{n-1}^{2}$$
For notational economy, the long run variance and the rate of mean reversion are often represented by the single input, $\omega$:
$$\sigma _{n}^{2}=\omega+\beta \sigma _{n-1}^{2}+\alpha\mu _{n-1}^{2}$$
Maximum Likeihood Estimation (MLE) is a statistical method used for fitting the data to a model. When using MLE, we first assume a distribution and then seek to determine the model parameters. To estimate GARCH(1,1) parameters, we assume distribution of returns conditional on variance are normally distributed.
To derive ω , α and β, we maximize: $$\sum_{i=1}^{n}{\log \left[ \frac{1}{\sqrt{2\pi \sigma _{i}^{2}}} e^{\frac{-(\mu _{i}-\bar{\mu })^{2}}{2\sigma _{i}^{2}}}\right]}$$
3. Volatility Estimation with the GJR-GARCH model
The GJR model, proposed by Glosten, Jagannathan and Runkle, is a GARCH variant that includes leverage terms for modeling asymmetric volatility clustering. In the GJR formulation, large negative changes are more likely to be clustered than positive changes in line with the the stylized fact that negative shocks at time $t_{-1}$ have a stronger impact in the variance at time $t$ than positive shocks. This asymmetry tends to be called the leverage effect because the increase in volatility was traditionally ascribed to to the increased leverage induced by a negative shock to market value, that is: as market value falls, leverage, $debt/mktvalue$ increases and, therefore, risk also increases.
The GJR-GARCH model relies on two equations for the variance at time $t$, $\epsilon_{t} = R_{t} - \mu_{t}$ following negative and positive expected return i.e.where $\epsilon_{t-1} < 0$ and $\epsilon_{t-1} > 0$.
In the case of a positive surprise, we take the usual GARCH equation:
$$\sigma _{n}^{2}=\omega+\beta \sigma _{n-1}^{2}+\alpha\epsilon_{n-1}^{2}$$
In the case of a negative surprise, the predicted variance should be higher than after a positive surprise, which requires a larger coefficient multiplying the squared prediction error, namely $(\alpha+\gamma)\epsilon_{n-1}$ rather than $\alpha\epsilon_{n-1}$:
$$\sigma _{n}^{2}=\omega+\beta \sigma _{n-1}^{2}+(\alpha+\gamma)\epsilon_{n-1}^{2}$$
We can unify these two formulae in a single equation thus:
$$\sigma _{n}^{2}=\omega+\beta \sigma _{n-1}^{2}+(\alpha+\gamma I_{n-1})\epsilon_{n-1}^{2}$$
Where $I$ is zero if $\epsilon_{t-1} > 0$ and 1 if $\epsilon_{t-1} < 0$
We estimate all the parameters(μ,ω,α,γ,β) simultaneously, by maximizing the log likelihood
4. VaR Estimation with Implied Volatility
Black and Scholes proposed the following stochastic differential equation to model asset prices. Here the Weiner process $W_t$ is a standard Brownian Motion. The values of $S_t$ are log-normally distributed and the returns $dS_t / S_t$ are normally distributed. The instantaneous change in the price of an asset is driven by a deterministic drift component, $(r-q)$, that is, the risk free rate minus the dividend yield in a risk neutral world, and a random component given by the normal random variable and a constant volatility $\sigma$.
$$dS_{t} = (r-q) S_{t}dt + \sigma S_{t} d W_{t}$$
According to the Black-Scholes derivation, we can further state:
$$\frac{\partial V}{\partial t}+ \frac 1{2}{\sigma^2 S^2} \frac{\partial^2 V}{\partial S^2}+ r S \frac{\partial V}{\partial S}= rV$$
Or equivalently:
$$\frac{\partial V}{\partial t}+ \frac 1{2}{\sigma^2 S^2} \frac{\partial^2 V}{\partial S^2}+ r S \frac{\partial V}{\partial S}- rV = 0$$
Where $V$ is the option value
We know that the value of a call option is:
$$C = SN(d_1) - Ke^{-rt}N(d_2)$$
and, the corresponding put option price is: $$P = Ke^{-rt}N(-d_2) - SN(-d_1)$$
Where:
$d_1= \frac{1}{\sigma \sqrt{t}}\left[\ln{\left(\frac{S}{K}\right)} +{\left(r + \frac{\sigma^2}{2}\right)}t\right]$
$d_2= d_1 - \sigma \sqrt{t}$
$S$ is the spot price of the underlying asset
$K$ is the strike price
$r$ is the annualized continuous compounded risk free rate
$σ$ is the volatility of returns of the underlying asset
$t$ is time to maturity (expressed in years)
$N(.)$ is the standard normal cumulative distribution
We can consider call and put equation as a function of the volatility parameter $σ$ . Finding implied volatility thus requires solving the nonlinear problem $f(x)=0$ where $x=σ$ given a starting estimate and $P$ and $C$ are the put and call market prices.
For call options:
$$f(x) = SN(d_1(x)) - Ke^{-rt}N(d_2(x)) - C$$
For put options:
$$f(x) = Ke^{-rt}N(-d_2(x)) - SN(-d_1(x)) -P$$
The underlying assumptions are that returns are normally distributed and the risk-neutral probability density function of $S_T$ is log normal with time-scaled implied volatility informing the shape parameter and a scale parameter informed by the expected return over time to expiry $T$, that is, $ln(S_0) + [r - \frac{\sigma^2}{2}]T$
The risk neutral pdf of $S$ time to expiry $T$:
$$g(S_T) = \frac{1}{S \sigma \sqrt{2\pi t}} e^{ \frac{-\left(\ln(S) - \ln(S_0) - \left[r - \frac{\sigma^2}{2}\right] \right)^2}{2 \sigma^2 t}} $$
Evidently, we are assuming a single unique and constant implied volatility to estimate the distribution of stock prices over the period. In reality, implied volatilities vary with both time and strike.
In the Black-Scholes framework, where the underlying follows a log-normal distribution, the risk neutral probability $P_K$
of being above a strike is $N(d_2)$, where $N(.)$ is the standard normal cumulative distribution function and $$\begin{equation*} d_{2} = \frac{Log(\frac{F}{K}) - \frac{\sigma_{K}^{2}}{2}t}{\sigma_{K} \sqrt{t}} \end{equation*}$$
with $F$, being the forward price:
$$F = S_0 e^{rt}$$
Thus, the risk neutral probability of being below a strike is:
$$1 - N(d_2)$$
Which means we can obtain an implied Value-at-Risk from the risk-neutral density function at a desired confidence level.
5. Implied Volatility: Understanding the limitations
Whilst implied volatilities are certainly useful in gauging market expectations of risk, they should be used with caution as as predictive input to model stock returns or generate forward-looking valuations of derivatives for risk management purposes. The the assumptions of constant volatility and normally distributed stock returns (lognormally distributed prices) are violated by the empirical observations of price behaviour in both the cash and options markets, notably:
6. Risk metrics derived from Implied Volatility Surface
The asymmetry of the option-implied distribution of asset returns allows us extract a number of useful metrics which provide an indication of the bullish or bearish sentiment of market participants.
7. Determining Implied Risk-Neutral Distributions from the volatility smile with the Breeden-Litzenburger approach
The price of a European Call option with strike K, as with any asset price, is simply the discounted value of the expected return:
$$ c\ =\ e^{-rt}\int_0^{\infty{}}max\left(S_T\ -K\ ,\ 0\right)g\left(S_T\right)dS_T $$
where r is the risk-free rate, $\left(S_T\right)$is the asset price at time T, $g$is the risk-neutral probability density function of $\left(S_T\right)$.
Equivalently, removing the zero term: $$ c\ =\ e^{-rt}\int_K^{\infty{}}\left(S_T\ -\ K\right)g\left(S_T\right)dS_T $$
In other words the fair value for the call price is the weighted sum of all possible outcomes multiplied by their probability of occurring.
If we differentiate the call price function with respect to the strike price, we obtain:
$$ \frac{\partial{}c}{\partial{}K}=\ e^{-rt}\int_K^{\infty{}}g\left(S_T\right)dS_T $$
Differentiating the call price function a second time with respect to the strike price, gives:
$$ \frac{{\partial{}}^2c}{\partial{}K^2}=\ e^{-rt}\ g\left(K\right) $$
Which implies:
$$ g\left(K\right)=\ e^{-rt}\ \frac{{\partial{}}^2c}{\partial{}K^2}\ \ $$
Thus the risk-neutral density can be obtained from the discounted second derivative of the call price function.
This result allows risk-neutral probability distributions to be estimated from volatility smiles. Supposing that $c_1$,$\ c_2,c_3$ are the prices of T-year European call options with strike prices of respectively where is a small value. An estimate of is obtained by approximating the discounted second derivative:
$$ g\left(K\right)\approx{}\ e^{-rt}\ \frac{\left(c_1-c_2\right)+\left(c_3-c_2\right)}{{\delta{}}^2}\ \ \approx{}\ e^{-rt}\ \frac{\left(c_1+c_3\right)-\left(2c_2\right)}{{\delta{}}^2}\ \ $$
A perhaps more intuitive way to understand the above is to view the combination of the four positions -- long $c_1$,$\ c_3$ with strikes of $K-\delta{}\ ,\ K+\delta{}$and short 2 $c_2$ with strikes of -- as a butterfly spread.
Considering the payoff triangle of the butterfly spread, we note that it has the dimensions of height, $\delta{}$, reflecting the maximum payoff and width $2\delta{}$. The area under the spike therefore is: $$ (0.5\ *\ 2\delta{})\ *\ \delta{}=\ {\delta{}}^2 $$
Dividing the costs of these trades $\left(c_1+c_3\right)-\left(2c_2\right)\ $by their payoffs ${\delta{}}^2$, and adjusting for the time value of money, yields the future probability distribution of the stock as priced by the options market:
$$ g\left(K\right)\ =\ e^{-rt}\ \frac{\left(c_1+c_3\right)-\left(2c_2\right)}{{\delta{}}^2}\ \ $$
Rearranging, we note that the price of the Butterfly Spread strategy is of course , the discounted value $e^{-rt}$ of the expected return: $$ e^{-rt}\ g\left(K\right){\delta{}}^2\ =\ \ \left(c_1+c_3\right)-\left(2c_2\right)\ \ $$
8. Generating simulated values for the volatility level (and stock price) with the Heston Process
Given the restrictive nature of the Black-Scholes framework, a number of extensions to the model have been proposed to reflect the empirical evidence which indicates that volatility in asset prices is non-normal, time-varying, dependent on stock price dynamics notably in the form of the leverage effect as well as to reconcile the fact that the volatilities implied by traded option prices exhibit a term structure (dependence on time to expiry) and a volatility smirk (dependence on moneyness).
Practitioners came to focus on stochastic volatility models. These models assume that the asset price and its volatility are both assumed to be random processes and can change over time. The most popular stochastic volatility model was introduced by Heston. It factors in a possible correlation between a stock's price and its volatility, it conveys volatility as reverting to the mean, it gives a closed-form solution for option pricing and it does not require that stock price follow a lognormal probability distribution.
In his influential paper Heston he proposed that that the risk neutral dynamics of the asset price are governed by the following system of stochastic differential equations (SDEs):
$$dS_{t} = (r-q) S_{t}dt + V_{t} S_{t} dW_{t}$$
$$dV_{t} = \kappa (\theta -V_{t}) dt + \xi \sqrt{V_{t}} dZ_{t}$$
$$dW_{t}, dZ_{t} = \rho dt$$
Where $S_{t}$is the price of a dividend paying stock at time $t$ and $V_{t}$ is its instantaneous variance. The parameter $r$ is the risk-free interest rate, and $q$ is the dividend yield. Both are assumed to be constant over time. $\kappa$ is the rate of mean reversion to the mean reversion level $\theta$, representing long run variance and and $\xi$ is the so-called volatility of volatility. $W_{t}$ and $Z_{t}$ are two correlated Brownian motions with $dW_{t}, dZ_{t} = \rho dt$ where $\rho$ is the correlation coefficient.
This mean-reverting variance process is of course square-root diffusion the process proposed by Cox, Ingersoll and Ross in their 1985 paper to model the short interest rate.
In the Heston model shocks to the asset price and volatility process may be negatively correlated, as empirical evidence suggests, meaning that a down move in the asset is accompanied by an up move in volatility and vice versa. This allows us to account for the aforementioned stylized fact called the leverage effect, which in essence states that volatility goes up in times of stress (declining markets) and goes down in times of a bull market (rising markets). Within the context of model parametrization, the more negatively correlated the stock and vol processes, the more negatively skewed the implied distribution of stock prices. If we were to reverse-engineer the Black-Scholes implied vols from the option prices generated by the Heston Process, one would expect a more pronounced risk reversal spread, that is, a greater difference between the implied vols at lower stock prices and those at higher prices. This makes intuitive sense given that we assuming a higher leverage effect by inputting a more negative correlation coefficient. A greater vol-of-vol parameter results in a higher kurtosis in the implied distribution of stock prices. Again, if we were to reverse-engineer the Black-Scholes implied vols from the option prices generated by the Heston Process, one would expect a more pronounced volatility smile. A higher volatility smile increases the the implied volatility curvature resulting, ceteris paribus, in higher butterfly spreads. A higher mean reversion factor implies a faster rate of reversion to long term vol, which in turn implies a flatter volatility term structure.
Calibration of the Heston Model: Heston allows that "One could estimate $\theta$ and other parameters by using values implied by option prices. Alternatively, one could estimate $\theta$ and $\kappa$ from the true spot-price process" It is computationally efficient to estimate the parameters with the quantlib process. Here, the calibration of the model is the process of seeking the model parameters for which the model results best match the market option data. This data usually consists of Black-Scholes implied volatilities derived from market quoted prices. Calibrating the Heston model is equivalent to solving the non-linear constrained optimization problem: Minimize the error between the market quotes of options and the options price estimated by Heston model. The object function could be the sum of squared differences.
Formally, Using the risk-neutral measure, the Heston model has five unknown parameters:
$$\Omega = \lbrace V_{t} ,\theta , \kappa , \xi , \rho \rbrace $$
By calibrating these parameters values, we seek to obtain an evolution for the underlying asset that is consistent with the current prices of plain vanilla options. In order to find the optimal parameter set, $ {\Omega }$, we need to (i) define a measure to quantify the distance between model and market prices; and (ii) run an optimization scheme to determine the parameter values that minimize such distance.
$$\min \sum_{i=1}^{N}{\frac{1}{N}}\lbrack C_{i}^{\Omega }(K_{i},T_{i})-C_{i}^{Mkt}(K_{i},T_{i})\rbrack ^{2}$$
Where $C_{i}^{\Omega }(K_{i},T_{i})$ are the option values using the parameter set , for all strikes and expiries, and $C_{i}^{Mkt}(K_{i},T_{i})$ are the market observed option price, for all strikes and expiries
1) Import Libraries
# Standard Python libraries
import pandas as pd
import numpy as np
# For plotting
%matplotlib inline
import matplotlib.pyplot as plt
from matplotlib import cm
from mpl_toolkits.mplot3d import Axes3D
plt.style.use('seaborn')
from matplotlib.pyplot import rcParams
rcParams['figure.figsize'] = 16, 8
# Import arch library
from arch import arch_model
import warnings
warnings.filterwarnings('ignore')
2) Retrieve Price Data
# Load locally stored data
df = pd.read_csv('vol.csv', parse_dates=True, index_col=0, dayfirst=True)
# Check first 5 values
df.head()
3) Visualize Index Price and VIX Price
figure, axis_1 = plt.subplots()
axis_1.plot(df['S&P 500'], color='green', label='S&P 500')
axis_1.legend(['S&P 500'],loc='best')
plt.title('Prices')
plt.grid(True)
axis_2 = axis_1.twinx()
axis_2.plot(df['VIX'], color='red', label='VIX')
axis_2.legend(['VIX'],loc='best');
4) Calculate log returns of index
returns = np.log(df['S&P 500']).diff().fillna(0)
5) Visualize daily log returns
# Visualize S&P Index daily returns
plt.plot(returns, color='orange')
plt.title('S&P 500 Index Returns')
plt.grid(True)
4) Define and fit models
#Define the model (Normal Distribution)
am_n = arch_model(returns, mean = 'zero', vol='Garch', p=1, o=0, q=1, dist='Normal')
#Fit the model
garch1_1 = am_n.fit()
#Define the model (t-Distribution)
am_t = arch_model(returns, mean = 'zero', vol='Garch', p=1, o=0, q=1, dist='StudentsT')
tgarch1_1 = am_t.fit()
5) Obtain Model Output
print(garch1_1)
print(tgarch1_1)
6) Visualize Historic Annualized Conditional Vol
# Plot annualised vol
fig1 = garch1_1 .plot(annualize='D')
fig2 = tgarch1_1.plot(annualize='D')
garch1_1.conditional_volatility*np.sqrt(252)
tgarch1_1.conditional_volatility*np.sqrt(252)
7) Visualize Historic Annualized Conditional Vol vs long run vol
# Model params
garch1_1.params
# long run variance from model forecast
lrv = garch1_1.params[0]/(1-garch1_1.params[1]-garch1_1.params[2])
# long run variance
np.sqrt(lrv*252)*100
plt.plot(garch1_1.conditional_volatility*np.sqrt(252), color='blue')
plt.axhline(y=np.sqrt(lrv*252), color='red');
7) Visualize Historic Annualized Conditional Vol vs VIX
# Visualise GARCH volatility and VIX
plt.title('Annualized Volatility')
plt.plot(returns.index, garch1_1.conditional_volatility*np.sqrt(252)*100, color='blue', label='GARCH')
plt.plot(returns.index, df['VIX'], color='orange', label = 'VIX')
plt.legend(loc=2)
plt.grid(True)
8) Forecast Volatility
# Forecast for next 60 days
model_forecast = garch1_1.forecast(horizon=60)
# Subsume forecast values into a dataframe
forecast_df = pd.DataFrame(np.sqrt(model_forecast.variance.dropna().T *252)*100)
forecast_df.columns = ['Cond_Vol']
forecast_df.head()
# Plot volatility forecast over a 60-day horizon
plt.plot(forecast_df, color='blue')
plt.axhline(y=np.sqrt(lrv*252)*100, color='red',)
plt.xlim(0,60)
plt.xticks(rotation=90)
plt.xlabel('Horizon (in days)')
plt.ylabel('Volatility (%)')
plt.title('Volatility Forecast : 60-days Ahead');
plt.grid(True)