Maximum likelihood estimation of the Heston
stochastic volatility model using asset and option prices: an application of
nonlinear filtering theory
Francesca Mariani, Graziella
Pacelli, Francesco
Zirilli
Warning: If your
browser does not allow you to see the content of this website click here to download a word file that reproduces the
material contained here.
Let
us suppose that the dynamics of the stock prices and of their stochastic
variance is described by the Heston model, that is by a system of two
stochastic differential equations with a suitable initial condition. Our aim is
to estimate the parameters of the Heston model and one component of the initial
condition, that is the initial stochastic variance, from the knowledge of the
stock and option prices observed at discrete times. The option prices
considered refer to an European call on the stock whose prices are described by
the Heston model. The method proposed to solve this problem is based on a
filtering technique to construct a likelihood function and on the maximization
of the likelihood function obtained. The estimated parameters and initial value
component are characterized as being a maximizer of the likelihood function
subject to some constraints. The solution of the filtering problem, used to
construct the likelihood function, is based on an integral representation of
the fundamental solution of the FokkerPlanck equation associated to the
Heston model, on the use of the wavelet expansions presented in [1], [2], [3] to approximate the integral kernel appearing
in the representation formula of the fundamental solution, on a simple
truncation procedure to exploit the sparsifying properties of the wavelet
expansions and on the use of the Fast Fourier Transform (FFT). The use of these
techniques generates a very efficient and fully parallelizable numerical
procedure to solve the filtering problem, this last fact makes possible to
evaluate very efficiently the likelihood function and its gradient. As a
byproduct of the solution of the filtering problem we have developed a
stochastic variance tracking technique that gives very good results in numerical
experiments. The maximum likelihood problem used in the estimation procedure is
a low dimensional constrained optimization problem, its solution with ad hoc
techniques is justified by the computational cost of evaluating the likelihood
function and its gradient. We use parallel computing and a variable metric
steepest ascent method to solve the maximum likelihood problem. Some numerical
examples of the estimation problem using synthetic data obtained with a
parallel implementation of the previous numerical method are presented. Very
impressive speed up factors are obtained in the numerical examples using the
parallel implementation of the numerical method proposed. This website contains
two animations and some auxiliary material that helps the understanding of the
paper [7] and makes available to the interested users the
computer programs used to produce the numerical experience presented here and
in [7] .
Assuming
the stock prices and their stochastic volatilities or variances as state
variables and the stock prices and option prices observed at discrete times as observations,
we study the problem of forecasting (tracking) the unobserved and timevarying
stochastic volatility or variance when the model parameters and initial
condition are known using the observations (filtering problem) and the
problem of estimating the model parameters and
the unknown initial volatility or variance using the observations (estimation
problem). We consider these
problems in the context of the Heston stochastic volatility model.
Let
S_{t} and v_{t} be the state variables at time t>0 that is,
respectively, the stock price and the stochastic variance at time t>0. The
Heston model assumes that the dynamics of the state variables (S_{t},v_{t}),
t>0, is described by the following system of stochastic differential
equations [4]:
_{}
where W_{t}^{1}, W_{t}^{2},
t>0, are standard Wiener processes such that W_{0}^{1}=W_{0}^{2}=0, dW_{t}^{1}, dW_{t}^{2}
are their stochastic differentials and <dW_{t}^{1}dW_{t}^{2}
>= rdt where < × > denotes the mean of × and rÎ[1,1] is a constant known as correlation parameter; m is a real constant related to the
behaviour of the mean of the stock log returns, associated to the stochastic
process S_{t} , t>0. Note
that the stochastic variance v_{t}, t>0, is assumed to be described by a meanreverting process with speed
g³0 and parameters q³0 and e³0, that is by equation (2). The equations (1),
(2) must be equipped with an initial condition, that is:
_{}
Here _{} and _{}denote random variables concentrated
with probability one in a point; for simplicity, we continue to denote these
points respectively with _{}and _{}.
We note that _{}, the stock price at t=0, is
observable while _{}, the stochastic variance at t=0,
cannot be observed. The quantities m, q, e, g
and r are real constants and are the
model parameters that we want to estimate. Hence, the parameters and unknown
initial condition component of the Heston model that we want to estimate
are m, q, e, g, r and _{}. We assume that 2gq/e^{2 }³1, this assumption guarantees that
if _{} is positive with probability one, v_{t}, solution of (1),
(2), remains positive with probability one for t>0.
We suppose to observe the prices of the stock
and the prices of a European call option of known maturity and strike price on
the underlying whose price S_{t},
t>0, satisfies (1), (2), (3), (4) at prescribed discrete times_{ }0<t_{1}<t_{2}<…<t_{n}<t_{n+1
}=+¥, let F_{t}= {(S_{i},C_{i}) : t_{i }£t}, t>0, be the set of the
observations up to time t>0 of the stock prices S_{t} and of the
option prices C_{t} that is for i=1,2,...,n let S_{i} be the
observation of the stock price and C_{i} be the observation of the
option price at time t_{i} and let _{} be the vector of the model
parameters and of the initial stochastic variance, where the superscript ^{T}
denotes the transpose operator. We use the notation t_{0} =0 and F_{0}={(_{})}, this
notation simplifies some of the formulae that follow. The option prices
considered refer to the prices of an European call option on the underlying
whose prices are described by S_{t}, t>0, with known strike price E
and maturity time T such that T>t_{n}, we denote with C(S,v,t;E,T,Q) the price of this option in the Heston
stochastic volatility model when 0£t<T, S_{t}=S and v_{t}=v. We assume that the option prices observed are given by the
theoretical option prices in the Heston model affected by a Gaussian error,
that is: for i=1,2,...,n, C_{i}=C(S_{i},v_{i,}t_{i};
E,T,Q)+u_{i}, where u_{i }is a Gaussian error term; that
is we assume that u_{i }is sampled from a Gaussian variable with mean
zero and known variance _{} and C(S_{i},v_{i,}t_{i};
E,T,Q) is the option price in the Heston
model assuming known the (unobserved) variance v_{i} at time t_{i}.
We introduce the stock log returns:
_{} (5)
without
loss of generality, we can assume S_{0}=_{}=1. From (1), (2), (3), (4) and (5) it is easy to see that (x_{t},v_{t}),
t>0, satisfy the following equations:
_{}
with
initial condition:
_{}
The joint probability density function _{} (x,v)Î(¥,+¥)_{}(0, +¥) , t>0, of the state variables x_{t},
v_{t}, t>0, conditioned to the observations F_{t},
t>0, is solution of the following
equations [5]: for i=0,1,...,n:
_{} _{ }(10)
with
initial condition: for i=0:
_{} (11)
and for
i=1,2,...,n:
_{}
_{}(12)
where for i=1,2,...,n: x_{i}=log(S_{i})
and _{}(x,v,t_{i};E,T,
Q) is the value in the Heston model
at time t=t_{i} when _{}, _{}of a call option on the stock whose price is S_{t},
t>0, with strike price E and maturity time T.
Once we know the joint probability density _{}(x,v,tF_{t},Q), (x,v) Î (¥,+¥)_{}(0, +¥), t>0, we can forecast the values of the
random variables (x_{t},v_{t}), t>0, as follows:
_{}
The
quantities _{} can be regarded as
the solution of the filtering problem.
Let G(x,v,t,x',v',t'Q), (x,v), (x',v') Î(¥,+¥)_{}(0, +¥) ,
t>t'>0, be the fundamental solution of (10), we have (see [6]
pag.605):
_{} where i denotes the imaginary unit an A and B are
explicitly known elementary functions [6] pag.605. Using the fundamental solution of (10) given in (15) the solution of (10),
(11), (12) can be written as follows:
_{} (16)
and
for i=1,2,…,n:
_{} (17)
where
_{} (18)
and
_{} (19)
The evaluation of_{ }, (x,v)Î (¥,+¥)_{}(0, +¥), t>0, for i=1,2,...,n, on a grid of N^{2}
points in the (x,v) variables with a naïve quadrature procedure applied to
formula (17) using N quadrature nodes in each variable requires O(N^{5})
complex multiplications when N goes to infinity. In fact (17) due to (15) must
be regarded as a three dimensional integral (i.e. an integral of the _{}k, l variables). Using
· a wavelet expansion [1],
[2], [3] of the integral kernel of (17)
and of the function (18),
· a truncation procedure that
exploits the sparsifying properties of the wavelet expansion,
· the Fast Fourier Transform
(FFT),
we reduce the computational cost of the
previous computation from O(N^{5}) to N_{1}O(N^{2}log
N) complex multiplications when N goes to infinity, where N_{1} is
independent of N. Note that N_{1 }is the number of the wavelet
functions remaining in the expansion after the use of the truncation procedure.
Using the estimates (13), (14), we can forecast
the values of the stock log return x_{t}, t>0, and of the stochastic
variance v_{t}, t>0, coherently with the observations F_{t}, t³0. The continuous update in time of the value
of the couple _{} is the “trajectory
tracking” of (6), (7), (8), (9). The continuous update in time of _{} is the “variance
tracking”.
The quality of the estimated values (_{},_{}),
t>0, depends on the variance of the random variables x_{t}, v_{t}, t>0, conditioned to the
observations, that is we can estimate the quality of the estimates (13), (14)
computing respectively the quantities:
The estimates (13), (14) are good, when the
variances (20), (21) are small.
In the trajectory tracking shown in the digital
movie and in Figure1 the solid line represents the true trajectory made of the
values (x_{t},v_{t}), tÎ[0,24/252.5], obtained computing
with Euler method one trajectory of the stochastic differential equations (6),
(7), (8), (9) with _{} =0 and _{}=0.5; the dashed line represents the
trajectory followed by the forecasted values (_{},_{}), tÎ[0,24/252.5],
obtained solving the filtering problem computing the integrals (13),
(14) and using the data shown in Table 1. The asterisks and the circles
indicate, respectively, the values of
(x_{t},v_{t})
and of _{} at the observation
times t=t_{i}, i=0,1,...,5. We have chosen Q = (0.026, 5.94, 0.306, 0.01159,
0.576, 0.5)^{T}. We have considered the option with strike price E =
0.5 and maturity time T = 126/252.5.
We assume that the non adimensional components
of the parameter vector Q are expressed in annualized unit,
that is in the unit 1/year. We use a “year” made of the average number of
trading days per year, that is 252.5 days, so that for example t=24/252.5
corresponds to the end of the trading day number 24 of the regular year.
i 
t_{i} 
x_{i} 
C(x_{i},v_{i},t_{i}E,Q) 
C_{i} 
0 
0 
0 


1 
4/252.5 
0.00371625 
0.13279039 
0.13440918 
2 
8/252.5 
0.08059790 
0.06952851 
0.06835849 
3 
12/252.5 
0.00498269 
0.12195051 
0.12349829 
4 
16/252.5 
0.00590680 
0.10456021 
0.10599176 
5 
20/252.5 
0.08869340 
0.18950671 
0.18758168 
Table 1. The data of the
filtering problem
Below we show two figures (Figure 1 and Figure
2) that help the understanding of the quality of the estimates obtained solving
the filtering problem. Figure 1 shows the trajectory tracking obtained as
solution of the filtering problem. In particular the solid line represents the
true trajectory made by the values (x_{t},v_{t}),
tÎ[0,24/252.5],
obtained solving with the Euler method one trajectory of the stochastic
differential equations (6), (7), (8), (9), the dashed line represents the
trajectory followed by the forecasted values (_{},_{}), tÎ[0,24/252.5]. Figure 2 shows S(v_{t } F_{t, }Q)
versus time when tÎ[0,24/252.5].
Figure 1. Trajectory tracking Figure
2. S(v_{t}F_{t },Q)_{ }versus
time t
In Section 3 we have assumed Q known
to solve the filtering problem but in real situations when the Heston model is
used Q is
not known and must be estimated from the knowledge of the observations. The
dynamics of the stochastic processes (x_{t},v_{t}), t>0, and the estimates (_{},_{}),
t>0, obtained from the solution of
the filtering problem, depend on the vector _{} Let _{}be the set of the admissible vectors Q. The
observations (x_{i},C_{i}), i=1,2,...,n,and the initial
condition _{}_{ }are realizations
(samples) of the random variable x_{t}, t³0,
at the observation times and at t=t_{0}=0 and of a random variable, the
option price C_{t}, t>0, derived from (x_{t},v_{t}),
t>0, at the observation times so that we can hope that the observations can
be used to characterize the vector Q.
Therefore, in order to derive forecasts of the stock log returns and of the
stochastic variance and in order to obtain derivative prices that are coherent
with the observed derivative prices it is essential to have a reliable estimate
of the vector Q. That
is we search an admissible vector Q
that makes most likely the observations
_{}_{ }and (x_{i},C_{i}), i=1,2,...,n.
We solve the problem of estimating
the vector Q
from the observations using the maximum likelihood method. The maximum
likelihood method consists in searching the vector QÎM
that maximizes the following (log)likelihood
function:
_{} (22)
Note that (22) represents the sum of
the logarithms of the integrals of the product between the transition
probability densities _{}(x_{i+1},v,t_{i+1}F_{t},Q)
and the weight functions _{},
i=0,1,...,n1. The addition of these weight functions to the expression of the
likelihood function considered in (22) has the purpose of taking care of the
fact that also the option prices C_{i}, i=1,2,...,n, are observed, assigning bigger weights to the values
of the v variable that make more likely the observation of C_{i} at
time t=t_{i}, i=1,2,...,n. The function F(Q)
represents the (log)likelihood of
the parameter vector Q given the observations F_{t}, t>0,
and _{}_{ }at time t=t_{0}.
In order to maximize the (log)likelihood
function shown in (22), we use an iterative optimization procedure that
searches the maximum likelihood estimate Q^{*}^{ }solution
of the maximization problem:
_{} (23)
starting from an initial guess Q^{0}ÎM.
The iterative procedure that solves the
optimization problem (23) using Q^{0 }as initial guess finds during the iterations
the admissible vectors Q^{k,},^{ }k=1,2,...,iter,
where iter is the maximum number of iterations allowed. The vectors Q^{k}, k=0,1,…,iter, are such that F(Q^{k+1})³F(Q^{k}), k=0,1,…,iter1.
The evaluation of the (log)likelihood function F(Q) is very
expensive computationally. We reduce the computational cost of solving the
filtering and estimation problems using parallel computing. Using
n_{p }processors the wall clock time required to execute the
program that finds the solution of the filtering problem reduces approximately
of a factor n_{p}. In particular if we denote with _{}the wall clock time needed to evaluate the (log)likelihood
function F(Q), for a
given choice of QÎM, using n_{p
}processors, we can measure the efficiency of the parallel implementation
considered computing the speed up factor s(n_{p}), where
_{} (26)
In Table 2 we show as a function of the number of
processors n_{p} the speed up factor and the wall clock time _{} needed to evaluate
the (log)likelihood function associated to the data shown in Table 1 on IBM
SP4 machine with 32 processors.
n_{p} 
s(n_{p}) 
_{}

1 
1 
4200 
2 
1.9927 
2107 
4 
3.9424 
1065 
8 
7.6549 
548.6 
16 
14.8287 
283.23 
Table 2. Speed up factors s(n_{p}) and
wall clock time _{}
needed to evaluate the likelihood function
associated to
the data of Table1 versus the number n_{p} of processors
employed
L.Fatone, M.C.Recchioni, F.Zirilli New
Scattering Problems and Numerical Methods in Acoustics, in Recent Research
Developments in Acoustics,S.G.Pandalai Managing Editor, Transworld Research
Network, Kerala, India, 2 (2005), 3969.
L.Fatone, M.C.Recchioni, F.Zirilli New
wavelet bases made of piecewise polynomial functions: approximation theory, quadrature
rules and applications to kernel sparsification and image compression, in
preparation.
L.Fatone, G.Rao, M.C.Recchioni,
F.Zirilli, High Performance Algorithms Based on a New Wavelet Expansion for
Time Dependent Acoustic Obstacle Scattering, submitted to Journal of
Computational Physics.
S.L.Heston, A ClosedForm
Solution for Options with Stochastic Volatility with Applications to Bond and
Currency Options , Review of Financial Studies, 6 (1993), 327343.
A.H. Jazwinski, Stochastic Processes and
Filtering Theory , Academic Press, New York, (1970).
A.Lipton, Mathematical Methods for Foreign
Exchange, , World Scientific Pubblishing Co. Pte. Ltd, Singapore, (2001).
F.Mariani, G.Pacelli, F.Zirilli, Maximum
Likelihood Estimation of the Heston Stochastic Volatility Model Using Asset and
Option Prices: an Application of Nonlinear Filtering Theory ,
Optimization Letters, 2 (2008), 177222.
The FORTRAN code distributed in this website computes the solution of the estimation problem (23) using the formulae outlined in Sections 3 and 4 and derived in [7] . The data of the estimation problem are those described in Section 4 and must be provided by the user. The FORTRAN code consists of two programs. The first program preproc.f reads and writes on files the data inserted by the user, initializes the parameter vector Q^{0 }and computes the number of the wavelet coefficients needed after the truncation procedure in order to approximate efficiently the integral kernel contained in (17) associated to the initial guess Q^{0}.
The second program estimate.f starting from the initial guess Q^{0 } generates a sequence of admissible vectors Q^{k}, k=1,2,…,iter,^{ }using a variable metric steepest ascent procedure. At every iteration the program solves the filtering problem associated to the current vector parameter Q^{k}.Both programs use parallel computing and MPI as message passing library. Therefore they can run on parallel machines.
Entry n. 8177