Maximum likelihood
estimation of the parameters of a system of stochastic differential equations
that models the returns of the index of some classes of hedge funds^{*}
Lorella Fatone, Francesca Mariani, Maria
Cristina Recchioni, Francesco Zirilli
Warning: If your
browser does not allow you to see the content of this website click here to download a word file that reproduces the material contained here.
^{*}The numerical experience reported in this paper has been obtained using
the computing grid of ENEA (
In [1] we study an inverse problem for a parabolic
partial differential equation. The parabolic partial differential equation
considered is the Fokker Planck equation
associated to a system of stochastic
differential equations and the inverse problem studied consists in finding from
suitable data the values of the parameters that appear in the coefficients of
this Fokker Planck equation, that is of the parameters that define the system
of stochastic differential equations. The data used in the reconstruction of
the parameters are observations made at discrete times of the stochastic process solution of the
system of stochastic differential equations. That is, the data of the inverse
problem are a “sample” taken from some of the components of the vector solution
of the stochastic differential equations and not, as usual, observations made
on the solution of the parabolic equation. The choice of the system of
stochastic differential equations and of the data used in the inverse
problem are motivated by applications in
mathematical finance. The stochastic differential equations presented can be
used to model the dynamics of the log-returns
of the index of some classes of hedge funds, such as, for example, the so
called “long short equity” hedge funds
and of some auxiliary variables. The
solution of the inverse problem proposed is obtained through the solution
of a filtering and of an estimation
problem. The solution of the last two problems is based on the knowledge of the
joint probability density function of the state variables of the model
conditioned to the observations made and
to the initial condition. This joint
probability density function is solution of an initial value problem for the
Kushner equation that in the circumstances considered here can be written as a
sequence of initial value problems with appropriate
initial conditions for the Fokker Planck equation associated to the system of
stochastic differential equations. An integral representation formula for this
probability density function is derived and used to develop a numerical
procedure to solve the estimation problem using the maximum likelihood method.
The computational method proposed has been tested on synthetic data and the results obtained are presented. This
website contains some auxiliary material useful to understand [1]
and contains some animations and the description of some numerical experiments. A more general
reference to the work of the authors and of their coauthors in mathematical
finance is the website: http://www.econ.univpm.it/recchioni/finance.
Assuming
the log-return of the stock prices index, the log-return of the index of a
suitable class of hedge funds (i.e.: for
example “long short equity” class) and
the stochastic variance of the
stock prices index as state
variables and the log returns of the index of the stock prices and of the index of the hedge funds class observed
at discrete times as observations, we study the problem of forecasting
(tracking) the unobserved and time-varying stochastic variance when the model
parameters and initial condition are known using the observations (filtering
problem) and the problem of estimating the model parameters and the unknown initial stochastic variance using the observations (estimation
problem). The solution of this last problem is needed in order to use the
model proposed to forecast in real
situations future log returns of the stock prices and hedge funds indices.
Let
x_{t} , z_{t}, and v_{t} be the state variables at time
t>0 that is, respectively, the log-return of stock prices index, the log-return of the
index of the hedge funds class and the stochastic variance of the stock prices index
at time t>0. We assume that the stochastic process (x_{t} , z_{t}
, v_{t} ), t>0, is solution of
the following system of stochastic differential equations:
_{}
where W_{t}^{1}, W_{t}^{2},
W_{t}^{3} , t>0, are standard Wiener processes such that W_{0}^{1}=W_{0}^{2}=
W_{0}^{3 }=0, dW_{t}^{1},
dW_{t}^{2} , dW_{t}^{3} , are their stochastic
differentials and <dW_{t}^{1}dW_{t}^{2}
>= r_{1,2}dt, <dW_{t}^{1}dW_{t}^{3}
>= r_{1,3}dt, <dW_{t}^{2}dW_{t}^{3}
>= r_{2,3}dt
where < × > denotes the mean of × and r_{1,2,}r_{1,3,}r_{2,3}Î[-1,1] are constants known as correlation coefficients, _{} is a real constant
related to the behaviour of the mean of the stock prices index log return,
associated to the stochastic process of the stock prices index S_{t}
, t>0. Note that the stochastic
variance v_{t}, t>0, is
assumed to be described by a mean-reverting process with speed c³0 and parameters q³0 and e³0, that is by equation (3). The stochastic differential equations (1), (3) are
known as Heston model [2]. Equation (2) has been
introduced to model the log-returns of the indices of suitable classes of hedge
funds (such as, for example, the long short equity class). In fact equation (2)
says that dz_{t} is the sum of bdx_{t} and a random
disturbance given by g_{}dW_{t}^{3}, where b and g
are real constants. If we interpret z_{t},
t>0, as the log-return at time t of the index of one of these classes of
hedge funds, x_{t}, t>0, as the log-return at time t of the stock
prices index, for example the S&P500 index, and v_{t}, t>0, as the
stochastic variance of x_{t},
t>0, equation (2) can be regarded as a reasonable translation in the
continuous time setting of the findings of the time series analysis presented
in [3] and the
system (1), (2), (3) can be interpreted
as a model to explain the dynamics of the log-return of the index of one of the
classes of hedge funds mentioned previously that uses the log-return of the S&P500 index and its
stochastic variance as explanatory variables. We remark that the models
proposed in [3] are discrete time models.
The equations (1), (2), (3) must be equipped with an initial condition,
that is:
Here _{},_{}, and _{} denote random
variables concentrated with probability one in a point; for simplicity, we
continue to denote these points respectively with _{},_{}, and _{}. We note that _{}, _{}, that is the log-returns of the stock prices index and of
the index of the hedge funds class at t=0, are
observable while _{}, that is the stochastic variance at t=0, cannot be observed.
The quantities _{}, q, c, e, g, b and r_{1,2, }r_{1,3, }r_{2,3 } are real constants and are the model
parameters that we want to estimate. Hence, the parameters and unknown initial
condition component of the model (1), (2), (3) that we want to estimate
are _{}, q, c, e, g, b, r_{1,2, }r_{1,3, }r_{2,3 } and _{}. We assume that 2cq/e^{2 }³1, this assumption guarantees that
if _{} is positive with
probability one, v_{t}, solution of
(3), (5) remains positive with probability one for
t>0.
We suppose that at discrete times 0=t_{0}<t_{1}<t_{2}<…<t_{n}<t_{n+1
}=+¥, the log-return of the stock prices index and the log-return of the
index of a special class of hedge funds are observed and let F_{t}= {(_{},_{}) : t_{i }£t}, t>0, be the set of the observations up
to time t>0 of the log-returns of the stock prices index x_{t} and
of the log-returns of the index of the hedge funds, that is for i=1,2,...,n,
let _{} be the observation of
the log-return of the stock prices index and
_{} be the observation of
the log-return of the hedge funds index at time t=t_{i }. We assume
that the observations are error free, that is we assume _{}=_{}, _{}=_{}, i=0,1,…,n. Let _{} be the vector of the
model parameters and of the initial stochastic variance, where the superscript ^{T}
denotes the transpose operator. We use the notation t_{0} =0 and F_{0}={(_{})} to simplify the formulae that
follow.
The joint probability density function _{}, (x,z,v)Î (-¥,+¥)_{} (-¥,+¥)_{}(0, +¥) , t>0, of having x_{t}=x, z_{t}=z,
v_{t}=v, t>0, conditioned to the observations F_{t},
t>0, is solution of the following
equations [1] : for i=0,1,...,n:
_{} (7)
with initial condition: for i=0:
_{}=d_{}d_{}d_{}, (x,z,v)Î( -¥,+¥)_{} (-¥,+¥)_{}(0,+¥) , t=t_{0}, (8)
and for i=1,2,...,n:
_{}=_{}, (x,z,v)Î( -¥,+¥)_{} (-¥,+¥)_{}(0,+¥) , t=t_{i}. (9)
Once we know the joint probability density _{}(x,z,v,t|F_{t},Q), (x,z,v) Î (-¥,+¥)_{}(-¥,+¥)_{}(0, +¥), t>0, we can forecast the values of the
random variables (x_{t}, z_{t} ,v_{t}), t>0, conditioned
to the observations using the quantities
_{} given by:
_{}
(10)
_{}
(11)
_{}_{} (12)
The quantities _{} can be regarded as the
solution of the filtering problem.
Let G(x,z,v,t,x',z’,v',t'|Q), (x,z,v), (x',z’,v') Î(-¥,+¥)_{}(-¥,+¥)_{}(0, +¥) ,
t>t'>0, be the fundamental solution of (7) whose expression is given in
[1].
Using the function
G the solution of problem
(7), (8), (9) can be written as follows:
_{}=G(x,z,v,t,_{},_{},_{},_{}| _{}), (x,z,v)Î(-¥,+¥)_{}(-¥,+¥)_{}(0,+¥), _{}<t_{}, (13)
and for i=1,2,…,n:
_{} _{} (14)
where
_{}
(15)
The evaluation of_{ }, (x,z,v)Î (-¥,+¥)_{}(-¥,+¥)_{} (0, +¥), t>0, for i=1,2,...,n, on a grid of N^{3}
points in the (x,z,v) variables with a naïve quadrature procedure applied to
formula (14) using N quadrature nodes in each integration variable requires O(N^{6}) complex
multiplications when N goes to infinity. In fact (14) due to the fact that the
fundamental solution G is obtained computing a two dimensional integral can be
regarded as a three dimensional integral. In [1] using:
· a wavelet expansion [1],
[5] , [6] of the integral kernel appearing
in (14) and of the function (15),
· a truncation procedure that exploits
the sparsifying properties of the wavelet expansion,
· a suitable expansion in powers of a
time step and the Fast Fourier Transform
(FFT),
we reduce the computational cost of the
previous computation from O(N^{6}) to N_{1}O(N^{3}log
N) complex multiplications when N goes to infinity, where N_{1} is
independent of N. Note that N_{1 }is the number of the wavelet
functions remaining in the expansion of the integral kernel after the use of the
truncation procedure.
Using the estimates (10), (11), (12) we can
forecast the values of the log-return of the stock prices index x_{t}, t>0, of the log-return of
the hedge funds class index z_{t}, t>0, and of the stochastic variance v_{t},
t>0, coherently with the observations
F_{t}, t³0. The continuous update in time of _{} is the “trajectory
tracking” of (1), (2), (3), (4), (5), (6). The continuous update in time of _{} is the corresponding “variance
tracking”.
The quality of the forecasted values _{}depends on the variance of the random variables x_{t},
z_{t} v_{t}, t>0,
conditioned to the observations, that is we can estimate the quality of the
estimates (10), (11), (12) computing respectively the quantities:
_{} (16)
_{}
(17)
_{} (18)
The estimates (10), (11), (12) are good, when
the variances (16), (17), (18) are small.
In the trajectory tracking shown in Figure 1
and in the digital movie showing the
trajectory tracking the solid line
represents the true trajectory made of the values (x_{t}, z_{t}
v_{t}), tÎ[0,24/252.5], obtained computing numerically
for t > 0 with Euler method one trajectory of the
stochastic differential equations (1), (2), (3) with the initial condition (4), (5), (6)
_{}=0, _{}=0, and
_{}=0.5;
the dashed line represents the trajectory followed by the
forecasted values _{} tÎ[0,24/252.5], obtained solving the filtering problem
computing the integrals (10), (11), (12) and using the data shown in Table 1. The
squares and the asterisks indicate,
respectively, the values of (x_{t},
z_{t} v_{t}) and of _{} at the observation
times t=t_{i}, i=0,1,...,5. We have chosen Q = (0.026, 5.94, 0.306, 0.01159,1,0.1,
-0.576,0,0, 0.5)^{T}. We note that
b,
g,
r_{1,2, }r_{1,3, }r_{2,3 } are adimensional quantities and that _{}, q, c, e,
_{} dimensionally are 1/time. We assume
that the components _{}, q, c, e,
_{}
of the parameter vector Q are expressed in annualized unit, that is in
the unit 1/year. We use a “year” made of the average number of trading days per
year, that is made of 252.5 days, so that for example t=24/252.5 corresponds to the end
of the trading day number 24 of the regular year.
i |
t_{i} |
_{} |
_{} |
0 |
0 |
0 |
0 |
1 |
4/252.5 |
-0.1046324981 |
-0.09423814923 |
2 |
8/252.5 |
-0.0389636566 |
-0.03170475035 |
3 |
12/252.5 |
-0.1106878449 |
-0.09375359522 |
4 |
16/252.5 |
0.01355863752 |
0.036249189791 |
5 |
20/252.5 |
0.04209650394 |
0.053437008036 |
Table 1. The data of the filtering problem
Below we show two figures (Figure 1 and Figure
2) that help the understanding of the quality of the estimates obtained solving
the filtering problem. We remind that Figure 1 shows the trajectory tracking
obtained as solution of the filtering problem. In particular the solid line
represents the
true trajectory made of the values (x_{t}, z_{t}
v_{t}), tÎ[0,24/252.5], obtained computing with Euler method one
trajectory of the initial value problem (1), (2), (3), (4), (5), (6),
the dashed line represents the trajectory followed by the forecasted values
of _{}, tÎ[0,24/252.5]. Figure 2 shows S(x_{t }| F_{t, }Q), S(z_{t }| F_{t, }Q), S(v_{t }| F_{t, }Q) versus
time when tÎ[0,24/252.5]. In the figures the
time is expressed in days that is _{}.
Figure 1. Tracking of x_{t}, z_{t}, v_{t} versus _{} Figure 2. S(x_{t}|F_{t },Q), S(z_{t}|F_{t },Q),_{ }S(v_{t}|F_{t },Q)_{ }versus
_{}
In Section 3 we have assumed Q
known to solve the filtering problem but in real situations when the model (1), (2), (3) is used Q is
not known and it must be estimated from the knowledge of the observations. The
dynamics of the stochastic process (x_{t}, z_{t} v_{t}), t>0, and of the estimates _{} t>0, obtained from the solution of the filtering
problem, depend on the vector _{} Let _{}be the set of the admissible vectors Q.
We solve the problem of estimating
the vector Q
from the observations using the maximum likelihood method. That is we search
the vector QÎM
that maximizes the following (log-)likelihood
function:
_{} (19)
Note that (19) represents the sum of
the logarithms of the integrals of the transition probability densities _{}(x_{i+1},
z_{i+1},v,t_{i+1}|F_{t}_{i },Q),
i=0,1,...,n-1. The function F(Q)
represents the (log-)likelihood of
the parameter vector Q given the observations F_{t}, t>0,
and the initial condition _{}
_{
}at time t=t_{0}.
In order to maximize the (log-)likelihood
function shown in (19), we use an iterative optimization procedure that
searches the maximum likelihood estimate Q^{*}^{ }solution
of the maximization problem:
_{}
(20)
starting from an initial guess Q^{0}ÎM.
The iterative procedure that solves the
optimization problem (20) using Q^{0 }as initial guess finds during the iterations
the admissible vectors Q^{k,},^{ }k=1,2,...,iter,
where iter is the maximum number of iterations allowed. The vectors Q^{k}, k=0,1,…,iter, are such that F(Q^{k+1})³F(Q^{k}), k=0,1,…,iter-1.
The second experiment consists in analyzing a series of daily observations during a
period of two years, that is a series made of 505 observation times. Remind
that we have considered a “year” made of 252.5 trading days. We consider 505
observation times corresponding to 1010 observations _{},_{}, i=0,1,…,504, where we have chosen t_{0}=0, x_{0}=_{}=0, z_{0}=_{}=0, v_{0}=_{}=0.5. The data are generated integrating numerically with Euler
method one trajectory of (1), (2), (3), (4), (5), (6) for two years period.
We choose the vector _{} of the stochastic differential
system that generates the data to be equal to the vector _{} used in the previous
experiment in the first year of data (i.e. when t=t_{i} , i=0,1,…,252)
and to the vector _{} in the second year of
data (i.e. when t=t_{i} , i=253,254,…,504) where _{} and _{} are given by: _{}=(0.026,5.94,0.306,0.01159,1,0.1,
-0.576,0,0,0.5)^{T}
, _{}=
(0.4,2,0.01,0.01,1,0.01,0.5,0,0,0.012)^{T}.
Click
on the names to download the file
containing the observations _{} (samplexx.txt)
,_{} (samplezz.txt), i=0,1,…,504.
We solve problem (20) using a time window made of 9
consecutive observation times corresponding to 18 observations. In particular
we divide the time interval [0,504/252.5] in 63 consecutive disjoint time windows containing 8 consecutive
observation times that is for j=0,1,…,62 the time values t=t_{i,j}=t_{8j+i}, i=1,2,…,8 belong to
the same time interval. Note that when considering the time window j (j³1) we use
the notation t_{0,j}=t_{8j }that is t_{0,j }is the
last observation time of the time window
j-1. Note that 8´63=504 and
that the observation time t=t_{0} is not considered in the time windows considered above. The observation
time t=t_{ 0} is associated to the first time window (j=0)
and we use the notation t_{0,0}=t_{0}. We solve problem
(20) using the observations made at 9 consecutive observation times that is given j using
_{},_{}, at time t=t_{i,j} i=0,1,…,8, that is for j=0,1,…,62 the time values t_{i,j}=t_{8j+i},
i=0,1,2,…,8 belong to the same time window.
We solve 63 problem
(20) starting from different initial guesses. In Figure 3 the diamond
represents the vector _{} and the square represents the vector _{}. The unit
length in Figure 3 is _{} . The origin is the mid point of the segment that joins the points
_{} and _{} and the length of the segment that joins the
diamond and the square is proportional to the Euclidean norm of the vector _{}- _{}. The vector
determined solving problem (20) using the optimization procedure are
represented by stars. A point P=Q is plotted around _{} when the quantity _{} is smaller than the
quantity _{}, otherwise it is plotted around _{}. The point
P is plotted on the left or on the right of the point that represents _{} or _{} according with the sign of _{}, the eighth component of the
vector Q
that is
the estimated correlation coefficient _{}. Finally the difference of the abscissae of the point _{} and a point P=Q of its cluster
is _{}. A similar statement
holds for _{} and a point P on the cluster
associated to _{}. We remind
that in Figure 3 the unit length is _{}.
Figure 3: Reconstruction of the parameter vectors _{}, _{}
Figure
3 and the digital movie that shows the solution of the 63 problem (20) show
that the points obtained solving problem (20) are concentrated on two disjoint
segments one on the left and one on the right of the origin and that they form two
disjoint clusters, that is, the solution of the 63 problems (20) corresponding
to the time windows described previously
shows that two sets of parameters seem to generate the data studied.
This is really the case. Moreover as expected the points of the cluster around Q_{2} are those obtained by the analysis
of the observations made in the second year.
L.Fatone,
F.Mariani, M.C.Recchioni, F.Zirilli, Maximum
likelihood estimation of the parameters of system of stochastic
differential equations that models the
returns
a of the index of some
S.L.Heston, A closed-form solution
for options with stochastic volatility with applications to bond and currency
options , Review of Financial Studies, 6, 327-343 (1993).
P. Pillonel, L. Solanet, Predictability in hedge fund index returns
and its application in fund of hedge funds style allocation, Master's Thesis
in Banking and Finance at Université de
Lausanne, Hautes Etudes Commerciales (HEC), (2006).
L.Fatone, M.C.Recchioni, F.Zirilli,
Wavelet Bases Made of Piecewise Polynomial Functions:
Theory and Applications, Applied Mathematics, 2, 196-216 (2011).
L.Fatone, G.Rao, M.C.Recchioni,
F.Zirilli, High performance algorithms based on a new wavelet expansion for
time dependent acoustic obstacle scattering, Communications in
Computational Physics, 2(6), 1139-1173 (2007).
F.Mariani, G.Pacelli, F.Zirilli, Maximum
likelihood estimation of the Heston stochastic volatility model using asset and
option prices: an application of nonlinear filtering theory, Optimization Letters, 2, 177-222 (2008).
The FORTRAN code distributed in this website
computes the solution of the estimation problem (20) using the formulae
outlined in Sections 3 and 4 and derived in [1] . The data
of the estimation problem are those described in Section 4 and must be provided
by the user. The FORTRAN code consists of two programs.
The first program preproc.f90 reads and writes on appropriate files the
data given by the user, initializes the parameter vector Q^{0 }and computes the number of wavelet coefficients needed after the
truncation procedure in order to approximate satisfactorily the integral kernel
contained in (14) associated to the initial guess Q^{0}.
click here to download preproc.f90
The second program estimate.f90 starting from the initial guess Q^{0 } generates a sequence of admissible vectors Q^{k}, k=1,2,…,iter,^{ }using a variable
metric steepest ascent optimization procedure.
click here to download estimate.f90