Method for approximate evaluation of integrals
In mathematics, Laplace's method, named after Pierre-Simon Laplace, is a technique used to approximate integrals of the form
-
where
is a twice-differentiable function, M is a large number, and the endpoints a and b could possibly be infinite. This technique was originally presented in Laplace (1774).
In Bayesian statistics, Laplace approximation can refer to either approximating the posterior normalizing constant with Laplace's method[1] or approximating the posterior distribution with a Gaussian centered at the maximum a posteriori estimate.[2] Laplace approximations play a central role in the integrated nested Laplace approximations method for fast approximate Bayesian inference.
The idea of Laplace's method [edit]
has a global maximum at 0.
is shown on top for M = 0.5, and at the bottom for M = 3 (both in blue). As M grows the approximation of this function by a Gaussian function (shown in red) improves. This observation underlies Laplace's method.
Suppose the function
has a unique global maximum at x 0. Let
be a constant and consider the following two functions:
-
Note that x 0 will be the global maximum of
and
as well. Now observe:
-
As M increases, the ratio for
will grow exponentially, while the ratio for
does not change. Thus, significant contributions to the integral of this function will come only from points x in a neighbourhood of x 0, which can then be estimated.
General theory of Laplace's method [edit]
To state and motivate the method, we need several assumptions. We will assume that x 0 is not an endpoint of the interval of integration, that the values
cannot be very close to
unless x is close to x 0, and that
We can expand
around x 0 by Taylor's theorem,
-
where
(see: big O notation).
Since
has a global maximum at x 0, and since x 0 is not an endpoint, it is a stationary point, so the derivative of
vanishes at x 0. Therefore, the function
may be approximated to quadratic order
-
for x close to x 0 (recall
). The assumptions ensure the accuracy of the approximation
-
(see the picture on the right). This latter integral is a Gaussian integral if the limits of integration go from −∞ to +∞ (which can be assumed because the exponential decays very fast away from x 0), and thus it can be calculated. We find
-
A generalization of this method and extension to arbitrary precision is provided by Fog (2008).
Formal statement and proof [edit]
Suppose
is a twice continuously differentiable function on
and there exists a unique point
such that:
-
Then:
-
Lower bound: Let
. Since
is continuous there exists
such that if
then
By Taylor's Theorem, for any
-
Then we have the following lower bound:
-
where the last equality was obtained by a change of variables
-
Remember
so we can take the square root of its negation.
If we divide both sides of the above inequality by
-
and take the limit we get:
-
since this is true for arbitrary
we get the lower bound:
-
Note that this proof works also when
or
(or both).
Upper bound: The proof is similar to that of the lower bound but there are a few inconveniences. Again we start by picking an
but in order for the proof to work we need
small enough so that
Then, as above, by continuity of
and Taylor's Theorem we can find
so that if
, then
-
Lastly, by our assumptions (assuming
are finite) there exists an
such that if
, then
.
Then we can calculate the following upper bound:
-
If we divide both sides of the above inequality by
-
and take the limit we get:
-
Since
is arbitrary we get the upper bound:
-
And combining this with the lower bound gives the result.
Note that the above proof obviously fails when
or
(or both). To deal with these cases, we need some extra assumptions. A sufficient (not necessary) assumption is that for
-
and that the number
as above exists (note that this must be an assumption in the case when the interval
is infinite). The proof proceeds otherwise as above, but with a slightly different approximation of integrals:
-
When we divide by
-
we get for this term
-
whose limit as
is
. The rest of the proof (the analysis of the interesting term) proceeds as above.
The given condition in the infinite interval case is, as said above, sufficient but not necessary. However, the condition is fulfilled in many, if not in most, applications: the condition simply says that the integral we are studying must be well-defined (not infinite) and that the maximum of the function at
must be a "true" maximum (the number
must exist). There is no need to demand that the integral is finite for
but it is enough to demand that the integral is finite for some
This method relies on 4 basic concepts such as
- 1. Relative error
The "approximation" in this method is related to the relative error and not the absolute error. Therefore, if we set
-
the integral can be written as
-
where
is a small number when
is a large number obviously and the relative error will be
-
Now, let us separate this integral into two parts:
region and the rest.
- 2.
around the stationary point when
is large enough
Let's look at the Taylor expansion of
around x 0 and translate x to y because we do the comparison in y-space, we will get
-
Note that
because
is a stationary point. From this equation you will find that the terms higher than second derivative in this Taylor expansion is suppressed as the order of
so that
will get closer to the Gaussian function as shown in figure. Besides,
-
- 3. The larger
is, the smaller range of
is related
Because we do the comparison in y-space,
is fixed in
which will cause
; however,
is inversely proportional to
, the chosen region of
will be smaller when
is increased.
- 4. If the integral in Laplace's method converges, the contribution of the region which is not around the stationary point of the integration of its relative error will tend to zero as
grows.
Relying on the 3rd concept, even if we choose a very large Dy , sDy will finally be a very small number when
is increased to a huge number. Then, how can we guarantee the integral of the rest will tend to 0 when
is large enough?
The basic idea is to find a function
such that
and the integral of
will tend to zero when
grows. Because the exponential function of
will be always larger than zero as long as
is a real number, and this exponential function is proportional to
the integral of
will tend to zero. For simplicity, choose
as a tangent through the point
as shown in the figure:
If the interval of the integration of this method is finite, we will find that no matter
is continue in the rest region, it will be always smaller than
shown above when
is large enough. By the way, it will be proved later that the integral of
will tend to zero when
is large enough.
If the interval of the integration of this method is infinite,
and
might always cross to each other. If so, we cannot guarantee that the integral of
will tend to zero finally. For example, in the case of
will always diverge. Therefore, we need to require that
can converge for the infinite interval case. If so, this integral will tend to zero when
is large enough and we can choose this
as the cross of
and
You might ask that why not choose
as a convergent integral? Let me use an example to show you the reason. Suppose the rest part of
is
then
and its integral will diverge; however, when
the integral of
converges. So, the integral of some functions will diverge when
is not a large number, but they will converge when
is large enough.
Based on these four concepts, we can derive the relative error of this Laplace's method.
Other formulations [edit]
Laplace's approximation is sometimes written as
-
where
is positive.
Importantly, the accuracy of the approximation depends on the variable of integration, that is, on what stays in
and what goes into
.[3]
The derivation of its relative error
First, use
to denote the global maximum, which will simplify this derivation. We are interested in the relative error, written as
,
-
where
-
So, if we let
-
and
, we can get
-
since
.
For the upper bound, note that
thus we can separate this integration into 5 parts with 3 different types (a), (b) and (c), respectively. Therefore,
-
where
and
are similar, let us just calculate
and
and
are similar, too, I'll just calculate
.
For
, after the translation of
, we can get
-
This means that as long as
is large enough, it will tend to zero.
For
, we can get
-
where
-
and
should have the same sign of
during this region. Let us choose
as the tangent across the point at
, i.e.
which is shown in the figure
is the tangent lines across the point at
.
From this figure you can find that when
or
gets smaller, the region satisfies the above inequality will get larger. Therefore, if we want to find a suitable
to cover the whole
during the interval of
,
will have an upper limit. Besides, because the integration of
is simple, let me use it to estimate the relative error contributed by this
.
Based on Taylor expansion, we can get
-
and
-
and then substitute them back into the calculation of
; however, you can find that the remainders of these two expansions are both inversely proportional to the square root of
, let me drop them out to beautify the calculation. Keeping them is better, but it will make the formula uglier.
-
Therefore, it will tend to zero when
gets larger, but don't forget that the upper bound of
should be considered during this calculation.
About the integration near
, we can also use Taylor's Theorem to calculate it. When
-
and you can find that it is inversely proportional to the square root of
. In fact,
will have the same behave when
is a constant.
Conclusively, the integral near the stationary point will get smaller as
gets larger, and the rest parts will tend to zero as long as
is large enough; however, we need to remember that
has an upper limit which is decided by whether the function
is always larger than
in the rest region. However, as long as we can find one
satisfying this condition, the upper bound of
can be chosen as directly proportional to
since
is a tangent across the point of
at
. So, the bigger
is, the bigger
can be.
In the multivariate case where
is a
-dimensional vector and
is a scalar function of
, Laplace's approximation is usually written as:
-
where
is the Hessian matrix of
evaluated at
and where
denotes matrix determinant. Analogously to the univariate case, the Hessian is required to be negative definite.[4]
By the way, although
denotes a
-dimensional vector, the term
denotes an infinitesimal volume here, i.e.
.
Laplace's method extension: Steepest descent [edit]
In extensions of Laplace's method, complex analysis, and in particular Cauchy's integral formula, is used to find a contour of steepest descent for an (asymptotically with large M) equivalent integral, expressed as a line integral. In particular, if no point x 0 where the derivative of
vanishes exists on the real line, it may be necessary to deform the integration contour to an optimal one, where the above analysis will be possible. Again the main idea is to reduce, at least asymptotically, the calculation of the given integral to that of a simpler integral that can be explicitly evaluated. See the book of Erdelyi (1956) for a simple discussion (where the method is termed steepest descents).
The appropriate formulation for the complex z-plane is
-
for a path passing through the saddle point at z 0. Note the explicit appearance of a minus sign to indicate the direction of the second derivative: one must not take the modulus. Also note that if the integrand is meromorphic, one may have to add residues corresponding to poles traversed while deforming the contour (see for example section 3 of Okounkov's paper Symmetric functions and random partitions).
Further generalizations [edit]
An extension of the steepest descent method is the so-called nonlinear stationary phase/steepest descent method. Here, instead of integrals, one needs to evaluate asymptotically solutions of Riemann–Hilbert factorization problems.
Given a contour C in the complex sphere, a function
defined on that contour and a special point, say infinity, one seeks a function M holomorphic away from the contour C, with prescribed jump across C, and with a given normalization at infinity. If
and hence M are matrices rather than scalars this is a problem that in general does not admit an explicit solution.
An asymptotic evaluation is then possible along the lines of the linear stationary phase/steepest descent method. The idea is to reduce asymptotically the solution of the given Riemann–Hilbert problem to that of a simpler, explicitly solvable, Riemann–Hilbert problem. Cauchy's theorem is used to justify deformations of the jump contour.
The nonlinear stationary phase was introduced by Deift and Zhou in 1993, based on earlier work of Its. A (properly speaking) nonlinear steepest descent method was introduced by Kamvissis, K. McLaughlin and P. Miller in 2003, based on previous work of Lax, Levermore, Deift, Venakides and Zhou. As in the linear case, "steepest descent contours" solve a min-max problem. In the nonlinear case they turn out to be "S-curves" (defined in a different context back in the 80s by Stahl, Gonchar and Rakhmanov).
The nonlinear stationary phase/steepest descent method has applications to the theory of soliton equations and integrable models, random matrices and combinatorics.
Laplace's method generalization: Median-point approximation [edit]
In the generalization, evaluation of the integral is considered equivalent to finding the norm of the distribution with density
-
Denoting the cumulative distribution
, if there is a diffeomorphic Gaussian distribution with density
-
the norm is given by
-
and the corresponding diffeomorphism is
-
where
denotes cumulative standard normal distribution function.
In general, any distribution diffeomorphic to the Gaussian distribution has density
-
and the median-point is mapped to the median of the Gaussian distribution. Matching the logarithm of the density functions and their derivatives at the median point up to a given order yields a system of equations that determine the approximate values of
and
.
The approximation was introduced in 2019 by D. Makogon and C. Morais Smith primarily in the context of partition function evaluation for a system of interacting fermions.
Complex integrals [edit]
For complex integrals in the form:
-
with
we make the substitution t = iu and the change of variable
to get the bilateral Laplace transform:
-
We then split g(c + ix) in its real and complex part, after which we recover u = t/i. This is useful for inverse Laplace transforms, the Perron formula and complex integration.
Example: Stirling's approximation [edit]
Laplace's method can be used to derive Stirling's approximation
-
for a large integerN.
From the definition of the Gamma function, we have
-
Now we change variables, letting
so that
Plug these values back in to obtain
-
This integral has the form necessary for Laplace's method with
-
which is twice-differentiable:
-
-
The maximum of
lies at z 0 = 1, and the second derivative of
has the value −1 at this point. Therefore, we obtain
-
See also [edit]
- Method of stationary phase
- Method of steepest descent
- Large deviations theory
- Laplace principle (large deviations theory)
Notes [edit]
- ^ Tierney, Luke; Kadane, Joseph B. (1986). "Accurate Approximations for Posterior Moments and Marginal Densities". J. Amer. Statist. Assoc. 81 (393): 82–86.
- ^ Amaral Turkman, M. Antónia; Paulino, Carlos Daniel; Müller, Peter (2019). "Methods Based on Analytic Approximations". Computational Bayesian Statistics: An Introduction. Cambridge University Press. pp. 150–171. ISBN978-1-108-70374-1.
- ^ Butler, Ronald W (2007). Saddlepoint approximations and applications. Cambridge University Press. ISBN978-0-521-87250-8.
- ^ MacKay, David J. C. (September 2003). Information Theory, Inference and Learning Algorithms. Cambridge: Cambridge University Press. ISBN9780521642989.
References [edit]
- Azevedo-Filho, A.; Shachter, R. (1994), "Laplace's Method Approximations for Probabilistic Inference in Belief Networks with Continuous Variables", in Mantaras, R.; Poole, D. (eds.), Uncertainty in Artificial Intelligence, San Francisco, CA: Morgan Kaufmann, CiteSeerX10.1.1.91.2064 .
- Deift, P.; Zhou, X. (1993), "A steepest descent method for oscillatory Riemann–Hilbert problems. Asymptotics for the MKdV equation", Ann. of Math., vol. 137, no. 2, pp. 295–368, arXiv:math/9201261, doi:10.2307/2946540, JSTOR 2946540 .
- Erdelyi, A. (1956), Asymptotic Expansions, Dover .
- Fog, A. (2008), "Calculation Methods for Wallenius' Noncentral Hypergeometric Distribution", Communications in Statistics, Simulation and Computation, vol. 37, no. 2, pp. 258–273, doi:10.1080/03610910701790269 .
- Laplace, P S (1774), "Mémoires de Mathématique et de Physique, Tome Sixième" [Memoir on the probability of causes of events.], Statistical Science, 1 (3): 366–367, JSTOR 2245476
- Wang, Xiang-Sheng; Wong, Roderick (2007). "Discrete analogues of Laplace's approximation". Asymptot. Anal. 54 (3–4): 165–180.
This article incorporates material from saddle point approximation on PlanetMath, which is licensed under the Creative Commons Attribution/Share-Alike License.
0 Response to "A Fast Method for the Numerical Evaluation of Continuous Fourier and Laplace Transforms"
Post a Comment