 # Numerical methods

## Numerical methods Topics

Sort by:

### Generalized minimal residual method

The generalized minimal residual (GMRES) method (Saad and Schultz 1986) is an extension of the minimal residual method (MINRES), which is only applicable to symmetric systems, to unsymmetric systems. Like MINRES, it generates a sequence of orthogonal vectors, but in the absence of symmetry this can no longer be done with short recurrences; instead, all previously computed vectors in the orthogonal sequence have to be retained. For this reason, "restarted" versions of the method are used.In the conjugate gradient method, theresiduals form an orthogonal basis for the spaceIn GMRES, this basis is formed explicitly:The reader may recognize this as a modified Gram-Schmidt orthonormalization. Applied to the Krylov sequence this orthogonalization is called the "Arnoldi method" (Arnoldi 1951). The inner product coefficients and are stored in an upper Hessenberg matrix.The GMRES iterates are constructed aswhere..

The biconjugate gradient stabilized (BCGSTAB) method was developed to solve nonsymmetric linear systems while avoiding the often irregular convergence patterns of the conjugate gradient squared method (van der Vorst 1992). Instead of computing the conjugate gradient squared method sequence , BCGSTAB computes where is an th degree polynomial describing a steepest descent update.BCGSTAB often converges about as fast as the conjugate gradient squared method (CGS), sometimes faster and sometimes not. CGS can be viewed as a method in which the biconjugate gradient method (BCG) "contraction" operator is applied twice. BCGSTAB can be interpreted as the product of BCG and repeated application of the generalized minimal residual method. At least locally, a residual vector is minimized, which leads to a considerably smoother convergence behavior. On the other hand, if the local generalized minimal residual method step stagnates,..

The conjugate gradient method is not suitable for nonsymmetric systems because the residual vectors cannot be made orthogonal with short recurrences, as proved in Voevodin (1983) and Faber and Manteuffel (1984). The generalized minimal residual method retains orthogonality of the residuals by using long recurrences, at the cost of a larger storage demand. The biconjugate gradient method (BCG) takes another approach, replacing the orthogonal sequence of residuals by two mutually orthogonal sequences, at the price of no longer providing a minimization.The update relations for residuals in the conjugate gradient method are augmented in the biconjugate gradient method by relations that are similar but based on instead of . Thus we update two sequences of residuals(1)(2)and two sequences of search directions(3)(4)The choices(5)(6)ensure the orthogonality relations(7)if .Few theoretical results are known about the convergence..

In the biconjugate gradient method, the residual vector can be regarded as the product of and an th degree polynomial in , i.e.,(1)This same polynomial satisfies(2)so that(3)(4)(5)This suggests that if reduces to a smaller vector , then it might be advantageous to apply this "contraction" operator twice, and compute . The iteration coefficients can still be recovered from these vectors (as shown above), and it turns out to be easy to find the corresponding approximations for . This approach is the conjugate gradient squared (CGS) method (Sonneveld 1989).Often one observes a speed of convergence for CGS that is about twice as fast as for the biconjugate gradient method, which is in agreement with the observation that the same "contraction" operator is applied twice. However, there is no reason that the contraction operator, even if it really reduces the initial residual , should also reduce the once reduced vector . This..

### Conjugate gradient method on the normal equations

The conjugate gradient method can be applied on the normal equations. The CGNE and CGNR methods are variants of this approach that are the simplest methods for nonsymmetric or indefinite systems. Since other methods for such systems are in general rather more complicated than the conjugate gradient method, transforming the system to a symmetric definite one and then applying the conjugate gradient method is attractive for its coding simplicity.CGNE solves the system(1)for and then computes the solution(2)CGNR solves(3)for the solution vector , where(4)If a system of linear equations has a nonsymmetric, possibly indefinite (but nonsingular) coefficient matrix, one obvious attempt at a solution is to apply the conjugate gradient method to a related symmetric positive definite system . While this approach is easy to understand and code, the convergence speed of the conjugate gradient method now depends on the square of the condition..

### Minimal residual method

The conjugate gradient method can be viewed as a special variant of the Lanczos method for positive definite symmetric systems. The minimal residual method (MINRES) and symmetric LQ method (SYMMLQ) methods are variants that can be applied to symmetric indefinite systems.The vector sequences in the conjugate gradient method correspond to a factorization of a tridiagonal matrix similar to the coefficient matrix. Therefore, a breakdown of the algorithm can occur corresponding to a zero pivot if the matrix is indefinite. Furthermore, for indefinite matrices the minimization property of the conjugate gradient method is no longer well-defined. The MINRES methods is a variant of the conjugate gradient method that avoids the LU decomposition and does not suffer from breakdown. MINRES minimizes the residual in the 2-norm. The convergence behavior of the conjugate gradient and MINRES methods for indefinite systems was analyzed by Paige et..

### Chebyshev iteration

Chebyshev iteration is a method for solving nonsymmetric problems (Golub and van Loan 1996, §10.1.5; Varga, 1962, Ch. 5). Chebyshev iteration avoids the computation of inner products as is necessary for the other nonstationary methods. For some distributed memory architectures these inner products are a bottleneck with respect to efficiency. The price one pays for avoiding inner products is that the method requires enough knowledge about the spectrum of the coefficient matrix that an ellipse enveloping the spectrum can be identified; this difficulty can be overcome, however, via an adaptive construction developed by Manteuffel (1977) and implemented by Ashby (1985). Chebyshev iteration is suitable for any nonsymmetric linear system for which the enveloping ellipse does not include the origin.Chebyshev iteration is similar to the conjugate gradient method except that no inner products are computed. Scalars and must..

### Symmetric lq method

The conjugate gradient method can be viewed as a special variant of the Lanczos method for positive definite symmetric systems. The minimal residual method and symmetric LQ method (SYMMLQ) are variants that can be applied to symmetric indefinite systems.The vector sequences in the conjugate gradient method correspond to a factorization of a tridiagonal matrix similar to the coefficient matrix. Therefore, a breakdown of the algorithm can occur corresponding to a zero pivot if the matrix is indefinite. Furthermore, for indefinite matrices the minimization property of the conjugate gradient method is no longer well-defined. The MINRES and SYMMLQ methods are variants of the CG method that avoid the LU decomposition and do not suffer from breakdown. SYMMLQ solves the projected system, but does not minimize anything (it keeps the residual orthogonal to all previous ones).When is not positive definite, but symmetric, we can still construct..

### Comonotone approximation

The approximation of a piecewise monotonic function by a polynomial with the same monotonicity. Such comonotonic approximations can always be accomplished with th degree polynomials, and have an error of (Passow and Raymon 1974, Passow et al. 1974, Newman 1979).

### B&eacute;zier curve

Given a set of control points , , ..., , the corresponding Bézier curve (or Bernstein-Bézier curve) is given bywhere is a Bernstein polynomial and . Bézier splines are implemented in the Wolfram Language as BezierCurve[pts].A "rational" Bézier curve is defined bywhere is the order, are the Bernstein polynomials, are control points, and the weight of is the last ordinate of the homogeneous point . These curves are closed under perspective transformations, and can represent conic sections exactly.The Bézier curve always passes through the first and last control points and lies within the convex hull of the control points. The curve is tangent to and at the endpoints. The "variation diminishing property" of these curves is that no line can have more intersections with a Bézier curve than with the curve obtained by joining consecutive points with straight line segments...

### Milne's method

A predictor-corrector method for solution of ordinary differential equations. The third-order equations for predictor and corrector are (1)(2)Abramowitz and Stegun (1972) also give the fifth order equations and formulas involving higher derivatives.

### Isocline

The term isocline derives from the Greek words for "same slope." For a first-order ordinary differential equation is, a curve with equation for some constant is known as an isocline. In other words, all the solutions of the ordinary differential equation intersecting that curve have the same slope . Isoclines can be used as a graphical method of solving an ordinary differential equation.The term is also used to refer to points on maps of the world having identical magnetic inclinations.

### Galerkin method

A method of determining coefficients in a power series solutionof the ordinary differential equation so that , the result of applying the ordinary differential operator to , is orthogonal to every for , ..., (Itô 1980).Galerkin methods are equally ubiquitous in the solution of partial differential equations, and in fact form the basis for the finite element method.

### Collocation method

A method of determining coefficients in an expansionso as to nullify the values of an ordinary differential equation at prescribed points.

Adams' method is a numerical method for solving linear first-orderordinary differential equations of the form(1)Let(2)be the step interval, and consider the Maclaurin series of about ,(3)(4)Here, the derivatives of are given by the backward differences(5)(6)(7)etc. Note that by (◇), is just the value of .For first-order interpolation, the method proceeds by iterating the expression(8)where . The method can then be extended to arbitrary order using the finite difference integration formula from Beyer (1987)(9)to obtain(10)Note that von Kármán and Biot (1940) confusingly use the symbol normally used for forward differences to denote backward differences .

### Chebyshev approximation formula

Using a Chebyshev polynomial of the first kind , define(1)(2)Then(3)It is exact for the zeros of . This type of approximation is important because, when truncated, the error is spread smoothly over . The Chebyshev approximation formula is very close to the minimax polynomial.

### Thiele's interpolation formula

Let be a reciprocal difference. Then Thiele's interpolation formula is the continued fraction

### Spline

A piecewise polynomial function that can have a locally very simple form, yet at the same time be globally flexible and smooth. Splines are very useful for modeling arbitrary functions, and are used extensively in computer graphics.Cubic splines are implemented in the Wolfram Language as BSplineCurve[pts, SplineDegree -> 3] (red), Bézier curves as BezierCurve[pts] (blue), and B-splines as BSplineCurve[pts].

### Lagrange interpolating polynomial

The Lagrange interpolating polynomial is the polynomial of degree that passes through the points , , ..., , and is given by(1)where(2)Written explicitly,(3)The formula was first published by Waring (1779), rediscovered by Euler in 1783, and published by Lagrange in 1795 (Jeffreys and Jeffreys 1988).Lagrange interpolating polynomials are implemented in the Wolfram Language as InterpolatingPolynomial[data, var]. They are used, for example, in the construction of Newton-Cotes formulas.When constructing interpolating polynomials, there is a tradeoff between having a better fit and having a smooth well-behaved fitting function. The more data points that are used in the interpolation, the higher the degree of the resulting polynomial, and therefore the greater oscillation it will exhibit between the data points. Therefore, a high-degree interpolation may be a poor predictor of the function between points, although the accuracy..

### Bicubic spline

A bicubic spline is a special case of bicubic interpolation which uses an interpolationfunction of the form(1)(2)(3)(4)where are constants and and are parameters ranging from 0 to 1. For a bicubic spline, however, the partial derivatives at the grid points are determined globally by one-dimensional splines.

### Interpolation

The computation of points or values between ones that are known or tabulated using the surrounding points or values.In particular, given a univariate function , interpolation is the process of using known values to find values for at points , . In general, this technique involves the construction of a function called the interpolant which agrees with at the points and which is then used to compute the desired values.Unsurprisingly, one can talk about interpolation methods for multivariate functions as well, though these tend to be substantially more involved than their univariate counterparts.

### Interpolant

In univariate interpolation, an interpolant is a function which agrees with a particular function at a set of known points and which is used to compute values for at points , .Modulo a change of notation, the above definition translates verbatim to multivariateinterpolation models as well.Generally speaking, the properties required of the interpolant are the most fundamental designations between various interpolation models. For example, the main difference between the linear and spline interpolation models is that the interpolant of the prior is required merely to be piecewise linear whereas spline interpolants are assumed to be piecewise polynomial and globally smooth.

### Nurbs curve

A nonuniform rational B-spline curve defined bywhere is the order, are the B-spline basis functions, are control points, and the weight of is the last ordinate of the homogeneous point . These curves are closed under perspective transformations and can represent conic sections exactly.

### Internal knot

One of the "knots" , ..., of a B-spline with control points , ..., and knot vectorwhere

### Newton's divided difference interpolation formula

Let(1)then(2)where is a divided difference, and the remainder is(3)for .

### Hermite's interpolating polynomial

Let be an th degree polynomial with zeros at , ..., . Then the fundamental Hermite interpolating polynomials of the first and second kinds are defined by(1)and(2)for , 2, ..., where the fundamental polynomials of Lagrange interpolation are defined by(3)They are denoted and , respectively, by Szegö (1975, p. 330).These polynomials have the properties(4)(5)(6)(7)for , 2, ..., . Now let , ..., and , ..., be values. Then the expansion(8)gives the unique Hermite interpolating fundamental polynomial for which(9)(10)If , these are called Hermite's interpolating polynomials.The fundamental polynomials satisfy(11)and(12)Also, if is an arbitrary distribution on the interval , then(13)(14)(15)(16)(17)(18)where are Christoffel numbers.

### Gauss's interpolation formula

where is a trigonometric polynomial of degree such that for , ..., , and

### Muller's method

Generalizes the secant method of root finding byusing quadratic 3-point interpolation(1)Then define(2)(3)(4)and the next iteration is(5)This method can also be used to find complex zerosof analytic functions.

### Cubic spline

A cubic spline is a spline constructed of piecewise third-order polynomials which pass through a set of control points. The second derivative of each polynomial is commonly set to zero at the endpoints, since this provides a boundary condition that completes the system of equations. This produces a so-called "natural" cubic spline and leads to a simple tridiagonal system which can be solved easily to give the coefficients of the polynomials. However, this choice is not the only one possible, and other boundary conditions can be used instead.Cubic splines are implemented in the Wolfram Language as BSplineCurve[pts, SplineDegree -> 3].Consider 1-dimensional spline for a set of points . Following Bartels et al. (1998, pp. 10-13), let the th piece of the spline be represented by(1)where is a parameter and , ..., . Then(2)(3)Taking the derivative of in each interval then gives(4)(5)Solving (2)-(5) for , , , and then gives(6)(7)(8)(9)Now..

### Aitken interpolation

An algorithm similar to Neville's algorithm for constructing the Lagrange interpolating polynomial. Let be the unique polynomial of th polynomial order coinciding with at , ..., . Then (1)(2)(3)(4)

### Weierstrass approximation theorem

If is a continuous real-valued function on and if any is given, then there exists a polynomial on such thatfor all . In words, any continuous function on a closed and bounded interval can be uniformly approximated on that interval by polynomials to any degree of accuracy.

### Runge's theorem

Let be compact, let be analytic on a neighborhood of , and let contain at least one point from each connected component of . Then for any , there is a rational function with poles in such that(Krantz 1999, p. 143).A polynomial version can be obtained by taking . Let be an analytic function which is regular in the interior of a Jordan curve and continuous in the closed domain bounded by . Then can be approximated with arbitrary accuracy by polynomials (Szegö 1975, p. 5; Krantz 1999, p. 144).

### Jackson's theorem

Jackson's theorem is a statement about the error of the best uniform approximation to a real function on by real polynomials of degree at most . Let be of bounded variation in and let and denote the least upper bound of and the total variation of in , respectively. Given the function(1)then the coefficients(2)of its Fourier-Legendre series, where is a Legendre polynomial, satisfy the inequalities(3)Moreover, the Fourier-Legendre series of converges uniformly and absolutely to in .Bernstein (1913) strengthened Jackson's theorem to(4)A specific application of Jackson's theorem shows that if(5)then(6)

### Frobenius triangle identities

Let be a Padé approximant. Then(1)(2)(3)(4)where(5)and is the C-determinant.

### Thin plate spline

The thin plate spline is the two-dimensional analog of the cubic spline in one dimension. It is the fundamental solution to the biharmonic equation, and has the formGiven a set of data points, a weighted combination of thin plate splines centered about each data point gives the interpolation function that passes through the points exactly while minimizing the so-called "bending energy." Bending energy is defined here as the integral over of the squares of the second derivatives,Regularization may be used to relax the requirement that the interpolant pass through the data points exactly.The name "thin plate spline" refers to a physical analogy involving the bending of a thin sheet of metal. In the physical setting, the deflection is in the direction, orthogonal to the plane. In order to apply this idea to the problem of coordinate transformation, one interprets the lifting of the plate as a displacement of the or coordinates..

### Cardinal function

Let be a function and let , and define the cardinal series of with respect to the interval as the formal serieswhere is the sinc function. If this series converges, it is known as the cardinal function (or Whittaker cardinal function) of , denoted (McNamee et al. 1971).

### Nurbs surface

A nonuniform rational B-spline surface of degree is defined bywhere and are the B-spline basis functions, are control points, and the weight of is the last ordinate of the homogeneous point .NURBS surfaces are implemented in the WolframLanguage as BSplineSurface[array].

### Moving average

Given a sequence , an -moving average is a new sequence defined from the by taking the arithmetic mean of subsequences of terms,(1)So the sequences giving -moving averages are(2)(3)and so on. The plot above shows the 2- (red), 4- (yellow), 6- (green), and 8- (blue) moving averages for a set of 100 data points.Moving averages are implemented in the Wolfram Language as MovingAverage[data, n].

### Gregory's formula

Gregory's formula is a formula that allows a definite integral of a function to be expressed by its sum and differences, or its sum by its integral and difference (Jordan 1965, p. 284). It is given by the equationdiscovered by Gregory in 1670 and reported to be the earliest formula in numericalintegration (Jordan 1965, Roman 1984).

### Halley's method

A root-finding algorithm also known as the tangent hyperbolas method or Halley's rational formula. As in Halley's irrational formula, take the second-order Taylor series(1)A root of satisfies , so(2)Now write(3)giving(4)Using the result from Newton's method,(5)gives(6)so the iteration function is(7)This satisfies where is a root, so it is third order for simple zeros. Curiously, the third derivative(8)is the Schwarzian derivative. Halley's method may also be derived by applying Newton's method to . It may also be derived by using an osculating curve of the form(9)Taking derivatives,(10)(11)(12)which has solutions(13)(14)(15)so at a root, and(16)which is Halley's method.

### Method of false position

An algorithm for finding roots which retains that prior estimate for which the function value has opposite sign from the function value at the current best estimate of the root. In this way, the method of false position keeps the root bracketed (Press et al. 1992).Using the two-point form of the linewith , using , and solving for therefore gives the iteration

### Halley's irrational formula

A root-finding algorithm which makes useof a third-order Taylor series(1)A root of satisfies , so(2)Using the quadratic equation then gives(3)Picking the plus sign gives the iteration function(4)This equation can be used as a starting point for deriving Halley'smethod.If the alternate form of the quadratic equationis used instead in solving (◇), the iteration function becomes instead(5)This form can also be derived by setting in Laguerre's method. Numerically, the sign in the denominator is chosen to maximize its absolute value. Note that in the above equation, if , then Newton's method is recovered. This form of Halley's irrational formula has cubic convergence, and is usually found to be substantially more stable than Newton's method. However, it does run into difficulty when both and or and are simultaneously near zero...

### Sturm function

Given a function , write and define the Sturm functions by(1)where is a polynomial quotient. Then construct the following chain of Sturm functions,(2)(3)(4)(5)(6)known as a Sturm chain. The chain is terminated when a constant is obtained.Sturm functions provide a convenient way for finding the number of real roots of an algebraic equation with real coefficients over a given interval. Specifically, the difference in the number of sign changes between the Sturm functions evaluated at two points and gives the number of real roots in the interval . This powerful result is known as the Sturm theorem. However, when the method is applied numerically, care must be taken when computing the polynomial quotients to avoid spurious results due to roundoff error.As a specific application of Sturm functions toward finding polynomial roots, consider the function , plotted above, which has roots , , , and 1.38879 (three of which are real). The derivative..

### Maehly's procedure

A method for finding roots which defines(1)so the derivative is(2)One step of Newton's method can then be writtenas(3)

### Graeffe's method

A root-finding method which was among the most popular methods for finding roots of univariate polynomials in the 19th and 20th centuries. It was invented independently by Graeffe, Dandelin, and Lobachevsky (Householder 1959, Malajovich and Zubelli 2001). Graeffe's method has a number of drawbacks, among which are that its usual formulation leads to exponents exceeding the maximum allowed by floating-point arithmetic and also that it can map well-conditioned polynomials into ill-conditioned ones. However, these limitations are avoided in an efficient implementation by Malajovich and Zubelli (2001).The method proceeds by multiplying a polynomial by and noting that(1)(2)so the result is(3)repeat times, then write this in the form(4)where . Since the coefficients are given by Vieta's formulas(5)(6)(7)and since the squaring procedure has separated the roots, the first term is larger than rest. Therefore,(8)(9)(10)giving(11)(12)(13)Solving..

### Schur transform

For(1)polynomial of degree , the Schur transform is defined by the -degree polynomial(2)(3)where is the reciprocal polynomial.

### Lambert's method

A root-finding algorithm also called Bailey's method and Hutton's method. For a function of the form , Lambert's method gives an iteration functionso

### Crout's method

A root-finding algorithm used in LU decomposition. It solves the equationsfor the unknowns and .

### Laguerre's method

A root-finding algorithm which converges to a complex root from any starting position. To motivate the formula, consider an th order polynomial and its derivatives,(1)(2)(3)(4)Now consider the logarithm and logarithmic derivatives of (5)(6)(7)(8)(9)(10)Now make "a rather drastic set of assumptions" that the root being sought is a distance from the current best guess, so(11)while all other roots are at the same distance , so(12)for , 3, ..., (Acton 1990; Press et al. 1992, p. 365). This allows and to be expressed in terms of and as(13)(14)Solving these simultaneously for gives(15)where the sign is taken to give the largest magnitude for the denominator.To apply the method, calculate for a trial value , then use as the next trial value, and iterate until becomes sufficiently small. For example, for the polynomial with starting point , the algorithmic converges to the real root very quickly as (, , ).Setting gives Halley's..

### Schr&ouml;der's method

Two families of equations used to find roots of nonlinear functions of a single variable. The "B" family is more robust and can be used in the neighborhood of degenerate multiple roots while still providing a guaranteed convergence rate. Almost all other root-finding methods can be considered as special cases of Schröder's method. Householder humorously claimed that papers on root-finding could be evaluated quickly by looking for a citation of Schröder's paper; if the reference were missing, the paper probably consisted of a rediscovery of a result due to Schröder (Stewart 1993).One version of the "A" method is obtained by applying Newton's method to ,(Scavo and Thoo 1995).

### Brent's method

Brent's method is a root-finding algorithm which combines root bracketing, bisection, and inverse quadratic interpolation. It is sometimes known as the van Wijngaarden-Deker-Brent method. Brent's method is implemented in the Wolfram Language as the undocumented option Method -> Brent in FindRoot[eqn, x, x0, x1].Brent's method uses a Lagrange interpolating polynomial of degree 2. Brent (1973) claims that this method will always converge as long as the values of the function are computable within a given region containing a root. Given three points , , and , Brent's method fits as a quadratic function of , then uses the interpolation formula(1)Subsequent root estimates are obtained by setting , giving(2)where(3)(4)with(5)(6)(7)(Press et al. 1992).

### Isograph

The substitution of for in a polynomial . is then plotted as a function of for a given in the complex plane. By varying so that the curve passes through the origin, it is possible to determine a value for one root of the polynomial.

### Bisection

Bisection is the division of a given curve, figure, or interval into two equal parts (halves).A simple bisection procedure for iteratively converging on a solution which is known to lie inside some interval proceeds by evaluating the function in question at the midpoint of the original interval and testing to see in which of the subintervals or the solution lies. The procedure is then repeated with the new interval as often as needed to locate the solution to the desired accuracy.Let and be the endpoints at the th iteration (with and ) and let be the th approximate solution. Then the number of iterations required to obtain an error smaller than is found by noting that(1)and that is defined by(2)In order for the error to be smaller than ,(3)Taking the natural logarithm of both sides thengives(4)so(5)..

### Bairstow's method

A procedure for finding the quadratic factors for the complex conjugate roots of a polynomial with real coefficients.(1)Now write the original polynomial as (2)(3)(4)(5)(6)(7)(8)Now use the two-dimensional Newton's method tofind the simultaneous solutions.

### Horner's method

A method for finding roots of a polynomial equation . Now find an equation whose roots are the roots of this equation diminished by , so(1)The expressions for , , ... are then found as in the following example, where(2)Write the coefficients , , ..., in a horizontal row, and let a new letter shown as a denominator stand for the sum immediately above it so, in the following example, . The result is the following table.Solving for the quantities , , , , and gives(3)(4)(5)(6)(7)so the equation whose roots are the roots of , each diminished by , is(8)(Whittaker and Robinson 1967).To apply the procedure, first determine the integer part of the root through whatever means are needed, then reduce the equation by this amount. This gives the second digit, by which the equation is once again reduced (after suitable multiplication by 10) to find the third digit, and so on.To see the method applied, consider the problem of finding the smallest positive root of(9)This..

### Point estimation theory

A theory of constructing initial conditions that provides safe convergence of a numerical root-finding algorithm for an equation . Point estimation theory treats convergence conditions and the domain of convergence using only information about at the initial point (Petković et al. 1997, p. 1). An initial point that provides safe convergence of Newton's method is called an approximate zero.Point estimation theory should not be confused with pointestimators of probability theory.

### Wynn's epsilon method

Wynn's -method is a method for numerical evaluation of sums and products that samples a number of additional terms in the series and then tries to extrapolate them by fitting them to a polynomial multiplied by a decaying exponential.In particular, the method provides an efficient algorithm for implementing transformations of the form(1)where(2)is the th partial sum of a sequence , which are useful for yielding series convergence improvement (Hamming 1986, p. 205). In particular, letting , , and(3)for , 2, ... (correcting the typo of Hamming 1986, p. 206). The values of are there equivalent to the results of applying transformations to the sequence (Hamming 1986, p. 206).Wynn's epsilon method can be applied to the terms of a series using the Wolfram Language command SequenceLimit[l]. Wynn's method may also be invoked in numerical summation and multiplication using Method -> Fit in the Wolfram Language's NSum and NProduct..

### Convergence improvement

The improvement of the convergence properties of a series, also called convergence acceleration or accelerated convergence, such that a series reaches its limit to within some accuracy with fewer terms than required before. Convergence improvement can be effected by forming a linear combination with a series whose sum is known. Useful sums include(1)(2)(3)(4)Kummer's transformation takes a convergent series(5)and another convergent series(6)with known such that(7)Then a series with more rapid convergence to the same value is given by(8)(Abramowitz and Stegun 1972).The Euler transform takes a convergent alternatingseries(9)into a series with more rapid convergence to the same value to(10)where(11)(Abramowitz and Stegun 1972; Beeler et al. 1972).A general technique that can be used to acceleration converge of series is to expand them in a Taylor series about infinity and interchange the order of summation. In cases where a symbolic..

### Clenshaw recurrence formula

The downward Clenshaw recurrence formula evaluates a sum of products of indexed coefficients by functions which obey a recurrence relation. If(1)and(2)where the s are known, then define(3)(4)for and solve backwards to obtain and .(5)(6)(7)(8)(9)(10)The upward Clenshaw recurrence formula is(11)(12)for .(13)

### Woolhouse's formulas

Let the values of a function be tabulated at points equally spaced by , so , , ..., . Then Woolhouse's formulas approximating the integral of are given by the Newton-Cotes-like formulas(1)(2)

### Numerical integration

Numerical integration is the approximate computation of an integral using numerical techniques. The numerical computation of an integral is sometimes called quadrature. Ueberhuber (1997, p. 71) uses the word "quadrature" to mean numerical computation of a univariate integral, and "cubature" to mean numerical computation of a multiple integral.There are a wide range of methods available for numerical integration. A good source for such techniques is Press et al. (1992). Numerical integration is implemented in the Wolfram Language as NIntegrate[f, x, xmin, xmax].The most straightforward numerical integration technique uses the Newton-Cotes formulas (also called quadrature formulas), which approximate a function tabulated at a sequence of regularly spaced intervals by various degree polynomials. If the endpoints are tabulated, then the 2- and 3-point formulas are called the trapezoidal rule and..

### Weddle's rule

Let the values of a function be tabulated at points equally spaced by , so , , .... Then Weddle's rule approximating the integral of is given by the Newton-Cotes-like formula

### Filon's integration formula

A formula for numerical integration,(1)where(2)(3)(4)(5)(6)(7)and the remainder term is(8)

### Monte carlo integration

In order to integrate a function over a complicated domain , Monte Carlo integration picks random points over some simple domain which is a superset of , checks whether each point is within , and estimates the area of (volume, -dimensional content, etc.) as the area of multiplied by the fraction of points falling within . Monte Carlo integration is implemented in the Wolfram Language as NIntegrate[f, ..., Method -> MonteCarlo].Picking randomly distributed points , , ..., in a multidimensional volume to determine the integral of a function in this volume gives a result(1)where(2)(3)(Press et al. 1992, p. 295).

### Durand's rule

Let the values of a function be tabulated at points equally spaced by , so , , ..., . Then Durand's rule approximating the integral of is given by the Newton-Cotes-like formula

### Trapezoidal rule

The 2-point Newton-Cotes formulawhere , is the separation between the points, and is a point satisfying . Picking to maximize gives an upper bound for the error in the trapezoidal approximation to the integral.

Also called Radau quadrature (Chandrasekhar 1960). A Gaussian quadrature with weighting function in which the endpoints of the interval are included in a total of abscissas, giving free abscissas. Abscissas are symmetrical about the origin, and the general formula is(1)The free abscissas for , ..., are the roots of the polynomial , where is a Legendre polynomial. The weights of the free abscissas are(2)(3)and of the endpoints are(4)The error term is given by(5)for . Beyer (1987) gives a table of parameters up to and Chandrasekhar (1960) up to (although Chandrasekhar's for is incorrect).300.000001.3333330.33333340.8333330.166667500.0000000.7111110.5444440.10000060.5548580.3784750.066667

### Cubature

Ueberhuber (1997, p. 71) and Krommer and Ueberhuber (1998, pp. 49 and 155-165) use the word "quadrature" to mean numerical computation of a univariate integral, and "cubature" to mean numerical computation of a multiple integral.Cubature techniques available in the Wolfram Language include Monte Carlo integration, implemented as NIntegrate[f, ..., Method -> MonteCarlo] or NIntegrate[f, ..., Method -> QuasiMonteCarlo], and the adaptive Genz-Malik algorithm, implemented as NIntegrate[f, ..., Method -> MultiDimensional].

### Simpson's rule

Simpson's rule is a Newton-Cotes formula for approximating the integral of a function using quadratic polynomials (i.e., parabolic arcs instead of the straight line segments used in the trapezoidal rule). Simpson's rule can be derived by integrating a third-order Lagrange interpolating polynomial fit to the function at three equally spaced points. In particular, let the function be tabulated at points , , and equally spaced by distance , and denote . Then Simpson's rule states that(1)(2)Since it uses quadratic polynomials to approximate functions, Simpson's rule actually gives exact results when approximating integrals of polynomials up to cubic degree.For example, consider (black curve) on the interval , so that , , and . Then Simpson's rule (which corresponds to the area under the blue curve obtained from the third-order interpolating polynomial) gives(3)(4)(5)whereas the trapezoidal rule (area under the red curve) gives and the..

### Simpson's 3/8 rule

Let the values of a function be tabulated at points equally spaced by , so , , ..., . Then Simpson's 3/8 rule approximating the integral of is given by the Newton-Cotes-like formula

### Christoffel number

One of the quantities appearing in the Gauss-Jacobi mechanical quadrature. They satisfy(1)(2)and are given by(3)(4)(5)(6)where is the higher coefficient of .

### Shovelton's rule

Let the values of a function be tabulated at points equally spaced by , so , , ..., . Then Shovelton's rule approximating the integral of is given by the Newton-Cotes-like formula

### Hardy's rule

Let the values of a function be tabulated at points equally spaced by , so , , ..., . Then Hardy's rule approximating the integral of is given by the Newton-Cotes-like formula

A Gaussian quadrature-like formula for numerical estimation of integrals. It uses weighting function in the interval and forces all the weights to be equal. The general formula is(1)where the abscissas are found by taking terms up to in the Maclaurin series of(2)and then defining(3)The roots of then give the abscissas. The first few values are(4)(5)(6)(7)(8)(9)(10)(11)(12)(13)(OEIS A002680 and A101270).Because the roots are all real for and only (Hildebrand 1956), these are the only permissible orders for Chebyshev quadrature. The error term is(14)where(15)The first few values of are 2/3, 8/45, 1/15, 32/945, 13/756, and 16/1575 (Hildebrand 1956). Beyer (1987) gives abscissas up to and Hildebrand (1956) up to .23045067090The abscissas and weights can be computed analytically for small .230450..

A Gaussian quadrature-like formula for numerical estimation of integrals. It requires points and fits all polynomials to degree , so it effectively fits exactly all polynomials of degree . It uses a weighting function in which the endpoint in the interval is included in a total of abscissas, giving free abscissas. The general formula is(1)The free abscissas for , ..., are the roots of the polynomial(2)where is a Legendre polynomial. The weights of the free abscissas are(3)(4)and of the endpoint(5)The error term is given by(6)for .20.50.3333331.530.2222221.024970.6898980.75280640.1250.6576890.1810660.7763870.8228240.44092450.080.4462080.6236530.4463140.5627120.8857920.287427The abscissas and weights can be computed analytically for small .23

Seeks to obtain the best numerical estimate of an integral by picking optimal abscissas at which to evaluate the function . The fundamental theorem of Gaussian quadrature states that the optimal abscissas of the -point Gaussian quadrature formulas are precisely the roots of the orthogonal polynomial for the same interval and weighting function. Gaussian quadrature is optimal because it fits all polynomials up to degree exactly. Slightly less optimal fits are obtained from Radau quadrature and Laguerre-Gauss quadrature.interval are roots of1To determine the weights corresponding to the Gaussian abscissas , compute a Lagrange interpolating polynomial for by letting(1)(where Chandrasekhar 1967 uses instead of ), so(2)Then fitting a Lagrange interpolating polynomial through the points gives(3)for arbitrary points . We are therefore looking for a set of points and weights such that for a weighting function ,(4)(5)with weight(6)The..

### Boole's rule

Let the values of a function be tabulated at points equally spaced by , so , , ..., . Then Boole's rule approximating the integral of is given by the Newton-Cotes-like formulaThis formula is frequently and mistakenly known as Bode's rule (Abramowitz and Stegun 1972, p. 886) as a result of a typo in an early reference, but is actually due to Boole (Boole and Moulton 1960).

### Summation by parts

Summation by parts for discrete variables is the equivalent of integrationby parts for continuous variables(1)or(2)where is the indefinite summation operator and the -operator is defined by(3)where is any constant.

### Markoff's formulas

Formulas obtained from differentiating Newton'sforward difference formula,where is a binomial coefficient, and . Abramowitz and Stegun (1972) and Beyer (1987) give derivatives in terms of and derivatives in terms of and .

### Divided difference

The divided difference , sometimes also denoted (Abramowitz and Stegun 1972), on points , , ..., of a function is defined by and(1)for . The first few differences are(2)(3)(4)Defining(5)and taking the derivative(6)gives the identity(7)Consider the following question: does the property(8)for and a given function guarantee that is a polynomial of degree ? Aczél (1985) showed that the answer is "yes" for , and Bailey (1992) showed it to be true for with differentiable . Schwaiger (1994) and Andersen (1996) subsequently showed the answer to be "yes" for all with restrictions on or .

### Stirling's finite difference formula

(1)for , where is the central difference and(2)(3)with a binomial coefficient.

### Jackson's difference fan

If, after constructing a difference table, no clear pattern emerges, turn the paper through an angle of and compute a new table. If necessary, repeat the process. Each rotation reduces powers by 1, so the sequence multiplied by any polynomial in is reduced to 0s by a -fold difference fan.Call Jackson's difference fan sequence transform the -transform, and define as the -th -transform of the sequence , where and are complex numbers. This is denotedWhen , this is known as the binomial transform of the sequence. Greater values of give greater depths of this fanning process.The inverse -transform of the sequence is given byWhen , this gives the inverse binomial transform of .

### Steffenson's formula

(1)for , where is the central difference and(2)(3)(4)(5)where is a binomial coefficient.

### Difference quotient

It gives the slope of the secant line passing through and . In the limit , the difference quotient becomes the partial derivative

### Reciprocal difference

The reciprocal differences are closely related to the divideddifference. The first few are explicitly given by(1)(2)(3)(4)

### Gauss's forward formula

Gauss's forward formula is(1)for , where is the central difference and(2)(3)where is a binomial coefficient.

### Gauss's backward formula

This is sometimes knows as the "bars and stars" method. Suppose a recipe called for 5 pinches of spice, out of 9 spices. Each possibility is an arrangement of 5 spices (stars) and 9 dividers between categories (bars). The number of possibilities is . means you use spices 1, 1, 5, 6, and 9.(1)for , where is the central difference and(2)(3)where is a binomial coefficient.

### Clairaut's difference equation

Clairaut's difference equation is a special case of Lagrange's equation (Sokolnikoff and Redheffer 1958) defined by(1)or in " notation,"(2)(Spiegel 1970). It is so named by analogy with Clairaut'sdifferential equation(3)

### Forward difference

The forward difference is a finite differencedefined by(1)Higher order differences are obtained by repeated operations of the forward difference operator,(2)so(3)(4)(5)(6)(7)In general,(8)where is a binomial coefficient (Sloane and Plouffe 1995, p. 10).The forward finite difference is implemented in the Wolfram Language as DifferenceDelta[f, i].Newton's forward difference formula expresses as the sum of the th forward differences(9)where is the first th difference computed from the difference table. Furthermore, if the differences , , , ..., are known for some fixed value of , then a formula for the th term is given by(10)(Sloane and Plouffe 1985, p. 10).

### Central difference

The central difference for a function tabulated at equal intervals is defined by(1)First and higher order central differences arranged so as to involve integer indices are then given by(2)(3)(4)(5)(6)(7)(Abramowitz and Stegun 1972, p. 877).Higher order differences may be computed for evenand odd powers,(8)(9)(Abramowitz and Stegun 1972, p. 877).

### Newton's forward difference formula

Newton's forward difference formula is a finite difference identity giving an interpolated value between tabulated points in terms of the first value and the powers of the forward difference . For , the formula states(1)When written in the form(2)with the falling factorial, the formula looks suspiciously like a finite analog of a Taylor series expansion. This correspondence was one of the motivating forces for the development of umbral calculus.An alternate form of this equation using binomial coefficients is(3)where the binomial coefficient represents a polynomial of degree in .The derivative of Newton's forward difference formulagives Markoff's formulas.

### Finite difference

The finite difference is the discrete analog of the derivative. The finite forward difference of a function is defined as(1)and the finite backward difference as(2)The forward finite difference is implemented in the Wolfram Language as DifferenceDelta[f, i].If the values are tabulated at spacings , then the notation(3)is used. The th forward difference would then be written as , and similarly, the th backward difference as .However, when is viewed as a discretization of the continuous function , then the finite difference is sometimes written(4)(5)where denotes convolution and is the odd impulse pair. The finite difference operator can therefore be written(6)An th power has a constant th finite difference. For example, take and make a difference table,(7)The column is the constant 6.Finite difference formulas can be very useful for extrapolating a finite amount of data in an attempt to find the general term. Specifically, if a function..

### Bessel's finite difference formula

An interpolation formula, sometimes known as theNewton-Bessel formula, given by(1)for , where is the central difference and(2)(3)(4)(5)(6)(7)(8)(9)where are the coefficients from Gauss's backward formula and Gauss's forward formula and and are the coefficients from Everett's formula. The s also satisfy(10)(11)for(12)

### Everett's formula

(1)for , where is the central difference and(2)(3)(4)(5)where are the coefficients from Gauss's backward formula and Gauss's forward formula and are the coefficients from Bessel's finite difference formula. The s and s also satisfy(6)(7)for(8)

### Backward difference

The backward difference is a finite differencedefined by(1)Higher order differences are obtained by repeated operations of the backward difference operator, so(2)(3)(4)In general,(5)where is a binomial coefficient.The backward finite difference are implemented in the Wolfram Language as DifferenceDelta[f, i].Newton's backward difference formula expresses as the sum of the th backward differences(6)where is the first th difference computed from the difference table.

### Secant method

A root-finding algorithm which assumes a function to be approximately linear in the region of interest. Each improvement is taken as the point where the approximating line crosses the axis. The secant method retains only the most recent estimate, so the root does not necessarily remain bracketed. The secant method is implemented in the Wolfram Language as the undocumented option Method -> Secant in FindRoot[eqn, x, x0, x1].When the algorithm does converge, its order of convergenceis(1)where is a constant and is the golden ratio.(2)(3)(4)so(5)The secant method can be implemented in the WolframLanguage as SecantMethodList[f_, {x_, x0_, x1_}, n_] := NestList[Last[] - {0, (Function[x, f][Last[]]* Subtract @@ )/Subtract @@ Function[x, f] /@ }&, {x0, x1}, n]

### Newton's method

Newton's method, also called the Newton-Raphson method, is a root-finding algorithm that uses the first few terms of the Taylor series of a function in the vicinity of a suspected root. Newton's method is sometimes also known as Newton's iteration, although in this work the latter term is reserved to the application of Newton's method for computing square roots.For a polynomial, Newton's method is essentially the same as Horner's method.The Taylor series of about the point is given by(1)Keeping terms only to first order,(2)Equation (2) is the equation of the tangent line to the curve at , so is the place where that tangent line intersects the -axis. A graph can therefore give a good intuitive idea of why Newton's method works at a well-chosen starting point and why it might diverge with a poorly-chosen starting point.This expression above can be used to estimate the amount of offset needed to land closer to the root starting from an initial guess..

The conjugate gradient method is an algorithm for finding the nearest local minimum of a function of variables which presupposes that the gradient of the function can be computed. It uses conjugate directions instead of the local gradient for going downhill. If the vicinity of the minimum has the shape of a long, narrow valley, the minimum is reached in far fewer steps than would be the case using the method of steepest descent.For a discussion of the conjugate gradient method on vector and shared memory computers, see Dongarra et al. (1991). For discussions of the method for more general parallel architectures, see Demmel et al. (1993) and Ortega (1988) and the references therein.

### Householder's method

A root-finding algorithm based on the iteration formulaThis method, like Newton's method, has poor convergence properties near any point where the derivative .A fractal is obtained by applying Householders's method to finding a root of . Coloring the basin of attraction (the set of initial points which converge to the same root) for each root a different color then gives the above plots.

### Wavelet matrix

Any discrete finite wavelet transform can be represented as a matrix, and such a wavelet matrix can be computed in steps, compared to for the Fourier matrix, where is the base-2 logarithm. A single wavelet matrix can be built using Haar functions.

### Wavelet

Wavelets are a class of a functions used to localize a given function in both space and scaling. A family of wavelets can be constructed from a function , sometimes known as a "mother wavelet," which is confined in a finite interval. "Daughter wavelets" are then formed by translation () and contraction (). Wavelets are especially useful for compressing image data, since a wavelet transform has properties which are in some ways superior to a conventional Fourier transform.An individual wavelet can be defined by(1)Then(2)and Calderón's formula gives(3)A common type of wavelet is defined using Haar functions.The Season 1 episode "Counterfeit Reality" (2005) of the television crime drama NUMB3RS features wavelets.

### Lemari&eacute;'s wavelet

A wavelet used in multiresolution representation to analyze the information contentof images. The wavelet is defined by(1)where(2)(3)(Mallat 1989ab).

### Descartes' sign rule

A method of determining the maximum number of positive and negative real roots of a polynomial.For positive roots, start with the sign of the coefficient of the lowest (or highest) power. Count the number of sign changes as you proceed from the lowest to the highest power (ignoring powers which do not appear). Then is the maximum number of positive roots. Furthermore, the number of allowable roots is , , , .... For example, consider the polynomial(1)Since there are three sign changes, there are a maximumof three possible positive roots.For negative roots, starting with a polynomial , write a new polynomial with the signs of all odd powers reversed, while leaving the signs of the even powers unchanged. Then proceed as before to count the number of sign changes . Then is the maximum number of negative roots. For example, consider the polynomial(2)and compute the new polynomial(3)In this example, there are four sign changes, so there area maximum of..

### M&uuml;ntz's theorem

Müntz's theorem is a generalization of the Weierstrass approximation theorem, which states that any continuous function on a closed and bounded interval can be uniformly approximated by polynomials involving constants and any infinite sequence of powers whose reciprocals diverge.In technical language, Müntz's theorem states that the Müntz space is dense in iff