Gauss–Newton algorithm

The Gauss–Newton algorithm is used to solve non-linear least squares problems, which is equivalent to minimizing a sum of squared function values. It is an extension of Newton's method for finding a minimum of a non-linear function. Since a sum of squares must be nonnegative, the algorithm can be viewed as using Newton's method to iteratively approximate zeroes of the components of the sum, and thus minimizing the sum. In this sense, the algorithm is also an effective method for solving overdetermined systems of equations. It has the advantage that second derivatives, which can be challenging to compute, are not required.

Non-linear least squares problems arise, for instance, in non-linear regression, where parameters in a model are sought such that the model is in good agreement with available observations.

The method is named after the mathematicians Carl Friedrich Gauss and Isaac Newton, and first appeared in Gauss' 1809 work Theoria motus corporum coelestium in sectionibus conicis solem ambientum.

Description

Given functions (often called residuals) of variables with the Gauss–Newton algorithm iteratively finds the value of that minimize the sum of squares

Starting with an initial guess for the minimum, the method proceeds by the iterations

where, if r and β are column vectors, the entries of the Jacobian matrix are

and the symbol denotes the matrix transpose.

At each iteration, the update can be found by rearranging the previous equation in the following two steps:

With substitutions , , and , this turns into the conventional matrix equation of form , which can then be solved in a variety of methods (see Notes).

If $m = n$ , the iteration simplifies to

which is a direct generalization of Newton's method in one dimension.

In data fitting, where the goal is to find the parameters such that a given model function best fits some data points , the functions are the residuals:

Then, the Gauss–Newton method can be expressed in terms of the Jacobian of the function as

Note that is the left pseudoinverse of .

is an effective method for solving of equations in the form of with and where is the (also known as ) of the of . It can be considered an extension of and enjoys the same local quadratic convergence toward isolated regular solutions.