# 2.8. Newton’s Method

Newton’s method is an efficient way to calculate roots of a function.

**Definition 2.8.1 (Newton’s method).** Let $\vec f:U \to \R^n$ be a differentiable map. *Newton’s method* consists of starting with some guess $a_0$ for a solution of $\vec f(x) = \vec 0$l Then linearlize the equation at $a_0$: replace the increment to the function, $\vec f(x) - \vec f(a_0)$, by a linear function of the increment, $D\vec f(a_0)$. Now solve the corresponding linear equation:

This is a system of $n$ linear equations in $n$ unknowns. We can write it as

\[[D\vec f(a_0)](x-a_0)=-\vec f(a_0).\]If $[D\vec f(a_0)]$ is invertible, which will usually be the case, then

\[x = a_n - [D\vec f(a_n)]^{-1}f(a_n).\]Newton’s method doesn’t work for every function. In the single variable case, it is clear when it does not work.

## Lipschitz conditions

To find at which points Newton’s method converges, we create an explicit bound on how good an approximation $[D\vec f(x_0)]\vec h$ is to $\vec f(x_0 + \vec h) - \vec f(x_0)$.

**Definition 2.8.4 (Lipschitz condition for a derivative).** Let $U \subset\R^n$ be open and let $f: U \to \R^n$ be a differentiable mapping. The derivative $[Df(x)]$ satisfies a Lipschitz condition on a subset $V \subset U$ with *Lipschitz ratio* $M$ if for all $x,y\in V$,

### Computing Lipschitz ratios using second partial derivatives

**Definition 2.8.7 (Second partial derivative).** Let $U \subset \R^n$ be open, and let $f: U \to \R$ be a differentiable function. If the function $D_if$ is itself differentiable, then its partial derivative with respect to the $j$th variable, $D_j(D_if)$ is called the *second partial derivative* of $f$.

**Proposition 2.8.9 (Derivative of a** $C^2$ **mapping is Lipschitz).** Let $U \subset \R^n$ be an open ball, and $f: U \to \R^n$ a $C^2$ mapping. If

for any $x\in U$ and for all triples of indices $1\leq i,j,k \leq n$, then for $u,v \in U$,

\[|[Df(u)]-[Df(v)]| \leq \Big(\sum_{1\leq i,j,k\leq n}(c_{i,j,k})^2\Big)^{1/2}|u-v|.\]## Kantorovich’s Theorem

**Theorem 2.8.13 (Kantorovich’s theorem).** Let $a_0$ be a point in $\R^n$, $U$ an open neighborhood of $a_0$ in $\R^n$, and $\vec f: U \to \R^n$ a differentiable mapping, with its derivative $[D\vec f(a_0)]$ invertible. Define

If $\overline U_1 \subset U$ and the derivative $[D\vec f(x)]$ satisfies the Lipschitz condition

$ | [D\vec f(u_1)] - [D\vec f(u_2)] | \leq M | u_1 - u_2 | $ for all points $u_1, u_2 \in \overline U_1$, and if the inequality |

is satisfied, then the equation $\vec f(x) = \vec 0$ has a unique solution in the closed ball $\overline U_1$, and Newton’s method with initial guess $a_0$ converges to it.

**Example. (Let’s complicate life).**When does Newton’s method converge for $x^3-2x-5=0$?

Suppose we start at $x_0 = 2$. The distance between $x_0$ and $x_1$, written as $h_0$, is equal to

\[h_0 = [Df(x)]^{-1}f(x_0)\\ = \frac{f(x_0)}{f'(x_0)} = \frac{-1}{10}.\]Now we compute $x_1 = x_0 -h_0 = 2 + 1/10 = 2.1$. For Newton’s method to converge, we need to find the supremum of $f’‘(x)$ on a ball $B_{h_0}(x_1)=(2,2.2)$. Because $f’’$ is an always increasing function, the sup is $f’‘(2.2) = 13.2$. Using this value on Kantorovich’s theorem,

\[\frac{|f(x_0)|M}{|f'(x_0)|^2} = \frac{(1)(13.2)}{10^2} < \frac{1}{2}.\]So it does indeed converge at $x_0 = 2$.

**Example.**

Find $r$ such that if $a^2 + b^2 < r^2$, Newton’s method starting at $\begin{pmatrix}3\4\end{pmatrix}$ will converge to a solution. We find the roots of the function from $U \to \R^2$

\[f\begin{pmatrix}x\\y\end{pmatrix} = \begin{pmatrix}x^2+y^2-25-a\\xy-12-b\end{pmatrix}.\]By Kantorovich’s theorem, we solve when $ | \vec f(a_0) | [D\vec f(a_0)]^{-1} | ^2M \leq \frac{1}{2}$. We first calculate $M$ by computing $f$’s Lipschitz condition. |

The partial derivatives are bounded by $M = \sqrt{10}$. Next, we calculate the inverse of the Jacobian. At $x=3$ and $y=4$,

\[[Df]^{-1} = \frac{1}{14}\begin{bmatrix}3&-8\\-4&8\end{bmatrix}\]The length of the square of this is $\frac{125}{196}$.

Now, let’s find the ball $U_0$ where $f$ converges. It is the ball of radius $h_0$ centered around $x_1$.

\[h_0 = [Df(a_0)]^{-1}f(a_0)\\ = \frac{1}{14}\begin{bmatrix}3&-8\\-4&8\end{bmatrix}\begin{bmatrix}-a\\-b\end{bmatrix}\\ = \frac{1}{14}\begin{bmatrix}-3a+8b\\4a-6b\end{bmatrix}.\]The equation $f\begin{pmatrix}x\y\end{pmatrix} = \begin{pmatrix}0\0\end{pmatrix}$ has a unique solution in $B_{ | h_0 | }a_1$ if $r\leq \frac{1}{2}\frac{196}{125}\frac{1}{\sqrt{10}}$. |