3. Norms

A norm is how we measure the length/size of vectors and matrices.

\[\| . \|: V\to \R\quad (V \subset \R^n).\]

An example is relative error:

\[\frac{\|x-y\|}{\|x\|}.\]

$|.|$ is a norm if

Common Vector Norms

The infinity norm is the maximum value of a vector

\[\|\vec x\|_\infty =\max_i|x_i|.\]

This is very punishing of outliers.

The 1 norm is the sum of the absolute values

\[\|x\|_1 = \sum_{i=0}^n|x_i|.\]

The 2 norm is the square root of sum the squares.

These are all examples of $p$-norms, which are given by

\[\Bigg(\sum|x_i|\Bigg)^{1/p}.\]

The limit is what happens when $p\to \infty$.

Isometries are operations that do not change the norms.

Given two norms $|x|_a$ and $|x|_b$, there is a theorem that $\exists$ constants $c_1, c_2$ such that

\[c_1\|x\|_b \leq \|x\|_a \leq c_2\|x\|_b.\]

Example. The infinity norm is less than the 1 norm less than $n$ times the infinity norm.

\[\max_i|x_i| \leq \sum_{i=0}^n|x_i|\leq n \max_i|x_i|.\]

Note that they are equivalent when $x$ has exactly 1 non-zero element or all the entries are 0.

Suppose $A \in \R^{m\times n}$. The Frobenius norm is treating a matrix as a $mn$ dimension vector and taking the 2-norm.

We can also calculate the induced norm, or operator norm.

\[\|A\|_p = \sup_{\|x\|_p = 1} {\|Ax\|_p}.\]

A nice property of induced norms is that they are subordinate and submultiplicative. See Cauchy-Schwarz. 1.4. Geometry of the real numbers.

\[\|Av\|\leq \|A\|\|v\|.\\ \|AB\|\leq \|A\|\|B\|.\]

This is not a property of all norms. For example, suppose we want ot take the max norm

\[\Bigg\|\begin{bmatrix}2&2\\0&0\end{bmatrix}\begin{bmatrix}1&0\\1&0\end{bmatrix}\Bigg \|_{max} = 4.\]

Suppose $M_1$ and $M_2$ are permutation matrices.

\[\|M_1AM_2\|_p = \max_{\|x\|=1}\|M_1AM_2x\|_p.\]

This property also holds for frobenius norms.

Suppose we have $|A|^2_2=\max_{|x|_1} |Ax|^2_2 = x^TA^TAx$.

$A^TA$ is a symmetric matrix, so it has a full set of orthonormal vectors for its eigenvalues.

$A^TA = V\Lambda V^T$ where $V^TV = I$.

$|A|_2 = \sqrt{\lambda_{max} (A^TA)}$.