This site is being phased out.

Gradient

From Mathematics Is A Science
Jump to navigationJump to search

How do we understand the derivative of functions of several variables?

Fix a point $a$. Then

  • The directional derivative $\nabla_e f(a)$ is a number for each $e$, $||e||=1$, there are infinitely many.
  • The partial derivative $\frac{\partial f}{\partial x_k}(a)$ is a number, the gradient, and depends on the choice of coordinates.
  • The total derivative $f'(a)$ is a linear map from ${\bf R}^n$ in ${\bf R}$ and is given by a matrix of dimension $1 \times n$ if coordinates are chosen.


Theorem. Suppose $a \in {\rm int \hspace{3pt}}D(f)$, and $f$ is differentiable at $a$. Then

  1. $\nabla_e f(a) = f'(a)(e)$ ($f'(a)$ a linear map applied to $e$)
  2. $f′(a) = {\rm grad \hspace{3pt}}f(a)^T$ (transposed)
  3. $f′(a)(x) = < {\rm grad \hspace{3pt}}f(a)^T, x >.$

Proof 1. Given $e$ with $||e|| = 1$, we plug in the definition of $f'(a)$:

$$\displaystyle\lim_{x \rightarrow a} \frac{f(x) - T(x)}{|| x - a ||} = 0.$$

Take $x = a + te, t > 0$. Then $x \rightarrow a$ if $t \rightarrow 0$. Then

$$\displaystyle\lim_{t \rightarrow 0} \frac{f( a + te ) - T( a + te)}{t} = 0,$$

because $f$ is differentiable at $a$.

Recall $T$ is an affine function, i.e.

$$T(x) = f(a) + f'(a)( x - a ).$$

Substituting the above term into the limit, we obtain

$$\displaystyle\lim_{t \rightarrow 0} \frac{f( a + te ) - f(a) - f'(a)( te )}{t} = 0,$$

or

$$\displaystyle\lim_{t \rightarrow 0} \left[ \frac{ f( a + te ) - f(a) }{t} - \frac{f'(a)( te )}{t} \right] = 0.$$

Note that the first term is the directional derivative. Further, $f'(a)$ is a linear map, so we can take out t. Hence

$$\displaystyle\lim_{t \rightarrow 0} \left[ \frac{f( a + te ) - f(a)}{t} - f'(a)( e ) \right] = 0.$$

Taking out $f'(a)( e )$ (which we can do because it is independent of $t$) yields

$$\displaystyle\lim_{t \rightarrow 0} \frac{f( a + te ) - f(a)}{t} = f'(a)( e ).$$

Prove 2 and 3 as an exercise.


Algebraic Properties of $f'$ where $f: {\bf R}^n \rightarrow {\bf R}$, $f(x)$ a number. 1. Sum Rule

$$( f + g )' (a) = f'(a) + g'(a)$$

2. Product Rule $$( f \cdot g )' (a) = f'(a) g(a) + g'(a) f(a)$$

(Note that $f'(a), g'(a)$ are vectors, $f(a), g(a)$ are scalars)

3. Quotient Rule $$\left( \frac{f}{g} \right)' (a) = \frac{f'(a) g(a) - g'(a) f(a)}{g(a)^2}, g(a) \neq 0.$$

Chain Rule Let

$$(a) f: {\bf R}^n \rightarrow {\bf R}, {\rm \hspace{3pt} and \hspace{3pt}} g: {\bf R} \rightarrow {\bf R}, {\rm \hspace{3pt} or}$$

$$(b) f: {\bf R} \rightarrow {\bf R}^n, {\rm \hspace{3pt} and \hspace{3pt}} g: {\bf R}^n \rightarrow {\bf R}.$$

Then

$$( g \circ f )' (a) = g'( f(a) ) \cdot f'(a).$$

"Image"


Example. 5 Let

$$f: {\bf R} \rightarrow {\bf R}^2,$$

$$f(t) = ( t {\rm cos \hspace{3pt}} t, t^2 {\rm sin \hspace{3pt}} t )$$

$$g: {\bf R}^2 \rightarrow {\bf R},$$


$$g( x_1, x_2 ) = x_1^2 + x_2^2.$$

Further $t = \frac{\pi}{2}, ( x_1, x_2 ) = ( 0, \frac{\pi^2}{2^2} ) = f( \frac{\pi}{2} ).$

Then

$$\frac{\partial g}{\partial x_1} = 2x_1 {\rm \hspace{3pt} and \hspace{3pt}} \frac{\partial g}{\partial x_2} = 2x_2$$

$$\rightarrow g'( f \left( \frac{\pi}{2} \right) ) = \left[ 0, \frac{\pi^2}{2} \right],$$

$$f'(t) = ( {\rm cos \hspace{3pt}} t - t {\rm sin \hspace{3pt}} t, 2t {\rm sin \hspace{3pt}}t + t^2 {\rm cos \hspace{3pt}} t ),$$

$$\begin{array}{} f' \left( \frac{\pi}{2} \right) &= \left( \frac{\pi}{2}, \pi \right) \\ &= | \frac{\pi}{2} | \\ &| \pi | \end{array}$$

Exercise. With Chain Rule: $$\begin{array}{} \frac{d}{dt} ( g \circ f )\left( \frac{\pi}{2} \right) &= g'( f \left( \frac{\pi}{2} \right) ) \cdot f' \left( \frac{\pi}{2} \right) \\ &= < g'( f \left( \frac{\pi}{2} \right) ), f′ \left( \frac{\pi}{2} \right) > \\ &= < \left[ 0, \frac{\pi}{2} \right], \left[ \frac{2 \pi}{2} , \pi \right] > \\ &= \frac{\pi^2}{2} \end{array}$$

(Verify)


Chain Rule

$$\frac{d}{dx} f( g(x) ) = f'( g(x) ) \cdot g'(x) {\rm \hspace{3pt} with \hspace{3pt} Calc 1}$$

$$\frac{d}{dx} ( f \circ g )(a) = f'(b) \cdot g'(a), b = g(a).$$

The derivative of the composition is the product of the derivatives, and the derivative of the composition is the composition of the derivatives. (The same because they are linear)

Let ${\rm dim \hspace{3pt}} = 1, f'(b) = 3$ (not a number, it's a linear map), $g'(a) = 5$.

Then

$$\frac{d}{dx}( f \circ g )(a) = 3 \cdot 5 = 15.$$

Let $f'(b)(u) = 3u$ linear with respect to $u, g'(a)(v) = 5v$ linear with respect to $v$.

"Image"

Let ${\rm dim \hspace{3pt}} = n.$

$f \circ g$ is a composition of linear functions,
$A_f \cdot A_g$ is a product of their matrices.

Fastest growth - what is the direction?

"Image"

Maximize the directional derivatives: Find $e$ with $|| e || = 1$ such that

$$\nabla_e f(a) \geq \nabla_u f(a)$$

for all other $u, || u || = 1$.

We maximize by changing $e$:

$$\begin{array}{} \nabla_e f(a) &= < f'(a), e > \\ &= || f'(a) || \cdot || e || {\rm cos \hspace{3pt}} \varphi, {\rm \hspace{3pt} where \hspace{3pt}} \varphi {\rm \hspace{3pt} is \hspace{3pt} the \hspace{3pt} angle \hspace{3pt} between \hspace{3pt}} e {\rm \hspace{3pt} and \hspace{3pt}} f'(a), \\ &= || f'(a) || \cdot {\rm cos \hspace{3pt}} \varphi, \end{array}$$

where $f'(a)$ is fixed and ${\rm cos \hspace{3pt}} \varphi$ varies.

We maximize ${\rm cos \hspace{3pt}} \varphi$, the largest value is $1 = {\rm cos \hspace{3pt}} 0$. So we choose

$$e = \frac{f'(a)}{|| f'(a) ||} = \nabla \frac{f(a)}{|| \nabla f(a) ||},$$

i.e. the direction is that of the gradient.

Following the gradient will lead us to a local maximum. Also,

$$\max_{||e||=1} \nabla_e f(a) = || \nabla f(a) ||.$$