This site is devoted to mathematics and its applications. Created and run by Peter Saveliev.

Several variables

From Mathematics Is A Science
Jump to navigationJump to search

A ball is thrown...

Let's review what we learned in Chapter 7 about motion of a ball (or a cannonball).

When a ball is thrown in the air under an angle, it moves in both vertical and horizontal directions, simultaneously and independently. The dynamics is very different. In the horizontal direction, as there is no force changing the velocity, the latter remains constant:

Constant speed.png

Meanwhile, the vertical velocity is constantly changed by the gravity. The dependence of the height on the time is quadratic:

Constant acceleration.png

We have three variables:

  • $t$ is time;
  • $x$ is the horizontal dimension, the depth, that depends on time;
  • $y$ is the vertical dimension, the height, that also depends on time.

The path of the ball will appear to an observer -- from the right angle -- as a curve. It is placed in the $xy$-plane positioned vertically:

Tx ty xy planes.png

First, the sequences.

We used these difference quotients to find the velocity and then the acceleration from the location: $$\begin{array}{l|l|l} &\text{ horizontal }&\text{ vertical }\\ \hline \text{ position }&x_n &y_n\\ \text{ velocity }&v_n=\frac{x_{n+1}-x_n}{h} &u_n=\frac{y_{n+1}-y_n}{h}\\ \text{ acceleration }&a_n=\frac{v_{n+1}-v_n}{h} &b_n=\frac{u_{n+1}-u_n}{h}\\ \end{array}$$ where $h$ is the increment of time.

These formulas can now be solved in order to be able to model the location as a function of time. The result is these recursive formulas for the Riemann sums: $$\begin{array}{l|l|l} &\text{ horizontal }&\text{ vertical }\\ \hline \text{ acceleration }&a_n &b_n\\ \text{ velocity }&v_{n+1}=v_n+ha_n&u_{n+1}=u_n+hb_n\\ \text{ position }&x_{n+1}=x_n+hv_n&y_{n+1}=y_n+hu_n\\ \end{array}$$

Problem: From a $200$ feet elevation, a cannon is fired horizontally at $200$ feet per second. How far will the cannonball go?

Cannon is fired horizontally.png

The physics is as follows:

  • horizontal: there is no force, hence $a_n=0$ for all $n$;
  • vertical: the force is constant and $b_n=-g$ for all $n$;

where $g$ is the gravitational constant: $$g=32 \text{ ft/sec}^2.$$

Next, we acquire the initial conditions:

  • the initial location is: $x_0=0$ and $y_0=200$;
  • the initial velocity is: $v_0=200$ and $u_0=0$.

Example (how far). To find when and where the ball hits the ground, we scroll down to find the row with $y$ close to $0$.

Cannonball horizontal spreadsheet.png

It happens sometime between $t=3.5$ and $t=3.6$ seconds, say $t_1=3.55$ seconds. Second, the values of $x$ at the time is between $x=700$ and $x=720$ feet, say, $x_1=710$ feet. We also plot the graphs of $x$ and $y$ as functions of $t$ on the right.

The spreadsheet is constructed for $x$ and $y$ separately, as follows. The time is in the first column progressing from $0$ every $.05$. The second derivative is in the next, $0$ and $-32$, respectively. In the next column, the initial velocity is entered in the top cell, $200$ and $0$ respectively. Below, the velocity is computed as a Riemann sum function of the previous column, with the same formula: $$\texttt{=R[-1]C+(RC[-2]-R[-1]C[-2])*R[-1]C[-1]}$$ In the next column, the initial location is entered in the top cell, $0$ and $200$ respectively. Below, the location is computed as a Riemann sum function of the previous column, with the same formula: $$\texttt{=R[-1]C+(RC[-3]-R[-1]C[-3])*RC[-1]}$$

The results are shown below:

Cannon and numerical integration.png

To find the solution to the problem from this data, we find the interval during which the cannonball hit the ground, i.e., $y=0$. We go down the $y$ column until we find the value closest to $0$; it is $y=1.2$. We then find the corresponding value of $x$; it is $x=700$. $\square$

Plotting $x$ against $y$ produces the path of the cannonball:

Cannonball horizontal path.png

Exercise. Under the same conditions, solve numerically the problem of hitting a target $500$ feet away.

We start with the continuous case now:

  • horizontal: $x' '=0$;
  • vertical: $y' '=-g$.

We start at the same place as above: $$\begin{cases} x' '&=0,&x'(0)=200, &x(0)=0;\\ y' '&=-g&y'(0)=0, &y(0)=200. \end{cases}$$

Since the velocity is an antiderivative of the acceleration, we integrate these. Then for horizontal, we have: $$x'=\int 0\, dt=C_x,$$ where $C_x$ is any constant. Next, for the vertical, $$y'=\int -g\, dt=-gt+C_y,$$ where $C_y$ is any constant.

Since the location is an antiderivative of the velocity, we integrate these. Then for horizontal, we have: $$x=\int x'\, dt=\int C_x\, dt=C_xt+K_x,$$ where $K_x$ is any constant. Next, for the vertical, $$y=\int y'\, dx=\int (-gt+C_y)\, dt=-\tfrac{1}{2}gt^2+C_yt+K_y,$$ where $K_y$ is any constant.

Thus, the general solution of this system of differential equations is: $$\begin{cases} x&=&&&C_xt&+&K_x,\\ y&=&-\tfrac{1}{2}gt^2&+&C_yt&+&K_y. \end{cases}$$ Any possible dynamics is found by specifying the values of the four constants: $$C_x,\ C_y,\ ,K_x,\ K_y.$$

The physics of the situation allows us to assign meanings to these four constants. First, $$\begin{cases} x'&=&&&C_x& \Longrightarrow & x'(0)=C_x,\\ y'&=&-gt&+&C_y& \Longrightarrow & y'(0)=C_y. \end{cases}$$ Therefore,

  • $C_x$ is the (constant) horizontal component of velocity;
  • $C_y$ is the initial vertical component of velocity.

Next, $$\begin{cases} x&=&&&C_xt&+&K_x& \Longrightarrow & x(0)=K_x,\\ y&=&-\tfrac{1}{2}gt^2&+&C_yt&+&K_y& \Longrightarrow & y(0)=K_y. \end{cases}$$ Therefore,

  • $K_x$ is the initial horizontal location (depth);
  • $K_y$ is the initial vertical location (height).

Thus, we have: $$\begin{cases} \text{depth }&=&\text{ initial depth }&+&\text{ initial horizontal velocity }&\cdot\text{ time },\\ \text{height}&=&\text{ initial height }&+&\text{ initial vertical velocity }&\cdot\text{ time }&-\tfrac{1}{2}g\cdot\text{ time }^2. \end{cases}$$

We used these two equations to solve a variety of problems about motion.

Example (how far). From a $200$ feet elevation, a cannon is fired horizontally at $200$ feet per second. How far will the cannonball go?

Cannon is fired horizontally.png

The initial conditions:

  • the initial location is: $0$ and $200$;
  • the initial velocity is: $200$ and $0$.

Then our equations become: $$\begin{cases} x&=&200t,\\ y&=200&&-16t^2. \end{cases}$$

Previously we solved the problem algebraically as follows. The height at the end of the flight is $y_1=0$, so to find the time, we set $y=200-16t^2=0$ and solve for $t$: $$t_1=\sqrt{\frac{200}{16}}\approx 3.54.$$ We substitute this value of $t$ into $x$ to find the corresponding depth: $$x_1=200t_1=200\frac{5\sqrt{2}}{2}\approx 707.$$ $\square$

What about the velocity as a function of time? We have: $$\begin{cases} \frac{dx}{dt}&=v_x,\\ \frac{dy}{dt}&=v_y&-gt. \end{cases}$$ Adding these two equations to the former two allows us to solve more profound problems.

Example (impact). In the setting of the last example, how hard does the ball hit the ground?

First, we examine the spreadsheet. Instead of the formulas, we compute the average velocities (i.e., the difference quotients) to approximate the velocities. The formula for $x'$ is: $$\texttt{=(RC[-2]-R[-1]C[-2])/(RC[-3]-R[-1]C[-3]),}$$ and the formula for $y'$ is: $$\texttt{=(RC[-2]-R[-1]C[-2])/(RC[-4]-R[-1]C[-4]).}$$ The denominators refer to the column that contains the time and the numerator refers to the columns that contain $x$ and $y$ respectively.

Cannonball horizontal spreadsheet -- velocities.png

Looking at the same row as before, we see that the vertical velocity at the moment of impact is between $-110.4$ and $-113.6$ feet per second.

Now, the algebra. The formulas for the velocities take this form: $$\begin{cases} \frac{dx}{dt}&=200,\\ \frac{dy}{dt}&=&-32t. \end{cases}$$ Let's find the velocity at the time of contact. We substitute the time we've found, $$t_1=\frac{5\sqrt{2}}{2},$$ into the formulas for velocity: $$\begin{cases} \frac{dx}{dt}\Big|_{t=t_1}&=200,\\ \frac{dy}{dt}\Big|_{t=t_1}&=&-32t_1=-32\frac{5\sqrt{2}}{2}\approx -112. \end{cases}$$ The answer matches our estimate.

But which one of the two numbers represent how fast the ball hits the ground? It is the latter if the ball hits the (horizontal) surface and it is the former if this is a wall. Then, the general answer should be a combination of the two. This is how they should be combined via the Pythagorean Theorem:

Velocity of impact.png

Then, the impact is determined by this number: $$\sqrt{200^2+(-112)^2}\approx 229.$$ $\square$

Introduction to parametric curves

Example (ball). This is how we understand the trajectory of a thrown as a parametric curve. There are two observers:

  • one is behind the throw and can see only the rise and fall of the ball and
  • the other is on the ground under the path and can only see the forward progress of the ball.
Basketball is thrown 2.png

If the two make records where the ball was at what time, they can use the time stamps to match the two coordinates and then plot this point on the $xy$-plane. These points will form the ball's trajectory, what an third, outside, observer would see. $\square$

Two functions that have nothing to do with each other except the inputs can be matched...

Definition. A parametric curve is a combination of two functions of the same variable: $$\begin{cases} x=f(t),\\ y=g(t). \end{cases}$$

Example (plotter). A curve may be plotted on a piece of paper by a computer. A pen is attached to a runner on a vertical bar, while that bar slides along a horizontal rail at the bottom edge of the paper:

Plotter.png

The computer commands the next location of both: for each moment of time $t$, the horizontal location of the bar (and the pen) is given by $x=f(t)$ and the vertical location of the pen is given by $y=g(t)$. $\square$

However, this view of parametric curves is most useful within the framework of multidimensional spaces and vectors. This theory is developed starting in this chapter. Of course, the motion metaphor -- $x$ and $y$ are coordinates in the space -- will be superseded. In contrast to this approach, we look at the two quantities and two functions that might have nothing to do with each other (except for $t$ of course). As a result, we don't have to worry about the coordinate plane having the same units for the axes in order to keep the curve proportional. Initially, there is no curve!

Example (commodities trader). Suppose a commodities trader follows the market. What he sees is the following:

  • $t$ is time,
  • $x$ is the price of wheat (say, in dollars per bushel), and
  • $y$ is the price of sugar (say, in dollars per ton).

We simply have two functions we -- initially -- look at separately.

First, let's imagine that the price of wheat is decreasing: $$x\ \searrow\ .$$ To make this specific, we can choose: $$x=f(t)=\frac{1}{t+1}.$$ We then can study this function within the confines of calculus of single variable. To see some actual data, we evaluate $x$ for several values of $t$: $$\begin{array}{l|ll} t&x\\ \hline 0&1.00\\ 1&.50\\ 2&.33 \end{array}$$ With more points acquired in a spreadsheet we can plot the graph on the $tx$-plane:

Price of wheat.png

At this point, we could, as we have in the past, proceed to use calculus to study the derivatives, the slopes, the extreme points, etc. of this function...

Second, suppose that the price of sugar is increasing and then decreasing: $$y\ \nearrow\ \searrow\ .$$ To make this specific, we can choose (an upside parabola): $$y=g(t)=-(t-1)^2+2.$$ We then again evaluate $y$ for several values of $t$: $$\begin{array}{l|ll} t&y\\ \hline 0&1.00\\ 1&2.00\\ 2&1.00 \end{array}$$ With more points acquired in a spreadsheet we plot the graph on the $ty$-plane:

Price of sugar.png

We are interested in finding hidden relations between these two commodities... Before we develop calculus of parametric curves -- to study the slopes of the curves, the tangents, the turning points, etc. -- we would like to simply visualize them. How do we combine the two plots?

As the two plots are made of (initially) disconnected points -- $(t,x)$ and $(t,y)$ -- so is the new plot. What are those points? The points are $(x,y)$s with $x$ and $y$ appropriately paired up. A value of $x$ is paired up with a value of $y$ when they appear along the same $t$ in both plots: $$\begin{array}{l|l|l} t&x&y\\ \hline 0&1.00&1.00\\ 1&.50&2.00\\ 2&.33&1.00 \end{array}$$ This is what happens to each pair:

Combine tx and ty into xy.png

As a result, with the independent variables are the same, for both functions, only the dependent variables appear. Instead of plotting all points $(t,x,y)$, which belong to the $3$-dimensional space, we just plot $(x,y)$ on the $xy$-plane -- for each $t$.

Price of wheat and sugar.png

The direction matters! Since $t$ is missing, we have to make sure we know in which direction we are moving and indicate that with an arrow. Ideally, we also label the points in order to indicate not only “where” but also “when”.

Price of wheat and sugar labels.png

Thus, this is motion, just as before, but through what space? An abstract space of prices that we've made up. The space is comprised of all possible combinations of prices, i.e., a point $(x,y)$ stands for a combination of two prices: $x$ for wheat and $y$ for sugar.

How much information about the dynamics of the two prices contained in the original functions can we recover from the new graph? A lot. We can shrink the graph vertically to de-emphasize the change of $y$ and to reveal the qualitative behavior of $x$, and vice versa:

Price of wheat and sugar shrunk.png

We see the decrease of $x$ and then the increase followed by the decrease of $y$. In addition, the density of the points indicates the speed of the motion. $\square$

Thus the monotonicity of $x$ and the monotonicity of $y$ determine the direction of the parametric curve. We summarize this observation below: $$\begin{array}{l|lll} y\backslash x & \nearrow & \searrow\\ \hline \nearrow & \nearrow & \nwarrow\\ \searrow & \searrow & \swarrow\\ \end{array}$$

Example (abstract). We can do this in a fully abstract setting. When two functions, $f,g$, are represented by their respective lists of values (instead of formulas), they are easily combined into a parametric curve, $F$. We just need to eliminate the repeated column of inputs. Suppose we need to combine these two functions: $$\begin{array}{c|cc} t&x=f(t)\\ \hline 0&1\\ 1&2\\ 2&3\\ 3&0\\ 4&1 \end{array}\quad \& \quad \begin{array}{c|cc} t&y=g(t)\\ \hline 0&5\\ 1&-1\\ 2&2\\ 3&3\\ 4&0 \end{array}\quad=\quad?$$ We repeat the inputs column -- only once -- and then repeat the outputs of either function. First row: $$f:0\mapsto 1\quad \&\quad g:0\mapsto 5\quad \Longrightarrow\ F:0\mapsto (0,5).$$ Second row: $$f:1\mapsto 2\quad \&\quad g:1\mapsto -1\quad \Longrightarrow\ F:1\mapsto (2,-1).$$ And so on. This is the whole solution: $$ \begin{array}{c|cc} t&x=f(t)\\ \hline 0&1\\ 1&2\\ 2&3\\ 3&0\\ 4&1 \end{array}\quad \& \quad \begin{array}{c|cc} t&y=g(t)\\ \hline 0&5\\ 1&-1\\ 2&2\\ 3&3\\ 4&0 \end{array}\quad \Longrightarrow\quad \begin{array}{c|rlcr} t&P=&(f(t)&,&g(t))\\ \hline 0&&(1&,&5)\\ 1&&(2&,&-1)\\ 2&&(3&,&2)\\ 3&&(0&,&3)\\ 4&&(1&,&0) \end{array}$$ As you can see, there are no algebraic operations carried out and there is no new data, just the old data arranged in a new way. However, it is becoming clear that the list is also a function of some kind... $\square$

Example (spreadsheet). This is a summary how the parametric curve is formed from two functions provided with a spreadsheet. The three columns -- $t$, $x$, and $y$ -- are copied and then the last two are used to create a chart:

Parametric curve with spreadsheet.png

This chart is the path -- not the graph -- of the parametric curve. Note also that the curve isn't the graph of any function of one variable as the Vertical Line Test is violated. $\square$

Example (pattern). Plotting a parametric curve may reveal a relation between two quantities:

Price of wheat and sugar linear.png

$\square$

Parametric curves are functions!

This idea comes with certain obligations (Chapter 2). First, we have to name it, say $F$. Second, as we combine two functions, we use the following notation for this operation: $$F=(f,g):\ \begin{cases} x=f(t),\\ y=g(t). \end{cases}$$

Next, what is the independent variable? It is $t$. After all, this is the input of both of the functions involved. What is the dependent variable? It is the “combination” of the outputs of the two functions, i.e., $x$ and $y$. We know how to combine these; we form a pair, $P=(x,y)$. This $P$ is a point on the $xy$-plane!

To summarize, we do what we have done many times before (addition, multiplication, etc.) -- we create a new function from two old functions. We represent a function $f$ diagrammatically as a black box that processes the input and produces the output: $$\newcommand{\ra}[1]{\!\!\!\!\!\xrightarrow{\quad#1\quad}\!\!\!\!\!} \newcommand{\da}[1]{\left\downarrow{\scriptstyle#1}\vphantom{\displaystyle\int_0^1}\right.} % \begin{array}{ccc} \text{input} & & \text{function} & & \text{output} \\ t & \mapsto & \begin{array}{|c|}\hline\quad f \quad \\ \hline\end{array} & \mapsto & x \end{array}$$ Now, what if we have another function $g$: $$\newcommand{\ra}[1]{\!\!\!\!\!\xrightarrow{\quad#1\quad}\!\!\!\!\!} \newcommand{\da}[1]{\left\downarrow{\scriptstyle#1}\vphantom{\displaystyle\int_0^1}\right.} % \begin{array}{ccc} \text{input} & & \text{function} & & \text{output} \\ t & \mapsto & \begin{array}{|c|}\hline\quad g \quad \\ \hline\end{array} & \mapsto & y \end{array}$$ How do we represent $F=(f,g)$? To represent it as a single function, we need to “wire” their diagrams together side by side: $$\newcommand{\ra}[1]{\!\!\!\!\!\xrightarrow{\quad#1\quad}\!\!\!\!\!} \newcommand{\da}[1]{\left\downarrow{\scriptstyle#1}\vphantom{\displaystyle\int_0^1}\right.} % \begin{array}{ccc} t & \mapsto & \begin{array}{|c|}\hline\quad f \quad \\ \hline\end{array} & \mapsto & x& \\ ||&&&&\updownarrow\\ t & \mapsto & \begin{array}{|c|}\hline\quad g \quad \\ \hline\end{array} & \mapsto & y \end{array}$$ It is possible because the input of $f$ is the same as the input of $g$. For the outputs, we can combine them even when they are of different nature. Then we have a diagram of a new function: $$\begin{array}{ccc} (f,g):& t & \mapsto & \begin{array}{|c|}\hline &t& \begin{array}{lllll} \nearrow &t &\mapsto &\begin{array}{|c|}\hline\quad f \quad \\ \hline\end{array} & \mapsto & x & \searrow\\ \\ \searrow &t &\mapsto &\begin{array}{|c|}\hline\quad g \quad \\ \hline\end{array} & \mapsto & y & \nearrow\\ \end{array}& (x,y)\\ \hline\end{array} & \mapsto & P&. \end{array}$$ We see how the input variable $t$ is copied into the two functions, processed by them in parallel, and finally the two outputs are combined together to produce a single output. The result can be seen as again black box: $$\begin{array}{ccc} & t & \mapsto & \begin{array}{|c|}\hline \quad F \quad \\ \hline\end{array} & \mapsto & P&. \end{array}$$ The difference from all the functions we have seen until now is the nature of the output.

Next, what is the domain of $F=(f,g)$? It is supposed to be a recording of all possible inputs, i.e., all the $t$'s for which the output $P=(f(t),g(t))$ of the function makes sense. For this point to make sense, both of its coordinates have make sense. Then, we can choose the domain of $F$ to be the intersection of the domains of $f$ and $g$.

Example (domain). What is the domain of this parametric curve below? $$F=(f,g):\ \begin{cases} x=\sqrt{t},\\ y=\frac{1}{t-1}. \end{cases}$$ The domain of $f$ is $t\ge 0$ and the domain of $g$ is $t\ne 1$. Therefore, the domain of $f$ is $$[0,+\infty) \cap \bigg( (-\infty,0)\cup (0,+\infty)\bigg) =[0,1)\cup (1,+\infty).$$ $\square$

What about the image (the range of values) of $F=(f,g)$? It is supposed to be a recording of all possible outputs of $F$. The terminology used is often different though.

Definition. The path of a parametric curve $x=f(t),\ y=g(t)$ is the set of all such points $P=(f(t),g(t))$ on the $xy$-plane.

The path is typically a curve. We plot several of them below.

Example (path). In general, the two processes, $x=x(t)$ and $y=y(t)$, are independent. When we combine them to see the path of the object by plotting $(x,y)$ for each $t$, the result may be unexpected:

Path from x and y -- spreadsheet.png

$\square$

What about the graph of $F=(f,g)$? As we know from Chapter 2, the graph of a function is supposed to be a recording of all possible combinations of inputs and outputs of $F$. What if the outputs are $2$-dimensional?

Definition. The graph of a parametric curve $x=f(t),\ y=g(t)$ is the set of all such points $(t,x,y)=(t,f(t),g(t))$ in the $txy$-space.

The graph is built from:

  • the graph of $x=f(t)$ on the $tx$-plane (the floor), and
  • the graph of $y=g(t)$ on the $ty$-plane (the wall facing us).

It is a curve in space, akin to a piece of wire:

Tx ty planes and txy space.png

Then the shadow of this wire on the floor is the graph $x=f(t)$ (light from above). If the light is behind us, the shadow on the wall in front is the graph $y=g(t)$. In addition, pointing a flashlight from right to left will produce the path of the parametric curve on the $xy$-plane.

For parametric curves, we plot paths instead of graphs. The simple reason is that we can't plot in 3D by hand, for now. The drawback is a loss of information: the plot of the path tells us where but not when (unless we label the locations).

So, we have moved from a pair of quantities represented by functions to form a parametric curve and then to its path as a way to visualize the relation between the two variables. In reverse, a curve -- such a road -- that needs to be studied will benefit from being represented as the path of a parametric curve.

Example (circle). Let's take the circle! All we need to know comes from trigonometry; this was the definition of $\sin$ and $\cos$ in terms of the angle with the $x$-axis (Chapter 4): $$x=\cos t,\ y=\sin t.$$ So, the $x$-coordinate of the point on the unit circle at angle $t$ is $\cos t$ and its $y$-coordinate is $\sin t$:

Circle parametrical.png

Then, this parametric curve's path is the circle of radius $R$: $$x=R\cos t,\ y=R\sin t.$$ Of course, we recognize the trig substitution we utilized in Chapter 12 to compute the area and the length of the circle; this wasn't a coincidence!

So, we either move at a constant horizontal velocity -- $y=\sqrt{R^2-x^2}$ -- as before or at a constant angular velocity now:

Circle graph and parametrized.png

The domain of the parametric curve that represents the upper half of the circle is $[0,\pi]$. Without such a restriction, the point continue to make full circles around the origin... We can also move along the circle in the backward direction: $$x=\cos (-t),\ y=\sin (-t),$$ or go twice as fast: $$x=\cos 2t,\ y=\sin 2t,$$ and so on. $\square$

Definition. We say that we parametrize a curve $C$ if we find a parametric curve $x=f(t),\ y=g(t)$, called its parametrization, the path of which is this curve $C$.

We go in the direction opposite of the one we went when we created a parametric curve from two numerical functions:

Parametrizing a curve.png

The idea is to think of $C$ as if it is a road being driven over by someone who is recording the coordinates of his locations over time. Thus, we get $x$ and $y$ from our GPS and $t$ from our clock. Of course, this can be done in a number of ways: same road, different drivers (speed, direction, etc.). Therefore, there are infinitely many different parametric curves with the same path...

Exercise. Parametrize the unit circle in such a way that it starts at $(0,1)$.

Exercise. Parametrize the unit circle in such a way that it starts at $(1,0)$ but never reaches $(0,1)$ even as $t\to +\infty$.

This is the summary of the terminology: $$\begin{array}{l|ll} \text{types of functions:}&\text{general functions}&\text{numerical functions}&\text{parametric curves}&\text{motion}\\ \hline \text{the set of all outputs:}&\text{image}&\text{range}&\text{path}&\text{trajectory}\\ \end{array}$$

Introduction to functions of several variables

We saw some functions of several variables in Chapter 10. Let's review some main ideas.

Example (baker). We will take a look at the example in the last section from a different angle. The time $t$ is not a part of our consideration anymore but we retain the two variables representing the two commodities:

  • $x$ is the price of wheat,
  • $y$ is the price of sugar.

We also add a product to the setup:

  • $z$ is the price of a loaf of bread.

What is the relation between these three? As those two are the two major ingredients in bread, we will assume that

  • $z$ depends on $x$ and $y$.

One can imagine a baker who every morning, upon receiving the updated prices of wheat and sugar, uses a table that he made up in advance to decide on the price of his bread for the rest of the day. Let's see how such a table might come about.

What kind of dependencies are these? Increasing prices of the ingredients in creases the cost and ultimately the price of the product: $$\begin{array}{lll} x\nearrow\ \Longrightarrow\ z\nearrow;\\ y\nearrow\ \Longrightarrow\ z\nearrow. \end{array}$$ At its simplest, such an increase is linear. In addition to some fixed costs,

  • each increase of $x$ leads to a proportional increase of $z$, and
  • each increase of $y$ leads to a proportional increase of $z$,

independently! A simple formula that captures this dependence may be this: $$z=p(x,y)=2x+y+1.$$ In order to visualize this function, we compute a few of its values:

  • $p(0,0)=1$,
  • $p(0,1)=2$,
  • $p(0,2)=3$,
  • $p(1,0)=3$,
  • $p(1,1)=4$,
  • etc.

Even though this is a list, we realize that the input variables don't fit into a list comfortably... they form a table! $$\begin{array}{ccc} (0,0)&(1,0)&(2,0)&...\\ (0,1)&(1,1)&(2,1)&...\\ (0,2)&(1,2)&(2,2)&...\\ ...&...&...&... \end{array}$$ In fact, we can align these pairs with $x$ in each column and $y$ in each row: $$\begin{array}{l|ccc} y\backslash x&0&1&2&...\\ \hline 0&(0,0)&(1,0)&(2,0)&...\\ 1&(0,1)&(1,1)&(2,1)&...\\ 2&(0,2)&(1,2)&(2,2)&...\\ ...&...&...&...&... \end{array}$$ Now, the values, $z=p(x,y)$: $$\begin{array}{l|ccc} y\backslash x&0&1&2&...\\ \hline 0&1&3&5&...\\ 1&2&4&6&...\\ 2&3&5&7&...\\ ...&...&...&...&... \end{array}$$ That's what baker's table might look like...

Let's bring these two together: $$\begin{array}{l|ccc} y\backslash x&0&&&1&&&2&&&...\\ \hline 0&(0,0)&&&(1,0)&&&(2,0)&&&...\\ &&\searrow&&&\searrow&&&\searrow&&...\\ &&&1&&&3&&&5&...\\ 1&(0,1)&&&(1,1)&&&(2,1)&&&...\\ &&\searrow&&&\searrow&&&\searrow&&...\\ &&&2&&&4&&&6&...\\ 2&(0,2)&&&(1,2)&&&(2,2)&&&...\\ &&\searrow&&&\searrow&&&\searrow&&...\\ &&&3&&&5&&&7&...\\ ...&...&&&...&&&...&&&... \end{array}$$

In the past, we have visualized numerical functions by putting bars on top of the $x$-axis. Now, we visualize the values by building columns with appropriate heights on top of the $xy$-plane:

Function of 2 variables with bars.png

Notice that by fixing one of the variables -- $x=0,1,2$ or $y=0,1,2$ -- we create a function of one variable with respect to the other variable. We fix $x$ below and extract the columns from the table: $$x=0:\ \begin{array}{|l|c|} y&z\\ \hline 0&1\\ 1&2\\ 2&3\\ \end{array},\quad x=1:\ \begin{array}{|l|c|} y&z\\ \hline 0&3\\ 1&4\\ 2&5\\ \end{array},\quad x=2:\ \begin{array}{|l|c|} y&z\\ \hline 0&5\\ 1&6\\ 2&7\\ \end{array}.$$ A pattern is clear: growth by $1$. We next fix $y$ and extract the rows from the table: $$y=0:\ \begin{array}{r|cccc} \hline x&0&1&2\\ \hline z&1&3&5\\ \hline \end{array},\quad y=1:\ \begin{array}{r|cccc} \hline x&0&1&2\\ \hline z&2&4&6\\ \hline \end{array},\quad y=2:\ \begin{array}{r|cccc} \hline x&0&1&2\\ \hline z&3&5&7\\ \hline \end{array}.$$ A pattern is clear: growth by $2$. We have the total of six (linear) functions!

Let's do the same with a spreadsheet. This is the data:

Function of two variables in spreadsheet -- data.png

The value in each cell is computed from the corresponding value of $x$ (all the way up) and from the corresponding value of $y$ (all the way left). This is the formula: $$\texttt{=2*R3C+RC2+1}.$$ The simplest way to visualize is by coloring the cell depending on the values (common in cartography: elevation, temperature, humidity, precipitation, population density, etc.:

Function of two variables -- heat map.png

The growth is visible: it grows the most in some diagonal direction but it's not $45$ degrees...

We can also visualize with bar chart, just as before:

Function of two variables -- bars.png

If we used bars to represent the Riemann sums to compute the area, here we are after the volume...

The most common way, however, to visualize a function of two variables in mathematics is with its graph, which, in this case, is a surface:

Function of two variables -- surface.png

In this particular case, this is a plane. The second graph is the same surface but displayed as a wire-frame (or even a wire-fence). The wires are the graphs of those linear functions of one variable created from our function when we fix one variable at a time. Each of these wires comes from choosing either:

  • the row of $x$'s (top) and one other row in the table, or
  • the column of $y$'s (leftmost) and one other column in the table.

$\square$

Exercise. Provide a similar analysis for (a) the wind-chill and (b) the heat index.

Example (non-linear). Below we plot $q(x,y)=\sin(xy)$.

Sin(xy).png

$\square$

The functions of one variable created from our function $z=p(x,y)$ when we fix one variable at a time are: $$\begin{array}{ll} y=b& \longrightarrow &f_b(x)=p(x,b);\\ x=a& \longrightarrow &g_a(y)=p(a,y). \end{array}$$ There are infinitely many of them. Their graphs are the slices -- along the axes -- of the surface that is the graph of $F$.

Function of two variables.png

Therefore, the monotonicity of these functions tells us about the monotonicity of $p$ -- in the directions of the axes!

Functions of two variables are functions...

This idea comes with certain questions to be answered. What is the input, the independent variable? Taking a clue from our analysis of parametric curves, we answer: it is the “combination” of the two inputs of the function, i.e., $x$ and $y$ that form a pair, $X=(x,y)$, which is is a point on the $xy$-plane. What is the output, the dependent variable? It is $z$.

We represent a function $p$ diagrammatically as a black box that processes the input and produces the output: $$\newcommand{\ra}[1]{\!\!\!\!\!\xrightarrow{\quad#1\quad}\!\!\!\!\!} \newcommand{\da}[1]{\left\downarrow{\scriptstyle#1}\vphantom{\displaystyle\int_0^1}\right.} % \begin{array}{ccc} \text{inputs} & & \text{function} & & \text{output} \\ x\\ &\searrow\\ & & \begin{array}{|c|}\hline\quad p \quad \\ \hline\end{array} & \mapsto & z&.\\ &\nearrow\\ y \end{array}$$ Instead, we would like to see a single input variable, $(x,y)$, decomposed into two $x$ and $y$ to be processed by the function at the same time: $$\begin{array}{ccc} & (x,y) & \mapsto & \begin{array}{|c|}\hline \quad p \quad \\ \hline\end{array} & \mapsto & z&. \end{array}$$ The difference from all the functions we have seen until now is the nature of the input.

Next, what is the domain of $p$? It is supposed to be a recording of all possible inputs, i.e., all pairs $(x,y)$ for which the output $z=p(x,y)$ of the function makes sense. This requirement create a subset of the $xy$-plane and, therefore, a relation between $x$ and $y$.

Example (domain). What is the domain of this function: $$p=\sqrt{x+y}?$$ Only $(x,y)$'s are allowed that satisfy $x+y\ge 0$. Therefore, the domain of $p$ is a half-plane:

Half-plane.png

$\square$

What about the image, i.e., the range of values of $p$? It is a recording of all possible outputs of $p$.

Definition. The image of a function of two variables $z=p(x,y)$ is the set of all such values $z$ on the $z$-axis.

What about the graph of $p=(f,g)$? It is supposed to be a recording of all possible combinations of inputs and outputs of $F$.

Definition. The graph of a function of two variables $z=p(x,y)$ is the set of all such points $\big( x,y,p(x,y) \big)$ in the $xyz$-space.

Introduction to calculus of several variables

When a parametric curve is formed from two functions one variable: $$x=f(t),\ y=g(t),$$ are the derivatives of $f$ and $g$ visible in the shape (and the slope) of the path? Vice versa, can the derivatives be deduced from the slopes of the points of the path?

The slopes of the graphs of $f$ and $g$ produce the slope of the parametric curve according to a simple rule which is easy to discover from the case when both functions are linear:

Combine tx and ty into xy -- slopes.png

In other words, if $m$ and $n$ and slopes of $f$ and $g$ respectively, then the slope of $(f,g)$ is $\frac{n}{m}$. Indeed, $$\frac{\text{change of }y}{\text{change of }x}=\frac{\text{change of }y / \text{change of }t}{\text{change of }x / \text{change of }t}.$$

When the functions are non-linear, the rule is the same but it is applied one point at a time: $$\text{slope at } (a,b)\ =\ \frac{g'(b)}{f'(a)}.$$ To see why, it suffices to zoom in one a point of the parametric curve as well as the corresponding points of the graphs of the two functions:

Slope of parametric curve = ratio of derivative of f and g.png

In other words, we have this: $$\frac{dy}{dx}= \left.\frac{dy}{dt} \middle/ \frac{dx}{dt}\right. .$$ The formula resembles the Chain Rule, not by coincidence.

Exercise. Prove the formula for the case when $f$ is one-to-one.

The two special cases are the following:

  • $f'(a)=0\ \Longrightarrow\ $ the slope is vertical, and
  • $g'(b)=0\ \Longrightarrow\ $ the slope is horizontal.

The former case was seen as “extreme” in calculus of one variable. It's not extreme in calculus of parametric curves!

Example (trader). Recall that the price of wheat and the price of sugar are represented by two functions combined into a parametric curve. With these functions sampled, we compute their rates of change: $$\begin{array}{c|c|c} t&x&x'\\ \hline 0&1.00\\ \downarrow&\downarrow\\ 1&.50&\frac{.5-1}{1-0}=-.5\\ \downarrow&\downarrow\\ 2&.33&\frac{.33-.5}{2-1}=-.17 \end{array}\quad\begin{array}{c|c|c} t&y&y'\\ \hline 0&1.00\\ \downarrow&\downarrow\\ 1&2.00&\frac{2.00-1.00}{1-0}=1\\ \downarrow&\downarrow\\ 2&1.00&\frac{1.00-2.00}{2-1}=-1 \end{array}$$ Note that there are fewer numbers that in the original because there are fewer segments than points.

Let's now confirm this result via actual differentiation of our functions: $$\begin{cases} x&=f(t)&=\frac{1}{t+1},\\ y&=g(t)&=-(x-1)^2+2. \end{cases}\ \Longrightarrow\ \begin{cases} x'&=f'(t)&=-\frac{1}{(t+1)^2},\\ y'&=g'(t)&=-2(x-1). \end{cases}$$ We have computed the derivatives of the two component functions. Combined they also form a parametric curve!

The signs of the two new functions tell us the increasing/decreasing behavior of the two original functions and, therefore, the direction of the parametric curve. For example, $x'<0$ so that the curve moves to the left and... moves down initially because $y'<0$ and... moves up later because $y'>0$. Let's visualize and confirm these results with a spreadsheet (without using the computed derivatives above):

Wheat and sugar -- derivatives.png

In order to approximate either of the two derivatives, we use the slope of the segment between two adjacent points. It is the average rate of change (also known as the difference quotient, the slope of the secant line, etc.): $$\frac{\text{change of }x}{\text{change of }t} \text{ and } \frac{\text{change of }y}{\text{change of }t}.$$ The change of $t$ is fixed as $h=\Delta t$. This value for either $x$ and $y$ is computed via the same spreadsheet formula as before: $$\texttt{=(RC[-1]-R[-1]C[-1])/R2C1}.$$ Note that there are one fewer cells in this column because there are one fewer segments than points.

Wheat and sugar -- derivatives 2.png

In addition to $x'<0\ \Longrightarrow\ x\searrow$, we also get $x'\nearrow\ \Longrightarrow\ x\smile$ (concave up). Similarly, $y'>0\ \Longrightarrow\ x\nearrow$ and $y'\searrow\ \Longrightarrow\ y\frown$ (concave down) initially and then the opposite. Also, the apparent linearity of $y'$ indicate that $y$ might be quadratic... From the monotonicity of the component functions we conclude that initially the parametric curve goes $\nwarrow$ and then $\swarrow$ to confirm the picture. $\square$

Exercise. What conclusions about the shape of the parametric curve can you draw from the concavity of its component functions?

Thus, we can say about a parametric curve that its derivative is made up of the derivatives of the functions involved. There are only two for a parametric curve... and infinitely many for a function of two variables!

Example (baker). The dependence of the price of bread on the prices of wheat and sugar are represented by the function $z=p(x,y)$ below. As it is sampled, we can compute the rates of change of this function in the horizontal and vertical directions: $$\text{over }x:\ \begin{array}{l|cccc} y\backslash x&0&\to&1&\to&2\\ \hline 0&1&\to&3&\to&5\\ \\ 1&2&\to&4&\to&6\\ \\ 2&3&\to&5&\to&7\\ \end{array}\quad\text{over }y:\ \begin{array}{l|cccc} y\backslash x&0&\quad&1&\quad&2\\ \hline 0&1&\quad&3&\quad&5\\ \downarrow&\downarrow&\quad&\downarrow&\quad&\downarrow&\\ 1&2&\quad&4&\quad&6\\ \downarrow&\downarrow&\quad&\downarrow&\quad&\downarrow&\\ 2&3&\quad&5&\quad&7\\ \end{array}$$ Recall that by fixing one of the variables we create a function of one variable with respect to the other variable. Now we approximate the derivatives of these functions just as before, via the average rate of change: $$\frac{\text{change of }z}{\text{change of }x} \text{ and } \frac{\text{change of }z}{\text{change of }y}.$$ First, we approximate the derivative in the direction of $y$: $$\begin{array}{l|c|ccc} y\ (x=0)&z&z'&\\ \hline 0\ \downarrow &1\ \downarrow &\\ 1\ \downarrow &2\ \downarrow &\frac{2-1}{1-0}=1\\ 2\ \downarrow &3\ \downarrow &\frac{3-2}{2-1}=1\\ \end{array}\quad \begin{array}{l|c|ccc} y\ (x=1)&z&z'&\\ \hline 0\ \downarrow &3\ \downarrow \\ 1\ \downarrow &4\ \downarrow &\frac{4-3}{1-0}=1\\ 2\ \downarrow &5\ \downarrow &\frac{5-4}{2-1}=1\\ \end{array}\quad \begin{array}{l|c|ccc} y\ (x=2)&z&z'&\\ \hline 0\ \downarrow &5\ \downarrow \\ 1\ \downarrow &6\ \downarrow &\frac{6-5}{2-1}=1\\ 2\ \downarrow &7\ \downarrow &\frac{7-6}{2-1}=1\\ \end{array}$$ All $1$s. Note that there are fewer numbers that in the original because there are fewer segments than points. Similarly, we approximate the derivative in the direction of $x$: $$\begin{array}{r|lll} (y=0)\ x&0&\to\ 1&\to\ 2\\ \hline z&1&\to\ 3&\to\ 5\\ \hline z'&&\frac{3-1}{1-0}=2&\frac{5-3}{2-1}=2 \end{array}\quad \begin{array}{r|cccc} (y=1)\ x&0&1&2\\ \hline z&2&4&6\\ \hline z'&&2&2 \end{array}\quad \begin{array}{r|cccc} (y=0)\ x&0&1&2\\ \hline z&3&5&7\\ \hline z'&&2&2 \end{array}$$ All $2$s. We put these one-variable functions together; then the rates of change of $F$ with respect to $x$ and $y$ are these new functions of two variables respectively: $$\leadsto\quad\begin{array}{l|cccc} y\backslash x&0&1&2\\ \hline 0&&2&2\\ 1&&2&2\\ 2&&2&2\\ \end{array}\quad\&\quad \begin{array}{l|cccc} y\backslash x&0&1&2\\ \hline 0&\\ 1&1&1&1\\ 2&1&1&1\\ \end{array}\quad\leadsto\quad \begin{array}{l|cccc} y\backslash x&0&1&2\\ \hline 0&\\ 1&&(2,1)&(2,1)\\ 2&&(2,1)&(2,1)\\ \end{array}$$ The two functions are further combined on right. As we shall see later, going $2$ horizontally and $1$ vertically is the direction of the fastest growth of the function $p$.

Let's now confirm this result via actual differentiation of our function: $$p(x,y)=2x+y+1.$$ Just as before, we fix one of the variables and differentiate with respect to the other. We call these two functions the partial derivatives of $p$ with respect to $x$ and $y$ respectively. For $x$, we declare $y$ fixed and differentiate over $x$ using the following two types of notation (following Leibniz and Lagrange as before): $$\frac{\partial p}{\partial x}=p'_x=\frac{\partial}{\partial x}\left(2x+y+1\right)=\frac{\partial}{\partial x}(2x)+0+0=2.$$ For $y$, we declare $x$ fixed and differentiate over $y$: $$\frac{\partial p}{\partial y}=p'_y=\frac{\partial}{\partial y}\left(2x+y+1\right)=0+\frac{\partial}{\partial y}(y)+0=1.$$ The conclusion might sound familiar: the derivative of a linear function is constant! The combination of the two partial derivatives will be seen as the derivative of $p$ called the gradient of $p$. This is a new kind of function to be discussed in Chapters 19 and 21.

We can confirm these results by examining the spreadsheet. Each line (wire) below on right is the graph of a function of one variable:

Function of two variables -- wireframe.png

And each has its own derivative! As we move horizontally, the values of $x$ grow by $.1$ while the values of $z$ grow by $.2$. Therefore, $p'_x=2$. Similarly, as we move vertically, the values of $y$ grow by $.1$ and so do the values of $z$. Therefore, $p'_y=1$. $\square$

Warning: Do not confuse partial differentiation with implicit differentiation that comes under related rates: $$\frac{\partial}{\partial x}\left(xy\right)=y\quad \text{vs. }\quad \frac{d}{dx}\left(xy\right)=y+\frac{dy}{dx}.$$ The extra term on right comes from the fact that the two variables are related! In the former case, they aren't -- as two independent variable -- and, therefore, $\frac{dy}{dx}=0$. In either case, the differentiation in on the $xy$-plane going in some direction: in the former case it can be only vertical or horizontal and in the latter case the direction is determined by the relation between the variables (for example, if $x-y=1$, the direction is $45$ degrees).

Example (non-linear). Let's consider this function again: $$q(x,y)=\sin(xy).$$ Compute the partial derivatives by the Chain Rule. First we declare $y$ an unknown and unspecified but fixed parameter and carry out differentiation with respect to $x$: $$\frac{\partial q}{\partial x}=\frac{\partial}{\partial x}\big(\sin(xy)\big)=\cos(xy)\cdot \frac{\partial}{\partial x}(xy)=\cos(xy)y.$$ This time $x$ is the parameter: $$\frac{\partial q}{\partial y}=\frac{\partial}{\partial y}\big(\sin(xy)\big)=\cos(xy)\cdot \frac{\partial}{\partial y}(xy)=\cos(xy)x.$$ Let's confirm these results by examining the graph of $q$ plotted with a spreadsheet:

Sin(xy).png

Note that the edge of the surface is a curve and it is the graph of the function given in the very last row of the table. We also notice that:

  • the surface is flat along the $x$-axis, because $\frac{\partial q}{\partial x}=0$ for $y=0$, and
  • the surface is flat along the $y$-axis, because $\frac{\partial q}{\partial x}=0$ for $x=0$.

We approximate these derivatives just as before, the average rate of change with the change of $x$ and $y$ is fixed as $h=\Delta x=\Delta y$. In a spreadsheet, the rate of change of $z$ in the direction of $x$ and $y$ is computed via the same formula but applied horizontally and vertically respectively: $$\texttt{=(R[-23]C-R[-23]C[-1])/R1C1 } ; \quad \texttt{=(RC[-23]-R[-1]C[-23])/R1C1 }.$$ The formulas produce the two partial derivatives:

Sin(xy) and its derivatives.png

At the point $(.1,.1)$, the values of the two partial derivatives are equal, which is why the direction of the fastest growth of $q$ is at $45$ degrees. Note also that the highest locations form a ridge; it is where both partial derivatives are equal to $0$. To find the location of the ridge we solve the equation $$\cos(xy)=0\ \Longrightarrow\ xy=\pi/2.$$ It's a hyperbola. $\square$

The idea that the values of the partial derivatives indicate the direction of the fastest growth of the function can be illustrated approximately as follows: $$\begin{array}{l|ccc} y\quad\quad\backslash x & q'_x>0 & q'_x<0\\ \hline q'_y>0 & \nearrow & \nwarrow\\ q'_y<0 & \searrow & \swarrow\\ \end{array}$$

Example (bread buyers). We will take the two examples -- the commodity trader and the baker -- from the last two sections and ask, what price of bread have daily visitors to the bakery shop seen over time? These are the variables:

  • $t$ is time,

two variables representing these two commodities:

  • $x$ is the price of wheat,
  • $y$ is the price of sugar, and

and a product:

  • $z$ is the price of a loaf of bread.

The visitors see how $z$ depends on $t$, via some function of single variable: $$z=h(t).$$ What is it?

This is the summary of the setup: $$\begin{array}{|ccccc|} \hline &\text{Example 1 (trader)} & & \bigg|\\ \hline t&\longrightarrow & (x,y) &\longrightarrow & z\\ \hline &\bigg| & &\text{Example 2 (baker)}\\ \hline \end{array}$$ We realize that the problem is about compositions!

Recall that the price of wheat and the price of sugar are represented by a parametric curve:: $$x=f(t)=\frac{1}{t+1},\ y=g(t)=-(x-1)^2+2.$$ Furthermore, the price of bread is computed from the other two prices by the function of two variables: $$z=p(x,y)=2x+y+1.$$ The two functions are visualized as follows:

Time -- wheat -- sugar -- bread.png

Then, of course, $h$ is the composition of these two: $$t\mapsto (x,y)\mapsto z,$$ computed via the following substitution: $$h(t)=p(f(t),g(t)).$$ To visualize what happens, imagine the parametric curve -- on the $xy$-plane -- being “lifted” to the graph of $p$:

Time -- wheat -- sugar -- bread 2.png

The elevation is then the value of $h$. The end result is below:

Price of bread over time.png

$\square$

In the past, we have found the derivative of the composition of two functions by the Chain Rule: we expressed it in terms of the derivatives of the two functions involved (their product). We then conjecture that in order to find the derivative of the composition above we need to understand the meaning of the following:

  • the derivative of a parametric curve, and
  • the derivative of a function of two variables.

The centroid of a flat object

Suppose we have a plate with uniform density and identical thickness (it is known as a “lamina”). How can we balance it on a single support called the centroid?

Centroid.png

There are a few heuristics that help. If the object has a “center”, such as a circle or a square, this is it.

Centroid from symmetry.png

Also, any axis of symmetry will have to contain the centroid.

The idea of centroid is related to the concept of the center of mass which is the center of rotation of the object when subjected to a force. We studied this concept previously but with the weight distributed within a straight segment, such as a seesaw:

Seesaw.png

We found that if one person is heavier than the other, the latter person should sit farther from the center in order to balance the beam. In fact, the distance should be twice as long!

Consider a variation of the seesaw. It is made of two beams nailed together to form a cross with a single point of support in the middle:

Seesaw4.png

It appears that four persons of equal weight will be in balance when located at equal distance from the point of support. But there is more: they will be balanced as long as either pair of persons facing each other are in balanced! We can then use what we have learned from the $1$-dimensional case.

We explore this idea by replacing this construction with a square. Then the seesaws that we considered previously can be interpreted as this square balanced on a bar that goes all the way across:

Balanced seesaw -- square.png

We can spread the weight along the line parallel to the bar because only the distance to this bar matter for the leverage of each weight. Once we add the $x$- and $y$-axes to the picture, this distance is simply the $x$-coordinate:

Balanced square on xy-plane.png

Now, our problem is that of balancing the region below the graph of a function.

Balance on edge or on bar.png

Let's review how we do this. Suppose we have a non-negative function $y=f(x)$ integrable on segment $[a,b]$. For a given point $c$ the integral $$\int_a^b f(x)(x-c)\, dx$$ is called the total moment of the region with respect to $c$. The center of mass of the region is such a point $c$ that the total moment with respect to $c$ is zero: $$c=\frac{\int_a^b f(x)x\, dx}{\int_a^b f(x)\, dx}.$$

Example. Let's review how we find how we can balance a triangle on its horizontal edge. Suppose it is the region under the graph $y=f(x)=x$ from $0$ to $1$:

Triangle balanced.png

We compute the total moment of the object: $$\begin{array}{ll} \int_0^1 f(x)x\, dx&=\int_0^1 x\cdot x\, dx\\ &=\int_0^1 x^2\, dx\\ &=x^3/3\Bigg|_0^1\\ &=1/3. \end{array}$$ Meanwhile, the mass is simply $1/2$. Therefore, the center of mass -- on the $x$-axis -- is $$c_1=\frac{1}{3}\div \frac{1}{2}=\frac{2}{3}.$$

What if we want to balance the triangle on its other edge? We place the $x$-axis along that edge, then the slanted edges is given by $y=g(x)=1-x$. We compute the total moment of the object: $$\begin{array}{ll} \int_0^1 g(x)x\, dx&=\int_0^1 (1-x)x\, dx\\ &=\int_0^1 (x-x^2)\, dx\\ &=y^2/2-y^3/3\Bigg|_0^1\\ &=1/2-1/3\\ &=1/6. \end{array}$$ Meanwhile, the mass is still $1/2$. Therefore, the center of mass -- along this edge -- is $$c_2=\frac{1}{6}\div \frac{1}{2}=\frac{1}{3}.$$

Triangle centroid.png

We can balance the triangle on either of two bars. Now we remove the bars and replace them with a single support placed at their intersection. $\square$

Definition. Suppose a function $y=f(x)$ is integrable on $[a,b]$. Then the total moment of the region under the graph of $f$ with respect to the line $x=c$ is defined to be: $$\int_a^b (x-c)f(x)\, dx;$$ then such a line is an axis of the region if the total moment is zero.

Example. Let's try the half-circle. One of the axes will go through the center perpendicular to the diameter. We just need to find the other. That's why we place the quarter of the disk adjacent to the origin:

Centroid of half-circle.png

Then the total moment is: $$\int_0^1 (x-c)\sqrt{1-x^2}\, dx=0.$$ Therefore, $$\int_0^1 x\sqrt{1-x^2}\, dx=c\int_0^1 \sqrt{1-x^2}\, dx.$$ The integral in the right-hand side is simply the area of the quarter circle and the one in the left-hand side is easily evaluated by substitution ($u=1-x^2$): $$\begin{array}{lll} \int_0^1 x\sqrt{1-x^2}\, dx&=\int_1^0 -\frac{1}{2}\sqrt{u}\, du \\ &= -\frac{1}{2}\frac{2}{3}u^{3/2}\Bigg|_1^0 \\ &=\frac{1}{3}. \end{array}$$ Therefore, we have: $$\frac{1}{3}=c\pi /4\ \Longrightarrow\ c=\frac{4}{3\pi}\approx .42.$$ $\square$

Exercise. Find the axes of the region bounded by the graph $y=\frac{1}{x+.5}-.5$ and the axes.

We have been able to use only this definition to find the axes of regions with symmetries. Now the general case. We will have to use the $x$- and $y$-axes available to us.

Centroid of lamina.png

Definition. Suppose a function $y=f(x)$ is decreasing on $[0,A]$ and $f(0)=B>0$. Then the centroid of the region bounded by the graph of $f$, the $x$-axis, and the $y$-axis is a point $(c_x,c_y)$ such that the total moments of the region with respect to the lines $x=c_x$ and $y=c_y$ are zero; i.e., $$\int_0^A (x-c_x)f(x)\, dx=0,$$ and $$\int_0^B (y-c_y)f^{-1}(y)\, dy=0.$$

Then, the coordinates of the centroid are: $$c_x=\frac{1}{A}\int_0^A xf(x)\, dx,$$ and $$c_y=\frac{1}{A}\int_0^B yf^{-1}(y)\, dy,$$ where $A$ is the area of the region.

Example. Let's find the centroid of the region bounded the graph of $y=1-x^2$. $\square$

The general case of a plate of an arbitrary shape will be addressed in Chapter 20.

Coordinate systems; polar coordinates

If we want to study the Euclidean plane, and Euclidean geometry, algebraically, we superimpose the Cartesian grid over this plane:

Cartesian plane for Euclidean plane 2.png

The geometry on the piece of paper then determines what is going on, not a particular choice of a coordinate system. For example, the direction of fastest growth is determined by the surface itself:

Function of two variables -- heat map.png

We can place the coordinate system on top of our physical space in a number of ways...

We start with dimension $1$. The line can have different coordinate systems assigned to it and those are related to each other via some transformations:

Axis shifted along arrows.png

Above you see two ways to interpret the transformation:

  • (1) arrows are between the $x$-axis and the intact $y$-axis or
  • (2) we move the $y$-axis so that $y=f(x)$ is aligned with $x$.

We followed the former in Chapter 2 and we will follow the latter in this section.

We can think as if the whole $x$-axis is drawn on a pencil:

Shift of coordinate axis.png

Each letter will have a new coordinate in the new coordinate system.

These are the three main transformations of an axis: shift, flip, and stretch (left) and this is what happens to the coordinates (right):

Coordinate axis transformations.png

This is the algebra for the basic transformations of the axis, the old and the new coordinates: $$\newcommand{\ra}[1]{\!\!\!\!\!\xrightarrow{\quad#1\quad}\!\!\!\!\!} \newcommand{\da}[1]{\left\downarrow{\scriptstyle#1}\vphantom{\displaystyle\int_0^1}\right.} \begin{array}{lcl} t& \ra{ \text{ shift by } k} & x=t-k;\\ t& \ra{ \text{ flip } } & x=-t;\\ t& \ra{ \text{ stretch by } k} & x=t/ k. \end{array}$$

Now dimension $2$, the plane.

Both $x$ and $y$-axes can be subjected to the transformations above. The change of coordinates under the resulting six basic transformations of the $xy$-plane is shown below: $$\begin{array}{lcccclcccclcccc}\text{vertical shift: }&\begin{array}{rlcll} (&x&,&y&)\\ (&x&,&y-k&)\end{array},&\text{flip: }&\begin{array}{rlcll} (&x&,&y&)\\ (&x&,&y\cdot(-1)&)\end{array},&\text{stretch: }&\begin{array}{rlcll} (&x&,&y&)\\ (&x&,&y/ k&)\end{array},\\ \text{horizontal shift: }&\begin{array}{rlcll} (&x&,&y&)\\ (&x-k&,&y&)\end{array},&\text{flip: }&\begin{array}{rlcll} (&x&,&y&)\\ (&x\cdot(-1)&,&y&)\end{array},&\text{stretch: }&\begin{array}{rlcll} (&x&,&y&)\\ (&x/ k&,&y&)\end{array}.\end{array}$$

Example. Some transformations cannot be reduced to a combination of these six. Recall from Chapter 3, that in order to find the graph of the inverse function, we execute a flip about the diagonal of the plane. We grab the end of the $x$-axis with the right hand and grab the end of the $y$-axis with the left hand then interchanging them:

Inverse by hand.png

We face the opposite side of the paper then, but the graph is still visible: the $x$-axis is now pointing up and the $y$-axis right. The axes can be rotated, together:

Rotated grids with axes.png

The coordinate will change but they will still unambiguously determine a location on the plane. The axes can be skewed:

Skewed grid.png

Even then the two numbers indicating the intersection of two lines will unambiguously determine a location on the plane. And so on... Further analysis is presented in Chapter 23. $\square$

There are also alternative coordinate systems. Not only the axes are not rectangular, they are also curved!

The circle is a very special parametric curve. It has replaced the representation of the circle as the union of the two arcs as graphs of these two functions:

Circle as two graphs.png

This curve will also supply us with a new way to record locations on the plane.

The $x$-coordinate of the point on the circle of radius $r$ at angle $\theta$ with the $x$-axis is $r\cos \theta$ and its $y$-coordinate is $r\sin \theta$:

Circle parametrical and polar.png

Then, this is the circle of radius $r$ traced infinitely many times: $$\begin{array}{ll} &x=r\cos \theta,& y=r\sin \theta\\ \text{with }& -\infty<\theta<+\infty,&r>0\ \text{ fixed}. \end{array}$$

What if we consider all of these circles along with the origin ($r=0$)? We have a family of concentric circles:

Circles covering plane.png

They cover the whole plane and they don't intersect each other! They are also parametrized (in a uniform way):

Circle stretched.png

Therefore, the pair of numbers, $$\begin{array}{ccc} \theta & \text{ and }& r\\ -\infty<\theta<+\infty&& -\infty<r<+\infty, \end{array}$$ unambiguously determines a location on the plane:

  • the latter number, $r$, determines which circle we pick, and
  • the former number, $\theta$, tells how far we go along this circle.

Definition. These two numbers, $\theta$ and $r$, are called the polar coordinates of a point $P$ on the $xy$-plane: $$P=(\theta,r).$$

Example (polar points). For each of these pairs $(\theta,r)$, we compute its counterpart on the $xy$-plane: $$\begin{array}{rrrrr} \theta&r&\to&x&y\\ 0.00 &1.00 &\to& 1.00 &0.00\\ 1.00 &0.00 &\to& 0.00 &0.00\\ 1.00 &1.00 &\to& 0.54 &0.84\\ 3.14 &1.00 &\to& -1.00 &0.00\\ 1.57 &-1.50 &\to& 0.00 &-1.50\\ \end{array}$$ We plot them here:

Polar points.png

We used the following formulas for the last two columns: $$\texttt{=RC[-9]*COS(RC[-10]) }\text{ and }\texttt{=RC[-10]*SIN(RC[-11])}.$$ $\square$

Warning: unlike the Cartesian system, the polar coordinate system does not give a unique representations of locations on the $xy$-plane: $$(\theta+2\pi,r)=(\theta,r),\quad (\theta,-r)=(\theta+\pi,r).$$

Now some curves...

Example (polar curves). Simple relations between $r$ and $\theta$ produce curves: on the $\theta r$-plane and on the $xy$-plane. The formula for the circle of radius $R$ is very simple, by design: $$r=R.$$ So, $r$ is fixed while $\theta$ varies:

Polar circle.png

Next, $\theta$ is fixed while $r$ varies; it's a ray:

Polar ray.png

If both vary, identically, we have this spiral:

Polar diagonal.png

$\square$

Example (spiral). Suppose we need to find a polar representation of this spiral, winding onto the origin:

Spiral.png

We realize that we need $r$ to approach $0$ as $\theta$ goes to infinity. For example, this will do: $$r=1/\theta,\ \ \theta>0.$$ $\square$

Exercise. Find a polar representation of this spiral:

Spiral wrapping around a circle.png

The conversion equations,

  • $x = r \cos \theta$,
  • $y = r \sin \theta$,

are understood as a transformation of the plane: $$(\theta,r)\to (x,t)=(r \cos \theta, r \sin \theta).$$ It curves the axes:

Polar coordinates -- grid.png

This transformation has the inverse given by:

  • $r=\sqrt{x^2+y^2}$,
  • $\theta=\tan^{-1}\frac{y}{x},\ x\ne 0$.


In the $3$-dimensional space we have, in addition to the Cartesian coordinates $(x,y,z)$, the spherical coordinates $(r,\ \theta,\ \phi)$, related as follows:

  • $x = r \cos \theta \sin \phi$,
  • $y = r \sin \theta \sin \phi$,
  • $z = \cos \phi$,

where

  • $0 \le \theta \le 2\pi$,
  • $0 \le \phi \le \pi$,
  • $r \le 0$.

Discrete forms

Some numerical simulation of motion that we have seen can be computed with tools similar to but simpler than functions defined at the nodes or the secondary nodes of a partition.

Example. Recall what we started with in the very beginning. Suppose the speedometer is broken and in order to estimate how fast we are driving, we look at the odometer every hour:

Location as a function of time.png

That's a discrete $0$-form. To find the displacement for every hour we just look at the differences:

Location and velocity as functions of time.png

That's a discrete $1$-form. Alternatively, the odometer is broken and we look at the speedometer to sample the velocity and then, via the Riemann sums, find the displacement. $\square$

Let's start over.

Suppose we have a partition of some interval $[a,b]$ in the $x$-axis.

Partition.png

There are two types of pieces:

  • the nodes, or $0$-cells, $x=x_k,\ k=0,1,...,n$; and
  • the edges, or $1$-cells, $c_k=[x_{k-1},x_{k}],\ k=1,...,n$.

The increments of $x$ are $\Delta x_k=x_k-x_{k-1}$.

There are no secondary nodes this time!

Example. Specific representations can also be provided with a spreadsheet choosing, for example, $\Delta x=1$:

Domain R.png

You can see how every other cell is squarely and every other is stretched horizontally to emphasize the different nature of these cells: nodes vs. edges. $\square$

In the motion interpretation, there is a number -- the location -- associated with each node (just as before) and a number -- the displacement -- associated with each edge.

Definition. For a given partition (of an interval or the whole real line), a discrete form of

  • degree $0$ is a real-valued function with nodes as inputs; and
  • degree $1$ is a real-valued function with edges as inputs.

We use arrows to picture these functions as correspondences:

Forms as functions 2.png

Here we have two:

  • a discrete $0$-form $f: 0\mapsto 2,\ 1\mapsto 4,\ 2\mapsto 3,\ ...$; and
  • a discrete $1$-form $s: [0,1]\mapsto 3,\ [1,2]\mapsto .5,\ [2,3]\mapsto 1,\ ...$.

A more compact way to visualize is this:

Forms as functions 3.png

We can also list the values of the two functions:

  • a discrete $0$-form $f$ with $f(0)=2,\ f(1)=4,\ f(2)=3,\ ...$; and
  • a discrete $1$-form $s$ with $s\Big([0,1] \Big)=3,\ s\Big([1,2] \Big)=.5,\ s\Big([2,3] \Big)=1,\ ...$.

Discrete functions can be represented by tables (spreadsheets):

Discrete functions.png

The most common way to visualize a function is with its graph, which consists of points on the $xy$-plane with $y=f(x)$:

Forms as functions.png

For a discrete $0$-form, $x$ is a node, a number, and $y=f(x)$ is also a number. Together, they produce $(x,y)$, a point on the $xy$-plane (with the $x$-axis split into cells as shown above). For a discrete $1$-form, $[A,B]$ is an interval in the $x$-axis, and $y=g([A,B])$ is a number. Together, they produce a collection of points on the $xy$-plane such as $(x,y)$ for every $x$ in $[A,B]$. The result is a horizontal segment.

To underscore the difference between the two, the graph of a discrete $0$-form is shown with dots and that of a discrete $1$-form with vertical bars:

Discrete functions -- graphs.png

Even though these functions may consist of unrelated pieces, it is possible that we can see a continuous curve if we zoom out:

Discrete functions -- zoomed out.png

Example. Let's consider an example of motion. Suppose a $0$-form $p$ gives the position of a person and suppose

  • at time $n$ hours we are at the $5$ mile mark: $p(n)=5$, and then
  • at time $n+1$ hours we are at the $7$ mile mark: $p(n+1)=7$. $\\$

We don't know what exactly has happened during this hour but the simplest assumption would be that we have been walking at a constant speed of $2$ miles per hour.

Difference as the change.png

Now, instead of our velocity function $v$ assigning this value to each instant of time during this period, it is assigned to the whole interval: $$v\Bigg|_{[n,n+1]}=2\ \text{, or better }\ v\Big( [n,n+1] \Big)=2$$ This way, the elements of the domain of the velocity function are the edges and the resulting function is a discrete $1$-form! $\square$

The functions, when defined on the nodes change abruptly and, consequently, the change over every interval $[A,B]$ is simply the difference of values at the nodes, from right to left: $$f(B)-f(A).$$ The output of this simple computation is then assigned to the interval $[A,B]$: $$[A,B] \mapsto f(B)-f(A).$$

ExampleOfDiscreteForm3.png

Just as before, the difference stands for the change of the function.

Definition. The difference of a discrete $0$-form $f$ is a discrete $1$-form given by its values at each edge: $$\Delta f \, (c_k)=f(x_{k})-f(x_{k-1}).$$

The relation between a $0$-form and its difference is illustrated below:

Differential of a node function.png

This is how a spreadsheet computes the difference of a function given by the data in the first column:

Spreadsheet computes the differential.png

Example. When the discrete $0$-forms are represented by formulas, the computations are straightforward ($h=1$) with a chance of simplification: $$\begin{array}{llllll} (1)&f(n)=3n^2+1\ &\Longrightarrow\ &\Delta f\, (c_n)=(3n^2+1)-(3(n-1)^2+1)=6n-3;\\ (2)&g(n)=\frac{1}{n}\ &\Longrightarrow\ &\Delta g\, (c_n)= \frac{1}{n}-\frac{1}{n-1}=-\frac{1}{n(n-1)} \text{ for } n\ne 0,1;\\ (3)&p(n)=2^n\ &\Longrightarrow\ &\Delta p\, (c_n)= 2^{n}-2^{n-1}=2^{n-1}. \end{array}$$ $\square$

Definition. The sum of a discrete $1$-form $g$ is a discrete $0$-form given by its value at each node $x_k,\ 1\le k\le n,$ of a partition of $[a,b]$ by: $$\sum_{[a,x_k]} g=g(c_1)+g(c_2)+...+g(c_k),$$ where $c_1,c_2,...,c_n$ are the edges of the partition.

The fundamental relation is between the differences and sums.

First, we have a $0$-form and a $1$-form:

  • if $f$ is defined at the nodes $x_k,\ k=0,1,2,...,n$, of the partition, then
  • the difference $g$ of $f$ is defined at the edges of the partition by:

$$g(c_{k})=f(x_{k})-f(x_{k-1}).$$

Theorem (Fundamental Theorem of Discrete Calculus I). Suppose $f$ is a discrete $0$-form. Then, for each node $x$ of the partition, we have: $$\sum_{[a,x]} (\Delta f) =f(x)-f(a).$$

Second, we have a $1$-form and a $0$-form:

  • if $g$ is defined at the edges $c_k,\ k=1,2,...,n$, of the partition, then
  • the sum $f$ of $g$ is defined recursively at the nodes of the partition by:

$$f(x_{k})=f(x_{k-1})+g(c_k).$$

Theorem (Fundamental Theorem of Discrete Calculus II). Suppose $g$ is a discrete $1$-form. Then, we have: $$\Delta\left( \sum_{[a,x]} g \right)=g.$$

FTDC -- spreadsheet.png

So, the two operations cancel each other in either order:

FTDC.png

As we see now, the Fundamental Theorem doesn't need quotients.

Next, there are no compositions of forms. In order to create a composition $q\circ p$ of a $0$- or $1$-form $q$ with another function or form $p$, the values of $p$ have to be $0$- and $1$-cells respectively.

Definition. A cell function $y=p(x)$ is a function that assigns

  • a node to each node, and
  • an edge or a node to each edge,

in such a way that the end-points of each edge remain end-points: $$p\big([u,v]\big)=\big[p(u),p(v) \big].$$

Graph maps for 2 edges.png

The function $p$ assigns a $k$- or $k-1$-cell to each $k$-cell.

Realizations of simplicial maps dim 1.png

Because of the property, the values of a cell function on the edges can be reconstructed from its values on the nodes. The former is then analogous to the difference of the cell function.

For convenience, we assume that $\Delta$ is zero when computed over any node $x$.

Theorem (Chain Rule). The difference of the composition of two functions is the composition of the difference of the latter with the former; i.e., for any cell function $x=p(t)$ from $[a,b]$ to $[c,d]$ and any $0$-form $y=g(x)$ on $[c,d]$, we have the differences satisfy: $$\Delta (g\circ p)= \Delta g\, \circ p.$$

In other words, we have for each edge $s$: $$\Delta (g\circ p)(s)= \Delta g\, (p(s)).$$

Differential forms

They are the discrete analogs of discrete forms.

Question: Is the derivative $\frac{dy}{dx}$ a fraction?

The answer that followed the definition was an emphatic No!

A more advanced answer we give here is: Yes, here's why.

Suppose we have a function $y=f(x)$ and we are to study its behavior around a point $x=a$. The derivative at $a$ is $$\frac{dy}{dx}\bigg|_{x=a} = \text{ the slope of the tangent line through } (a,f(a)) = \frac{\text{rise}}{\text{run}}.$$ This is a fraction after all!

Example. Specifically, suppose $f(x)=x^2+2x$. At $a=0$, we have $f(0)=0$, so our interest is the point $(0,0)$. Then, $$ \frac{dy}{dx}\bigg|_{x=0} = 2x+2\bigg|_{x=0}=2. $$ If this is a fraction, what would be the meaning of this: $$dy = 2\cdot dx? $$

Y=x^2+2x and tangent.png

It is the equation of the tangent line written with respect to $dy$ and $dx$. $\square$

Thus, the equation $$dy=f'(a)\cdot dx$$ refers to a specific location, $x=a$ and $y=f(a)$, on the $xy$-plane and it is a relation between the two new variables as the old ones have been specified.

Can we see $dx$, $dy$ on the graph?

Tangent and differentials.png

Thus, we have:

  • $dx$ is the run of the tangent line, and
  • $dy$ is the rise of the tangent line.

They are called the differentials of $x$ and $y$ respectively.

Keep in mind that here $dx$ is just a certain variable related to $x$ (to emphasize this point, the formula can be re-written as $Y = f'(a)\cdot X$). The algebra may come from the example above:

  • $y$ depends on $x$ via $y=f(x)$, and
  • $dy$ depends on $x$ and $dx$ via $dy=f'(x)dx$.

Example (linearization). Given a function $f(x)=x^2$, find its best linear approximation at $a=1$. Since $f'(x)=2x$, we see that $f'(a) = f'(1) = 2$ and, therefore, the best linear approximation of $f$ at $a=1$ is $$T(x)= f(a) + f'(a)(x-a)= 1 + 2(x-a).$$ Now we interpret $x-a$ as $dx$. Then, if we ignore the constant part, we can write $dy = 2dx$. The equation expresses our derivative in terms of these new variables, the differentials. We capture the relation between the increment of $x$ and that of $y$ -- close to $a$. Indeed, $y$ grows two times as fast as $x$. We acquire this information by introducing a new coordinate system $(dy,dx)$. In this coordinate system, the best linear approximation (given by the tangent line) becomes simply a linear function. $\square$

The analysis presented above applies to every point -- and to all points at once:

Tangent lines.png

Recall also from Chapter 12 how we learned to look at the integral differently: we change what we integrate. Instead of a function, $$\int k(x)\, dx,$$ it is a differential form, $k(x)\cdot dx$. As presented above, the form comes from the following: $$y=f(x) \text{ at }x=a\ \Longrightarrow\ \frac{dy}{dx}=f'(a),$$ and, furthermore, $$\qquad \Longrightarrow\ \ dy=f'(a)\cdot dx.$$ This is a relation between the two extra variables, once the relation between the old ones has been specified. The dependence between the differentials varies from location to location. So, $dx$ is the differential of $x$, which is a variable separate from, but related to, $x$. Then, $f'(x)\cdot dx$ is just a function of two variables. The dependence of the differential form on the second variable is especially simple; it's a multiple.

Recall how the Chain Rule, in the Leibniz notation, is interpreted as, and it is, a “cancellation” of $du$ (when it's not zero): $$\frac{dy}{dx}=\frac{dy}{\not du}\frac{\not du}{dx}.$$ We represented the formula of integration by substitution as follows: $$\int f(u)\cdot \frac{du}{\not dx}\, \not dx=\int f(u)\, du.$$ So, under a substitution $u=g(x)$ in an integral, we also substitute: $$du=g'\, dx.$$

Definition. A differential form of degree $1$, or simply a $1$-form, is defined as a function of two variables: $$\varphi=\varphi (x,dx)=g(x) \cdot dx,$$ where $y=g(x)$ is a function of $x$, linear with respect to the second variable.

Warning: the symbol “$\cdot$” stands for multiplication and it is often omitted.

The point of the new concept is to make a careful distinction between the location, $x$, and the direction, $dx$.

Let's plot this function:

  • first we plot the graph of $g$ (green), which is the restriction of our function $\varphi$ to a fixed value of $dx$;
  • then we observe that $\varphi$ is $0$ when $dx=0$ and plot those points on the $x$-axis (blue),
  • finally we connect these dots to the curve with straight lines (purple).

The result is this surface:

1-form plotted.png

Note that the sum and the difference but not the product of quotient of two $1$-forms is also a $1$-form.

Integration of forms is understood in the same sense as before.

Definition. The integral of a $1$-form $\varphi=g\, dx$ over an interval $[a,b]$ is defined to be $$\int_a^b\varphi=\int_{[a,b]} \varphi=\int_a^bg(x)\, dx.$$ Then the form $g\, dx$ is integrable whenever $g$ is integrable.

Definition. A differential form of degree $0$, or simply a $0$-form, is any function $y=f(x)$ of $x$. Its exterior derivative $df$ is defined to be the $1$-form given by: $$df=f'(x)\, dx.$$

When the name of the function that relates $x$ and $y$ is not provided, the following notation is also common: $$dy=f'(x)\, dx.$$

Thus, the exterior derivative of a function contains all information about its derivative and vice versa. However, the former provides a direct answer to this question:

  • if we are at $x=a$ and make a step $dx$, what is the step $dy$ of $y$?

Example. Suppose $x$ is time and $y=f(x)$ is the location at time $x$. Let's re-state the above question:

  • if at time $x=a$ we are at $y=f(a)$ and then we move for a short time $dx$ more, how far will we go?

The distance is velocity multiplied by time: $$\text{displacement }=f'(a)\cdot dx,$$ but only when the velocity, $f'$, is constant! In the general case, that is just an estimate. $\square$

Example. We have used this algebra for integration by substitution. For the integral $$\int x\sin x^2\, dx,$$ we introduce a new variable $$u=x^2$$ and then compute the exterior derivative of this function: $$du=2x\, dx.$$ Our definition of differential form treats both the integrand above and the last expression as simple cases of multiplication of two numbers. That is why we are at liberty to algebraically manipulate these expressions the way we have. $\square$

In Chapter 14, the main application of differential forms is via their discrete counterpart discussed in the next section. We will also see applications of differential forms in the multidimensional case in the following chapters.

The following is a simple re-statement with our new notation of a familiar theorem.

Theorem (Fundamental Theorem of Calculus). Suppose $\varphi$ is a $1$-form integrable on interval $[a,b]$. Then, $$\int_a^b\varphi =\int_{[a,b]}\varphi= F(b)-F(a),$$ for any $0$-form $F$ that satisfies: $$dF=\varphi.$$

In order to study a real-valued function $y=f(x)$, we now keep track of two variables:

  • the locations, $x$ vs. $y$, and
  • the directions, $dx$ vs. $dy$,

as follows: $$\begin{array}{|c|} \hline \\ \quad (x,dx)\mapsto (y,dy)=(f(x),f'(x)dx). \quad \\ \\ \hline \end{array}$$

A function can be sampled at

  • the nodes, producing a discrete $0$-form, or at
  • the secondary nodes, producing a discrete $1$-form.

Alternatively, we express this idea as sampling of the corresponding differential forms.

Theorem.

  • A differential $0$-form sampled at the nodes of the partition is a discrete $0$-form.
  • A differential $1$-form sampled at the intervals of the partition is a discrete $1$-form; i.e., if $\varphi$ is a differential $1$-form, the corresponding discrete $1$-form is defined by:

$$s\Big([A,B] \Big)=\int_{[A,B]} \varphi.$$

Example (motion). To follow the idea from the last section, the exterior derivative provides a direct answer to this question:

  • Suppose $x$ is time and $y=f(x)$ is the location at time $x$. If at time $x=a$ we are at $y=f(a)$ and then we move for a short time $dx$ more, how far will we go?

The distance is velocity multiplied by time: $$\text{displacement }=f'(a)\cdot dx,$$ and this time the velocity, $f'$, is assumed to be constant throughout the interval. $\square$