This site is being phased out.

Matrices: part 2

From Mathematics Is A Science
Jump to navigationJump to search

Properties of matrix operations

The properties of matrix multiplication are very much like the ones for numbers.

What's not working is commutativity: $$AB \neq BA,$$ generally.

Indeed, let try to put random numbers in two matrices and see if they "commute":

Non-commutativity of matrix multiplication.png

Already different!

To understand the idea of non-commutativity one might think about matrices as operation carried out in manufacturing. Suppose $A$ is polishing and $B$ is painting. Clearly, $$AB \neq BA.$$

Mathematically, $$\begin{array}{} x \stackrel{A}{\rightarrow} y \stackrel{B}{\rightarrow} z\\ x \stackrel{B}{\rightarrow} y \stackrel{A}{\rightarrow} z \end{array}$$

Does this have to be the same? Not in general.

Let's put them together:

$$ \newcommand{\ra}[1]{\!\!\!\!\!\!\!\xrightarrow{\quad#1\quad}\!\!\!\!\!} \newcommand{\da}[1]{\left\downarrow{\scriptstyle#1}\vphantom{\displaystyle\int_0^1}\right.} % \begin{array}{llllllllllll} x & \ra{A} & y \\ \da{B} & \searrow ^C & \da{B} \\ y & \ra{A} & z \end{array} $$

This is called a commutative diagram if $C$ makes sense, i.e., $C=AB=BA$.

Let's think of matrices as functions. Then non-commutativity isn't surprising. Indeed, take $$f(x)={\rm sin \hspace{3pt}} x, g(x)=e^x.$$ Then $$\begin{array}{} gf &\neq fg \\ e^{{\rm sin \hspace{3pt}}x} &\neq {\rm sin \hspace{3pt}}e^x \end{array}$$

Functions don't commute, in general, and neither do matrices.

But some do. For example, $A^2=A \cdot A$ makes sense, as well as: $$A^n = A \cdot \ldots \cdot A$$ (where there are $n$ $A$'s). Then $$A^m \cdot A^n = A^n A^m,$$ i.e., powers commute.

Also, consider diagonal matrices $A = \{a_{ij}\}$ with $a_{ij}=0$ if $i \neq j$. Such a matrix commute with any matrix.

DiagonalMatrix.png

In particular, define $$I_n = \left[ \begin{array}{} 1 & 0 & 0 & \ldots \\ 0 & 1 & 0 & \ldots \\ \vdots \\ 0 & 0 & \ldots & 1 \end{array} \right]$$

or $a_{ij}=\left\{ \begin{array}{} 1 & i = j \\ 0 & {\rm otherwise} \end{array} \right.$

It is called the identity matrix.

Property: $IA=A$ and $AI=A$.

Provided, of course, $A$ has the appropriate dimensions:

  • $I_nA=A$ makes sense if $A$ is $n \times m$.
  • $AI_n=A$ makes sense if $A$ is $m \times n$.

Proof: Go back to the definition: for $AI_n=A$. Suppose $AI_n = \{c_{pq}\}$:

$$c_{pq} = a_{p1}b_{1q} + a_{p2}b_{2q} + \ldots + a_{pn}b_{nq}. (AB)$$

But

$b_{iq} = \left\{ \begin{array}{} 1 & i=q \\ 0 & i \neq q \end{array} \right.$, so $$b_{iq}=0,0,\ldots,0,1,0,\ldots,0.$$

Substitute these:

$$c_{pq}=a_{p1}\cdot 0 + a_{p2} \cdot 0 + \ldots + a_{p,q-1}\cdot 0 + a_{pq}\cdot 1 + a_{p,q+1} \cdot 0 + \ldots + a_{pn} \cdot 0 = a_{pq}.$$

The left hand side is the $pq$ entry of $AI_n$ and the right hand side is the $pq$ entry of $A$, so $AI_n = A$.

$I_nA=A$ is similar. $\blacksquare$


So, $I_n$ behaves like $1$ among numbers. It's also called the multiplicative identity, as opposed to the additive identity, $0$.

Both addition and multiplication can be carried out for $n \times m$ matrices, $n$ fixed.

You can take this idea quite far:

$$I + A + \frac{1}{2}A^2 + \frac{1}{3!}A^3 + \ldots + \frac{1}{n!}A^n + \ldots := e^A$$

Yes, the exponent of a matrix!

Consider the main properties of matrix multiplication:

  • Associative Law 1: $(AB)C=A(BC)$, proof is routine.
  • Associative Law 2: $r(AB)=A(rB)=(rA)B$
  • Distributive Law 1: $A(B+C)=AB+AC$
  • Distributive Law 2: $(B+C)A = BA + CA$

Proof: Suppose $A = \{a_{ij}\}, B=\{b_{ij}\}, C=\{c_{ij}\}$ and $B+C=D=\{d_{ij}\}$, $S=(B+C)A = \{s_{ij}\}$.

Consider $$s_{pq} = d_{p1}a_{1q} + d_{p2}a_{2q} + \ldots + d_{pn}a_{nq},$$ where $d_{ij}=b_{ij}+c_{ij}$. Substitute

$$s_{pq} = (b_{p1}+c_{p1})a_{1q} + (b_{p2}+c_{p2})q_{2q} + \ldots + (b_{pn}+c_{pn})a_{nq}$$

expand and rearrange

$$\begin{array}{} &= b_{p1}a_{1q} + c_{p1}a_{1q} + \ldots + b_{pn}a_{nq} + c_{pn}a_{nq} \\ &= (b_{p1}a_{1q}+\ldots+b_{pn}a_{nq}) + (c_{p1}a_{1q} + \ldots + c_{pn}a_{nq}) \\ &= pq {\rm \hspace{3pt} entry \hspace{3pt} of \hspace{3pt}} BA + pq {\rm \hspace{3pt} entry \hspace{3pt} of \hspace{3pt}} CA. \end{array}$$

So $(B+C)A = BA + CA$. $\blacksquare$

Let's rewrite this computation with $\Sigma$-notation.

$$\begin{array}{} s_{pq} &= \sum_{i=1}^n d_{pi}a_{iq} \\ &= \sum_{i=1}^n(b_{pi}+c_{pi})a_{iq} \\ &= \sum_{i=1}^n(b_{pi}a_{iq} + c_{pi}a_{iq}) \\ &= \sum_{i=1}^nb_{pi}a_{iq} + \sum_{i=1}^n c_{pi}a_{iq}. \end{array}$$

Back to systems of linear equations. Recall

  • equation: $ax=b$, $a \neq 0 \rightarrow$ solve $x=\frac{b}{a}$
  • system: $AX=B$, ?? $\rightarrow$ solve $X = \frac{B}{A}$.

To do that we need to define division of matrices!

Specifically, define an analogue of the reciprocal $\frac{1}{a}$, the inverse $A^{-1}$ (both notation and meaning like inverse function!).

But is it $X=BA^{-1}$ or $A^{-1}B$?

We know these don't have to be the same!

We define division via multiplication. Recall:

  • $x=\frac{1}{a}$ if $ax=1$ or $xa=1$.

Similarly,

  • $X=A^{-1}$ if $AX=I$ or $XA=I$.

Question: Which one is it?

Answer: Both!

The first one is called the "right-inverse" of $A$ and the second one is called the "left-inverse" of $A$.

More about inverse matrices

Definition: Given an $n \times n$ matrix $A$, its inverse is a matrix $B$ that satisfies $AB=I$, $BA=I$.

Note: "its" is used to hide the issue. It should be "an inverse" or "the inverse".

Question: Is it well defined?

  • Existence: not always.

How do we know? As simple as this. Go to $n=1$, then matrices are numbers. Then $A=0$ does not have the inverse.

Definition: If the inverse exists, $A$ is called invertible. Otherwise, $A$ is singular.

  • Uniqueness: Yes.

Theorem: If $A$ is invertible, then its inverse is unique.

Proof: Suppose $B,B'$ are inverses of $A$. Then (compact proof):

$$B=BI=B(AB')=(BA)B'=IB'=B'.$$

Or, with more details.

Try (1) $BA=I$ and (2) $AB'=I$.

Multiply the equation by $B'$ on both sides:

$$(BA)B'=IB'.$$

Use associativity on the left and the definition of multiplicative inverse on the right:

$$B(AB')=B'.$$

Use the second equation:

$$BI=B'.$$

Use the definition of multiplicative inverse on the left:

$$B=B'.$$ $\blacksquare$

(The second proof was less compact, but doesn't miss details.)

So, if $A$ is invertible, then $A^{-1}$ is the inverse.

Example: What about $I$? $I^{-1}=I$, because $II=I$.

Example: Verify: $$\left[ \begin{array}{} 3 & 0 \\ 0 & 3 \end{array} \right]^{-1} = \left[ \begin{array}{} \frac{1}{3} & 0 \\ 0 & \frac{1}{3} \end{array} \right]$$

It can be easily justified without computation: $$(3I)^{-1} = \frac{1}{3}I.$$

Generally, given $A = \left[ \begin{array}{} a & b \\ c & d \end{array} \right]$, find the inverse.

Start with:

$A^{-1} = \left[ \begin{array}{} x & y \\ u & v \end{array} \right]$.

Then

$$\left[ \begin{array}{} a & b \\ c & d \end{array} \right] \left[ \begin{array}{} x & y \\ u & v \end{array} \right] = \left[ \begin{array}{} 1 & 0 \\ 0 & 1 \end{array} \right]$$

Solve this matrix equation. Expand:

$$\left[ \begin{array}{} ax+ba & ay+bv \\ cx+du & cy+dv \end{array} \right] = \left[ \begin{array}{} 1 & 0 \\ 0 & 1 \end{array} \right]$$

Break apart: $$\begin{array}{} ax+bu=1, & ay+bv=0 \\ cx+du=0, & cy+dv=1 \end{array}$$

The equations are linear. Solve it. Exercise.

A shortcut formula: $$A^{-1}=\frac{1}{ad-bc} \left[ \begin{array}{} d & -b \\ -c & a \end{array} \right]$$

ExplainingMatrixInverseShortcut.png

Let's verify: $$\begin{array}{} AA^{-1} &= \left[ \begin{array}{} a & b \\ c & d \end{array} \right] \frac{1}{ad-bc} \left[ \begin{array}{} d & -b \\ -c & a \end{array} \right] \\ &= \frac{1}{ad-bc}\left[ \begin{array}{} a & b \\ c & d \end{array} \right] \left[ \begin{array}{} d & -b \\ -c & a \end{array} \right] \\ &= \frac{1}{ad-bc} \left[ \begin{array}{} ad-bc & ab-ba \\ cd - dc & -bc + da \end{array} \right] \\ &= \left[ \begin{array}{} 1 & 0 \\ 0 & 1 \end{array} \right] = I \end{array}$$

Verify also $A^{-1}A=I$.

To find $A^{-1}$, solve a system. And it might not have a solution: $\frac{1}{ad-bc} = ?$.

Fact: For $A=\left[ \begin{array}{} a & b \\ c & d \end{array} \right]$, $A^{-1}$ exists if and only if $ad-bc \neq 0$.

$D=ad-bc$ is called the determinant of $A$.


Example: 0 = $\left[ \begin{array}{} 0 & 0 \\ 0 & 0 \end{array} \right]$ is singular, by formula.

In dimension $n$, the zero matrix is singular: $0 \cdot B = 0$, not $I$. (Try $AB=I$?)

Just like $0 \in {\bf R}$, $\frac{1}{0}$ undefined.


Example: Other singular matrices: $$\left[ \begin{array}{} 1 & 1 \\ 1 & 1 \end{array} \right], \left[ \begin{array}{} 0 & 0 \\ 0 & 1 \end{array} \right], \left[ \begin{array}{} 1 & 3 \\ 1 & 3 \end{array} \right]$$

What do they have in common?

What if we look at them as pairs of vectors?

$$\left\{ \left[ \begin{array}{} 1 \\ 1 \end{array} \right], \left[ \begin{array}{} 1 \\ 1 \end{array} \right] \right\}, \left\{ \left[ \begin{array}{} 0 \\ 0 \end{array} \right], \left[ \begin{array}{} 0 \\ 1 \end{array} \right] \right\}, \left\{ \left[ \begin{array}{} 1 \\ 1 \end{array} \right], \left[ \begin{array}{} 3 \\ 3 \end{array} \right] \right\}$$

What's so special about them as opposed to $\left\{ \left[ \begin{array}{} 1 \\ 0 \end{array} \right], \left[ \begin{array}{} 0 \\ 1 \end{array} \right] \right\}$?

The vector are linearly dependent!

Theorem: If $A$ is invertible, then so is $A^{-1}$. And $({A^{-1}})^{-1}=A$.

Proof: Later.

Theorem: If $A,B$ are invertible, then so is $AB$, and $$(AB)^{-1} = B^{-1}A^{-1}.$$

This can be explained by thinking about matrices as functions:

InverseOfProduct.png

Proof: Later.


Computing inverses

It's easy in dimension 1: $$[a]^{-1} = \left[ \frac{1}{a} \right].$$

Next, dimension 2.

Given $A=\left[ \begin{array}{} a & b \\ c & d \end{array} \right]$, find the inverse.

How? $A^{-1} = \left[ \begin{array}{} x & y \\ u & v \end{array} \right]$.

Matrix multiplication 2x2.png

$$\left[ \begin{array}{} a & b \\ c & d \end{array} \right] \left[ \begin{array}{} x & y \\ u & v \end{array} \right] = \left[ \begin{array}{} 1 & 0 \\ 0 & 1 \end{array} \right].$$

Break the right hand side into two vectors. Then

$$A\left[ \begin{array}{} x \\ u \end{array} \right] = \left[ \begin{array}{} 1 \\ 0 \end{array} \right]$$ and

$$A\left[ \begin{array}{} y \\ v \end{array} \right] = \left[ \begin{array}{} 0 \\ 1 \end{array} \right]$$

We have two matrix equations, each a system of linear equations. Their augmented matrices are:

$$\left[ \begin{array}{cc|c} a & b & 1 \\ c & d & 0 \end{array} \right] {\rm and} \left[ \begin{array}{cc|c} a & b & 0 \\ c & d & 1 \end{array} \right].$$

Observation: Same left part!

It's $A$. Only the right hand side varies.

How do we solve these systems? The usual way: row operations with end result row reduced echelon form.

How do we usually decide on which row operations to do? We look only at the left hand side.

Conclusion: the same row operations will solve both systems and we can solve them simultaneously!

We create

$\left[ \begin{array}{cc|cc} a & b & 1 & 0 \\ c & d & 0 & 1 \end{array} \right]$,

the "expanded" augmented matrix.

Now we find the row reduced echelon form of this matrix.

Example: $A = \left[ \begin{array}{} 1 & 1 \\ 1 & 2 \end{array} \right]$, find $A^{-1}$.

$$\left[ \begin{array}{cc|cc} 1 & 1 & 1 & 0 \\ 1 & 2 & 0 & 1 \end{array} \right] \rightarrow R_1-R_2 \left[ \begin{array}{cc|cc} 1 & 1 & 1 & 0 \\ 0 & -1 & 1 & -1 \end{array} \right] \rightarrow -R_2 \left[ \begin{array}{cc|cc} 1 & 1 & 1 & 0 \\ 0 & 1 & -1 & 1 \end{array} \right] \rightarrow R_1-R_2 \left[ \begin{array}{cc|cc} 1 & 0 & 2 & -1 \\ 0 & 1 & -1 & 1 \end{array} \right]$$

This is the reduced row echelon form.

Note: we "killed" the $1$ in the top right and bottom left of the first matrix and we turned the $-1$ in the bottom right of the second matrix into a $1$ so that the matrix matched the identity matrix.

Yes, we have $I$ on the right.

We can now break this extended matrix into two again:

$$\left[ \begin{array}{} 1 & 0 \\ 0 & 1 \end{array} \right] \left[ \begin{array}{} x \\ u \end{array} \right] = \left[ \begin{array}{} 2 \\ -1 \end{array} \right]$$

$$\left[ \begin{array}{} 1 & 0 \\ 0 & 1 \end{array} \right] \left[ \begin{array}{} y \\ v \end{array} \right] = \left[ \begin{array}{} -1 \\ 1 \end{array} \right],$$

Cancel $I$ here. Then we have:

$\left[ \begin{array}{} x \\ u \end{array} \right] = \left[ \begin{array}{} 2 \\ -1 \end{array} \right], \left[ \begin{array}{} y \\ v \end{array} \right] = \left[ \begin{array}{} -1 \\ 1 \end{array} \right].$

So $A^{-1} = \left[ \begin{array}{} 2 & -1 \\ -1 & 1 \end{array} \right]$ (verify).

We acquired the inverse at the end Gaussian elimination of the extended augmented matrix. That's the idea/plan of how to do it.

Theorem: Given a square matrix $A$, $n \times n$,

  • form the extended augmented matrix $[A | I_n]$

(which is $n$ systems of linear equations with same coefficients, $A$, and right hand side that comes from $I_n$),

  • find the reduced row echelon form of this matrix: $[A'|B]$.

Then, if $A'=I_n$, then $A$ is invertible and $A^{-1}=B$; otherwise $A$ is singular.


Example: Find $A^{-1}$ for $A = \left[ \begin{array}{} 1 & 1 \\ 2 & 2 \end{array} \right]$.

$$\left[ \begin{array}{cc|cc} 1 & 1 & 1 & 0 \\ 2 & 2 & 0 & 1 \end{array} \right] \rightarrow 2R_1-R_2 \left[ \begin{array}{cc|cc} 1 & 1 & 1 & 0 \\ 0 & 0 & 2 & -1 \end{array} \right]$$

The bottom row indicates that is not solution. Therefore, there is no inverse because there is no identity on the left.

Exercise: Find $\left[ \begin{array}{} 1 & 2 & 3 \\ 0 & 1 & 1 \\ 1 & 2 & 0 \end{array} \right]^{-1}$.


Definition: The rank of a matrix is the number of leading $1$'s in its reduced row echelon form.

Example: Rank $=4$:

Rank4Matrix.png

Corollary: An $n \times n$ matrix is invertible if and only if its rank is equal to $n$.

Also, suppose $\{v_1,\ldots,v_n\}$ are vectors. How many linearly independent vectors does it contain (compare to Reduction Theorem)?

Form a matrix $[v_1,v_2,\ldots,v_n]$ with $v_1,\ldots,v_n$ as columns. Then find its rank.

Theorem: Given $n \times n$ matrices $A,B$, suppose

  • 1. $AB=I$, then
  • 2. $BA=I$.

So $A,B$ are inverses of each other.

Note: the definition requires both 1. and 2.