This site is being phased out.
Matrices: part 2
Properties of matrix operations
The properties of matrix multiplication are very much like the ones for numbers.
What's not working is commutativity: $$AB \neq BA,$$ generally.
Indeed, let try to put random numbers in two matrices and see if they "commute":
Already different!
To understand the idea of non-commutativity one might think about matrices as operation carried out in manufacturing. Suppose $A$ is polishing and $B$ is painting. Clearly, $$AB \neq BA.$$
Mathematically, $$\begin{array}{} x \stackrel{A}{\rightarrow} y \stackrel{B}{\rightarrow} z\\ x \stackrel{B}{\rightarrow} y \stackrel{A}{\rightarrow} z \end{array}$$
Does this have to be the same? Not in general.
Let's put them together:
$$ \newcommand{\ra}[1]{\!\!\!\!\!\!\!\xrightarrow{\quad#1\quad}\!\!\!\!\!} \newcommand{\da}[1]{\left\downarrow{\scriptstyle#1}\vphantom{\displaystyle\int_0^1}\right.} % \begin{array}{llllllllllll} x & \ra{A} & y \\ \da{B} & \searrow ^C & \da{B} \\ y & \ra{A} & z \end{array} $$
This is called a commutative diagram if $C$ makes sense, i.e., $C=AB=BA$.
Let's think of matrices as functions. Then non-commutativity isn't surprising. Indeed, take $$f(x)={\rm sin \hspace{3pt}} x, g(x)=e^x.$$ Then $$\begin{array}{} gf &\neq fg \\ e^{{\rm sin \hspace{3pt}}x} &\neq {\rm sin \hspace{3pt}}e^x \end{array}$$
Functions don't commute, in general, and neither do matrices.
But some do. For example, $A^2=A \cdot A$ makes sense, as well as: $$A^n = A \cdot \ldots \cdot A$$ (where there are $n$ $A$'s). Then $$A^m \cdot A^n = A^n A^m,$$ i.e., powers commute.
Also, consider diagonal matrices $A = \{a_{ij}\}$ with $a_{ij}=0$ if $i \neq j$. Such a matrix commute with any matrix.
In particular, define $$I_n = \left[ \begin{array}{} 1 & 0 & 0 & \ldots \\ 0 & 1 & 0 & \ldots \\ \vdots \\ 0 & 0 & \ldots & 1 \end{array} \right]$$
or $a_{ij}=\left\{ \begin{array}{} 1 & i = j \\ 0 & {\rm otherwise} \end{array} \right.$
It is called the identity matrix.
Property: $IA=A$ and $AI=A$.
Provided, of course, $A$ has the appropriate dimensions:
- $I_nA=A$ makes sense if $A$ is $n \times m$.
- $AI_n=A$ makes sense if $A$ is $m \times n$.
Proof: Go back to the definition: for $AI_n=A$. Suppose $AI_n = \{c_{pq}\}$:
$$c_{pq} = a_{p1}b_{1q} + a_{p2}b_{2q} + \ldots + a_{pn}b_{nq}. (AB)$$
But
$b_{iq} = \left\{ \begin{array}{} 1 & i=q \\ 0 & i \neq q \end{array} \right.$, so $$b_{iq}=0,0,\ldots,0,1,0,\ldots,0.$$
Substitute these:
$$c_{pq}=a_{p1}\cdot 0 + a_{p2} \cdot 0 + \ldots + a_{p,q-1}\cdot 0 + a_{pq}\cdot 1 + a_{p,q+1} \cdot 0 + \ldots + a_{pn} \cdot 0 = a_{pq}.$$
The left hand side is the $pq$ entry of $AI_n$ and the right hand side is the $pq$ entry of $A$, so $AI_n = A$.
$I_nA=A$ is similar. $\blacksquare$
So, $I_n$ behaves like $1$ among numbers. It's also called the multiplicative identity, as opposed to the additive identity, $0$.
Both addition and multiplication can be carried out for $n \times m$ matrices, $n$ fixed.
You can take this idea quite far:
$$I + A + \frac{1}{2}A^2 + \frac{1}{3!}A^3 + \ldots + \frac{1}{n!}A^n + \ldots := e^A$$
Yes, the exponent of a matrix!
Consider the main properties of matrix multiplication:
- Associative Law 1: $(AB)C=A(BC)$, proof is routine.
- Associative Law 2: $r(AB)=A(rB)=(rA)B$
- Distributive Law 1: $A(B+C)=AB+AC$
- Distributive Law 2: $(B+C)A = BA + CA$
Proof: Suppose $A = \{a_{ij}\}, B=\{b_{ij}\}, C=\{c_{ij}\}$ and $B+C=D=\{d_{ij}\}$, $S=(B+C)A = \{s_{ij}\}$.
Consider $$s_{pq} = d_{p1}a_{1q} + d_{p2}a_{2q} + \ldots + d_{pn}a_{nq},$$ where $d_{ij}=b_{ij}+c_{ij}$. Substitute
$$s_{pq} = (b_{p1}+c_{p1})a_{1q} + (b_{p2}+c_{p2})q_{2q} + \ldots + (b_{pn}+c_{pn})a_{nq}$$
expand and rearrange
$$\begin{array}{} &= b_{p1}a_{1q} + c_{p1}a_{1q} + \ldots + b_{pn}a_{nq} + c_{pn}a_{nq} \\ &= (b_{p1}a_{1q}+\ldots+b_{pn}a_{nq}) + (c_{p1}a_{1q} + \ldots + c_{pn}a_{nq}) \\ &= pq {\rm \hspace{3pt} entry \hspace{3pt} of \hspace{3pt}} BA + pq {\rm \hspace{3pt} entry \hspace{3pt} of \hspace{3pt}} CA. \end{array}$$
So $(B+C)A = BA + CA$. $\blacksquare$
Let's rewrite this computation with $\Sigma$-notation.
$$\begin{array}{} s_{pq} &= \sum_{i=1}^n d_{pi}a_{iq} \\ &= \sum_{i=1}^n(b_{pi}+c_{pi})a_{iq} \\ &= \sum_{i=1}^n(b_{pi}a_{iq} + c_{pi}a_{iq}) \\ &= \sum_{i=1}^nb_{pi}a_{iq} + \sum_{i=1}^n c_{pi}a_{iq}. \end{array}$$
Back to systems of linear equations. Recall
- equation: $ax=b$, $a \neq 0 \rightarrow$ solve $x=\frac{b}{a}$
- system: $AX=B$, ?? $\rightarrow$ solve $X = \frac{B}{A}$.
To do that we need to define division of matrices!
Specifically, define an analogue of the reciprocal $\frac{1}{a}$, the inverse $A^{-1}$ (both notation and meaning like inverse function!).
But is it $X=BA^{-1}$ or $A^{-1}B$?
We know these don't have to be the same!
We define division via multiplication. Recall:
- $x=\frac{1}{a}$ if $ax=1$ or $xa=1$.
Similarly,
- $X=A^{-1}$ if $AX=I$ or $XA=I$.
Question: Which one is it?
Answer: Both!
The first one is called the "right-inverse" of $A$ and the second one is called the "left-inverse" of $A$.
More about inverse matrices
Definition: Given an $n \times n$ matrix $A$, its inverse is a matrix $B$ that satisfies $AB=I$, $BA=I$.
Note: "its" is used to hide the issue. It should be "an inverse" or "the inverse".
Question: Is it well defined?
- Existence: not always.
How do we know? As simple as this. Go to $n=1$, then matrices are numbers. Then $A=0$ does not have the inverse.
Definition: If the inverse exists, $A$ is called invertible. Otherwise, $A$ is singular.
- Uniqueness: Yes.
Theorem: If $A$ is invertible, then its inverse is unique.
Proof: Suppose $B,B'$ are inverses of $A$. Then (compact proof):
$$B=BI=B(AB')=(BA)B'=IB'=B'.$$
Or, with more details.
Try (1) $BA=I$ and (2) $AB'=I$.
Multiply the equation by $B'$ on both sides:
$$(BA)B'=IB'.$$
Use associativity on the left and the definition of multiplicative inverse on the right:
$$B(AB')=B'.$$
Use the second equation:
$$BI=B'.$$
Use the definition of multiplicative inverse on the left:
$$B=B'.$$ $\blacksquare$
(The second proof was less compact, but doesn't miss details.)
So, if $A$ is invertible, then $A^{-1}$ is the inverse.
Example: What about $I$? $I^{-1}=I$, because $II=I$.
Example: Verify: $$\left[ \begin{array}{} 3 & 0 \\ 0 & 3 \end{array} \right]^{-1} = \left[ \begin{array}{} \frac{1}{3} & 0 \\ 0 & \frac{1}{3} \end{array} \right]$$
It can be easily justified without computation: $$(3I)^{-1} = \frac{1}{3}I.$$
Generally, given $A = \left[ \begin{array}{} a & b \\ c & d \end{array} \right]$, find the inverse.
Start with:
$A^{-1} = \left[ \begin{array}{} x & y \\ u & v \end{array} \right]$.
Then
$$\left[ \begin{array}{} a & b \\ c & d \end{array} \right] \left[ \begin{array}{} x & y \\ u & v \end{array} \right] = \left[ \begin{array}{} 1 & 0 \\ 0 & 1 \end{array} \right]$$
Solve this matrix equation. Expand:
$$\left[ \begin{array}{} ax+ba & ay+bv \\ cx+du & cy+dv \end{array} \right] = \left[ \begin{array}{} 1 & 0 \\ 0 & 1 \end{array} \right]$$
Break apart: $$\begin{array}{} ax+bu=1, & ay+bv=0 \\ cx+du=0, & cy+dv=1 \end{array}$$
The equations are linear. Solve it. Exercise.
A shortcut formula: $$A^{-1}=\frac{1}{ad-bc} \left[ \begin{array}{} d & -b \\ -c & a \end{array} \right]$$
Let's verify: $$\begin{array}{} AA^{-1} &= \left[ \begin{array}{} a & b \\ c & d \end{array} \right] \frac{1}{ad-bc} \left[ \begin{array}{} d & -b \\ -c & a \end{array} \right] \\ &= \frac{1}{ad-bc}\left[ \begin{array}{} a & b \\ c & d \end{array} \right] \left[ \begin{array}{} d & -b \\ -c & a \end{array} \right] \\ &= \frac{1}{ad-bc} \left[ \begin{array}{} ad-bc & ab-ba \\ cd - dc & -bc + da \end{array} \right] \\ &= \left[ \begin{array}{} 1 & 0 \\ 0 & 1 \end{array} \right] = I \end{array}$$
Verify also $A^{-1}A=I$.
To find $A^{-1}$, solve a system. And it might not have a solution: $\frac{1}{ad-bc} = ?$.
Fact: For $A=\left[ \begin{array}{} a & b \\ c & d \end{array} \right]$, $A^{-1}$ exists if and only if $ad-bc \neq 0$.
$D=ad-bc$ is called the determinant of $A$.
Example: 0 = $\left[
\begin{array}{}
0 & 0 \\
0 & 0
\end{array}
\right]$ is singular, by formula.
In dimension $n$, the zero matrix is singular: $0 \cdot B = 0$, not $I$. (Try $AB=I$?)
Just like $0 \in {\bf R}$, $\frac{1}{0}$ undefined.
Example: Other singular matrices:
$$\left[
\begin{array}{}
1 & 1 \\
1 & 1
\end{array}
\right], \left[
\begin{array}{}
0 & 0 \\
0 & 1
\end{array}
\right], \left[
\begin{array}{}
1 & 3 \\
1 & 3
\end{array}
\right]$$
What do they have in common?
What if we look at them as pairs of vectors?
$$\left\{ \left[ \begin{array}{} 1 \\ 1 \end{array} \right], \left[ \begin{array}{} 1 \\ 1 \end{array} \right] \right\}, \left\{ \left[ \begin{array}{} 0 \\ 0 \end{array} \right], \left[ \begin{array}{} 0 \\ 1 \end{array} \right] \right\}, \left\{ \left[ \begin{array}{} 1 \\ 1 \end{array} \right], \left[ \begin{array}{} 3 \\ 3 \end{array} \right] \right\}$$
What's so special about them as opposed to $\left\{ \left[ \begin{array}{} 1 \\ 0 \end{array} \right], \left[ \begin{array}{} 0 \\ 1 \end{array} \right] \right\}$?
The vector are linearly dependent!
Theorem: If $A$ is invertible, then so is $A^{-1}$. And $({A^{-1}})^{-1}=A$.
Proof: Later.
Theorem: If $A,B$ are invertible, then so is $AB$, and $$(AB)^{-1} = B^{-1}A^{-1}.$$
This can be explained by thinking about matrices as functions:
Proof: Later.
Computing inverses
It's easy in dimension 1: $$[a]^{-1} = \left[ \frac{1}{a} \right].$$
Next, dimension 2.
Given $A=\left[ \begin{array}{} a & b \\ c & d \end{array} \right]$, find the inverse.
How? $A^{-1} = \left[ \begin{array}{} x & y \\ u & v \end{array} \right]$.
$$\left[ \begin{array}{} a & b \\ c & d \end{array} \right] \left[ \begin{array}{} x & y \\ u & v \end{array} \right] = \left[ \begin{array}{} 1 & 0 \\ 0 & 1 \end{array} \right].$$
Break the right hand side into two vectors. Then
$$A\left[ \begin{array}{} x \\ u \end{array} \right] = \left[ \begin{array}{} 1 \\ 0 \end{array} \right]$$ and
$$A\left[ \begin{array}{} y \\ v \end{array} \right] = \left[ \begin{array}{} 0 \\ 1 \end{array} \right]$$
We have two matrix equations, each a system of linear equations. Their augmented matrices are:
$$\left[ \begin{array}{cc|c} a & b & 1 \\ c & d & 0 \end{array} \right] {\rm and} \left[ \begin{array}{cc|c} a & b & 0 \\ c & d & 1 \end{array} \right].$$
Observation: Same left part!
It's $A$. Only the right hand side varies.
How do we solve these systems? The usual way: row operations with end result row reduced echelon form.
How do we usually decide on which row operations to do? We look only at the left hand side.
Conclusion: the same row operations will solve both systems and we can solve them simultaneously!
We create
$\left[ \begin{array}{cc|cc} a & b & 1 & 0 \\ c & d & 0 & 1 \end{array} \right]$,
the "expanded" augmented matrix.
Now we find the row reduced echelon form of this matrix.
Example: $A = \left[ \begin{array}{} 1 & 1 \\ 1 & 2 \end{array} \right]$, find $A^{-1}$.
$$\left[ \begin{array}{cc|cc} 1 & 1 & 1 & 0 \\ 1 & 2 & 0 & 1 \end{array} \right] \rightarrow R_1-R_2 \left[ \begin{array}{cc|cc} 1 & 1 & 1 & 0 \\ 0 & -1 & 1 & -1 \end{array} \right] \rightarrow -R_2 \left[ \begin{array}{cc|cc} 1 & 1 & 1 & 0 \\ 0 & 1 & -1 & 1 \end{array} \right] \rightarrow R_1-R_2 \left[ \begin{array}{cc|cc} 1 & 0 & 2 & -1 \\ 0 & 1 & -1 & 1 \end{array} \right]$$
This is the reduced row echelon form.
Note: we "killed" the $1$ in the top right and bottom left of the first matrix and we turned the $-1$ in the bottom right of the second matrix into a $1$ so that the matrix matched the identity matrix.
Yes, we have $I$ on the right.
We can now break this extended matrix into two again:
$$\left[ \begin{array}{} 1 & 0 \\ 0 & 1 \end{array} \right] \left[ \begin{array}{} x \\ u \end{array} \right] = \left[ \begin{array}{} 2 \\ -1 \end{array} \right]$$
$$\left[ \begin{array}{} 1 & 0 \\ 0 & 1 \end{array} \right] \left[ \begin{array}{} y \\ v \end{array} \right] = \left[ \begin{array}{} -1 \\ 1 \end{array} \right],$$
Cancel $I$ here. Then we have:
$\left[ \begin{array}{} x \\ u \end{array} \right] = \left[ \begin{array}{} 2 \\ -1 \end{array} \right], \left[ \begin{array}{} y \\ v \end{array} \right] = \left[ \begin{array}{} -1 \\ 1 \end{array} \right].$
So $A^{-1} = \left[ \begin{array}{} 2 & -1 \\ -1 & 1 \end{array} \right]$ (verify).
We acquired the inverse at the end Gaussian elimination of the extended augmented matrix. That's the idea/plan of how to do it.
Theorem: Given a square matrix $A$, $n \times n$,
- form the extended augmented matrix $[A | I_n]$
(which is $n$ systems of linear equations with same coefficients, $A$, and right hand side that comes from $I_n$),
- find the reduced row echelon form of this matrix: $[A'|B]$.
Then, if $A'=I_n$, then $A$ is invertible and $A^{-1}=B$; otherwise $A$ is singular.
Example: Find $A^{-1}$ for $A = \left[
\begin{array}{}
1 & 1 \\
2 & 2
\end{array}
\right]$.
$$\left[ \begin{array}{cc|cc} 1 & 1 & 1 & 0 \\ 2 & 2 & 0 & 1 \end{array} \right] \rightarrow 2R_1-R_2 \left[ \begin{array}{cc|cc} 1 & 1 & 1 & 0 \\ 0 & 0 & 2 & -1 \end{array} \right]$$
The bottom row indicates that is not solution. Therefore, there is no inverse because there is no identity on the left.
Exercise: Find $\left[ \begin{array}{} 1 & 2 & 3 \\ 0 & 1 & 1 \\ 1 & 2 & 0 \end{array} \right]^{-1}$.
Definition: The rank of a matrix is the number of leading $1$'s in its reduced row echelon form.
Example: Rank $=4$:
Corollary: An $n \times n$ matrix is invertible if and only if its rank is equal to $n$.
Also, suppose $\{v_1,\ldots,v_n\}$ are vectors. How many linearly independent vectors does it contain (compare to Reduction Theorem)?
Form a matrix $[v_1,v_2,\ldots,v_n]$ with $v_1,\ldots,v_n$ as columns. Then find its rank.
Theorem: Given $n \times n$ matrices $A,B$, suppose
- 1. $AB=I$, then
- 2. $BA=I$.
So $A,B$ are inverses of each other.
Note: the definition requires both 1. and 2.