This site is devoted to mathematics and its applications. Created and run by Peter Saveliev.

Topology of graphs

From Mathematics Is A Science
Jump to navigationJump to search

Graphs and their realizations

Here we take our first step toward algebraic topology.

We will concentrate initially on only discrete structures, such as graphs. The reason why has become clear from our attempts to answer some obvious topological questions about specific spaces. Given a subset of the Euclidean space:

  • verifying that it is path-connected requires testing infinitely many pairs of points; while
  • verifying that it is simply-connected requires testing infinitely many pairs of loops.

Even though there are a few theorems that we can use as shortcuts, we have no algebraic means for solving these problems computationally, nor for computing the number of path-components or the number of holes.

In the rest of the chapter we will:

  • define discrete structures and find discrete analogues of these topological ideas,
  • develop algebraic methods for solving the two topological problems above for discrete structures, and
  • develop further methods for studying transformations of these structures.

The very first data structure that needs to be analyzed topologically is graphs:

Graph examples.png

The examples of graphs shown come as networks of bridges, atoms in a molecule, related individuals, or computers.

Definition. A graph $G =(N,E)$ consists of two finite sets:

  • the set of nodes $N=N_G$, and
  • the set of edges $E=E_G$, which is a set of pairs of nodes.

We don't allow distinct edges for the same pair of nodes, nor doubles, such as $AA$, and we assume that $AB=BA$.

Example. In the example of a field of flowers, each type is represented by a node of a graph and the overlapping areas by its edges:

Flower field and its graph.png


Example. Let's define a graph $G$ by:

  • $N:=\{ 1,2,3,4,5 \}$, and
  • $E:=\{12,23,45,14 \}$.

Let us be clear from the beginning that the dataset given above is the whole graph and what's below is just an illustration:

Graph example.png

To draw an illustration, we first put the nodes in a more or less circular arrangement and then add the corresponding edges.


A common representation of a graph is via its incidence matrix, i.e., the matrix with a $1$ in the $ij$ position if the graph contains edge $ij$ and $0$s elsewhere. For $G$ we have: $$\begin{array}{c|cccccc} N & 1 & 2 & 3 & 4 & 5 \\ \hline 1 & 0 & 1 & 0 & 1 & 0 \\ 2 & 1 & 0 & 1 & 0 & 0 \\ 3 & 0 & 1 & 0 & 0 & 0 \\ 4 & 1 & 0 & 0 & 0 & 1 \\ 5 & 0 & 0 & 0 & 1 & 0 . \end{array}$$

One can also think of a graph as just a collection of points in space, also called “vertices”, or “nodes” connected by paths, called “edges”. Thus, a graph is made of finitely many pieces that are known to be path-connected! One can then study the topology of such sets by means of graphs represented as discrete structures:

Graph of the circle.png

Definition. A realization $|G|$ of graph $G$ is a subset of the Euclidean space that is the union of the following two subsets of the space:

  • a collection of points $|N|$, one for each node $N$ in $G$, and
  • a collection of paths $|E|$, one for each edge $E$ in $G$, with no intersections other than the points in $|N|$.

In other words:

  • $a,b \in |N|$ are connected by a path $p \in |E|$ if and only if there is an edge in $AB \in E$, where $A,B$ are the nodes corresponding to points $a,b$:
  • $a=|A|,\ b=|B|$, and
  • $p=|AB|$.

The definition is justified by the following.

Theorem. Every finite graph can be realized in ${\bf R}^3$.

Of course, there may be many different realizations of the same graph.

Exercise. Provide an explicit realization of a graph with $4$ nodes in ${\bf R}^3$. Hint: start with two nodes.

Now we provide discrete analogs of our topological definitions.

Connectedness and components

We start with the discrete analog of the most basic topological property.

Definition. A edge-path, or simply path, from node $A \in N_G$ to node $B \in N_G$ in graph $G$ is a sequence of edges $$A_0A_1,\ A_1A_2,...\ , A_{n-1}A_n \in E_G$$ with $A_0=A,\ A_n=B$. A graph $G$ is called edge-connected, or connected, if for any pair of nodes $A,B \in N$, there is a path from $A$ to $B$.


We now connect this new concept to its origin.

Theorem. A graph $G$ is edge-connected if and only if its realization $|G|$ is path-connected.

Proof. $[\Rightarrow ]$ Suppose $G$ is an edge-connected graph and $a,b \in |G|$ are two points in its realization. Then $a,b$ belong to the realizations of some edges of the graph: $a \in |AA'|,b \in |BB'|$. Since $G$ is edge-connected, there is an edge-path $$A_0A_{1},A_1A_{2},...,A_{n-1}A_{n}$$ in the graph from $A$ to $B$. Then a path from $a$ to $b$ is constructed from these pieces:

  • the piece of $|AA'|$ that connects $a$ to $|A|$,
  • the realizations $|A_iA_{i+1}|$ of the edges $A_iA_{i+1}$, and
  • the piece of $|BB'|$ that connects $|B|$ to $b$.
Edge-connectedness vs path-connectedness.png

$[ \Leftarrow ]$ Suppose we have a realization $|G|$ of $G$ and it is path-connected and suppose $A,B \in N_G$ are two nodes of the graph. Then we know that there is a path $p:[0,1]\to |G|$ from $|A|$ to $|B|$ in $|G|$. Now, starting from $t=0$ every time $p(t)$ is a realization of a node $Q$, add $Q$ to the list $L$ of nodes. Now, this list may be infinite. Replace each repetition $QQ$ in $L$ with $Q$, then $L$ is finite: $$L=\{Q_0=A,Q_1,...,Q_n=B \}.$$ Finally, our edge-path in $G$ from $A$ to $B$ is $$P=\{Q_0Q_1,Q_1Q_2,...,Q_{n-1}Q_n \}.$$


Exercise. Provide a formula for the path in the first part of the proof and prove that it is continuous.

Exercise. Show that, in the second part of the proof, the list $L$ may indeed be infinite. Also prove that, after removing repetitions, $L$ becomes finite.

Just as before, in order to evaluate the topology of the graph, we make a transition to homology.

Definition. Two nodes $A,B \in N_G$ of graph $G$ are called homologous if there is an edge-path from $A$ to $B$.

To avoid confusion, let's emphasize here that homology is a relation between the nodes of the graph. It certainly is not a relation between the points of a realization, nor is it a relation between the edges of the graph! After all, the nodes in a graph normally represent some agents while the edges may represent some pair-wise relations between them, such as: people and their family relations. That's why we can't allow them to mix.

Theorem. Homology is an equivalence relation on the set of nodes of the graph.

Exercise. Prove the theorem.

The equivalence classes are called edge-components, or simply components, of $G$.

Theorem. For any graph $G$,

$\#$ of edge-components of $G = \#$ of path-components of $|G|$.

Exercise. Prove the theorem.

The problem is solved: we have expressed the topological property entirely in terms of data and now the computer can do this job.

That's homology of dimension $0$ (nodes) considered for objects of dimension $1$ (graphs). Higher dimensions will pose a more significant challenge.

Holes vs cycles

Definition. An edge-path in a graph $G$ from node $A$ to node $A$ is called a cycle, or a $1$-cycle, in $G$.

In the last section, we studied holes and tunnels in objects by capturing them with $1$-cycles. Then, in order to deal with the unavoidable redundancy, we declared $1$-cycles homologous and, therefore, indistinguishable, if they form the boundary of some surface. However, as there is no meaningful way to fit a surface into a graph, homology won't be useful or necessary.

Is there still redundancy? Yes, but of a different kind. The realization of graph $G$ below is the figure eight and it appears to have $2$ holes. However, the graph itself has (at least) $3$ cycles, $a,b,c$!

TopologicalFigure8 and its cycles.png

How do we remove the redundancy? The observation is algebraic: $$a+b=c.$$ In other words, the cycles are linearly dependent! That's why $c$ doesn't count.

This brings us to the general idea:

$\#$ of holes $= \#$ of linearly independent cycles.

It is time now to start to recognize the need for algebra in topology.

We might try something familiar first in order to turn nodes and edges into algebraic entities, such as the union. Unfortunately, the algebra of unions is inadequate as there is no appropriate meaning for subtraction: sometimes $(A\cup B) \setminus B \ne A$. The algebra that does work is familiar.

Example. Let's consider our graph again:

TopologicalFigure8 as a graph.png

Now consider our two cycles: $$\begin{array}{llllll} a & =12 + 25 + 56 + 61 \\ b & =23 + 34 + 45 + 52 \\ a+b & =12 + 25 + 56 + 61 + 23 + 34 + 45 + 52 \\ & =12 + (25 + 25) + 56 + 61 + 23 + 34 + 45 \\ & =12 + 56 + 61 + 23 + 34 + 45 \\ & = c \end{array}$$ Here we cancel the edge that appears twice.


Such cancelling can be assured if we use the binary arithmetic: $$x+x=0,\ \forall x.$$ It is the arithmetic of integers modulo $2$, i.e., the algebra of ${\bf Z}_2$.

This idea takes us deep into algebra. One unexpected conclusion is that

$0$ is also a cycle!

Now, we list the four $1$-cycles that have no repeated edges: $$Z_1:=\{ 0 ,a,b,a+b\}.$$ Since $a$ and $b$ generate the rest of the cycles, these two are the only ones that matter. Therefore,

$\#$ of holes $= 2.$

Note: If we choose to stay within the realm of linear algebra, we have to either think of $Z_1$ as a vector space with coefficients over ${\bf Z}_2$ or to enlarge $Z_1$ by including all of its real-valued linear combinations: $ra+sb,\ \forall r,s \in {\bf R}$. Then $\{a,b \}$ is a basis of this vector space and the idea becomes:

$\#$ of holes $= \dim Z_1 .$

However, in order to retain the cancellation property, we would have to deal with the complexity of directed edges, with $AB \ne BA$. We will postpone this option until later.

Above, we have justified the use of binary arithmetic for edges but it is just as appropriate for the algebra of nodes. It suffices to consider the boundaries of edge-paths. Clearly, the boundary of an edge-path is the sum of its end-nodes. For example, the path $12+23$ in the above graph has boundary $1+3$. On the other hand, its boundary is the sum of the boundaries of $12$ and $23$. Those are $1+2$ and $2+3$. We end up with $$1+3 = (1+2) + (2+3).$$ Hence, $2+2=0$. That's binary arithmetic again.

Let's review:

  • for dimension $0$, we captured components by means of nodes, and
  • for dimension $1$, we captured holes by means of cycles.

In either case, there is redundancy: many nodes per component and many cycles per hole. To resolve the redundancy, we did the following:

  • for dimension $0$, we counted the equivalence classes of nodes, and
  • for dimension $1$, we counted the linearly independent cycles.

These are two very different solutions to two similar problems. In our study of dimension $2$ (e.g., surfaces) and above, we'll need to use a combination of these two techniques.

The Euler characteristic

Definition. The Euler characteristic of graph $G$ is $$\chi(G) = \# \text{ of vertices } - \# \text{ of edges}.$$

The topological significance of this number is revealed if we consider a simple, i.e., without self-intersections, curve $C$. It might be a realization of various graphs $G$, but let's suppose $G$ is a sequence of $n$ consecutive edges:

Euler of segment.png

Then $$\chi (G) = n-(n-1)=1.$$ So, the number is independent from $n$! Therefore, $\chi (G)$ could be an attribute, i.e., a “characteristic”, of the curve itself that distinguishes it from curves that do have self-intersections. To justify this conclusion, however, we need the following.

Theorem. If a simple path is a realization of a graph, this graph is a sequence of $n\ge 1$ consecutive edges.

Exercise. Prove the theorem.

Exercise. Show that if $C$ consists of $k$ path-components each of which is a simple curve and $C=|G|$, then $\chi (G) =k$.

Then, is $\chi (G)$ the number of path-components of $|G|$? The example of a triangle $T$ shows that this is not correct as $\chi (T) = 0$. Below is best we can do.

Theorem. If $T$ is a tree, i.e., a connected graph with no cycles, then $$\chi(T) = 1.$$

Proof. Idea: Remove an edge then use induction.

So, we use induction on the number of edges on the graph.

First, tree $T$ with a single edge has $\chi(T) = 1.$ Now, we assume that any tree with fewer than $n$ edges satisfies this identity. Suppose that $T$ is a tree with $n$ edges.

Tree splitting.png

Remove any edge, $e=AB$, from $T$. The result is a new graph $G$ and we have $$N_G=G_T,E_G=E_T - \{e \}.$$

What kind of graph is it?

It is disconnected. Indeed, let $H:=[A],\ K:=[B]$, the edge-components of $A,B$ respectively. As we know, either graph is connected. Secondly, removing an edge can't create cycles, so both are trees. Thirdly, there is no path from $A$ to $B$, because if there was one, say, $P$, then the combination of $P$ and $e$ would be a path from $A$ to $A$, a cycle in the tree $T$. Therefore, $H$ and $K$ are disjoint.

So, $G$ splits into two trees $H$ and $K$ and each has fewer than $n$ edges. Therefore, by assumption, we have $$\chi(H) = \chi(K) = 1.$$

Let's compute now: $$\begin{array}{llllll} \chi(T) & =&&\# \text{ of vertices of } T - \# \text{ of edges of } T\\ & =&&\# \text{ of vertices of } G - ( \# \text{ of edges of } G + 1) \\ & =&&\# \text{ of vertices of } G - \# \text{ of edges of } G - 1 \\ & =&& \binom {\# \text{ of vertices of } H} {+ \# \text{ of vertices of } K} - \binom {\# \text{ of edges of } H} {+ \# \text{ of edges of } K} -1\\ & =&&\big( \# \text{ of vertices of } H - \# \text{ of edges of } H \big)\\ & &+&\big(\# \text{ of vertices of } K - \# \text{ of edges of } K \big) - 1\\ & =&&\chi(H) \\ & &+&\chi(K) -1\\ &=&&1+1-1\\ &=&&1. \end{array}$$


Exercise. Find and fix the two gaps in the proof. Hint: First prove that removing an edge can't create cycles.

Exercise. Repeat the proof of the theorem for $e$ an “end-edge”.

Exercise. What about the converse?

Corollary. If $G$ consists of $n$ disjoint trees, then $\chi (G) =n$.

Exercise. Prove the corollary.

Exercise. Prove the following generalization of the above theorem.

Theorem. If $G$ is a graph, then $$\chi(T) \le 1.$$ Moreover, $\chi(G) = 1$ if and only if $G$ is a tree.

Holes of planar graphs

What is a hole in the graph? The question isn't as simple as it appears. Let's investigate.

First, a tree has no holes! At the same time, it has no cycles. This looks like a match. Maybe holes are cycles?

In general, we can't say what a hole is. Even though we now know how to count holes -- as the number of linearly independent cycles -- we can't point at one of the cycles and say “That's a hole and that one isn't”. The example below shows that which of the cycles “naturally” look like holes depends on the realization of the graph:

TopologicalFigure8, 2 realizations, cycles.png


  • 1. for the first realization, the holes “are” cycles $a$ (red) and $b$ (blue) but not $c$ (green); while
  • 2. for the second realization, it's $b$ and $c$.

This ambiguity is caused by our topological point of view; we are allowed to bend stretch etc., and the results have to hold. But this ambiguity is, in fact, a good news because it is fully matched by the ambiguity of the algebra. Indeed, if $Z_1=\{0,a,b,c=a+b\}$, then

  • 1. $a$ and $b$ are generators, but
  • 2. so are $b$ and $c$!

(In the language of linear algebra, these are two bases of this vector space.)

We still would like to confirm our intuition about what holes in a graph are. For that, we will limit our attention to a simpler kind -- planar graphs. Those are the graphs that can be realized in the plane:

Planar graph and components.png

The idea is, once a realization $|G|$ of planar graph $G$ is chosen, the holes become visible as path-components of the complement of $|G|$.

We rely on the following two theorems (for the proof of the first see Munkres, Topology A First Course, p. 374).

Theorem (Jordan Curve Theorem). The complement of a simple closed curve in the plane has two path-components (one bounded and one unbounded).

Theorem. If a simple closed curve is a realization of a graph, then this graph is a cycle of $n\ge 1$ consecutive edges.

Exercise. Prove the last theorem. Hint: Try to answer these questions about $G$: how many components? cycles? holes? forks? end-points?

So, we have a one-to-one correspondence between:

  • the path-components of ${\bf R}^2 - |G|$,
  • the loops in $G$ the realization of which bound them, and
  • a certain basis of the space of cycles of $G$.

Let's count holes in this environment.

Suppose we are given a connected planar graph $G$ with a specific realization $|G|$ in the plane. The idea is to remove some edges one by one until we have a tree.

Planar graph and removing edges.png

The process is inductive. We start with our graph $G_0:=G$. Then at each step $k$, we pick an edge in graph $G_k,\ k=0,1,...$, that is a part of a cycle that bounds a hole in $G_k$ and remove it from the graph. This creates a new graph $G_{k+1}$. Of course, removing an edge in a cycle that surrounds a hole “kills” the hole by merging it with another or with the outside. Then $$\# \text{ of holes in } |G_{k+1}| = \# \text{ of holes in } |G_k| -1,$$ and $$\# \text{ of edges in } G_{k+1} = \# \text{ of edges in } G_k -1.$$

Exercise. Prove that after every step the new graph is connected.

Theorem. If a plane realization $|G|$ of a connected graph $G$ has $n$ holes then $$\chi(G) = 1 - n.$$

Proof. Above, we have shown that the graph with $n$ holes will need $n$ edges removed to turn it into a tree. Then the last theorem applies. $\blacksquare$

Example (houses and wells). Suppose we have three houses and three wells arranged as below. All three wells are public but, since the house-owners don't get along, they want to make paths from their houses to the wells that don't intersect. The first attempt fails -- there is no way to find a path from the third house to the second well:

Houses and wells.png

Let's prove that this is indeed impossible. The graph has $6$ nodes and $9$ edges, therefore, by the formula above, there must be $4$ holes in this planar graph. However, the way the paths have to be arranged, each hole, and the outside area too, has to be bounded by at least $4$ edges. This makes it necessary to have $20$ edges, with some repetitions. How many? Each edge is shared by exactly two holes, counting the outside area. Therefore, there must be at least $10$ edges, a contradiction.


Exercise. Can we rearrange the houses and wells so that the paths can be found?

Exercise. Can this problem be solved on a sphere, a cylinder, a torus? Suggest others.

An important way to rephrase the last theorem is the following.

Corollary. For any connected planar graph $G$, $$\# \text{ of holes of } |G|= 1 - \chi(G).$$

Therefore, since the number on the right is independent from a choice of realization, so is the number on the left. This confirms the results of our algebraic analysis of cycles presented above.

While useful, this non-algebraic approach is a poor-man's solution of this problem.

Exercise. Suppose graph $G$ is the disjoint union of $m$ trees, find its Euler characteristic.

Exercise. Suppose graph $G$ has $m$ components and $n$ holes, find its Euler characteristic.

The Euler Formula

We have demonstrated that the Euler characteristic of a graph $G$, $$\chi (G) =\# \text{ of nodes} -\# \text{ of edges },$$ captures enough topology of $G$ so that we can tell trees from non-trees. We have also noticed that it is the minus sign in the formula that ensures that this number is preserved when we add or remove “topologically irrelevant” edges.

Following these ideas, we move from dimension $1$ to dimension $2$ and define the Euler characteristic of a polyhedron $P$ as $$\chi (P) =\# \text{ of vertices } -\# \text{ of edges } +\# \text{ of faces },$$ in hope of capturing some or all of its topology.

A pattern was originally noticed by observing a lot polyhedra:

Convex polyhedra.png

Theorem (Euler Formula). For any convex polyhedron $P$, we have $$\chi (P) =2.$$

The formula will prove to be useful, even though convexity isn't a topological property.

We will consider two proofs. The first one will use a result from the last subsection about planar graphs.

We start by observing that the collection of vertices (nodes) and edges of polyhedron $P$ is always a realization of some graph $G_P$, but is it planar? It is, under an appropriate choice of stereographic projection:

Stereographic projection.png

However, it is easier to see such a realization as it appears on the sphere ${\bf S}^2$ -- via the radial projection -- demonstrated on this cube:

Cube on sphere.png

Then, the corollary from the last subsection is used in the following computation: $$\begin{array}{lllllll} \chi(P) & \\ =\# \text{ of vertices of } P-\# \text{ of edges of } P +\# \text{ of faces of } P \\ =\# \text{ of vertices of } G_P-\# \text{ of edges of } G_P +(\# \text{ of holes of } G_P +1)\\ =\chi(G_P) +\# \text{ of holes of } G_P +1\\ =\chi(G_P) +(1-\chi(G_P)) +1\\ =1+1\\ =2. \end{array}$$

Unfortunately, this proof relies on the convexity of the polyhedron. Meanwhile, we observe that it might take the same number of vertices, edges, and faces to either

  • add a sloped roof to a cube, or
  • make an indentation in it.

Therefore, both constructions preserve the Euler characteristic:

Cube with roofs.png

While the former object is convex, the latter isn't! This suggests that the Euler characteristic has nothing to do with convexity and may depend only on the topology of the polyhedron.

Exercise. Try the torus.

An alternative approach to the proof of the formula is outlined below.

We assume that $P$ is a polyhedron that satisfies these topological properties:

  • 1. $P$ is path-connected;
  • 2. every loop cuts $P$ into two disjoint pieces.

The idea of the proof is to build two trees on $P$ with the following properties:

  • $T$ captures all vertices of $P$ and some edges;
  • $G$ captures all faces of $P$ and the rest of the edges.

Now, this is what we know about graphs: $$V_T-E_T=1,$$ $$V_{G}-E_{G}=1.$$ Add the equations and re-arrange: $$V_T-(E_T+E_G)+V_G=2.$$ An examination reveals that these three terms are:

$\#$ of vertices - $\#$ of edges + $\#$ of faces

of $P$. And the theorem follows.

We just need to build $T$ and $G$ satisfying these conditions.

Exercise. Prove that every graph has a subgraph which

  • contains all of the original vertices, and
  • is a tree.

We apply this result to the graph of $P$. Then, we have its subgraph $T$ which is a tree and

vertices of $T$ = vertices of $P$.

Next, we build $G$ from $T$:

  • $G$'s vertices are the middle points of the faces.
  • $G$'s edges connect $G$'s vertices by crossing the edges of $P$, but not the edges of $T$.
Euler formula proof.png

Then $G$ is called the “dual graph” of $T$.

Exercise. It is always possible to get from one face to any adjacent face without crossing $T$. Why?

Exercise. A regular polyhedron is a polyhedron the faces of which have the same number of edges and the vertices of which have the same number of faces adjacent. Prove that there are exactly five of them (called the Platonic solids):

Platonic solids.png