This site is being phased out.

Discrete Calculus: Applied Analysis on Graphs for Computational Science by Grady and Polimeni

From Mathematics Is A Science
Jump to navigationJump to search

Discrete Calculus: Applied Analysis on Graphs for Computational Science by Leo J. Grady, Jonathan R. Polimeni [1]

Description: "The field of discrete calculus, also known as "discrete exterior calculus", focuses on finding a proper set of definitions and differential operators that make it possible to operate the machinery of multivariate calculus on a finite, discrete space. In contrast to traditional goals of finding an accurate discretization of conventional multivariate calculus, discrete calculus establishes a separate, equivalent calculus that operates purely in the discrete space without any reference to an underlying continuous process."

This is an informal review...

Best quote:

Calculus is topology.

The reason is that the matrix of the exterior derivative is simply the transpose of the matrix of the boundary operator. The fact is well-known but I never saw the depth of this connection. The practical consequences are also significant. Indeed, if you know the boundary of each k-cell in a cell complex in terms of $(k-1$-cells, you also know the exterior derivative of all discrete differential forms (co-chains). So, you know calculus.

This idea inspired me to come up with a formula of my own:

calculus / algebra = topology.

Roughly, you can take all linear algebra out of calculus via a certain equivalence relation (cohomology). What you end up with is Betti numbers.

Some random thoughts and quotes below, section by section.

Contents

Part I: A Brief Review of Discrete Calculus

Chapter 2. Introduction to Discrete Calculus

p. 14. "..vector calculus -- which is defined only for up to three spacial dimensions..." is only fair if you limit yourself to physics.

p. 17. It would make sense to state from the beginning that the exterior product of two vectors is a number, to avoid any confusion. Just to say that the exterior product is a "product" satisfying certain properties is circular.

p. 51. ".. given a primal n-complex, the boundary operator on primal $p$-cells is the transpose of the boundary operator on dual $(n-p+1)$-cells". That's Poincare duality.

p. 60. "... the boundary operator and the exterior derivative are defined by the structure of the complex itself. In other words, the structure of the derivative operator depends on the topological structure of the space -- in a sense, the graph is the operator."

p. 61. Laplace-de Rham operator is defined in terms of the exterior derivative and the codifferential: $$Δ=dd^*+d^*d.$$ "The Laplacian matrix also represents the connectivity of the graph structure, but not the orientation of the edges." It's defined directly by for each pair of vertices $u,v$ of a given graph

  • $L_{uv}=-1$ if $u,v$ are adjacent,
  • $L_{uu}=d_u$, the degree of $u$,
  • $L_{uv}=0$ if $u,v$ are not adjacent or equal.

p. 68. Unlike the smooth, the discrete Fundamental Theorem of Calculus makes sense even for integration over graphs not just over intervals.

p. 70. The example shows that the difference replacement for the smooth gradient does not produce the desired result: the line integral over $1 \times 1$ square of the gradient of a function isn't zero (path independence). Score for discrete exterior calculus. The same point can be made with the FTC but it's not as convincing. Typo: "the central differences gradient approximation" given is in fact the sum (formula 2.116).

The most useful chapter.

Chapter 3. Circuit Theory and Other Discrete Physical Models

p. 92. The electric potential is a $0$-form while the current is a $1$-form. All on a graph.


Part II: Applications of Discrete Calculus

Chapter 4. Building a Weighted Complex from Data

p. 125. "... in this chapter we will employ the term 'cycle' to describe a $2$-cell since a cycle of $1$-cells specifies a unique $2$-cell." Hmmm...

Chapter 5. Filtering on Graphs

Meaning denoising of data given by a $0$-form. The main tool is the discrete Fourier transform (DFT).

p. 155. "... we may generally assume that the noise has high spacial frequency." So, it's the part of the data that changes a lot from a vertex to the next. "Therefore, the goal of the most filtering operations is to remove the high frequency noise, while being careful to preserve the high [low?] frequency signal (often modeled as spacial discontinuities)." The last is an abrupt but one-time change in the value.

p. 156. Filtering = smoothing.

p. 170. The problem with all of these energy minimization algorithms is the assumption that all high frequencies represent noise.

Chapter 6. Clustering and Segmentation

This part is much less mathematical. As a result, a lot of background (and authors' expertise) is left out.

Chapter 7. Manifold Learning and Ranking

The main topic here is dimensionality reduction. The main approach is via geometry that relies on the metric acquired from the embedding of the manifold n the Euclidean space. The topological approach would be to use Vietoris-Rips complex and persistent homology.

Terminology: "Nodes" = vertices, "network" = $1$-dimensional cell complex.

p. 243. "... we will interpret the data associated with each node as the node coordinates."

p. 244. "The starting point for almost all manifold learning techniques is to apply some method for connecting data points a nodes via edges to form a network."

p. 244. "The classical approach to dimensionality reduction is Principal Component Analysis (PCA)... finds suitable set of basis vectors and project the high-dimensional data onto these vectors."

p. 245. "The standard approach to creating an edge structure is to find k-nearest neighbors of each node ... then assigned distance weights by the Euclidean distance..." The metric is local then extended globally. Why does the metric have to be Euclidean? For data the Euclidean distance is meaningless. In fact, why does the dataset have to be a metric space in the first place? The coordinate system is made up. And there are non-Euclidean topologies too.

p. 251. "... the clustering problem is therefore a reduction to the ultimate low-dimensionality of one dimension." Hmmm, is it zero dimension?

p. 259. "The significance of a node is represented numerically by a variable assigned to each node which is called a node rank." BTW, the meaning of the word "ranking" is supposed to be "relative position" not a score or a rating. Regardless, ranking of data means global maximization based on the scores of the nodes, or on local ranking. The latter might be thought of as a flow, a discrete vector field, or a $1$-form $g$. If $g=df$, we're done -- $f$ is the rating. If not, use Hodge decomposition.

p. 259. "The first principle underpinning the PageRank algorithm is that the rank of a node should be high if (i) the rank of the node linking to that node are high and (ii) should be low if the node has few incoming edges or if the nodes linking to the node have low rank." (Google should have stuck to this principle; now everything is a secret.) The algorithm always converges, if the graph is aperiodic, to the same thing, but is the principle enough to make the result unique? NO. It's interesting that PageRank, as well other algorithms mentioned here, uses made-up parameters.

p. 257. "... the PageRank algorithm has two natural interpretations as a PDE on a graph (diffusion and advection)..."

p. 258. The example of dimensionality reduction for a dataset of images of livers discovers two essential parameters (corresponding to some protrusions). The dataset has 108(!) images. Not very convincing. A better experiment: create a digital geometric figure with no symmetries, create the dataset of all photos (projections) from all possible directions, run your algorithm -- it's supposed to discover the rotation group $SO(3)$.

p. 260. "... a manifold learning technique should be able to flatten a low-dimensional dataset embedded into a high-dimensional space regardless how the dataset is embedded." This analogy doesn't seem to apply to closed manifolds without boundary like the sphere.

Do these algorithms ever fail?

Chapter 8. Measuring Networks

p. 279. "Computation of the $p$-th Betti number is possible from the incidence matrices via the formula" $$B_p=|S_p|-rank N_{p+1}-rank N_p,$$ where $|S_p|$ is the number of $p$-cells in the complex. Interestingly, the formula is referenced to Lefschetz's (of the Lefschetz number) book from 1942.

p. 280. The Betti numbers can be computed "by counting the number of eigenvalues of $L_p$."

p. 283. Gaussian curvature is typically defined in terms of principal curvature. However, an alternative definition of Gaussian curvature is provided by a special case of the Gauss-Bonnet theorem that applies equally well to the continuous and discrete cases... based on the idea that, in the plane, at a given vertex the sum of the angles between adjacent edges connecting the vertex to its neighbors always sum to $2π$." $$ʃʃ_MKdV+ʃ_{∂M}k_gds=2πχ(M),$$ where $k_g$ is the geodesic curvature along the boundary of the manifold and χ(M) is the Euler characteristic. The discrete case is: $$K(v)=\frac{2π-∑_uθ_u}{A_v},$$ where the summation is of the angles over all adjacent vertices $u$, $A_v$ is the area of the dual (Voronoi) cell.

p. 284. The definition of the Gaussian curvature "holds for finer and finer mesh spacing and is equivalent to the continuous definition in the limit. In contrast, mean curvature does not possess an analogues integral definition." Similar problem with lengths of curves.


Appendix A. Representation and Storage of a Graph and Complex

Appendix B. Optimization

p. 295. "The preliminary mechanism for applying these discrete calculus operators was to formulate algorithms as variational or energy minimization problems."

Appendix C. The Hodge Theorem: A Generalization of the Helmholtz Decomposition

See Hodge decomposition.

The black-and-white illustrations sometimes have captions that mention colors. Clearly, this is a compilation of older papers.

It was, overall, very educational for me.

At $106.02, it's expensive. Online version is here.