This site is being phased out.

Social choice

From Mathematics Is A Science
Revision as of 18:48, 30 November 2015 by imported>WikiSysop
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

The paradox of social choice

Let's review what we have learned about the problem of social choice.

Suppose we are to develop a procedure for finding a fair compromise on a choice of location (such as the camp on the shore of a lake) for two individuals:

Beech fork circular lake.png

We discovered that there is no such solution when the homology of the forest is non-trivial, such as one with a lake in the middle.

This is the general setup. There are $m$ voters, or agents, making their selections:

  • the space of choices is a simplicial complex $W$;
  • the choice made by the $k$th voter is a vertex $A$ in $W$.

Then a solution to the social choice problem is a compromise-producing rule, i.e., a function $$f:(A_1,...,A_m) \mapsto B,$$ where $B$ is some vertex in $K$, that satisfies the following three conditions:

  • Continuity (Stability Axiom). The function $f:W^m \to W$ generated by its values on the vertices is a simplicial map.
  • Symmetry (Anonymity Axiom). The function is symmetric:

$$fs=f,\ \forall s\in S_m.$$

  • Diagonality (Unanimity Axiom). The function restricted to the diagonal of $W^m$ is the identity:

$$f\delta = \operatorname{Id}.$$ Here, and below, $\delta$ is the diagonal function: $$\delta (x)=(x,...,x).$$

The analysis relied on the concept of a mean of degree $m$ on set $X$ as a function $f:X^m\to X$ which is both symmetric and diagonal. Algebraic means are homomorphisms while topological means are continuous maps.

Theorem. There is an algebraic mean of degree $m$ on an abelian group $G$ if and only if $G$ allows division by $m$. In that case, the mean is unique and is given by the standard formula.

Theorem. Suppose $G$ is a free abelian group and suppose $f:G^m\to G$ is a symmetric homomorphism such that, for some integer $q$, we have: $$f\delta = q\operatorname{Id}.$$ Then $m\big| q$. In that case, $f$ is a multiple of the sum, $$\Sigma(x_1,...,x_m)=x_1+...+x_m.$$

From these theorems, we concluded that the topological diagram below cannot be completed, because the algebraic one cannot be completed (over ring $R={\bf Z}$): $$ \newcommand{\ra}[1]{\!\!\!\!\!\!\!\xrightarrow{\quad#1\quad}\!\!\!\!\!} \newcommand{\da}[1]{\left\downarrow{\scriptstyle#1}\vphantom{\displaystyle\int_0^1}\right.} % \begin{array}{cccccccccccc} & X & \ra{\delta} & X^m &&&& H_p(X) & \ra{\delta _*} & H_p(X^m) \\ & _{\operatorname{Id}} & \searrow & \ \ \da{f=?} &&\ra{\quad H \quad} && _{\operatorname{Id}} & \searrow & \ \ \ \da{f_*=?} \\ & & & X&&&& & & H_p(X). \end{array} $$ The topological conclusion is the following.

Theorem. Suppose $X$ is path-connected and has torsion-free integral homology. If there is a topological mean on $X$ then $X$ is acyclic.

As a corollary, we proved the following “impossibility theorem”.

Theorem (Impossibility). Suppose the space of choices $W$ is path-connected and has torsion-free homology. Then the social choice problem on $W$ has no solution unless $W$ is acyclic.

The conclusion, though surprising, isn't counter-intuitive. It is a common sense observation that excluding some options might make compromise impossible. Of course, a compromise-producing rule also becomes impossible.

Example (political creeds). The six statements below are meant to represent an open cover of the political spectrum (in the US). In other words, we assume that every person will support at least one of these statements:

  • 1. Government is a problem solver.
  • 2. Government is a force for social justice.
  • 3. Private property is a cornerstone of society.
  • 4. Capitalism is the problem.
  • 5. Government is the problem.
  • 6. State should not exist.

We call the sets of individuals supporting these statements $U_1,...,U_6$ respectively. We now assign letters to the intersections of these sets. This way we classify all individuals based on which statements they support under the assumption that no-one supports more than two:

  • 1 and 2: $D:=U_1\cap U_2$,
  • 1 and 3: $R:=U_1\cap U_3$,
  • 2 and 4: $S:=U_2\cap U_4$,
  • 3 and 5: $L:=U_3\cap U_5$,
  • 4 and 6: $C:=U_4\cap U_6$,
  • 5 and 6: $A:=U_5\cap U_6$.

We now build the nerve of this cover. The intersections become the vertices and the sets become the edges:

Political spectrum as S1.png

The result is the circle ${\bf S}^1$. Keep in mind that only the adjacency is represented in the complex and only the adjacency is illustrated; none of these concepts apply: “left-right”, “close-far”, “between”, “opposite”, or “extreme”:

Political spectrum as S1 distorted.png

The theorem above then claims that, if these issues are put to the vote, there can be no electoral system that would always produce a fair compromise.

$\square$

Exercise. For the above example, use the possible sizes of the voter populations to create a filtration and evaluate its persistent homology.

A similar analysis shows that a compromise on colors may also be impossible -- without gray:

Colors.png

In an attempt to resolve the “paradox”, we will allow choices that are more complex than just a single location or outcome. The topology of the space of choices will no longer be a problem.

Ratings, comparisons, ranking, and preferences

Instead of a single location, each hiker can express a more complex set of preferences.

For example, the hiker can assign a number to each location in the forest reflecting his appreciation (i.e., its utility) of this choice. In the discrete interpretation, we suppose there are three camp sites, $A,B,C$, set up on the shore of the lake. Then the hiker $X$ assigns a number to each camp site.

Beech fork as triangle.png

Then $X$'s vote is a triple $(a,b,c)\in R^3$, and so is $Y$'s. The goal is to find a triple that best approximates the desires of the two. A compromise-producing rule for the two hikers is then a function: $$f: R^3\times R^3 \to R^3,$$ where $R={\bf R}$ or ${\bf Z}$ is our ring of coefficients. Even though Continuity doesn't apply anymore, Symmetry and Diagonality are natural requirements. We will see that such a function might not exist, depending on our choice of $R$.

The main examples of this, more sophisticated choice-making are:

  • ratings: $X(A)=1,\ X(B)=4$, etc.;
  • comparisons: $x(AB)=-1,\ x(BC)=3$, etc.;
  • rankings: $X(A)=\#1,\ X(B)=\#2$, etc.; and
  • preferences: $A < _x B,\ B < _x C$, etc.

Can any of these votes be combined into one compromise vote following the fairness principles presented in the last subsection?

Generally, there are $n$ alternatives or candidates: $$A:=\{0,1,2,...,n-1\}=\{A_0,A_1, A_2, ... ,A_{n-1}\}.$$ They are ordered. We also suppose that these candidates are the vertices of a simplicial complex $K$. All candidates are subject to evaluation but the presence of edge $AB$ in $K$ reflects the fact that $A$ and $B$ are comparable.

Simplex of choices.png

We will look into the first two options as they are subject to the algebra we have developed in this chapter.

A vote is an element of $C^*=C^*(K)$. In particular, a rating vote is a combination of numbers assigned to each candidate. Therefore, this is a $0$-cochain on $K$, $X\in C^0(K)$.

Example (utility). Once such a vote is given, we may also produce a vote for every linear combination of candidates: $$X\bigg(\sum_i p_iA_i\bigg):=\sum_i p_iX(A_i).$$ We recognize this as the expected utility on $K$:

Utility function.png

$\square$

Furthermore, a comparison vote is a combination of numbers assigned to each pair of candidates. Therefore, this is a $1$-cochain on $K$, $x\in C^1(K)$.

Note: One can also study rankings via the complex of orderings (subsection 4.4.4).

How these votes are tallied is discussed in the next subsection. Suppose the elections have produced a single vote, who wins?

If this is a rating $X$, the winner is the one (possibly tied) with the largest rating. If this is a comparison vote $x$, it can be used to find a corresponding rating vote $X\in d^{-1}(x)$, and then, again, maximize it.

The algebra of vote aggregation

There may be several types of elections, or evaluating procedures, to run -- based on the types of votes allowed. For example, we may limit the votes to: $$Q=C^k, Z^k, B^k,...,$$ as subgroups of $C^*(K)$. For a given $Q$, an election is any combination of $m$ votes from $Q$. Then the set of all possible elections is the product, $Q^m$.

For any type of elections, one has to convert an election, an element of $Q^m$, into a single vote of the same type, an element of $Q$.

Definition. A tally is a function: $$f: Q^m \to Q,$$ written as: $$x=f(x_1, ..., x_m).$$

Exercise. Show that a tally doesn't have to be generated by a simplicial map.

The simplest and the most natural way to tally is to add the votes of the $m$ voters: $$x=\Sigma(x_1, ..., x_m):=x_1+ ...+ x_m.$$ We will call $\Sigma$ the sum tally.

For $R={\bf R}$, there is also the mean tally: $$x=M(x_1, ..., x_m):=\frac{1}{m}(x_1+ ...+ x_m).$$

Proposition. The sum and the mean tallies are well-defined for both non-circular, $f^k:(Z^k)^m \to Z^k$, and rating, $f^k:(B^k)^m \to B^k$, comparison votes.

They are both homomorphisms but does any tally have to be?

Suppose, hypothetically, that two elections of the same type are run two days in a row. Suppose every voter casts his vote twice with the possibility that he changes his mind by the next day or simply splits his vote arbitrarily. We want to have a single outcome. The first option is to take these two votes and add them together to create a combined vote of this voter, and after this is done for all voters, tally the votes according to the election rule to find the outcome vote. The second option is to tally at the end of the first day, then, separately, at the end of the second, and then add the two outcomes together. The results should match.

We introduce a new axiom.

Additivity. Tally $f$ is a homomorphism: $$f(x_1+y_1,...,x_m+y_m)=f(x_1,...,x_m)+f(y_1,...,y_m).$$

Exercise. What if we replace “add” with “average”?

Exercise. What about scalar multiplication?

Now, what makes a good electoral system?

A fair tally would have to be additive and but also satisfy these two axioms:

  • If two voters flip their votes, it won't affect the tallied vote (Symmetry).
  • If all voters vote identically, then the tallied vote coincides with this vote (Diagonality).

Therefore, a fair tally $f$ is a mean on $Q$! What we know about means on groups implies the following.

Theorem.

  • Over ${\bf Z}$, there is no fair tally.
  • Over ${\bf R}$, the only fair tally is the mean tally.

Aggregating rating votes

So far, the votes have been treated as mere elements of an arbitrary subgroup $Q\subset C^*(K)$ and a tally for such an election has to satisfy some generic rule: Additivity and Symmetry.

This time, we consider ratings only and choose $Q:=C^0(K)$. Below, we speak of a rating tally $$f:(C^0)^m\to C^0.$$ We restate the latter axiom.

Symmetry I. The election result is independent of the permutation of the votes: $$f(sX)=f(X),\ \forall s\in S_m.$$

In other words, as the voters, $1,...,m$, interchange their votes, $X_1,...,X_m$, the election outcome remains unchanged.

This requirement is meant to guarantee that there is no special voter, a dictator. But what about a special candidate?

Written coordinate-wise, a tally is a function of matrices (over $R$). This matrix represents an election and its $(i,j)$-entry is the rating assigned by the $j$th voter to the $i$th candidate: $$\begin{array}{c|cccccc} &\text{voters}&\\ Ballot&\begin{array}{ccc}\ \ X_1&...&X_m\end{array}&&\text{outcome}\\ \hline \begin{array}{ccccc} \text{candidate } A_1\\ ...\\ \text{candidate } A_n \end{array} &f\left( \begin{array}{cccccc} a_{11}&...&a_{1m}\\ ...&&...\\ a_{n1}&...&a_{nm} \end{array} \right) &=& \left[ \begin{array}{ccc} c_1\\ ...\\ c_n \end{array} \right] \end{array}$$

According to Symmetry I,

  • interchanging the columns in the matrix doesn't affect the outcome.

In addition, we will require that

  • interchanging the rows in the matrix interchanges the entries in the outcome accordingly.

This requirement is restated as follows:

Symmetry II. For an election $X$, the election result $f(X):C_0\to R$ is independent of the permutation of the candidates: $$f(X)(sa)=s^{-1}f(X)(a),\ \forall X\in (C^0)^m, \ \forall s\in S_n,\ \forall a\in C_0.$$

Exercise. Here, $s$ is meant to be an $n\times n$ matrix that permutes the entries of the vector. Describe such a matrix.

So, renumbering the candidates, $A_1,...,A_n$, in the ballots produces the same election outcome once we renumber them back.

Exercise. Show that a “constant” tally, such as, $f(E)=A_1^*$ for every combination of vertices $E\in (K^{(0)})^m,$ satisfies Symmetry I but doesn't satisfy Symmetry II.

Proposition. The sum tally $\Sigma$ satisfies Symmetry II.

We no longer require that the outcome of a unanimous election be exactly the same quantitatively only that the order of the candidates is to be preserved. Suppose, the voters vote unanimously for candidate $A$, i.e., for all $i=1,.., m$, we have $$X_i(A)=1,\ X_i(B)=0,$$ where $B$ is any other candidate. In other words, $X_i=A^*$. Then the outcome of the elections should also be a zero vote for each candidate but $A$. Therefore, there is a positive coefficient $k_A$ such that: $$f(A^*,...,A^*)=k_AA^*.$$ We restate this condition as follows:

Diagonality. A rating tally preserves unanimous votes: the matrix of $f\delta$ is diagonal.

Next, Symmetry II implies that $k_A$ is independent of $A$ and we have the final version of our new axiom.

Proposition. Under Symmetry II and Additivity, Diagonality is equivalent to the following $$f\delta = k\operatorname{Id},\ k\in R.$$

Theorem. A rating tally that satisfies Additivity, Symmetry I, Symmetry II, and Diagonality is a multiple of the sum tally: $$f=k\Sigma,\ k\in R.$$

Exercise. Prove the theorem.

The following axiom is meant to prevent the voters from manipulating the outcome by voting strategically.

Monotonicity. (Independence of irrelevant alternatives) If no voter has changed his preference for a given pair of candidates from the election to the next, then the tallied preference vote for the candidates hasn't changed either: $$\begin{array}{llllll} \forall a,b \in C_0,\ X,Y \in (C^0)^m,\\ \operatorname{sign}\big(X_i(a) - X_i(b)\big) = \operatorname{sign}\big(Y_i(a) - Y_i(b)\big) \quad \Rightarrow\\ \operatorname{sign}\big(f(X)(a) - f(X)(b)\big) = \operatorname{sign}\big(f(Y)(a) - f(Y)(b)\big). \end{array}$$

Note: Tally $f$ is a function of $X=(X_1,...,X_m)\in (C^0)^m$. But with $a,b$ fixed, $X_i$ becomes an element of $R$ and so does $f(X)$, while $X$ is an element of $R^m$. Then we can restate the condition as follows: at two locations, if the input of the function increases (or decreases), then the output simultaneously increases or decreases in both locations. Such a function can only be increasing or decreasing, i.e., monotonic.

Monotonicity of IIA.png

Exercise. What is the order relation on $R^m$ being used?

Exercise. Explain the meaning of $s$ in the axiom as an operator and describe its matrix.

Example (election manipulation). Let $n=4$ and $m=3$. Consider this election and its outcome: $$\begin{array}{c|cccccccc} &A&B&C&D\\ \hline X&4&3&2&1\\ Y&3&4&2&1\\ Z&3&4&2&1\\ \hline U&10&11&6&3 \end{array}$$ Here $U=\Sigma(X,Y,Z)$ is the resulting vote.

We can also express these ratings as rankings: $$\begin{array}{cccccc} X:& A>B>C>D\\ Y:& B>A>C>D\\ Z:& B>A>C>D\\ \hline U:& B>A>C>D \end{array}$$ Here, each candidate receive four points for coming first, three for coming second, etc.

We have the election result: $$U:\ B > A.$$

Now suppose that in another election, voter $X$, now $X'$, moves $B$ from second place to last on his list: $$\begin{array}{c|ccccccc} &A&B&C&D\\ \hline X'&4&1&3&2\\ Y&3&4&2&1\\ Z&3&4&2&1\\ \hline V&10&9&7&4 \end{array}$$ Here $V=\Sigma(X',Y,Z)$ is the resulting vote.

We can express these ratings as rankings: $$\begin{array}{cccccc} X':& A>C>D>B\\ Y:& B>A>C>D\\ Z:& B>A>C>D\\ \hline V:& A>B>C>D \end{array}$$

We have the election result: $$V:\ A > B.$$

As we can see, while each voter's ranking of $A$ with respect to $B$ is the same in the two elections, the final rankings are different.

What happened is that voter $X$ -- wishing victory for $A$ -- correctly predicted that $B$ was the most dangerous competition. He then pushed $B$ to the bottom of his list to reduce $B$'s rating as much as possible. This manipulative step alone lead to $A$'s victory.

$\square$

Exercise. What is the minimal number of candidates that allows such a manipulation?

Exercise. If we normalize the tallied votes, the result is a “lottery” on $K$. Running an actual lottery can then be used to determine the winner. What is the effect of the manipulation described above on this lottery?

Thus, we have proven the following.

Proposition. The sum tally (and its multiples) does not satisfy Monotonicity.

Combined with the last theorem the proposition gives us the following.

Theorem (Impossibility). There is no rating tally $f:(C^0)^m \to C^0$ that satisfies Additivity, Symmetry I, Symmetry II, Diagonality, and Monotonicity.

The theorem is a version of Arrow's Impossibility Theorem.

It appears that ratings alone aren't versatile enough to be a basis of a fair electoral system. We will consider higher order votes next.

Aggregating comparison votes

This time, we consider comparison votes and choose our group to be $Q:=C^1(K)$. Below, we speak of a comparison tally $$f:(C^1)^m\to C^1.$$ We will restate some of the axioms. The first is without change.

Symmetry I. The election result is independent of the permutation of the votes: $$f(sX)=f(X),\ \forall s\in S_m.$$

Here, the voters, $1,...,m$, interchange their votes, $X_1,...,X_m$, each of which is pairwise comparison, such as: $X_1(AB)=3$.

This time, a tally is a function of $3$-dimensional matrices. Each such matrix represents a comparison election and its $(i,j,k)$-entry is the score assigned by the $k$th voter to the pair of $i$th and $j$th candidates. This is what a single ballot looks like: $$\begin{array}{c|cccccc} &\text{candidates}&\\ &\begin{array}{ccccc}A_1&...&A_n\end{array}&&\\ \hline \begin{array}{ccccc} \text{candidate } A_1\\ ...\\ \text{candidate } A_n \end{array} &\begin{array}{cccccc} a_{11}&...&a_{1n}\\ ...&&...\\ a_{n1}&...&a_{nn} \end{array} \end{array}$$ A tallied election outcome is also meant to be such a table.

According to Symmetry I, reshuffling the ballots doesn't affect the outcome. In addition, we require that interchanging the rows and columns -- identically -- in all of these matrices interchanges the rows and columns in the outcome table in the exact same way.

Symmetry II. For an election $X$, the election result $f(X):C_1\to R$ is independent of the permutation of the candidates: $$f(X)(sa)=s^{-1}f(X)(a),\ \forall X\in (C^1)^m, \ \forall s\in S_n,\ \forall a\in C_1.$$

In other words, just as before, renumbering the candidates, $A_1,...,A_n$, in the ballots produces the same election outcome once we renumber them back.

Proposition. The sum tally $\Sigma$ satisfies Symmetry II.

We still require that the outcome of a unanimous election preserves the order of the candidates. Suppose, the voters vote unanimously for candidate $A$ over $B$, i.e., for all $i=1,.., m$, we have $$X_i(AB)=1,\ X_i(CD)=0,$$ where $CD$ is any other pair of candidate (including $AC$ and $BC$ for $C\ne A,B$). In other words, $X_i=AB^*$. Then the outcome of the elections should also produce a zero vote for each pair but $AB$. Therefore, there is a positive coefficient $k_{AB}\in R$ such that: $$f(AB^*,...,AB^*)=k_{AB}AB^*.$$ This condition takes the same form as before.

Diagonality. A rating tally preserves unanimous votes: the matrix of $f\delta$ is diagonal.

Proposition. Under Symmetry II and Additivity, Diagonality is equivalent to the following $$f\delta = k\operatorname{Id},\ k\in {\bf Z}.$$

Theorem. A comparison tally that satisfies Additivity, Symmetry I, Symmetry II, and Diagonality is a multiple of the sum tally: $$f=k\Sigma,\ k\in {\bf Z}.$$

Monotonicity is easily restated following the familiar link between $0$- and $1$-cochains: $$d(X_1)(AB)=X_1(B)-X_1(A).$$ In other words, the differences of rating votes for candidates $A,B$ are understood as a comparison vote for $AB$.

Positivity. (Independence of irrelevant alternatives) If no voter has changed his preference for a given pair of candidates from the election to the next, then the tallied preference vote for the candidates hasn't changed either: $$\begin{array}{llllll} \forall a \in C_1,\ X,Y \in (C^1)^m,\\ \operatorname{sign}X_i(a) = \operatorname{sign}Y_i(a) \quad \Rightarrow\\ \operatorname{sign}f(X)(a) = \operatorname{sign}f(Y)(a). \end{array}$$

After all, the derivative of a monotonic function is either all positive or all negative.

Example (election manipulation). Let $n=4$ and $m=3$. Consider this election (and its outcome) from the last section. Each voter assigns $1$ to each pairwise preference: $$\begin{array}{c|ccccccc} &AB&AC&AD&BC&BD&CD\\ \hline X&1&1&1&1&1&1\\ Y&-1&1&1&1&1&1\\ Z&-1&1&1&1&1&1\\ \hline U&-1&3&3&3&3&3 \end{array}$$ Here $U=\Sigma(X,Y,Z)$ is the resulting vote.

In the second election, voter $X$, now $X'$, moves $B$ from the second place to the last on his list: $$\begin{array}{c|ccccccc} &AB&AC&AD&BC&BD&CD\\ \hline X'&1&1&1&-1&-1&1\\ Y&-1&1&1&1&1&1\\ Z&-1&1&1&1&1&1\\ \hline V&-1&3&3&1&1&3 \end{array}$$ Here $V=\Sigma(X',Y,Z)$ is the resulting vote.

The election results match: $$U,V:\ B > A.$$

Thus, even though $X$ pushed $B$ to the bottom of his list, $B$ is still ahead of $A$. The manipulation failed.

$\square$

Unlike ratings,

  • aggregation of comparison votes doesn't violate independence of irrelevant alternatives!

That's why there is no impossibility theorem.

The choice of a fair tally for comparison elections is obvious and so is the proof of the following:

Theorem (Possibility). The sum tally $\Sigma:(C^1)^m \to C^1$ is a comparison tally that satisfies Additivity, Symmetry I, Symmetry II, Diagonality, and Positivity.

Exercise. Provide a similar analysis for votes of degree $2$, i.e., $x\in C^2(K)$.

What remains to be seen, with the comparison votes successfully tallied, who is the winner?

Google's PageRank

PageRank is a way to compute the importance of web-pages based on their incoming and outgoing links. It is a mathematical idea that was at the core of Google's web search algorithm. According to Google, “PageRank thinks of links as 'votes', where a page linking to another page is essentially casting a vote for that page... PageRank also considers the importance of each page that casts a vote, as votes from some pages are considered to have greater value, thus giving the linked page greater value.” This is how web search is supposed to work; however, the actual formula for PageRank shows that it's not an electoral system in the sense we have discussed.

Let's review the basics of the mathematics of PageRank and subject them to mathematical scrutiny...

The idea of the definition is recursive.

Pagerank flow.png

Suppose a given page has three (or any other number) incoming links and five outgoing links. Let's assume that at the last stage the page received $9$ points of “PageRank” from the incoming links. Then, at the next stage, these $9$ points are distributed equally to the five outgoing links that carry these points to those pages. Consequently, each carries $9/5$ points.

Starting with an equal number of points for every page, we let the flow go around the web following the links and every time we re-compute the PageRank for every page. Once everything settles, the pages have accumulated their PageRanks as ratings and they are then ranked accordingly.

The simple formula fails when there are loops in the graph of links:

Pagerank loop.png

Indeed, a proportion of the initial PageRank of one of these pages will travel the full loop.

Example. Consider the propagation of PageRank through this simple graph: $$ \newcommand{\ra}[1]{\!\!\!\!\!\!\!\xrightarrow{\quad#1\quad}\!\!\!\!\!} \newcommand{\da}[1]{\left\downarrow{\scriptstyle#1}\vphantom{\displaystyle\int_0^1}\right.} % \begin{array}{ccccccc} & 1 & \ra{} & 1 \\ & & \nwarrow & \da{}\\ & 1 & \ra{} & 1 \end{array} , \begin{array}{ccccccc} & 1 & \ra{} & 1 \\ & & \nwarrow & \da{}\\ & 0 & \ra{} & 2 \end{array} , \begin{array}{ccccc} & 2 & \ra{} & 1 \\ & & \nwarrow & \da{}\\ & 0 & \ra{} & 1 \end{array} , \begin{array}{ccccc} & 1 & \ra{} & 2 \\ & & \nwarrow & \da{}\\ & 0 & \ra{} & 1 \end{array} ,\ ... \longrightarrow \ ? $$ There is no convergence!

$\square$

In order to ensure its convergence, the algorithm is modified. This time, only a proportion $r$ of the current PageRank is passed to the target pages. This value, called the decay coefficient, is commonly chosen to be $r=.85$. As there is no justification for choosing this over another value, $r$ is a non-mathematical, made-up parameter of this model.

Even after this modification, there are examples of undesirable behavior, such as accumulation of PageRank at the dead-ends.

Example. Suppose pages are linked to each other consecutively: $$A \to B \to C \to ... \to Y \to Z. $$ Then all pages will eventually pass their entire PageRanks to $Z$. As a result, pages $A-Y$ are tied at $0$. Then, the seemingly obvious ranking, $$A < B < C < ... < Y < Z,$$ is lost!

$\square$

In order to ensure that every page will get some of the PageRank, the algorithm is modified, again. This time, it is assumed that pages with no outbound links are linked to all other pages.

Further examples may cause (and possibly may have caused) further modifications of the algorithm. Instead, we simply present the most common way to compute the PageRank.

Definition. Suppose we represent the web as a directed graph and let $E$ be the set of edges and $N$ the total number of nodes in the graph. Then the PageRank is defined by the recursive formula: $$P_i=(1-r )\sum _{(i,j)\in E}\frac{P_j}{d_j}+\frac{r}{N},$$ where $P_i$ is the PageRank of the $i$th node, $r$ is the decay coefficient, and $d_j$ is the degree of node $j$.

At its limit, the totality of PageRanks is a rating (not a ranking).

There are still a number of surprising features. One is:

  • adding outbound links to your page may improve its PageRank.

Link to file: Spreadsheets.

Example (adding outbound links). Suppose we have ten “web-pages”: $A, B, ..., J$. They are listed in the table as both rows and columns. The links are the entries in the table.

First we suppose that they link to each other consecutively: $$A \to B \to C \to ... \to J. $$

Pagerank-chain.png

The PageRank points, as well as the rankings, of the pages are computed and displayed on the right. The scores for $A, ..., J$ are increasing in the obvious way: from $0$ to $.1$. In particular, $F$ is $\#$5.

Next, suppose $F$ adds a link to $A$, $F \to A$, which completes the loop. You can see that $1$ has appeared in the first column of the table:

Pagerank-chain-with-cycle.png

Now, suddenly $F$ ranks first! Thus, adding an outbound link has brought this page from $\#$5 to $\#$1.

$\square$

Another unexpected feature is:

  • changing the decay coefficient may change your rankings.

PageRank's paradigm is about passing something valuable (popularity, significance, authority, etc., but not a vote) from a page to the next along its links. Then, the PageRank points show who, after all this “redistribution of wealth”, ends up on top. This process is called advection: a fluid flow carries some substance, such as sand, around and gradually deposits it in different locations. Certainly, the amount of sand accumulated “at the end of the day” at a given location is expected to depend on the percentage of sand deposited per unit of time.

Exercise. Give examples of how the rankings may change depending on the value of the decay coefficient.

Our approach explained in this section is applied to web search as follows:

  • the presence of a link from $A$ to $B$ is a comparison vote of $-1$ on $AB$.

Combining ratings with comparisons

The outcome of an election is a single vote. However, it may be a comparison vote only and it can be “inconsistent” (even when each vote isn't), such as: $$A>B>C>A.$$ When the differences are identical, this is a “perfectly circular vote” and should be treated as a tie (left):

Circular voting.png

However, more complex votes may contain non-circular components (right): $$A>B>C>A,\quad D>A,D>B,D>C.$$ Here, $D$ is the winner.

We need to learn how to extract the circular component (left) from a given vote so that we can discard it: $$A=B=C,\quad D>A,D>B,D>C.$$

A similar task is seen, for example, when we study the motion on an inclined surface. Then only the component of the force parallel to the surface matters and the normal component is to be discarded:

Work along a straight path.png

Thus, we need the concept of the orthogonal complement of a subset $P$ of an inner product space $V$: $$P^\perp := \{v\in V: v\perp u,\ \forall u \in P\}.$$ It is a submodule.

Proposition. Suppose $P$ is a subset of an inner product space $V$. Then its orthogonal complement is a summand: $$V=< P >\oplus P^\perp.$$ Moreover, every element of $V$ is uniquely represented as the sum of its parallel and orthogonal components: $$v=v^{||}+v^\perp,\quad v^{||}\in < P >,v^\perp\in P^\perp.$$

Orthogonal decomposition.png

We next consider progressively more complex voting situations, which eventually will require us to use this concept.

In a basic electoral system, we have $n$ candidates located at the vertices of a simplicial complex $K$. We assume that the ring of coefficients $R$ is either ${\bf Z}$ or ${\bf R}$. A vote $x$ is an element of $C^*=C^*(K)$: $$a=a^0+a^1+a^2+...,\ a^i\in C^i(K).$$ The vote may be cast by a single voter or is the result of an election. In either case, we don't assume any consistency or strategy on behalf of the voter and, therefore, no interdependence among $a^0,a^1,a^2,...$.

Who is the winner?

Definition. For a given vote $a$, the winner of degree $0$ is the candidate(s) with the largest value of $a^0$.

Thus, $$winner:=\arg\max _{i\in K^{(0)}} a^0(i).$$ Above, $a^0$ is a rating vote and $a^1$ is a comparison vote. We discarded the comparison vote $a^1$ in the definition. However, it is unwise to throw out meaningful information and it is unacceptable when there is a tie.

Example. We have already seen that for the vote $$a^0=1,\quad a^1(AB)=1,\ a^1(BC)=-1,\ a^1(CA)=0,$$ the tie in dimension $0$ is broken by using the fact that $a^1$ is a coboundary: $$b^0(A)=0,\ b^0(B)=1,\ b^0(C)=0 \Rightarrow \partial^0 b^0=a^1.$$

$\square$

Definition. Suppose we are given a vote $a$ with the comparison component $a^1$ which is a coboundary. Then the winner of degree $1$ is the candidate(s) with the largest value of $a^0+b^0$, where $b^0$ is any $0$-chain that satisfies $\partial^0b^0=a^1$.

However, being a coboundary isn't necessary for the comparison component to be useful.

Example. We consider the simple case when the ratings are tied but comparison isn't:

  • $a^0(A)=a^0(B)$,
  • $a^1(A) < a^1(B)$.
Voting degree 0 1 with tie inexact.png

Then, of course, $B$ is the winner!

$\square$

Exercise. Show that $a^1$ doesn't have to be a coboundary.

Exercise. Show that using the comparison component of the vote may change the outcome even when there is no tie.

Example. In the next example, we have a circular vote. Then, subtracting the average of its values reveals a rating vote:

Circular vote and its rating.png

$\square$

Exercise. State and prove a theorem that generalizes the above example.

Now, what if we can't see the winner by just an examination of the vote as in the last example? We have an algebraic procedure for the case when $a^1$ happens to be a coboundary and now we need one for the general case.

Example. We consider the simple case above: $$a^0=(1,1,1),\ a^1=(1,0,0).$$ Then $a^1$ isn't a coboundary: there is no $b^0$ with $\partial^0b^0=a^1$ to be used to determine the winner. The idea is to find the best coboundary approximation of $a^1$.

First, let's find the space of $1$-coboundaries (i.e., rating comparison votes), $$B^1=\partial^0C^0 \subset C^1.$$ If $x,y,z$ are the values of a coboundary on the three edges, then there are some values $p,q,r$ of a $0$-cochain on the vertices such that $$x=q-p,\ y=r-q,\ z=p-r.$$ This is equivalent to the following: $$x+y+z=0.$$ This plane -- in the Euclidean space $C^1$ -- is the coboundary group $B^1$. Therefore, the best coboundary approximation of $a^1=(1,0,0)$ is $$b^1:=\arg\min \{ (x-1)^2+y^2+z^2: x+y+z=0\}.$$

Alternatively, one can think of $b^1$ as the projection of $a^1$ to $B^1$. Then the solution is as follows. A vector normal to this plane is chosen, $N=(1,1,1)$. Therefore, $$b^1=a^1+tN=(1,0,0)+t(1,1,1)=(1+t,t,t)$$ for some $t$. Since $b^1\in B^1$, we have $$(1+t)+t+t=0 \Rightarrow t=-1/3.$$ Then, the best coboundary approximation of $(1,0,0)$ is $$b^1=(2/3,-1/3,-1/3).$$ It is the coboundary of: $$b^0=(0,2/3,1/3).$$ Then, $$a^0+b^0=(1,1,1)+(0,2/3,1/3)=(1,5/3,4/3),$$ and $B$ is, again, the winner!

Voting degree 0 1 with tie inexact w projection.png

$\square$

Exercise. Provide a similar analysis for the following vote: $$a^0=(1,1,0),\ a^1=(1,0,1).$$

Decycling: how to extract ratings from comparisons

To be able to speak of projections, we need the cochain group $C^1(K)$ to be an inner product space.

It is uncomplicated. In fact, we will supply such a structure to the whole cochain group $C^*(K)$, by specifying an orthonormal basis.

A basic rule we have been enforcing is the equality between the candidates, i.e., the vertices of our complex $K$. Then the set of vertices has been the standard basis of $C_0(K)$; this time, in addition, we declare that

  • the vertices $\{A\}$ form an orthonormal basis of $C_0(K)$, and
  • the duals of vertices $\{A^*\}$ form an orthonormal basis of $C^0(K)$.

Similarly, the equality between pairs of candidates is also required. Then the set of edges of $K$ has been the standard basis of $C_1(K)$; this time, in addition, we declare that

  • the edges $\{a\}$ form an orthonormal basis of $C_1(K)$, and
  • the duals of edges $\{a^*\}$ form an orthonormal basis of $C^1(K)$.

This assumption provides us with an inner product, $$\langle x,y \rangle = \sum _i x_iy_i,$$ where $x_i,y_i$ are the coordinates of $x,y$ with respect to the basis, as well as a norm.

Furthermore, suppose $\{e_i\}$ is the set of edges of $K$, the chosen basis of $C_1(K)$. Then, $$x_i=x(e_i).$$ Furthermore, $$\langle x,y \rangle = \sum _i x(e_i)y(e_i)= \sum _i (xy)(e_i)= (xy)\left(\sum _i e_i\right).$$ In terms of $1$-forms, this is restated as: $$\langle x,y \rangle =\int_Kxy.$$

Definition. The best rating approximation of a comparison vote $a^1\in C^1$ is defined to be $$b^1:=\arg\min \{ ||x-a^1||: x\in B^1\}.$$

Definition. For a given vote $a$, the winner of degree $1$ is the candidate(s) with the largest value of $a^0+b^0$, where $b^0$ is any $0$-chain that satisfies $\partial^0b^0=b^1$ and $b^1$ is the best rating approximation of $a^1$.

Exercise. Define the winner of degree $2$ and show how it can be used.

The construction is based on the following decomposition: $$C^1=B^1\oplus (B^1)^\perp.$$ What is the meaning of $(B^1)^\perp $ in this context?

Example. Observe that in the above example, the difference between $a^1=(1,0,0)$ and its best rating approximation is $$a^1-b^1=(1,0,0)-\left(\frac{2}{3},-\frac{1}{3},-\frac{1}{3}\right) =\left(\frac{1}{3},\frac{1}{3},\frac{1}{3}\right)=\frac{1}{3}(1,1,1).$$ Of course, this is a vector perpendicular to the space of coboundaries and it is also a multiple of a circular vote with $1$s assigned to each edge.

Decomposition of comparison votes.png

Thus, each comparison vote is decomposed into the sum of a non-circular vote and a “perfectly” circular vote.

$\square$

Definition. The projection $D:C^1\to B^1$ will be called the decycling operator.

Exercise. Generalize the conclusion of the last example to the case of an $n$-simplex.

Example. These $1$-cochains have the same value assigned to each edge of the oriented triangle: $$x(AB)=x(BC)=x(AC)=p.$$ We recognize them as the duals of the $1$-boundaries: $$x=p(\partial_2ABC)^*.$$ Next, let's consider the coboundary of the dual of a vertex $A$: $$c=\partial^0A^*.$$ Then, for every vertex $B_i$ adjacent to $A$, we have $$c(AB_i)=A^*(B_i)-A^*(A)=-1.$$ The rest are $0$s. The inner product of such $c$ and $x$ is illustrated below:

Cochains orthogonal to coboundaries.png

$\square$

Theorem. $$(B^1)^\perp=(B_1)^*.$$

Exercise. Finish the proof of the theorem.

Thus, what we remove by decycling isn't just nonsensical, such as: $$... < \text{ rock } < \text{ paper } < \text{ scissors } < \text{ rock } < ... ,$$ this component doesn't contribute anything to a ranking.

Of course, we shouldn't expect decycling to preserve the presumed (but non-existent) ordering contained in a given comparison vote.

Example. Under decycling, the vote of $A>B$ is replaced with $A<B$:

Decomposition of comparison votes w reversal.png

$\square$

Algebraically, will a point with a positive first coordinate always have a projection with a positive first coordinate? No, after all, this could be the projection of the space onto an inclined plane, just as shown in the illustration in the last subsection.

Exercise. Find explicit formulas for the decycling operator for $n=3$.

As $D$ is an orthogonal projection, its range and the kernel are orthogonal subspaces. Then, for every $x,y\in C^1$, we have $$\langle Dx, (y-Dy) \rangle = 0,$$ or: $$\langle Dx, y \rangle = \langle Dx, Dy \rangle.$$ Now we apply the formula to $x=y=e^i$, the dual of $i$th edge of $K$. Then, $$D_{ii}=\langle D_i, D_i \rangle,$$ where $D_{ii}$ is the $(i,i)$-entry of the matrix of $D$ and $D_i$ is its $i$th column. We have proven the following useful property.

Proposition. The matrix of the decycling operator $D$ has only non-negative entries on the diagonal.

In search of fair elections

In the face of both an impossibility and a possibility theorems, what is our conclusion? Is there a fair way to run elections?

Let's suppose that the outcome will have to be a ranking. Then the possible avenues are outlined in the diagram below: $$ \newcommand{\ra}[1]{\!\!\!\!\!\!\!\xrightarrow{\quad#1\quad}\!\!\!\!\!} \newcommand{\da}[1]{\left\downarrow{\scriptstyle#1}\vphantom{\displaystyle\int_0^1}\right.} \newcommand{\la}[1]{\!\!\!\!\!\!\!\xleftarrow{\quad#1\quad}\!\!\!\!\!} \newcommand{\ua}[1]{\left\uparrow{\scriptstyle#1}\vphantom{\displaystyle\int_0^1}\right.} % \begin{array}{ccccc} \text{Election:} & & &&\text{Outcome:} \\ \text{comparison votes} & \ra{tally} & \text{a comparison vote} & \ra{decycling} & \text{a rating comparison vote}\\ \ua{(\partial^0)^m} & & & & \da{(\partial^0)^{-1}}\\ \text{rating votes} & \cdots\cdots & \text{impossible} & \cdots\cdots > & \text{a rating vote} \end{array} $$

Let’s make a few observations.

First, we cannot run a rating election, because we cannot tally it fairly according to the impossibility theorem in this section. (Moreover, if we replace ratings with rankings, there is still no fair tally according to the original Arrow's Impossibility Theorem.)

Second, we can run a comparison election.

Third, we can bypass the impossible direct route from rating election to the total rating: we first convert each voter's rating vote into a comparison vote, then tally the votes into a total comparison vote, and then finally convert it to a rating vote via decycling. However, since the start and the end points of the route remain the same as in the first case, this scheme fails due to the commutative nature of the diagram.

Fourth, we can truly circumvent the impossibility theorem by choosing a different starting point: run a comparison election and then proceed to the ratings as described above. In other words, the election is implemented by the following composition of homomorphisms: $$ \newcommand{\ra}[1]{\!\!\!\!\!\!\!\xrightarrow{\quad#1\quad}\!\!\!\!\!} \newcommand{\da}[1]{\left\downarrow{\scriptstyle#1}\vphantom{\displaystyle\int_0^1}\right.} \newcommand{\la}[1]{\!\!\!\!\!\!\!\xleftarrow{\quad#1\quad}\!\!\!\!\!} \newcommand{\ua}[1]{\left\uparrow{\scriptstyle#1}\vphantom{\displaystyle\int_0^1}\right.} % \begin{array}{ccccc} \text{Election:} & & &&\text{Outcome:} \\ {(C^1)^m} & \ra{\quad \Sigma\quad } & {C^1} & \ra{\quad D\quad } & {B^0}\\ & & & & \quad \ \da{(\partial^0)^{-1}}\\ & & & & {C^0} \end{array} $$ The result may be called a decycled comparison election.

We can use this approach to rank web-pages:

  • if there is a link from page $A$ to page $B$, edge $AB$ is given a value of $-1$.

The total is a comparison vote.

Example. Let's try to decycle this simple web presented as a comparison vote: $$ \newcommand{\ra}[1]{\!\!\!\!\!\!\!\xrightarrow{\quad#1\quad}\!\!\!\!\!} \newcommand{\da}[1]{\left\downarrow{\scriptstyle#1}\vphantom{\displaystyle\int_0^1}\right.} % \begin{array}{ccccccc} & C & \ra{-1} & D \\ & & \nwarrow^{-1} &\ \ \ \da{-1}\\ & A & \ra{-1} & B \end{array} \ \ \leadsto \ \begin{array}{ccccccc} & C & \ra{0} & D \\ & & \nwarrow^{0} &\ \da{0}\\ & A & \ra{-1} & B \end{array} \quad \text{OR} \begin{array}{ccccccc} & C & \ra{0} & D \\ & & \nwarrow^{-1/4} &\ \ \ \da{1/4}\\ & A & \ra{1/2} & B \end{array}\quad ? $$ Even though removing the obvious circular vote gives us the former, the latter is the correct answer. The resulting ranking is: $$A<C=D<B.$$

$\square$

Exercise. Justify the answer in the example. Hint: the diagrams are incomplete.

According to the proposition in the last subsection, the decycling operator $D$ is a square matrix that has only positive entries on the diagonal: $$D_{ii}\ge 0.$$ Then, $$\frac{\partial D_i}{\partial x_i} \ge 0,$$ where $D_i$ is the $i$th component (column) of $D$. Therefore, the $i$th variable is non-decreasing under $D$. This is our conclusion:

  • adding outbound links won't increase your rating.

However,

  • the decycling scheme fails to satisfy the independence of irrelevant alternatives.

Example (election manipulation). We consider a simple comparison vote $P\in C^1$: $$P(AB)=1,\ P(BC)=-1,\ P(AC)=0.$$ It is a rating vote: $$B>A=C.$$ Suppose this vote came -- via the sum tally -- from the following comparison election: $$\begin{array}{c|ccccccc} &AB&BC&CA&\\ \hline X&1&0&0\\ Y&0&-1&0\\ rest&0&0&0\\ \hline P&1&-1&0\\ \end{array}$$ The winner is $B$...

Suppose the voters have changed their votes -- but not about $A$ vs. $B$: $$\begin{array}{c|ccccc} &AB&BC&CA&\\ \hline X&1&0&0\\ Y&0&-1&0\\ rest&0&5&4\\ \hline Q&1&4&4\\ \end{array}$$ Now, $Q$ is not a rating comparison vote. As we know, it is decomposed, and decycled, as follows: $$Q=(1,4,4)=(-2,1,1)+(3,3,3).$$ The outcome of the second election is then: $$DQ=(-2,1,1),$$ and the winner is $A$!

Even though the axiom is violated, the result isn't unexpected or manipulative. When enough voters change their minds about $A$ vs. $C$ or $B$ vs. $C$, the outcome of $A$ vs. $B$ may change too.

$\square$

What makes the case of interlinked websites to truly stand out is the source and the nature of the votes. In a standard electoral system, there are voters and there are candidates and the former vote for the latter. On the web, there are only pages and they vote for each other. What's crucial, each page can only vote against itself.

This idea may serve as a foundation of a distributed, self-voting electoral system. In such a system, everyone is both a voter and a candidate but he can only vote against himself. Compare:

  • standard electoral system:
    • voter $X$: candidate $A$ $>$ candidate $B$;
  • self-voting electoral system:
    • voter $X$: candidate $A$ $>$ candidate $X$.

By voting this way, voter $X$ says: voter $A$ is better than me to decide on this issue. By its nature, this is a transitive order relation. With the circular patterns discarded by decycling, this mode of voting creates

  • a flow towards better and better voters,

or, which is the same thing,

  • a flow towards better and better candidates.

One can then follow this flow, which is a directed graph, to its end(s) to find the best candidate(s).

Decycling with a spreadsheet

We start with a comparison vote as an antisymmetric $n\times n$ matrix $X$. We need to construct a rating vote, as an $n$-vector $R$, so that $\partial^0R$ is the closest to $X$.

As we have seen, there is a direct way, via linear algebra, to compute $R$ from $X$. However, we will treat the task as an optimization problem instead: $$R:=\arg\min_R ||X-\partial^0R||.$$ To find the function to be minimized, we replace the norm with its square and then expand: $$\Phi (R_1,...,R_n):=\sum _{ij}(X_{ij}-(R_i-R_j))^2.$$ We will “chase” (the negative of) its gradient. The increment of $R_j$ is then proportional to: $$\begin{array}{lllll} \Delta R_j&:=\sum _{i\ne j}(X_{ij}-(R_i-R_j))\\ &=\sum _{i\ne j}X_{ij}-\sum _{i\ne j}R_i+\sum _{i\ne j}R_j\\ &=\sum _{i}X_{ij}-\sum _{i}R_i+nR_j. \end{array}$$ These three terms are visible in the spreadsheet:

  • the sum of the terms in the $j$th column of the matrix of the comparison vote,
  • the sum of the terms of the current vector of the rating vote, and
  • the $j$th entry of that vector times $n$.

The spreadsheet shown below finds the ranking of the webpages for the web in the last subsection:

Decycling comparison elections.png

We compute the iteration: $$R_j'=R_j+h\Delta R_j,$$ where $h$ is the parameter of the process that controls the length of its step.

Link to file: Spreadsheets.

Exercise. Prove or disprove the “majority criterion”: if a majority of voters vote $A>B$ for all $B$, then $A$ wins.

Exercise. Prove or disprove the “monotonicity criterion”: if $A$ wins, he still wins even after some of the voters change their votes from $A<B$ to $A>B$.

Exercise. Prove or disprove the “participation criterion”: adding a voter who votes $A>B$ to an existing election will not change the winner from $A$ to $B$.

Exercise. Prove or disprove the “consistency criterion”: if the voters are divided into parts and $A$ wins in each of the separate elections, he also wins in an election of all voters.

Exercise. Prove or disprove the “independence of clones criterion”: if $A$ is a winner and $B\ne A$, then even after adding a new candidate $C$ candidate $A$ still wins provided there is no voter with a vote: $B\le D\le C$ or $C\le D\le B$ for any candidate $D\ne B,C$.

Exercise. (a) Modify the formula to account for possibility of “incomparable” candidates. (b) Modify the spreadsheet accordingly.