The Principle of Mathematical Induction

Let’s say that you are looking at a line of dominoes. You know that if you knock over one domino, the next domino will be knocked over, too. Now you knock over the first domino. What happens?

All of the dominoes are knocked over.

That might seem trivial, but it’s not. Amazingly, this falling-dominoes principle applies to the natural numbers, and you can use it for all sorts of things.

The principle that the dominoes obey is an example of the principle of mathematical induction, which says:

Assume that, if a property P(n) is true for a natural number n, then P(n + 1) is true, too. Also assume that P(n_0) is true for some natural number n_0. Then P is true for all natural numbers greater than or equal to n_0.

So, for example, let P(n) equal “The nth domino is knocked over”. We know that, for all dominoes in our line of dominoes, if P(n) is true, then P(n + 1) is true, because if the nth domino is knocked over, then the (n + 1)th domino is also knocked over.

Now let P(n_0) be true, where n_0 corresponds to the first domino in line. What happens?

All of the dominoes are knocked over.

As you can tell from this example, you can extend the principle of mathematical induction to domains beyond the set of natural numbers. But it can also do more than tell you something obvious about dominoes. In a future post, I will show how you can use the principle to design iterative algorithms.

Miscellaneous Implementations

Over the past year, I have implemented several algorithms and data structures. I decided to do a brief round-up:

I implemented insertion sort for lists of numbers in Python.

I implemented insertion sort for linked lists in Python.

I implemented insertion sort for linked lists in Java.

I implemented prime factorization in Java.

I implemented binary search in Python.

I implemented tries in Python.

Q-Learning

In reinforcement learning, an agent takes actions within an environment. When an agent acts, a reward function assigns feedback to the agent that either positively or negatively reinforces the agent’s action. The agent’s goal is to use this feedback to learn an optimal policy.

For any given state of the environment, a policy tells the agent the action to take when the environment is in that state. An optimal policy tells the agent the best action to take. Of course, there are several different ways to define what a best action is. Let’s say that an optimal policy maximizes the sum of discounted rewards over time.

Say that Q(s, a) is the reward received from taking action a from state s, plus the discounted value of following the optimal policy thereafter. So, if the agent knows Q, then it can execute the optimal policy. Whenever the agent is in state s, the optimal policy is to take the action a that maximizes Q(s, a). So, the agent’s task is to learn Q.

To learn Q, create a table Q[s][a] and initialize the entries to zero. Then, over and over again, have the agent take actions from states and receive the ensuing rewards. Update Q[s][a] each time. Q[s][a] will converge to the true value of Q as the number of times that each state-action pair is visited converges to infinity.

I implemented Q-learning, and I applied it to a simple environment. The code is up at my GitHub.

References:

Mitchell, Tom. Machine Learning. McGraw-Hill, 1997.

Quines

A quine is a computer program that prints its own source code. Writing a quine is not as easy as you might think.

To learn how to write quines, I made use of David Madore’s discussion of quines and Introduction to the Theory of Computation, by Michael Sipser. However, I prefer Sipser’s discussion. I found it to be a lot clearer than Madore’s.

Sipser does not talk about computer programs. Instead, he talks about Turing machines. A computer program corresponds to a Turing machine. So, a quine corresponds to a Turing machine that outputs a description of itself.

Let QUINE be a Turing machine that outputs a description of itself. QUINE has two parts, C and D, each of which is itself a Turing machine. C outputs a description of D, and D outputs a description of C. So, QUINE outputs a description of its entire self.

D runs first. D outputs onto its tape a description of C. Refer to that description of C as <C>.

Let q(w) be the description of a Turing machine that outputs w. q(w) is a computable function. In fact, given any string w, it is easy to construct a Turing machine P_{w} that outputs w. On any input, P_{w} simply outputs w onto its tape, ignoring the input.

Now, C reads <C> off D’s tape. Then C computes q(<C>). So, now C has the description of a Turing machine that outputs <C>. But that’s D! So, now C can output a description of D.

So, a quine has two parts, the code and the data. The data represents the code. The code uses the data to print the code and the data.

I wrote a quine in C and a quine in Python. To verify that the C program is a quine, type the following commands at the terminal:

gcc -o quine quine.c
diff <(./quine) quine.c

To verify that the Python program is a quine, type the following command:

diff <(python quine.py) quine.py

You will find in both cases that diff produces no output, which means that there are no differences between the programs' outputs and their source code.

Turing Machines

In a previous post, I gave an informal explanation of the P=NP? problem. In that post, I described both deterministic and nondeterministic Turing machines. A couple of months ago, I decided to implement the deterministic Turing machine in Python.

I also implemented a special deterministic Turing machine that decides whether a given positive integer is divisible by 4, just like the hypothetical Turing machine that I describe in my previous post.

I include a Bash script that tests that implementation. The script inputs every multiple of 4, ranging from 4 to 40, into the Turing machine program. The program passes every test.

Ranked Pairs and “The Wire”

A few days ago, a redditor asked the people of /r/TheWire to rank the five seasons of the HBO television show “The Wire”, and several people posted their rankings. I wanted to take their individual rankings and uncover one aggregate ranking that captured the overall preferences of the entire group of posters. In other words, I wanted to implement a preferential voting scheme.

I was aware of Arrow’s theorem, which says that no voting scheme can satisfy a certain small set of reasonable criteria when voters are ranking three or more alternatives. However, I was also aware that a number of countries employ preferential voting systems. So, I didn’t give up, and I searched for a preferential voting system that would fit the bill.

I found one that worked: the ranked pairs voting scheme. In the ranked pairs voting system, you first count all the times that voters ranked Alternative X ahead of Alternative Y, for all pairs of alternatives (X, Y). If voters ranked Alternative X ahead of Alternative Y more often, then Alternative X is a majority over Alternative Y.

Next you sort the majorities. For example, say that voters ranked Alternative X ahead of Alternative Y three times and that they ranked Alternative A ahead of Alternative B five times. Then the latter majority comes before the former majority in the sort.

But what if the list of majorities creates a cycle? For example, say that we have [(A, B), (B, C), (C, A)]. Then A is a majority over B and B is a majority over C. But C is a majority over A. So, by transitivity, A is a majority over A. But an alternative X can be a majority over an alternative Y only if X was ranked ahead of Y, and A can’t be ranked ahead of itself.

So, here’s how you resolve that issue. You treat the list of majorities like a directed acyclic graph where the dag includes a directed edge from X to Y if X is a majority over Y. And you just go down the sorted list of majorities. For each majority, you try to add a corresponding edge to the dag. If adding that edge would produce a cycle, then you don’t add it. If it wouldn’t produce a cycle, then you do add it, and that majority gets locked in to the final list of majorities.

The winner of the election is the source of the dag, i.e., the vertex that doesn’t have any incoming edges. But I didn’t want to know just the winner. I wanted to know the aggregate ranking. So, I sorted the alternatives by the number of outgoing edges that their corresponding vertices have in the dag. The number of outgoing edges of a vertex corresponds to the number of alternatives that an alternative is a majority over.

I collected the data on the night of September 1. I simply scrolled down the list of comments and recorded the rankings that commenters reported. The only complication was that a couple of commenters expressed indifference between some seasons. So, for example, one person wrote, “4 > 1 > 3/2 > 5”. In such a case, I ranked seasons in the order that they were written. So, in this particular case, I placed Season 3 before Season 2.

Then I ran the program, and the result was 4 > 1 > 2 > 3. Season 5 didn’t even make the list.

You can find my program up at my GitHub.

Further references:

https://users.cs.duke.edu/~conitzer/comsocchapter.pdf
http://www.geeksforgeeks.org/detect-cycle-in-a-graph/