Miscellaneous Implementations

Over the past year, I have implemented several algorithms and data structures. I decided to do a brief round-up:

I implemented insertion sort for lists of numbers in Python.

I implemented insertion sort for linked lists in Python.

I implemented insertion sort for linked lists in Java.

I implemented prime factorization in Java.

I implemented binary search in Python.

I implemented tries in Python.

Q-Learning

In reinforcement learning, an agent takes actions within an environment. When an agent acts, a reward function assigns feedback to the agent that either positively or negatively reinforces the agent’s action. The agent’s goal is to use this feedback to learn an optimal policy.

For any given state of the environment, a policy tells the agent the action to take when the environment is in that state. An optimal policy tells the agent the best action to take. Of course, there are several different ways to define what a best action is. Let’s say that an optimal policy maximizes the sum of discounted rewards over time.

Say that Q(s, a) is the reward received from taking action a from state s, plus the discounted value of following the optimal policy thereafter. So, if the agent knows Q, then it can execute the optimal policy. Whenever the agent is in state s, the optimal policy is to take the action a that maximizes Q(s, a). So, the agent’s task is to learn Q.

To learn Q, create a table Q[s][a] and initialize the entries to zero. Then, over and over again, have the agent take actions from states and receive the ensuing rewards. Update Q[s][a] each time. Q[s][a] will converge to the true value of Q as the number of times that each state-action pair is visited converges to infinity.

I implemented Q-learning, and I applied it to a simple environment. The code is up at my GitHub.

References:

Mitchell, Tom. Machine Learning. McGraw-Hill, 1997.

Turing Machines

In a previous post, I gave an informal explanation of the P=NP? problem. In that post, I described both deterministic and nondeterministic Turing machines. A couple of months ago, I decided to implement the deterministic Turing machine in Python.

I also implemented a special deterministic Turing machine that decides whether a given positive integer is divisible by 4, just like the hypothetical Turing machine that I describe in my previous post.

I include a Bash script that tests that implementation. The script inputs every multiple of 4, ranging from 4 to 40, into the Turing machine program. The program passes every test.

Ranked Pairs and “The Wire”

A few days ago, a redditor asked the people of /r/TheWire to rank the five seasons of the HBO television show “The Wire”, and several people posted their rankings. I wanted to take their individual rankings and uncover one aggregate ranking that captured the overall preferences of the entire group of posters. In other words, I wanted to implement a preferential voting scheme.

I was aware of Arrow’s theorem, which says that no voting scheme can satisfy a certain small set of reasonable criteria when voters are ranking three or more alternatives. However, I was also aware that a number of countries employ preferential voting systems. So, I didn’t give up, and I searched for a preferential voting system that would fit the bill.

I found one that worked: the ranked pairs voting scheme. In the ranked pairs voting system, you first count all the times that voters ranked Alternative X ahead of Alternative Y, for all pairs of alternatives (X, Y). If voters ranked Alternative X ahead of Alternative Y more often, then Alternative X is a majority over Alternative Y.

Next you sort the majorities. For example, say that voters ranked Alternative X ahead of Alternative Y three times and that they ranked Alternative A ahead of Alternative B five times. Then the latter majority comes before the former majority in the sort.

But what if the list of majorities creates a cycle? For example, say that we have [(A, B), (B, C), (C, A)]. Then A is a majority over B and B is a majority over C. But C is a majority over A. So, by transitivity, A is a majority over A. But an alternative X can be a majority over an alternative Y only if X was ranked ahead of Y, and A can’t be ranked ahead of itself.

So, here’s how you resolve that issue. You treat the list of majorities like a directed acyclic graph where the dag includes a directed edge from X to Y if X is a majority over Y. And you just go down the sorted list of majorities. For each majority, you try to add a corresponding edge to the dag. If adding that edge would produce a cycle, then you don’t add it. If it wouldn’t produce a cycle, then you do add it, and that majority gets locked in to the final list of majorities.

The winner of the election is the source of the dag, i.e., the vertex that doesn’t have any incoming edges. But I didn’t want to know just the winner. I wanted to know the aggregate ranking. So, I sorted the alternatives by the number of outgoing edges that their corresponding vertices have in the dag. The number of outgoing edges of a vertex corresponds to the number of alternatives that an alternative is a majority over.

I collected the data on the night of September 1. I simply scrolled down the list of comments and recorded the rankings that commenters reported. The only complication was that a couple of commenters expressed indifference between some seasons. So, for example, one person wrote, “4 > 1 > 3/2 > 5”. In such a case, I ranked seasons in the order that they were written. So, in this particular case, I placed Season 3 before Season 2.

Then I ran the program, and the result was 4 > 1 > 2 > 3. Season 5 didn’t even make the list.

You can find my program up at my GitHub.

Further references:

https://users.cs.duke.edu/~conitzer/comsocchapter.pdf
http://www.geeksforgeeks.org/detect-cycle-in-a-graph/

Swapping Cards

Earlier today I searched through the ACM-ICPC archive of problems that have been used in ACM-ICPC Regionals and World Finals, looking for some interesting problems to solve. I found one that reminded me of another ACM-ICPC problem that I solved. You can read the full description of it, but I’ll put it in a nutshell:

You’re given a stack of playing cards. You can swap adjacent cards. You want to find the minimum number of swaps it takes to sort the cards so that all the diamonds and hearts in the stack appear before all the spades and clubs.

I realized that you are searching for a certain state of the stack of cards. Because you want to find that state in the minimum number of swaps, you can perform a version of breadth-first search to solve the problem.

I remembered the Marbles in Three Baskets problem, another ACM-ICPC problem that I solved. I applied the same basic technique and philosophy of problem-solving to this problem that I outline in the post about the Marbles in Three Baskets problem.

You can find my solution to this problem at my GitHub.

The Sieve of Eratosthenes

Given an integer N > 1, you want to find all of the prime numbers from 1 to N. For example, if N = 11, then you get {2, 3, 5, 7, 11}.

The Sieve of Eratosthenes is a simple way to solve this problem. The basic idea behind it is this: Construct the list of all integers from 1 to N, and remove the composite numbers.

It is easy to construct the list of all integers from 1 to N. So, the question is: How do you remove the composite numbers?

It is a fact of number theory that every composite number k has a prime factor that is less than or equal to sqrt(k). To see this, consider any composite number k = mn. If both m > sqrt(k) and n > sqrt(k), then k < mn, which contradicts the fact that k = mn. So, k must have a prime factor that is less than or equal to sqrt(k).

So, every composite number from 1 to N must have a prime factor that is less than or equal to sqrt(N). So, just iterate over each prime number from 1 to sqrt(N) and remove all of its multiples in the list. In that way you remove all of the composite numbers from the list.

I implemented the Sieve of Eratosthenes in C, and I implemented the Sieve of Eratosthenes in Python. Both of my implementations are up at my GitHub.

References:

https://proofwiki.org/wiki/Sieve_of_Eratosthenes
http://pages.pacificcoast.net/~cazelais/222/sieve.pdf

Vertex Coloring and the Min-Conflicts Algorithm

In the vertex coloring problem, you are given a graph and a set of colors. You want to assign colors to the vertices of the graph in such a way that no adjacent vertices have the same color.

The vertex coloring problem appears in a lot of different forms in the real world. To take a rather frivolous example, just consider Sudoku. Each cell corresponds to a vertex, the numbers 1-9 correspond to colors that you can assign to the vertices, and two vertices are adjacent if the corresponding cells are in the same row, column, or subgrid.

The vertex coloring problem is a constraint satisfaction problem. In a constraint satisfaction problem, you are given a set of variables, a set of possible values for each variable, and a set of constraints. You want to find an assignment of values to the variables that satisfies the constraints.

In the vertex coloring problem, the vertices are the variables, the colors are the possible values for each variable, and the constraint is that no adjacent vertices can have the same color.

There are a number of ways to solve constraint satisfaction problems. One way is recursive backtracking. The n queens problem is another constraint satisfaction problem, and I have applied recursive backtracking to solve the n queens problem.

Today I want to talk about the min-conflicts algorithm. Like simulated annealing, which I have also talked about, the min-conflicts algorithm is a type of local search. You start with some assignment of values to variables. Unless you get really lucky, that initial assignment will produce conflicts. For example, if a vertex is colored green and two of its neighboring vertices are colored green, then its being colored green produces two conflicts. So, at each iteration, the min-conflicts algorithm picks a conflicting variable at random and assigns to it a value that produces the minimal number of conflicts. If there is more than one such value, then it chooses one at random. It then repeats this process again and again, until it either finds a complete assignment of values that satisfies all of the constraints or goes through the maximum number of iterations that it may go through.

I implemented the min-conflicts algorithm, and I applied it to the vertex covering problem. I also decided to go back and apply the min-conflicts algorithm to the n queens problem. You can find more of my programs at my GitHub.