# Miscellaneous Implementations

Over the past year, I have implemented several algorithms and data structures. I decided to do a brief round-up:

I implemented insertion sort for lists of numbers in Python.

I implemented insertion sort for linked lists in Python.

I implemented insertion sort for linked lists in Java.

I implemented prime factorization in Java.

I implemented binary search in Python.

I implemented tries in Python.

# The ID3 Algorithm

You are given the voting records of a set of Democrats and the voting records of a set of Republicans. Now you are given the voting record of some unidentified politician, and you want to decide whether they are a Democrat or a Republican. How do you do that?

One way to do it is to take the voting records that you’re given and construct a decision tree. Each internal node in the decision tree corresponds to an attribute of a politician’s voting record, and each edge in the tree corresponds to a value of an attribute. To classify a given politician, you start at the root of the tree and determine the value of the corresponding attribute. Then you take the edge that corresponds to the value of that attribute. Now you’re at a new node, and you repeat the process. In this way you walk down the tree until you reach a leaf which designates the party that the politician probably belongs to.

The ID3 algorithm is a procedure to generate decision trees. ID3 is essentially a greedy search through the space of decision trees. You start with an empty tree and then iteratively construct the decision tree, starting at the root. At each iteration, you add the node whose corresponding attribute would produce the most gain in information if you were given its value.

I implemented the ID3 algorithm in Python. My implementation is up at my GitHub.

Each path from the root to a leaf in a decision tree corresponds to a rule. For example, a politician who voted against freezing physician fees and in favor of increasing spending on education is, as a rule, a Democrat. An entire decision tree corresponds to a set of rules. I wrote and included in my implementation a function that builds a ruleset out of a given decision tree.

And I applied the ID3 algorithm to solve the problem that I gave above. The solution is also up at my GitHub.

# Of Dijkstra and Priority Queues

Although I’ve implemented variants of Dijkstra’s algorithm in the past, I wanted to implement Dijkstra’s algorithm with my own implementation of the priority queue. So, today I bring to you a reinvention of the wheel.

Especially when the project is small, I tend to develop software in Python and then translate the code into another language. Python is a clean, easy-to-use language that has a REPL. Consequently, it is quick and fun to develop in Python. Oftentimes there’s a better tool for the job, but I like to use Python when I can.

In this case, I implemented Dijkstra’s algorithm and the priority queue in Python and then translated the code into Java. So, I have:

One of the many interesting things that I learned from my artificial intelligence class is that the big difference between breadth-first search, depth-first search, and Dijkstra’s algorithm is the data structure that stores vertices.

In breadth-first search, the data structure is a queue. The first item that you put into it is the first item that you remove from it.

In depth-first search, the data structure is a stack. The last item that you put into it is the first item that you remove from it.

In Dijkstra’s algorithm, the data structure is a priority queue. The item that has the highest priority is the item that gets removed.

You can actually think of stacks and queues as types of priority queues. A queue is a priority queue where elements have more precedence the earlier they enter the container. A stack is a priority queue where elements have more precedence the later they enter the container.

Priority queues are commonly implemented as binary heaps, and that is how I implemented my priority queue. A binary heap is a complete binary tree that is ordered such that, for any given vertex, all of the vertex’s children have lower priority than it.

Dijkstra’s algorithm solves the single-source shortest path problem. In the single-source shortest path problem, you are given a graph and some distinguished vertex, and you want to find the shortest paths between the distinguished vertex and every other vertex in the graph. So, for example, let’s say that you are in Atlanta, and you want to find the quickest routes to Dallas, Chicago, and New York. Represent the highway system as a graph, and then you can apply Dijkstra’s algorithm to solve your problem.

At this page, I found “a data file that represents a significant portion of the interstate highway system”. So, I decided to actually compute the shortest paths from Atlanta to Dallas, Chicago, and New York. In case you are dying to know, here are the results:

The shortest distance between Atlanta and New York is 897 miles. The route is Atlanta -> Spartanburg -> Charlotte -> Greensboro -> Petersburg -> Richmond -> Washington -> Baltimore -> Philadelphia -> Newark -> New York.

The shortest distance between Atlanta and Dallas is 791 miles. The route is Atlanta -> Birmingham -> Meridian -> Jackson -> Shreveport -> Dallas.

The shortest distance between Atlanta and Chicago is 729 miles. The route is Atlanta -> Chattanooga -> Wildwood -> Nashville -> Louisville -> Indianapolis -> Gary -> Chicago.

Of course, the quality of these results depends on the quality of the data and the quality of the program that operates on the data. However, these results are close to the results that Google turns up. So, both the data and my implementation of Dijkstra’s algorithm seem to be good.

For more of my programs, check out my GitHub.

(If you aren’t familiar with graph theory, I advise you to read my post on minimum spanning trees, where I define several important graph-theoretic terms that are relevant to this entry.)

There exist several ways to represent a graph. For example, you can represent a graph as an adjacency matrix, as an edge list, or as an adjacency list. In my Python implementation of Kruskal’s algorithm, I used an edge list.

Today I will discuss adjacency lists, because I’m going to use adjacency lists in some algorithms that I will post about soon.

The idea of an adjacency list is very simple. In its most basic form, an adjacency list is just a collection of lists of adjacent vertices. So, if we want to know which vertices are adjacent to a vertex v and our adjacency list is L, then L[v] is the list of vertices adjacent to v.

More precisely, L[v] is the list of vertices that v has an out-edge to. Adjacency lists can represent directed graphs. So, even if u is an element of L[v], v is not necessarily an element of L[u].

If nonnegative integers name the vertices or if we want to create a correspondence between the nonnegative integers and the vertices, then you can very easily implement an adjacency list with an array of arrays. Similarly, you can also implement an adjacency list with an array whose cells point to a linked list of the adjacency vertices.

If nonnegative integers don’t name the vertices and we don’t want to create a correspondence between the nonnegative integers and the vertices, then you can very easily implement an adjacency list with an associative array whose keys are the vertices’ names and whose values are the names of adjacent vertices.

You can implement the adjacency-list representation of a weighted graph with an array of arrays of vertex-weight pairs. For example, consider a weighted directed graph. Vertex #0 has an out-edge of weight 5 to Vertex #1 and an out-edge of weight 10 to Vertex #2, Vertex #1 has an out-edge of weight 15 to Vertex #2, and Vertex #2 has no out-edges. You can represent that graph as {{{1, 5}, {2, 10}}, {{2, 15}}, {}}.

Because an adjacency list takes up only O(V + E) memory, adjacency lists are useful for representing a sparse graph, which is a graph that has much fewer edges than the maximum number of edges.

# Mixes of data structures

Sometimes it’s easier to solve a problem if you use a mix of data structures. For example, you can easily determine whether a sequence of characters is a palindrome if you use both a stack and a queue. Just add the characters of the sequence to each data structure, and then remove a character from each data structure one at a time and compare them. The sequence is a palindrome if and only if the characters are always the same.

Here is my C++ implementation of a palindrome checker.

Here is a Python implementation of a palindrome checker.

Both implementations are up at my GitHub.

# Queues

The last element you push onto a stack will be the first element that’s popped off the stack. A stack is a last-in, first-out (or LIFO) data structure.

In contrast, a queue is a first-in, first-out (or FIFO) data structure. The first element you enqueue will be the first element you dequeue.

To understand the concept, think of an everyday queue. So, say you are standing in line at the bank. The first person in the line will be the first person the bank teller calls to the window. The computer-science concept of queues is an abstraction of the queues that we experience in everyday life.

Here is my C++ implementation of the queue. It is up at my GitHub.

# Disjoint-Set Data Structures

The disjoint-set data structure is a data structure that represents set partitions.

A set partition of a set S is a collection of disjoint subsets of S whose union is S.

Sets are disjoint if and only if they have no elements in common, and the union of a collection of sets is the set obtained by combining the elements of each set. So, when we partition a set, we put all of its elements into a collection of subsets, making sure that none of the subsets share any elements. For example, the set partitions of {1,2,3} are {{1,2,3}}, {{1},{2},{3}}, {{1},{2,3}}, {{1,2},{3}}, and {{1,3},{2}}.

Again, a disjoint-set data structure lets us efficiently model a set partition on a computer. It has two operations defined on it:

1. Union: Replaces two subsets with their union.

2. Find: Locates an element’s subset.

For this reason, disjoint-set data structures are also called “union-find data structures.”

Kruskal’s algorithm uses disjoint-set data structures to find minimum spanning trees.