The k-Nearest Neighbors Algorithm

The k-Nearest Neighbors algorithm is a simple machine-learning algorithm that classifies a new data point according to the classes of the data points closest to it. An example should illustrate the nature and effectiveness of the algorithm.

Consider a given apartment renter. If three out of the five renters living nearest to our renter make >$75k/yr, then our renter probably makes >$75k/yr, too. Of course, it’s possible that he or she doesn’t, but it’s likely that he or she does. The k-Nearest Neighbors algorithm operates on this principle. Given an integer k, the algorithm classifies a new data point as the class that appears most frequently among the k data points nearest to the new data point.

Here is my JavaScript implementation of the kNN algorithm.

As this great blog post explains, it’s crucial to pick an appropriate metric for measuring the nearness of neighbors. In my implementation, I used a common metric, the Euclidean distance.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s