Machine Learning – K-Nearest Neighbour (Part 1)
January 6, 2010 1 Comment
Cats and Dogs
Lets take 2 example classes. Say we wanted to be able to recognise the difference between a Cat and a Dog, how would you do this?
Well, firstly we have to identify some features that can easily distinguish them both. Lets have a quick example of a cat and a dog – yeah, seriously 😛
So then, the one on the left is a dog, and on the right is a cat (how cuuuute 😀 ) – i hope you realised this? Note: yes i was tempted to use a lolcat ¬.¬
So then, the features. Well, generally speaking, a dog is bigger than cat, yet a cat can have more fur than a dog. So here we can have the identified features of:
- Furriness (I think that is a word….)
With this, we can draw up a very quick graph, with the axis of Size and Furriness.
So, as we can see the Cats will normally cluster around the points in the top-left, and the dogs at the bottom-right.
However, what is the star – is it a Cat or a Dog?
This is how the K-Nearest Neighbour algorithm works. Firstly we have a variable “K”. With this, you assign “K” a value from 1 to the total number of examples. In the case above we have 7 examples (you don’t include the new example, the star)
From my experience the best value to chose for K is a low odd number – so for here I shall chose the value of 3. So K=3.
With this value of “K”, we look at the 3 closest points on the graph from the location of the new example point:
Ok, so from the graph above, I have drawn lines from the new point, to the 3 most closest points on the graph (because K=3, if it were 5, then there would be 5 lines, if it were 7, then there would be lines to all other points.)
The next part is to assign the new point a class, the class type will be determined from the majority of the closest points. For this it will be a Cat, since Cats out number Dogs 2-1.
Algorithm in Short
- Assign K a value – preferably a small odd number
- Find the closest number of K points
- Assign the new point from the majority of classes.
Next in Part 2 is Decision Boundaries, and what happens if everything goes horribly wrong! 😀