Machine Learning – Decision Trees
January 10, 2010 Leave a comment
I bet you thought that was going to say Decision Boundaries again 😀 – well… that is… if you’ve read the first 4 Machine Learning posts 😉
Nope, this time is Decision Trees, which are very similar to trees in programming – aka Binary Trees.
No, do NOT go and get any beer… although im not exactly going to be able to stop you am i? 🙂
The best way to see a Decision Tree is with an example, and the best example is with Students and Beer! – no im not saying every student drinks, but seriously… the majority do… even if you’re like me who would rather have a nice cup of hot-chocolate; just go with it 😉
Lets say we have 2 kinds of people, Students and Teachers. We have 1 kind of drink; Beer. Now, a Student (go with it) only drinks beer where there isn’t an exam on. And a Teacher will only drink Beer when its the weekend. With this we can say:
- IF student AND exam THEN beer
- IF student AND not exam THEN no beer
- IF teacher AND weekday THEN no beer
- IF teacher AND weekend THEN beer
With this, we can easily construct a Decision Tree:
Here we can see, say the first example above.
If Person=student AND Exam=yes THEN Beer
Before anybody says anything, yes, I do realise the mistake ive made. If exam then beer should be If exam then no beer. i apologise 😦
Yep, thats right. Time to revisit and old friend 😀
So, as described before, if we assume that we have all the data it can lead to over-fitting. Because of this it can mean that our techniques only work well on the Training Data. Yet, with a Decision Tree, over-fitting can lead to very large and obscure trees.