U yearn to learn? Me to!
Log into public github (no account? create one!).
Got co c9.io. Click on the github icon top right
Create new workspace, enter in the following github repo name: https://github.com/dotninjas/dotninjas.github.io.
Then press the big green button (don’t worry about the templates, they will work themselves out).
When you are in,
cd ninja
sh ninja
Now you should see something like this:
Type “eg0
” to show (a) some data and (b) a decision tree learned from that data.
@attribute outlook {sunny, overcast, rainy}
@attribute temperature real
@attribute humidity real
@attribute windy {TRUE, FALSE}
@attribute play {yes, no}
overcast 64 65 TRUE yes
overcast 72 90 TRUE yes
overcast 81 75 FALSE yes
overcast 83 86 FALSE yes
rainy 65 70 TRUE no
rainy 68 80 FALSE yes
rainy 70 96 FALSE yes
rainy 71 91 TRUE no
rainy 75 80 FALSE yes
sunny 69 70 FALSE yes
sunny 72 95 FALSE no
sunny 75 70 TRUE yes
sunny 80 90 TRUE no
sunny 85 85 FALSE no
outlook = sunny
| humidity <= 75: yes (2.0)
| humidity > 75: no (3.0)
outlook = overcast: yes (4.0)
outlook = rainy
| windy = TRUE: no (2.0)
| windy = FALSE: yes (3.0)
If you want to know more, then
-
This is the command line WEKA tool run by (e.g.)
Weka="java -Xmx2048M -cp weka.jar " learner=weka.classifiers.trees.J48 $Weka $learner -p 0 -C 0.25 -M 2 -t train.arff1 -T test.arff
This grows a decision tree downwards until there are more than -M 2
examples in the leaves.
Then in prunes sub-trees. Sub-trees die if, after pruning, the overall test error does not get worse by more
than -C 0.25
.
Why those magic numbers? Engineering judgement. I.e. the generated model is a result of decisions made by the analyst. We’ll get back to that.
For more examples, see
For the theory behind decision tree learning, see here