Decision trees in Machine Learning

Decision Trees
CART algorithm
Khan

Introduction
Decision trees
Decision trees are a model
where we break our data by
making decisions using series
of conditions(questions).

Decision tree algorithm
● These are also termed as CART algorithms.
● These are used for
○ Classification and
○ Regression
● Classification and Regression Trees

Decision tree components
● Root node
○ It refers to the start of the decision tree with
maximum split ( information Gain)
● Node
○ Node is a condition with multiple outcomes in
the tree.
● Leaf
○ This is the final decision(end point) of a node
from the condition(question)

Every node yields maximum data in each split which could be
achieved by IG
Information Gain ( IG )

It can be calculated by using impurity measures of each split
1. Gini Index (Ig
)
2. Entropy ( Ih
)
3. Classification error ( Ie
)
Impurity Metrics

● Root node is split to get maximum info gain.
● Increase in nodes in the tree causes overfitting.
● Splitting continues until each of the leaf is pure ( one of the
possible outcome )
● Pruning can also be done which means removal of branches
which use features of low importance.
● Gini index ≅ Entropy
● If uniform distribution , entropy is 1
Principle of spliting nodes

Split A
Parent data set ---> 40 items in feature 1 and 40 items in feature 2
Child 1 → 30 items in feature 1 and 10 items in feature 2
Split B
Parent data set ---> 40 items in feature 1 and 40 items in feature 2

ClsificationError=1−maxpj
IE(Dp)=1−40/80=1−0.5=0.5
B:IE(Dleft)=1−40/60=1−23=1/3
B:IE(Dright)=1−20/20=1−1=0
B:IGE=0.5−60/80×13−20/80×0=0.5−0.25−0=0.25
A:IE(Dleft)=1−30/40=1−34=0.25
A:IE(Dright)=1−30/40=1−34=0.25
A:IGE=0.5−40/80×0.25−40/80×0.25=0.5−0.125−0.125=0.25

Gini=1−∑p2
j
IG(Dp)=1−((40/80)2+(40/80)2)=1−(0.52+0.52)=0.5
A:IG(Dleft)=1−((30/40)2+(10/40)2)=1−(9/16+1/16)=38=0.375
A:IG(Dright)=1−((10/40)2+(30/40)2)=1−(1/16+9/16)=38=0.375
A:IG=0.5−40/80×0.375−40/80×0.375=0.125
B:IG(Dleft)=1−((20/60)2+(40/60)2)=1−(9/16+1/16)=1−59=0.44
B:IG(Dright)=1−((20/20)2+(0/20)2)=1−(1+0)=1−1=0
B:IG=0.5−60/80×0.44−0=0.5−0.33=0.17

Entropy=−∑pj
log2
pj
IH(Dp)=−(0.5log2(0.5)+0.5log2(0.5))=1
A:IH(Dleft)=−(30/40log2(30/40)+10/40log2(10/40))=0.81
A:IH(Dright)=−(10/40log2(1040)+30/40log2(30/40))=0.81
A:IGH=1−40/80×0.81−40/80×0.81=0.19
B:IH(Dleft)=−(20/60log2(20/60)+40/60log2(40/60))=0.92
B:IH(Dright)=−(20/20log2(20/20)+0)=0
B:IGH=1−60/80×0.92−20/80×0=0.31

Comparison of all Impurity Metrics
Scaled Entropy = Entropy /2
Gini index is intermediate
values of impurity lying
between classification error
and Entropy .

Pros :
● Simple to understand, interpret, visualize.
● It is effective to use in numerical and categorical data outcomes.
● Requires little effort from users for data preparation.
● Nonlinear relationships between parameters do not affect tree
performance.
● Able to handle irrelevant attributes ( Gain = 0 )

Cons :
● May make a complex tree with maximum depth.
● Unstable as small variation in input data may result in
completely different tree to get generated.
● As it is a greedy algorithm , may not find globally best tree for a
data set .

Applications :
1. Business Management
2. Customer Relationship Management
3. Fraudulent Statement Detection
4. Engineering
5. Energy Consumption
6. Fault Diagnosis
7. Healthcare Management

References :
● Python Machine Learning By Sebastian Raschka
● https://towardsdatascience.com/decision-trees-in-machine-learning-641b9c4e8
052
● https://www.bogotobogo.com/python/scikit-learn/scikt_machine_learning_Decis
ion_Tree_Learning_Informatioin_Gain_IG_Impurity_Entropy_Gini_Classificatio
n_Error.php
● https://media.ed.ac.uk/media/Pros+and+cons+of+decision+trees/1_p4gyge5m
● https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4466856/
● http://what-when-how.com/artificial-intelligence/decision-tree-applications-for-d
ata-modelling-artificial-intelligence/

Let’s code now
Data used : Iris from Sklearn
Plots : Matplotlib
Splits : Two features at a time
File : dtree.py
Link to code : Click here for code

Decision trees in Machine Learning

In this document

More Related Content

What's hot

Similar to Decision trees in Machine Learning

More from Mohammad Junaid Khan

Recently uploaded

Decision trees in Machine Learning