Decision Tree Induction

Decision Tree is a tree that helps us in decision-making purposes. Decision tree creates classification or regression models as a tree structure.
It separates a data set into smaller subsets, and at same time, decision tree is steadily developed. Decision node has at least two branches. leaf nodes show a classification or decision. Decision trees can deal with both categorical and numerical data.

Key factors

Entropy refers a common way to measure impurity. It measures the randomness or impurity in data sets.

It refers to decline in entropy after dataset is split. It is also called Entropy Reduction.

Decision tree is just like a flow chart diagram with terminal nodes showing decisions.

It enables us to analyze the possible consequences.
It provides us a framework to measure the values of outcomes.
It helps us to make the best decisions based on existing data.
The decision tree model comprises a set of rules for portioning a huge heterogeneous population into smaller, more homogeneous, or mutually exclusive classes given data of attributes together with its class, a decision tree creates a set of rules that can be used to identify the class. A decision tree creates a set of rules that can be used to identify the class. Rule is implemented after another, resulting in a hierarchy of segments within a segment.
The hierarchy is known as the tree. Each segment is called a node. With each progressive division, the members from the subsequent sets become more and more similar to each other. The algorithm used to build a decision tree is referred to as recursive partitioning. The algorithm is called as CART (Classification and Regression Trees)
The given example of a factory where

Management teams need to take a data-driven decision to expand or not based on the given data.

Net Expand = ( 0.6 *8 + 0.4*6 ) - 3 = $4.2M
Net Not Expand = (0.6*4 + 0.4*2) - 0 = $3M
$4.2M > $3M, the factory should be expanded.

Algorithm is based on three parameters: D, attribute_list, and Attribute _selection_method. It refer to D as a data partition.

D - It is the entire set of training tuples and their related class levels.
attribute_list - It is a set of attributes defining tuples.
Attribute_selection_method - It specifies a heuristic process for choosing attribute that "best" discriminates given tuples according to class. Attribute_selection_method process applies attribute selection measure

Missing values in data also do not influence process of building a choice tree to any considerable extent.
Decision tree does not need scaling of information.
Decision tree does not require a standardization of data.
Decision tree model is automatic and simple to explain to the technical team as well as stakeholders.
Compared to other algorithms, decision trees need less exertion for data preparation during pre-processing.