Artificial Intelligence and Quantum Computing for Advanced Wireless Networks. Savo G. Glisic
Читать онлайн книгу.gain (ig) is an impurity‐based criterion that uses the entropy (e) measure (origin from information theory) as the impurity measure:
where
(2.14)
Gini index: This is an impurity‐based criterion that measures the divergence between the probability distributions of the target attribute’s values. The Gini (G) index is defined as
(2.15)
Consequently, the evaluation criterion for selecting the attribute ai is defined as the Gini gain (GG):
(2.16)
Likelihood ratio chi‐squared statistics: The likelihood ratio (lr) is defined as
(2.17)
This ratio is useful for measuring the statistical significance of the information gain criteria. The zero hypothesis (H0) is that the input attribute and the target attribute are conditionally independent. If H0 holds, the test statistic is distributed as χ2 with degrees of freedom equal to (dom(ai) − 1) · (dom(y) − 1).
Normalized impurity‐based criterion: The impurity‐based criterion described above is biased toward attributes with larger domain values. That is, it prefers input attributes with many values over attributes with less values. For instance, an input attribute that represents the national security number will probably get the highest information gain. However, adding this attribute to a decision tree will result in a poor generalized accuracy. For that reason, it is useful to “normalize” the impurity‐based measures, as described in the subsequent paragraphs.
Gain ratio ( gr): This ratio “normalizes” the information gain (ig) as follows: gr(ai, S) = ig(ai, S)/e(ai, S). Note that this ratio is not defined when the denominator is zero. Also, the ratio may tend to favor attributes for which the denominator is very small. Consequently, it is suggested in two stages. First, the information gain is calculated for all attributes. Then, taking into consideration only attributes that have performed at least as well as the average information gain, the attribute that has obtained the best ratio gain is selected. It has been shown that the gain ratio tends to outperform simple information gain criteria both from the accuracy aspect as well as from classifier complexity aspect.
Distance measure: Similar to the gain ratio, this measure also normalizes the impurity measure. However, the method used is different:
where
(2.18)
Binary criteria: These are used for creating binary decision trees. These measures are based on the division of the input attribute domain into two subdomains.
Let β(ai, d1, d2, S) denote the binary criterion value for attribute ai over sample S when d1 and d2 are its corresponding subdomains. The value obtained for the optimal division of the attribute domain into two mutually exclusive and exhaustive subdomains, is used for comparing attributes, namely
(2.19)
Twoing criterion: The Gini index may encounter problems when the domain of the target attribute is relatively wide. In this case, they suggest using the binary criterion called the twoing (tw) criterion. This criterion is defined as
(2.20)