Information gain ratio correction

This paper presents an improvement of the information gain function used in a lot of decision tree Machine Learning algorithms. It was published on arXiv.org.

26/02/2018

R&D

Tous les articles

Sommaire

Abstract

Information gain ratio

Decision trees algorithms use a gain function to select the best split during the tree’s induction. This function is crucial to obtain trees with high predictive accuracy. Some gain functions can suffer from a bias when it compares splits of different arities. Quinlan proposed a gain ratio in C4.5’s information gain function to fix this bias. In this paper, we present an updated version of the gain ratio that performs better as it tries to fix the gain ratio’s bias for unbalanced trees and some splits with low predictive interest.

You can also download our papier on Github HERE.

Une plateforme compatible avec tout l’écosystème

aws
Azure
Google Cloud
OVH Cloud
scikit-lean
PyTorch
Tensor Flow
XGBoost
jupyter
PC
Python
R
Rust
mongo DB