Autobin: A predictive approach towards automatic binning using data splitting

Tanja Verster

doi:10.37920/sasj.2018.52.2.3

Authors

Tanja Verster Centre for Business Mathematics and Informatics, North-West University, Potchefstroom, South Africa

DOI:

https://doi.org/10.37920/sasj.2018.52.2.3

Keywords:

Binning, Credit scoring, Data splitting, Predictive models

Abstract

The concept of binning is known by many names: discretisation, classing, grouping and quantisation. It entails the mapping of continuous or categorical data into discrete bins. Binning is an important pre-processing step in most predictive models and considered a basic data preparation step in building a credit scorecard. Credit scorecards are mathematical models which attempt to provide a quantitative estimate of the probability that a customer will display a defined behaviour (e.g. default) with respect to their current credit position with a lender. Among the practical advantages of binning are the removal of the effects of outliers and a way to handle missing values. Many binning methods exist but they are often time consuming to actually carry out. We propose a new method, Autobin, that is based on data splitting and maximising a cross-validation form of the predicted log-likelihood. Autobin has the advantage of being nearly automatic and requires very little by way of tuning parameters. In a limited simulation study done, it was found that Autobin outperforms its competitors.

Downloads

Download data is not yet available.

Autobin: A predictive approach towards automatic binning using data splitting

Authors

DOI:

Keywords:

Abstract

Downloads

Downloads

Published

How to Cite

Issue

Section

Most read articles by the same author(s)

Similar Articles

Information