Safe Level Graph for Synthetic Minority Over-sampling Techniques

2013 | International Symposium on Communications and Information Technologies (ISCIT)
Share via twitter Share via email Download PDF

Chumphol Bunkhumpornpat


—In the class imbalance problem, most existent classifiers which are designed by the distribution of balance datasets fail to recognize minority classes since a large number of negative instances can dominate a few positive instances. Borderline-SMOTE and Safe-Level-SMOTE are over-sampling techniques which are applied to handle this situation by generating synthetic instances in different regions. The former operates on the border of a minority class while the latter works inside the class far from the border. Unfortunately, a data miner is unable to conveniently justify a suitable SMOTE for each dataset. In this paper, a safe level graph is proposed as a guideline tool for selecting an appropriate SMOTE and describes the characteristic of a minority class in an imbalance dataset. Relying on advice of a safe level graph, the experimental success rate is shown to reach 73% when an F-measure is used as the performance measure and 78% for satisfactory AUCs.