Imbalance Data Problem

Imbalance Data Problem When using technology to solve real-world challenges, perhaps the most frequent problems are a large number of noise and extreme data imbalances in unimaginable forms. In this blog, we would like to share our efforts to resolve data imbalances. 1 . What is imbalanced data? 1-1 . A notion Imbalanced data refers to data that significantly differentiates the number of observations in the normal category and the number of observations in the abnormal category. For example, there are significantly fewer cases with cancer than those who don’t get cancer, and significantly fewer cases of credit card fraud than normal transactions. These data can be seen as unbalanced data. 1-2 . The point at issue It is generally more important to categorize the abnormalities accurately, between accurately classifying the normal and accurately classifying the abnormalities. This is because abnormal data is usually the target value. When you look at the picture, blue represents normal ob...