Posts

Imbalance Data Problem

Image
Imbalance Data Problem When using technology to solve real-world challenges, perhaps the most frequent problems are a large number of noise and extreme data imbalances in unimaginable forms. In this blog, we would like to share our efforts to resolve data imbalances. 1 . What is imbalanced data? 1-1 . A notion Imbalanced data refers to data that significantly differentiates the number of observations in the normal category and the number of observations in the abnormal category. For example, there are significantly fewer cases with cancer than those who don’t get cancer, and significantly fewer cases of credit card fraud than normal transactions. These data can be seen as unbalanced data. 1-2 . The point at issue It is generally more important to categorize the abnormalities accurately, between accurately classifying the normal and accurately classifying the abnormalities. This is because abnormal data is usually the target value. When you look at the picture, blue represents normal ob...

Visualizing decision tree partition and decision boundaries

Image
  This visualization precisely shows where the trained decision tree thinks it should predict that the passengers of the Titanic would have  survived  (blue regions) or not (red) , based on their Age and Pclass.

What are Probability Distribution?

Image
probability distribution is a statistical function that describes all the possible values and probabilities for a random variable within a given range. This range will be bound by the minimum and maximum possible values, but where the possible value would be plotted on the probability distribution will be determined by a number of factors. The mean (average), standard deviation, skewness, and kurtosis of the distribution are among these factors.

[Thousand and One Codes] [Git-Hub] 0001 ~ 0010

 GitHub is the most important service when you start coding. It is not familiar to beginner programmers. but it is very famous service to trained programmers. So I will introduce how to use GitHub for beginners to use it well. Introduce Git Hub code in the terminal. git status git add file-name git add . git commit -m "message" git checkout branch-name

What is a data science?

Image
Data science defined Data science combines multiple fields, including statistics, scientific methods, artificial intelligence (AI), and data analysis, to extract value from data. Those who practice data science are called data scientists, and they combine a range of skills to analyze data collected from the web, smartphones, customers, sensors, and other sources to derive actionable insights. Data science encompasses preparing data for analysis, including cleansing, aggregating, and manipulating the data to perform advanced data analysis. Analytic applications and data scientists can then review the results to uncover patterns and enable business leaders to draw informed insights.