3 Stages of model building

3 Stages of model building in machine learning

1. Data collection, preprocessing, data cleaning

In this step we collect the data like features and labels.

Data Preprocessing :

1. Data cleaning includes simple techniques such as outlier and missing value removal and replace new values with either mean, median in regression and mode in classification

Mean is a average of sequence ; the sum of the values divided by the number of values.
Median is centered value or mean of two middle values in sequence; used when there are outliers in the sequence that might skew the average of the values.
Mode is value that appears most often in sequence

2. Handle or encoding categorical variables :

Transform categorical variables into numbers ; tool : labelencoder()

Differentiate those categories , tool : one hot encoder(), get_dummies()

3. Feature scaling :

Standardize or equalize the numeric features ; Scaling is to bring all values to same magnitude

Normalization: ranging between 0 and 1. It is also known as Min-Max scaling.
Standardization : values are centered around the mean with a unit standard deviation.

4. Splitting the data set into training and testing set

2. Algorithm selection

After seeing the data set we have to decide which algorithm should be chose

If we have regression problem like salary prediction, price evaluation

we use, Linear regression, Polynomial regression, Support vector regression

If we have classification problem like salary prediction, price evaluation
we use, Logistic regression, K neighbor classifier, Scale vector classifier, Decision tree, random forest

If we have cluster problem, we use k- means algorithm

3. Evaluate model, error analysis

Finding accuracy of model ; Determining or evaluating model performance for regression classification problems
The most commonly used metric is the mean square error (MSE), root mean square error (RMSE)
In classfication we use confusion matrix for model analysis

Welcome To Easy Python & basic Machine Learning codes

Search This Blog