Skip to main content

Classification & Confusion Matrix & Accuracy Paradox

Classification 

work on voting the object belongs from which classes has more probability


  •  There are two types of classification :

  1. Binary classification : There are two classes we have ex: male-female, cat-dog, yes-not 
  2. Multiple classification :  There are classes more than two we have ex: traffic signs, face recognition, flower race , Digit Recognition


Confusion matrix : 

Confusion matrix is one type of technique to evaluate the model accuracy for classification problem. In this technique we consider how many of positive and negative data points we predict correctly.


The main consideration terms are accuracy, precision and recall
The accuracy was an appealing matric, because it was a single number. Here precision and recall(sensitivity) are two numbers. So to get the final score (accuracy) of our model we use F1 score, so that we have a single number.

Here is the F1 score's mathematical formula:

F1 = 2x precision x recall / (precision + recall)



Terms Meaning
 Sensitivity (Recall , Total Positive Rate)          true positive per total actual positive         
 Specificity (Total Negative Rate)   true negative per total actual negative
 Precision   true positive per total predicted positive
 Negative Predicted Values   true negative per predicted negative
 Accuracy:   true predictions per total number of testing features  
 Fall out (False Positive Rate)   false positive per actual positive
 False discovery  (False Negative Rate)   false positive per predicted positive
 Missing Rate   false negative per actual negative
 False omission   false negative per predicted negative

Explain Accuracy Paradox

    Accuracy Paradox means there is contradiction arises about good accuracy of model. Being a high accurate model is sometime doesn't need and practically proved not good model. Means if a model has a higher accuracy rate than another, it doesn't mean necessarily mean it is better than the second one. As a developer, we have to look at the results and data further to see if the model is good or not.
    
     Let's say we have training set of 100 places from which we have to predict which has chances of earthquake and which has not. In reality, it happens occurs in 5 places.
     
     Model 1: Predict the two places hasn't chances of earthquake. But it's  don .In that case, the accuracy of the model is 98%.
     
     Model 2: Predict the 8 places has chances of earthquake, but it didn't occur at those places, the accuracy of the model is  92%.
     
    When we compare the two models, and we find out the Model 1 is high accurate than Model 2. But when we see the Model 1 doesn't predict that the remain three places has chance and it happened, so it would be dangerous. And when we look at the Model 2 that predict the 8 places has chances even it didn't happen. In this case peoples would come out of danger. Accordingly, this matter of description the Model 1 is far more useful for the testing.
    
    So, we can conclude that, accuracy is not a great judge of a classification model. We need to delve deeper to find out, how our model is working based on the information given by the dataset. 

Comments

Popular posts from this blog

Gradient Descent with RSME

Optimization Alorithms Ex : G.D. , S.D. ,  Adam, RMS prop , momentum , adelta Gradient Descent is an  optimization algorithm that find a best fit line and local minima of a differentiable function for given training data set. S imply used to find the coefficients (weights) and intercept (bias) that minimize a cost function as far as possible.  There are three types of  g radient descent techniques:   Regular Batch GD (Gradient Descent) -  Studiously descend the curve in one path towards one minima ; every hop calculates the cost function for entire training data. If training data is large, one should not use this. Random GD (Stochastic GD) -   Calculates the Cost function for only one (randomly selected) training data per hop ; tend to jump all over the place due to randomness but due to it actually jump across minima’s.  Mini Batch gradient descent - Somewhere midway between the above 2. Does the calculation for a bunch of random data poin...

How memory is managed in Python?

Memory management in Python involves a private heap containing all Python objects and data structures. The management of this private heap is ensured internally by the Python memory manager. The Python memory manager has different components which deal with various dynamic storage management aspects, like sharing, segmentation or preallocation. If some contents or data doesn't used by user for a long time it is deleted by garbage collector provided by Python.

Bank customers survival or not classification | churn modeling

  # ANN using Stochastic Gradiant Descent # %%  Importing the libraries import  numpy  as  np import  matplotlib.pyplot  as  plt import  pandas  as  pd # %% Importing the dataset dataset = pd.read_csv( 'Churn_Modelling.csv' ) x = dataset.iloc[:,  3 : 13 ].values y = dataset.iloc[:,  13 ].values # %% Encoding categorical data from  sklearn.preprocessing  import  LabelEncoder, OneHotEncoder le= LabelEncoder() x[:,  1 ] = le.fit_transform(x[:,  1 ]) x[:,  2 ] = le.fit_transform(x[:,  2 ]) # %%  Creating Dataframe df = pd.DataFrame(x,columns=dataset.columns[ 3 : 13 ]) # %%  ColumnTransformer , OneHotEncoding from  sklearn.compose  import  ColumnTransformer transformer = C...