PadhaiTime Logo
Padhai Time

Hands On Practice for Precision and Recall

Let us take the same titanic data set and create one machine learning model. For model evaluation we will use all the metrics which we have discussed so far.

     

If you have worked on some machine learning model, then you would be familier about the flow. It contains various steps in sequence:

1) Import Library

2) Reading Data

3) Preprocessing Data

4) Training the Model

5) Model Predictions

6) Evaluating the Model output

 

## Import Libraries 

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import precision_score, recall_score, accuracy_score, f1_score, fbeta_score

## Import Data 

data = pd.read_csv("titanic-data.csv")
data = data[['Fare', 'Age', 'Sex', 'Survived']]
print("No. of records are:", data.shape)
data.head()

   

undefined

  

## PreProcessing

def replace_string(value):
    if value == "male":
        return 0
    if value == "female":
        return 1
    
data['Sex'] = data['Sex'].apply(replace_string)
data['Age'] = data['Age'].fillna(data['Age'].mean())
data['Age'].describe()

data.head()

   

undefined

 

## Model Training 

X = data.drop(['Survived'], axis = 1)
y = data['Survived']
X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, random_state=42, test_size=0.3)

rc = RandomForestClassifier()
rc.fit(X_train, y_train)


## Model Predictions 

predictions = rc.predict(X_test)

y_test = y_test.reset_index(drop = True)

result = pd.DataFrame()
result['Actual_Target'] = y_test
result['Predicted_Target'] = predictions
result.head()

 

undefined

 

## Confusion Matrix 

pd.crosstab(result['Predicted_Target'], result['Actual_Target'])

 

undefined

 

## Calculating Metrics - Manually 


tp = result[(result['Actual_Target'] == 1) & (result['Predicted_Target'] == 1)].shape[0]
fn = result[(result['Actual_Target'] == 1) & (result['Predicted_Target'] == 0)].shape[0]
tn = result[(result['Actual_Target'] == 0) & (result['Predicted_Target'] == 0)].shape[0]
fp = result[(result['Actual_Target'] == 0) & (result['Predicted_Target'] == 1)].shape[0]

Accuracy = (tp + tn) / (tp + fn + tn + fp)
Precision = (tp) / (tp + fp)
Recall = (tp) / (tp + fn)
F1Score = 2*Precision*Recall/(Precision+Recall)

Precision, Recall, Accuracy, F1Score

Output:

(0.6831683168316832, 0.6699029126213593, 0.753731343283582, 0.6764705882352942)

  

## Calculating Metrics - Using Library

print("Precision:", precision_score(y_test, predictions))
print("Recall:", recall_score(y_test, predictions))
print("Accuracy:", accuracy_score(y_test, predictions))
print("F1 Score:", f1_score(y_test, predictions))
print("FBeta Score - Equal weights:", fbeta_score(y_test, predictions, beta=1))
print("FBeta Score - 0.5 weight:", fbeta_score(y_test, predictions, beta=0.5))
print("FBeta Score - 2 weight:", fbeta_score(y_test, predictions, beta=2))

Output:

Precision: 0.6831683168316832
Recall: 0.6699029126213593
Accuracy: 0.753731343283582
F1 Score: 0.6764705882352942
FBeta Score - Equal weights: 0.6764705882352942
FBeta Score - 0.5 weight: 0.6804733727810651
FBeta Score - 2 weight: 0.672514619883041

 

If you want to download the jupyter notebook file for the above code, click here

Bengaluru, India
contact.padhaitime@gmail.com
  • We collect cookies and may share with 3rd party vendors for analytics, advertising and to enhance your experience. You can read more about our cookie policy by clicking on the 'Learn More' Button. By Clicking 'Accept', you agree to use our cookie technology.
    Our Privacy policy can be found by clicking here