Cyber Security using Machine Learning model Confusion Matrix.

Sanchita Agrawal
4 min readJun 6, 2021

Hello guys😊

I’ve come up with a new blog article on which you’ll be knowing about confusion matrix and its implementation in today’s security world.

You all know Machine Learning has vast usage. We use ML models in various fields including security world.

So let us start..!!👇👇

How we use Machine Learning in Cyber Security world?

In cyber security we use machine learning concept as we know ml model works on the observation of certain patterns. Models are trained to predict.

Cyber Security guys uses this to take a step ahead from attackers. They predict the severity or chances that it might get attacked and so they can take precautions before hand and prevent from severe damage caused by attacking.

This is where we use the confusion matrix from sklearn. There are many metrics available but we use confusion matrix .

from sklearn.metrics import confusion_matrix

What is Confusion Matrix?

A confusion matrix is a table that is often used to describe the performance of a classification model (or “classifier”) on a set of test data for which the true values are known. It is used to measure the performance of a classification model. It can be used to evaluate the performance of a classification model through the calculation of
performance metrics like accuracy, precision, recall, and F1-score.

Confusion Matrix is used to determine the performance of the classification models for a given set of test data. It can only be determined if the true values for test data are known.

It is relatively simple but its terms can be confusing. Let’s see this by example of confusion matrix.

I’m taking here any random number just for example👇

Confusion Matrix table

Where ,

TN = True Negative

TP =True Positive

FP= False Positive

FN=False Negative

In classification we get output in two i.e. ‘yes’ /‘no’ or ‘positive’/ ‘negative’.

From above matrix , we know ..

📍Total no. of students were 150 where we predicted 100 students pass the exam and 50 students fail.

out of 150 students 100 (positive)pass and 50 fail (negative)

So, here 100 is positive and 50 is negative values.

📍In 100 positive values it may be possible that 80 students passed and 20 failed. Here 80 students passed means True Positive(TP) and 20 students failed means True Negative(TN).

📍In 50 negative values it may be possible that 40 really failed means False Positive (FP)and 10 students don’t fail means False Negative(FN).

False Positive(FP) is the Type 1 Error.

False Negative(FN) is the Type 2 Error.

📍This is a list of rates that are often computed from a confusion matrix for a binary classifier:

Accuracy: Overall, how often is the classifier correct?

(TP+TN)/total = 80+40/150 = 0.8

Confusion Matrix calculations

📍So, Cyber Security implementing confusion matrix to monitor cyber attacks. In any company there’s two case attack can happen and cannot happen.

📍Cyber attacks can be of any type for example.. Personal Data theft ,Organization Data theft, Spoofing, Hacking social media accounts, Stealing Bank details , Online Money stealing and many more .We have two types of error Type I Error and Type II Error.

📍Type II Error i.e. False Negative sometimes known as False Alarm . This is just for making security team to get aware that it maybe your model predicted but attack can happen. Type I Error i.e. False Positive can really be very dangerous because if security guys fully rely on ML model here the attack is very obvious. As they entered the database while machine is saying all is correct. That’s why security guys don’t fully depend on model as they remain active all the times.

📍So, ML models are not 100% accurate. we cannot depend full on ML models. ML models may be wrong. The critical nature that might vary from use case to use case where we want a tradeoff between the two types of error.

--

--

Sanchita Agrawal
0 Followers

Computer Science Major || Software Developer@GenusPower