Support Vector Machine

Support Vector Machine Algorithm In Python

Support Vector Machine (SVM) is one of the supervised Machine Learning Algorithm.That help to solve classification as well as regression problem.An SVM separate class based on the hyperplanes line and also set two margin line from the hyperplanes line that help to separate class easily.Have a possibility of many hyperplane lines that separates the class, but the select hyperplane line which help to maximize marginal distance.

Margin lines are parallel to hyperplane and that pass through the nearest point of the class.Distance between margin line and hyperplane is called marginal distance.Margin helped to train generalized model.If marginal distance is less than that may generate some error for a new data point.

Support Vector Means a point that is on the margin line

If data point a linearly separable, then the hyperplane line is easy to use.If data point are not linearly separable, then SVM kernel transfer lower dimension into higher dimension that help to solve nonlinear data point.

Support Vector Machine Kernels

  • linear
  • polynomial
  • Radial Basis Function(RBF)
  • sigmoid

Install Require Modules To Implement Support Vector Machine

pip install numpy
pip install pandas
pip install sklearn
pip install matplotlib

Classification Using Support Vector Machine

Here in this example, predict class label based on features value.x1 and x2 are the feature and class is the final class label in this example

Data Set

Support Vector Machine
Download Data set

Import required Python modules

Here use numpy for array handling,pandas for reading data from csv file sklearn for data preprocessing, model training and for model evaluation. Matplolib is used for plot graph that help to understand data using visualization.

#this module is used to read data from csv file
import pandas as pd
#this module is used for array handeling 
import numpy as np
#this module is used for split data into train and test set
from sklearn.model_selection import train_test_split
#this module is used tp plot graph
import matplotlib.pyplot as plt
#this module is used to train model
from sklearn import svm
#this module is used for model evaluation
from sklearn.metrics import jaccard_score

Read Data And Print

Here use read_csv method of pandas module that helps in reading data from csv file.After reading data from csv file printed using head function that shows the top five rows from the data frame.

data = pd.read_csv("data.csv")
data.head()

Plot Data Into Scatter Plot

Here we plot data of data frame into scatter plot.For different class choose different colour so easily identify that the whether the data are linearly separatable or not using hyperplane.

xx = data[data['Class'] == 4][:100].plot(kind='scatter', x='x1', y='x2', color='red', label='4');
data[data['Class'] == 2][:100].plot(kind='scatter', x='x1', y='x2', color='blue', label='2', ax=xx);
plt.show()
Support Vector Machine Algorithm In Python
Support Vector Machine

From above graph we observe that the data are easily separatable using hyperplane line.

Data Preprocessing And Train Test Split

From data frame first split feature and class first and then split data into train and test data set.Here we use 80% data as training and 20% as data as testing purpose.That help in overfitting of the model.That is very helpful in model evaluation.For that in sklearn module train_test_split method is available.

x=data[['x1','x2']];
y=data[['Class']].values.ravel()
xtrain, xtest, ytrain, ytest = train_test_split( x, y, test_size=0.2, random_state=4)

Model Training

For training of support vector machine classification model use SVC class is available in sklearn.svm module.This takes an argument as a kernel which used to train the model.Here we use linear as the kernel.

model = svm.SVC(kernel='linear')
model.fit(xtrain, ytrain)
SVC(C=1.0, break_ties=False, cache_size=200, class_weight=None, coef0=0.0, decision_function_shape='ovr', degree=3, gamma='scale', kernel='linear', max_iter=-1, probability=False, random_state=None, shrinking=True, tol=0.001, verbose=False)

Plot Hyperplane Line In Graph

xx = data[data['Class'] == 4][:100].plot(kind='scatter', x='x1', y='x2', color='red', label='4');
data[data['Class'] == 2][:100].plot(kind='scatter', x='x1', y='x2', color='blue', label='2', ax=xx);
w = model.coef_[0]
a = -w[0] / w[1]
xx = np.linspace(0, 8)
yy = a * xx - (model.intercept_[0]) / w[1]
plt.plot(xx, yy)
Support Vector Machine

Model Evaluation And Prediction

For model evaluation use jaccerd score.Many other model evaluation metrics availbe like f1-score,log loss.

ypred = model.predict(xtest)
ypred [0:5]
print("Accuracy score : ",jaccard_score(ytest, ypred,pos_label=2))
print("predicted class value for(5,1) :",model.predict([[5,1]])[0])
Accuracy score : 0.8979591836734694 
predicted class value for(5,1) : 2

Leave a Comment

Your email address will not be published. Required fields are marked *

Close Bitnami banner
Bitnami