A Comprehensive Guide On Machine Learning With Python

This comprehensive guide on Machine Learning with Python brings you the detailed discussion on using Python for ML. We start with the basics of Python and the importance of using it. Thereafter we give an overview of machine learning and things you need to learn before using Python for ML. We also give a step-by-step tutorial to work on it.
A Comprehensive Guide On Machine Learning With Python

Python has been gaining popularity amongst developers in recent times. It came into the scene in the 1990s, post which the platform has been updated regularly over the years. Today, we will talk about Machine Learning With Python In Detail.

The version which we see now is the most powerful, yet simple to comprehend and work with. Let us see what exactly is Python.

Why Opt For Python Programming Language?

According to a slash data report, 8.2 million developers are using Python, while 7.6 million use Java. Thus, Python’s importance can no way be undermined, and numbers to tell a similar story.

If you are a fresher, you need to pick a language that will keep you in track of career targets. You cannot just randomly pick a language and master it.

However, if you are in your current job, going out of the way to learn a new programming language will give your career an added boost.

The prime example for this would be if you are looking to get into the world of Machine Learning and Artificial intelligence, then learning Python is the best move you can make.

Some of the properties which make Python a must-learn language are mentioned below.
Why-Opt-For-Python-Programming-Languages

Simple Syntax

The idea behind using simple syntax is assigning more time to solve actual execution problems rather than spending time on debugging syntax errors.

Python Syntax

Just think about the time you can save, which can be channelized into other productive areas.

Python is a programming language that provides you the freedom to work with syntax that is simple to apply in the script and easy to read.

For instance, Python does not use ‘typedef’ to assign a variable. Also, it doesn’t use curly braces and semicolons.
python-cta-first

A Strong Community

a strong community

First of all, Python is open-source, so you have some of the best developers in the community to help you in various situations. Add to that regular workshops, conferences, and community meet helps a beginner to feel at home.

There are plenty of resources online to learn about Python. There are various blogs, and youtube channels, which regularly keep the aspiring developers updated about recent changes and provide useful tips and suggestions.

Extensibility

Python arms people with a rich framework and libraries, which makes working on various projects easy. The robust library has modules for implementing various functionality without actually having to manually code them.

Py with Mac OS, Windows, and Linux

Python works well with Mac OS, Windows, and Linux. It is also extremely portable and dynamic as it allows developers to run code in different operating systems.

In case you need full in-depth details about the Python library, you can visit the official website and access the documentation.

Things To Learn For Working With Python For Machine Learning

Brush Up Mathematical Skills

Brush Up Mathematical Skills

Having mathematical skills is an indispensable part of working with Python development Now, it does not mean that you need to have a master’s degree in mathematics, but a sound concept clarity on key topics will come in handy.

Now, let us see what is the role of mathematics in Machine Learning. To put it simply, mathematics is a tool to develop a model to analyze data.

Using the apt mathematical approach, the ML algorithms can be designed to collect information that is required from large volumes of data fed into the system.

So, having a mathematical aptitude helps you with the data set more efficiently by algorithms that do a specific task to the ‘T’.

How You Can Approach Learning the Mathematics Required?

Linear Algebra
Linear Algebra

In the linear algebra section, you need to learn about various elements like vectors and scalars. You will also have to learn about the matrix.

Though it is not directly connected to the algebra, you need to know how to switch between the linear equations and matrix. This will help in Machine Learning With Python.

Get the essential knowledge first. For instance, while working on regression, you need to have a stronghold over matrix multiplication.

So, amidst all of the different things that you can learn in the matrix, try to first acquaint yourself with the multiplication. Do not miss out on learning the basics operation like matrix addition.

Now, for working on PCM (Principal Component Method) of data analysis, you need to be familiar with the concept of EigenVectors.

Principal Component Analysis: It is a dimensionality reduction method. Now what this means is that with PCM, large sets of data with ‘n’ number of variables are reduced in such a fashion that important core elements are not lost.

Eigenvectors: For PCM, you need to have an understanding of Eigenvectors. These are vectors that do not lose sensitive parts of the data when their shape is transformed. Such vectors hold the important data set for accurate analysis.
Multivariate Calculus
Multivariate Calculus

The second line of mathematical knowledge that you need to possess while working with the Machine Learning model is Differentiation.

Again, you don’t need to go in-depth with calculus stuff. For instance, in ML you will be mainly dealing with the first-order derivative. So learning higher-order derivatives won’t be a great idea.

Things you need to learn in derivative, especially in the first-order form, include the chain rule, sum rule, power rule, and various other basic differentiation rules. Also, you need to be familiar with the topic of partial differentiation.

But the question is why do you need to study differentiation? Well, the short answer to that would be; differentiation helps to optimize machine Learning models for working with data in a better fashion.

How does it work?

How does it work?

When you differentiate a vector once, it becomes a Jacobian vector. Such vectors come in handy to recognize the point in the global data set where maximum data is available.

One of the properties of Jacobian vectors is that such vectors can easily be presented in a matrix. Also, the Jacobian vector helps to transform a nonlinear function into a linear one.

Now when you differentiate the Jacobian vector again, you get something called Hessian, which helps in reducing errors and assists the gradient method to reduce the load. We look into the Gradient Descent in much more detail in the next section.

Gradient Descent
Gradient Descent is another important part of machine learning dealing with neural networks, and derivatives. Without apt mathematical knowledge, it’s tough to understand the idea of gradient descent.

Carving out neural networks from scratch is the best way to come close to the mathematics used in mathematical analysis.

There are various resources online which will help you to learn how to build a neural network from scratch.

Here, you don’t need to get into the flesh of neural networks, just a basic understanding coupled with following guidelines and writing codes would suffice.

Gradient Descent is a process to minimize the cost function to find the local minimum. One of the other ways to understand gradient descent is that it helps to optimize weight.

Just a plain example would be reaching an output of 0.3 from an input of 0.1. Here, we continuously work around the range to optimize weight to reach a target close to 0.3.

Learning Basic Python Syntax

Learning Basic Python Syntax

There are two ways to look at the importance of learning Python Machine Learning syntax.

According to the first line of thought, you just can’t avoid learning the syntax in the first place, neither can you just read the documentation and get the hang of it.

You need to employ a combination of practice and learning to get a stronghold over syntax.

As per the second line of thought, you don’t need to go deeper into the syntactic knowledge of Python, while working with machine learning.

A lot of complex elements can be easily skipped when acquiring the Python syntax knowledge for machine learning applications. So, learning the basic syntax is the key.

python-cta-second

While in the offline section, we will mention the books that can help you in the process of learning Python.

Online Medium

There are various online learning sites that you can resort to for learning Python. They provide complete and comprehensive knowledge on the topic. Let’s see some online sites that provide resources for Python.

Code Academy

First in the list of online resources is Codecademy, which is one of the best online resources to learn any form of programming language. It is an apt platform for both an expert or a rookie.

DataQuest

DataQuest is yet another important source online to learn Python. It has a unique way of bringing forth the knowledge of Python.

Dataquest has pooled together data science & Python, the latter is taught in context to the former (data science).

Python Official Documentation and Tutorials

None of the online resources can replace the official documentation present in Python’s website.

The tutorial videos which are present in the web portal are also a great source to learn different elements of Python like the working of syntax.

Books: For AI & Python

Learning Python The Hard Way is an excellent resource to learn about Python and understand its intricacies moving level by level up the ladder to learning Python.

The author, Zed Shaw, creates an atmosphere of active learning for readers by guiding them to other resources that they can dig up while reading this book.

Knowing Data Science Libraries

As mentioned earlier, Python has numerous frameworks and libraries, which can make the task of working with data much simpler.

A library simply consists of objects and functions which can be imported into the script for reducing time and getting better results.

Executing a task sometime might require huge coding work, but with objects and functional elements present in the library, the numerous code lines can be shrunk into a single line.

Before heading into python libraries, you must understand what is Jupyter Notebook, and the way you can use it. It will help you in Python Machine Learning.

So, Let Us First See What Is Jupyter Notebook?

A Jupyter notebook is a web application that has three crucial components. First, at the top, you have space for text which serves the explanatory purpose.

Next, you have the space for the live execution of code. And finally, you can embed videos for better visualization along with graphs.

Generally, the installation package of Python is innately bundled with the Jupyter Notebook.

Below is the instruction on how you can use the Jupyter Notebook to know more about the libraries and use them in a better way.

Steps to leverage Jupyter Notebook to understand Python library:

  • Access Jupyter Notebook.
  • Have a thorough glance at the library documentation.
  • After you have some info about the library documentation, you need to import the library into the concerned notebook.
  • Follow the steps mentioned in the guidelines for using the library.
  • Go through the library documentation to tap into hidden features.

We have compiled a list of open-source libraries based on the functionalities, which are divided into two parts: Data processing & Modeling, and Visualisation.

Data Processing and Modelling

Numpy

Numpy
One of the innate benefits of Python is that it makes working with arrays easy and effective. This innate benefit is taken forward with the Numpy, short for Numerical Python, as it eases out the process of working with arrays and matrices.

Numpy takes care of the different mathematical operations on arrays, as it expedites the process of operations.

Further, with Numpy, several other functionalities come to the fore like Numeric conversion and different operations based on linear algebra.

Pandas

Panda
When you work with data in Python, you gotta know all about the pandas. There is hardly anything that you can skip in the pandas section, and move forward.

Pandas provide a plethora of tools to work with data. You can shape data in any form, you can even add or remove data. Every change made into the data frame is almost immediately reflected.

Scikit learn

Scikit
Scikit learn is one of the most popular Machine Learning Python libraries which was initially named Scikits. Learn. It was later called just Scikit learn, where Scikit is the compressed form ‘SciPy Tool Kit’.

Data scientists across the globe mainly use Scikit learn to comply with various machine learning and data mining tasks such as dimension reduction, model selection, regression, and clustering.

Data Visualisation

Matplotlib

Matplotlib
Matplotlib is a Python library that helps comprehend data in a better fashion using the graph representation.

It helps in the visual representation of large chunks of data in graphs or some other visual form.

The general visual representation can be brought to the fore with minimum-to-no code adjustments.

However, if you want some advanced graphical representation, you need to be prepared to code a few extra lines.

Machine Learning In Python: Step-by-Step Tutorial

By now you must have known why Python is best for Machine Learning. Now, to integrate it in your project, here is a step-by-step tutorial that you can follow and get a hands on it.

Step 1: Downloading & Installing Python SciPy

The very first and obvious step to start working with Machine Learning Python is to have the Scipy and Python platforms installed in your system. If you are a developer, you would know how to do the needful.

1.1 Install the Python-based SciPy libraries

This step by step guide is attuned with the Python v2.7 or 3.6+.

Here you need to begin with installing five crucial Python-based libraries. The list of which includes: SciPy, Pandas, Numpy, Sklearn, and Matplotlib.

Installing all the SciPy libraries is quite easy on different platforms like Mac OS X, Windows, and Linux. The ease in installation comes from the clear instruction mentioned in the SciPy installation pages. Python for Machine Learning can be a great combo.

To put it simply, for Mac OS X, using Macports is an easy way to comply with the installation task. For Linux, you need to use the Package manager. Finally, for Windows, you can begin with downloading Anaconda, which has all the libraries.

Check the versions

It is crucial that your Python version and the environment setup is correctly installed. For this, you can use the script mentioned below. Also, start working on the Python command line, instead of using IDEs.

Use the following script to test the version.

# Check the versions of libraries

 

# Python version

import sys

print('Python: {}'.format(sys.version))

# scipy

import scipy

print('scipy: {}'.format(scipy.__version__))

# numpy

import numpy

print('numpy: {}'.format(numpy.__version__))

# matplotlib

import matplotlib

print('matplotlib: {}'.format(matplotlib.__version__))

# pandas

import pandas

print('pandas: {}'.format(pandas.__version__))

# scikit-learn

import sklearn

print('sklearn: {}'.format(sklearn.__version__))
Python: 3.6.9 (default, Oct 19 2019, 05:21:45)

[GCC 4.2.1 Compatible Apple LLVM 9.1.0 (clang-902.0.39.2)]

scipy: 1.3.1

numpy: 1.17.3

matplotlib: 3.1.1

pandas: 0.25.1

sklearn: 0.21.3

Also, keep in mind the APIs do not change too frequently, so if you have a version from the recent past, it should work fine.

Step 2: Load The Data

Here we will be using the ‘Hello World’ alternative for the dataset in Machine Learning, it is the iris dataset.

Now, the given dataset has information about 150 iris flowers. The dataset includes several features of the different species of the iris flower. This is an essential step in the Python Machine Learning project. 

2.1 Importing libraries

At this point, we will import every element of libraries: function, object, modules. In an ideal scenario, every element should load without any fuss.

# Load libraries

from pandas import read_csv

from pandas.plotting import scatter_matrix

from matplotlib import pyplot

from sklearn.model_selection import train_test_split

from sklearn.model_selection import cross_val_score

from sklearn.model_selection import StratifiedKFold

from sklearn.metrics import classification_report

from sklearn.metrics import confusion_matrix

from sklearn.metrics import accuracy_score

from sklearn.linear_model import LogisticRegression

from sklearn.tree import DecisionTreeClassifier

from sklearn.neighbors import KNeighborsClassifier

from sklearn.discriminant_analysis import LinearDiscriminantAnalysis

from sklearn.naive_bayes import GaussianNB

from sklearn.svm import SVC

2.2 Time to load iris dataset

The data set can be directly pulled out from the UCI ML repository. Here, Pandas are used for working data and even visualizing the same. Each column is named while loading data, so that exploring data at later stages can be easier.

# Load dataset

url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/iris.csv"

names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'class']

dataset = read_csv(url, names=names)

Step 3: Summarizing The Dataset

Here, we will work through the data and gather insights for summarising them according to certain parameters.

We will assess the data from different angles. Also, while assessing the data for each section, you have to use a single command.

This will prepare you for further command-based steps in the project that you deal with in the near future.

3.1. Working out details about instances and attributes

Here we use the #shape property, to ascertain the number of rows (instances) and columns (attributes).

# shape

print(dataset.shape)

Now, according to our iris dataset, you should receive an output inferring to 150 rows or instances, and 5 attributes or columns.

(150, 5)

3.2 Asses the dataset manually

It is advised to have a peek at the data, to begin with.

# head

print(dataset.head(20))

In an ideal situation, you must glance through at least 20 rows of the given dataset.

sepal-length  sepal-width  petal-length  petal-width        class

0            5.1          3.5        1.4          0.2  Iris-setosa

1            4.9          3.0        1.4          0.2  Iris-setosa

2            4.7          3.2        1.3          0.2  Iris-setosa

3            4.6          3.1        1.5          0.2  Iris-setosa

4            5.0          3.6        1.4          0.2  Iris-setosa

5            5.4          3.9        1.7          0.4  Iris-setosa

6            4.6          3.4        1.4          0.3  Iris-setosa

7            5.0          3.4        1.5          0.2  Iris-setosa

8            4.4          2.9        1.4          0.2  Iris-setosa

9            4.9          3.1        1.5          0.1  Iris-setosa

10        5.4          3.7        1.5          0.2  Iris-setosa

11        4.8          3.4        1.6          0.2  Iris-setosa

12        4.8          3.0        1.4          0.1  Iris-setosa

13        4.3          3.0        1.1          0.1  Iris-setosa

14        5.8          4.0        1.2          0.2  Iris-setosa

15        5.7          4.4        1.5          0.4  Iris-setosa

16        5.4          3.9        1.3          0.4  Iris-setosa

17        5.1          3.5        1.4          0.3  Iris-setosa

18        5.7          3.8        1.7          0.3  Iris-setosa

19        5.1          3.8        1.5          0.3  Iris-setosa

3.3 Statistical Summary

Here, we will gather information from the attributes based summary record. The list includes min & the max, count, percentile in some cases, and finally mean.

# descriptions

print(dataset.describe())

Below, as you see the numerical values in centimeters, ranging between 0-8 cm.

                sepal-length  sepal-width  petal-length       petal-width

count   150.000000   150.000000    150.000000    150.000000

mean    5.843333 3.054000      3.758667        1.198667

std         0.828066 0.433594      1.764420        0.763161

min       4.300000 2.000000      1.000000        0.100000

25%      5.100000 2.800000      1.600000        0.300000

50%      5.800000 3.000000      4.350000        1.300000

75%      6.400000 3.300000      5.100000        1.800000

max       7.900000 4.400000      6.900000        2.500000

3.4 Class Distribution

Here, we go ahead with the class distribution for getting the absolute value for the number of instances each class has.

# class distribution

print(dataset.groupby('class').size())

As you can see below, each class has equal instances i.e. 50.

class
Iris-setosa        50
Iris-versicolor    50
Iris-virginica 	   50

3.5 Complete Example

Below, we have mentioned an example that combines the above steps to land with a single script.

# summarize the data

from pandas import read_csv

# Load dataset

url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/iris.csv"

names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'class']

dataset = read_csv(url, names=names)

# shape

print(dataset.shape)

# head

print(dataset.head(20))

# descriptions

print(dataset.describe())

# class distribution

print(dataset.groupby('class').size())

Step 4: Data Visualization

After assessing the data in the raw format, it’s time to give the data a visual representation for a better understanding of the dataset. We use univariate plots, and multivariate for the same.

4.1 Univariate Plots

In the case of univariate, we give the attributes a unique visual representation.

Here, since we have numeric information about the input variables, we carve out a box and whisker plot.

# box and whisker plots

dataset.plot(kind='box', subplots=True, layout=(2,2), sharex=False, sharey=False)

pyplot.show()

1 2

To spot the nature of distribution, we can have a histogram representation of each input variable.

# histograms

dataset.hist()

pyplot.show()

2 3

Here, we can see that two variables exhibit Gaussian distribution, and we can use this information with relevant algorithms.

4.2 Multivariate Plots

In the case of Multivariate plots, we assess the correlation between the input variables. A scatterplot matrix representation will help in a better understanding of the relationship.

# scatter plot matrix

scatter_matrix(dataset)

pyplot.show()

3 1

In the figure above, you can see a diagonal clustering of attributes, this indicates a high correlation and predictable properties.

4.3 Complete Example

All the elements of data visualisation are brought to a single reference script.

# visualize the data

from pandas import read_csv

from pandas.plotting import scatter_matrix

from matplotlib import pyplot

# Load dataset

url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/iris.csv"

names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'class']

dataset = read_csv(url, names=names)

# box and whisker plots

dataset.plot(kind='box', subplots=True, layout=(2,2), sharex=False, sharey=False)

pyplot.show()

# histograms

dataset.hist()

pyplot.show()

# scatter plot matrix

scatter_matrix(dataset)

pyplot.show()

Step 5: Evaluating Some Algorithms

It is time to check the accuracy of unseen data through data modeling.

5.1 Creating a validation dataset

Here, we first check the authenticity of the model created.

We will follow a dual approach here, first, we will check the accuracy of the data model created through statistical methods.

Secondly, we leave some of the unseen data away from the algorithmic working to actually assess the accuracy of the best model created on unseen data.

Here, 80% of the dataset will be passed through a training and evaluation process. We will select our model from this 80%. Rest 20% will be kept to be used as a validation dataset.

# Split-out validation dataset

array = dataset.values

X = array[:,0:4]

y = array[:,4]

X_train, X_validation, Y_train, Y_validation = train_test_split(X, y, test_size=0.20, random_state=1)

5.2 Test Harness

To judge the extent of model accuracy, we use a 10-fold cross method. Here, among the 10 parts, 9 parts are used for training, while the rest 1 part is used for testing, The same is applied to the entire train-test split.

To make sure that algorithms are tested on the same test split, we obtain a fixed number for the random seed using the #random_state argument.

The accuracy of the model is ascertained using the accuracy metric. It is basically the ratio of the total number of correct prediction instances, to the total number of instances in the dataset. The ratio is expressed in percentage.
python-cta-third

5.3. Build Models

Since we are not clear about which algorithm can be applied to solve the given problem, more so, we are unsure about the configuration to use. We apply a blend of linear and nonlinear algorithms. Machine Learning With Python can be an important step here.

Linear algorithm- Logistic Regression, and Linear Discriminant Analysis (LDA), Non-linear algorithm- K-Nearest Neighbors, Classification and Regression Trees, Support Vector Machines, and Gaussian Naive Bayes.

# Spot Check Algorithms

models = []

models.append(('LR', LogisticRegression(solver='liblinear', multi_class='ovr')))

models.append(('LDA', LinearDiscriminantAnalysis()))

models.append(('KNN', KNeighborsClassifier()))

models.append(('CART', DecisionTreeClassifier()))

models.append(('NB', GaussianNB()))

models.append(('SVM', SVC(gamma='auto')))

# evaluate each model in turn

results = []

names = []

for name, model in models:

kfold = StratifiedKFold(n_splits=10, random_state=1, shuffle=True)

cv_results = cross_val_score(model, X_train, Y_train, cv=kfold, scoring='accuracy')

results.append(cv_results)

names.append(name)

print('%s: %f (%f)' % (name, cv_results.mean(), cv_results.std()))

5.4 Select Best model

We need to assess the raw values for each of the six models, and we need to compare them. One of the best ways to compare the mean accuracy and spread of the models is to plot the results.

The sample results for each algorithm can be analyzed with box and whisker for independent distributions so that they can be compared with ease.

LR: 0.960897 (0.052113)

LDA: 0.973974 (0.040110)

KNN: 0.957191 (0.043263)

CART: 0.957191 (0.043263)

NB: 0.948858 (0.056322)

SVM: 0.983974 (0.032083)

4 1

5.5 Example

The complete example is given below for reference purposes.

# compare algorithms

from pandas import read_csv

from matplotlib import pyplot

from sklearn.model_selection import train_test_split

from sklearn.model_selection import cross_val_score

from sklearn.model_selection import StratifiedKFold

from sklearn.linear_model import LogisticRegression

from sklearn.tree import DecisionTreeClassifier

from sklearn.neighbors import KNeighborsClassifier

from sklearn.discriminant_analysis import LinearDiscriminantAnalysis

from sklearn.naive_bayes import GaussianNB

from sklearn.svm import SVC

# Load dataset

url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/iris.csv"

names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'class']

dataset = read_csv(url, names=names)

# Split-out validation dataset

array = dataset.values

X = array[:,0:4]

y = array[:,4]

X_train, X_validation, Y_train, Y_validation = train_test_split(X, y, test_size=0.20, random_state=1, shuffle=True)

# Spot Check Algorithms

models = []

models.append(('LR', LogisticRegression(solver='liblinear', multi_class='ovr')))

models.append(('LDA', LinearDiscriminantAnalysis()))

models.append(('KNN', KNeighborsClassifier()))

models.append(('CART', DecisionTreeClassifier()))

models.append(('NB', GaussianNB()))

models.append(('SVM', SVC(gamma='auto')))

# evaluate each model in turn

results = []

names = []

for name, model in models:

kfold = StratifiedKFold(n_splits=10, random_state=1, shuffle=True)

cv_results = cross_val_score(model, X_train, Y_train, cv=kfold, scoring='accuracy')

results.append(cv_results)

names.append(name)

print('%s: %f (%f)' % (name, cv_results.mean(), cv_results.std()))

# Compare Algorithms

pyplot.boxplot(results, labels=names)

pyplot.title('Algorithm Comparison')

pyplot.show()

Read also: A Detailed Guide To Find and Hire Python Developers (Skills, Mistakes, Cost & Salary)

Step 6:  Make Predictions

Now, here we pick an algorithm for predictions.

In step 5, you might have observed that the SVM model has the highest accuracy, so we will employ this model.

Here, we ascertain the accuracy of the model based on the validation dataset, we kept aside earlier.

6.1 Make Predictions

We can apply the model on the dataset in its entirety, and churn out predictions for the dataset validation.

# Make predictions on validation dataset

model = SVC(gamma='auto')

model.fit(X_train, Y_train)

predictions = model.predict(X_validation)

6.2 Evaluate predictions

Here the predictions are evaluated, as per the predicted results in the validation set. Post which we can go ahead with the calculation of classification accuracy alongside classification report, and confusion Matrix.

# Evaluate predictions

print(accuracy_score(Y_validation, predictions))

print(confusion_matrix(Y_validation, predictions))

print(classification_report(Y_validation, predictions))

Below you can see the accuracy rate as 96.67%, also the classification report, which information about precision, recall, f1-sore, and report.

0.9666666666666667

[[11  0  0]

[ 0 12  1]

[ 0  0  6]]

                     precision    recall  f1-score   support

 Iris-setosa    1.00      1.00      1.00        11

Iris-versicolor  1.00      0.92      0.96        13

Iris-virginica    0.86      1.00      0.92      6

 accuracy                                          0.97        30

macro avg    0.95      0.97      0.96        30

 weighted avg    0.97      0.97      0.97        30

6.3 Complete Example

The complete example for reference is given below in a single script.

# make predictions

from pandas import read_csv

from sklearn.model_selection import train_test_split

from sklearn.metrics import classification_report

from sklearn.metrics import confusion_matrix

from sklearn.metrics import accuracy_score

from sklearn.svm import SVC

# Load dataset

url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/iris.csv"

names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'class']

dataset = read_csv(url, names=names)

# Split-out validation dataset

array = dataset.values

X = array[:,0:4]

y = array[:,4]

X_train, X_validation, Y_train, Y_validation = train_test_split(X, y, test_size=0.20, random_state=1)

# Make predictions on validation dataset

model = SVC(gamma='auto')

model.fit(X_train, Y_train)

predictions = model.predict(X_validation)

# Evaluate predictions

print(accuracy_score(Y_validation, predictions))

print(confusion_matrix(Y_validation, predictions))

print(classification_report(Y_validation, predictions)

Conclusion

With this, we come to an end of the comprehensive guide on Machine Learning with Python. We have delved deep into the ‘whys’ and ‘hows’ of Machine Learning and Python.

We would like to end with a single point, you should not feel bogged down if you do not have prior programming knowledge.

The specialty of Python is that it is unique, and has a menial link with other forms of programming language. So, you can start afresh without any apprehension.

We hope you had a great time reading this article and it proves to be of great value for any Python Development Company. Thank You.!

Harikrishna Kundariya
Harikrishna Kundariya
Harikrishna Kundariya, a marketer, developer, app lover, technology savvy, designer, co-founder, Director of eSparkBiz @Mobile App Development Company where you can Hire Mobile App Developer. His 8+ experience enables him to provide digital solutions to new start-ups based on app development.

Related Post

What Type of Apps Can You Build in Python?

What Type of Apps Can You Build in Python?

Technology has evolved in the last few decades. The emergence of supercomp...

Python For Finance: How Is Python Used In Finance?

Python For Finance: How Is Python Used In Finance?

For various applications, i.e., from cryptocurrencies to risk management, ...

Step-by-Step Guide To Use Artificial Intelligence With Python

Step-by-Step Guide To Use Artificial Intelligence With Python

Python is an open-source language that has gained a lot of popularity in r...

Guaranteed Response within One Business Day!
person We are always looking to partner with great people & incredible brands, so let’s connect.