R.SOWMIYA (30323U09086).pptx data science with python
1.
ARCOT SRI MAHALAKSHMIWOMEN’S COLLEGE
ADVANCED DATA SCIENCE USING PYTHON
NAAN MUDHALVAN
SUBJECT CODE:23UNM40A
Presented By,
Name:SOWMIYA.R.
Reg.no:30323U09086
Bachelor Of Computer Application
2.
INTRODUCTION TO DATASCIENCE
INTRODUCTION:
Data Science is a combination of multiple
disciplines that uses statistics, data analysis, and
machine learning to analyse data and to extract
knowledge and insights from it.
By using Data Science, companies are able to make:
1.Better decisions (should we choose A or B)
2.Predictive analysis (what will happen next?)
3.Pattern discoveries (find pattern, or maybe hidden
information in the data)
3.
INTRODUCTION TO DATASCIENCE
Types:
Data Science is used in almost every industry today that
can predict customer behaviour and trends and identify new
opportunities.
Businesses can use it to make informed decisions about
product development and marketing. It is used as a tool to
detect fraud and optimize processes.
4.
INTRODUCTION TO DATASCIENCE
Key points:
Data science is really a progression of three steps.
We collect data, then analyse the trends within the data,
and lastly we make decisions based on the data.
Data science is a process in which the goal is to make
better choices.
5.
EXPLORATORY DATAANALYSIS
INTRODUCTION:
Exploratory DataAnalysis (EDA) is an analysis approach that identifies general
patterns in the data. These patterns include outliers and features of the data that
might be unexpected.
EDA is an important first step in any data analysis.
The goal of this tutorial document is to walk through some of the common
issues encountered in the early stages of an exploratory analysis on a set of data. It
gives examples of common problem areas in:
1. reading in data
2.dealing with blanks
3.dealing with factors
EXPLORATORY DATAANALYSIS
Benefits ofexoratory data analysis:
1.Deeper Insights
2. Improved Data Quality
3. Better Decision-Making
4.Enhanced Communication
5. Enhanced Communication
8.
PYTHON FOR DATASCIENCE
INTRODUCTION:
Python is a programming language widely used by
Data Scientists.
Python has in-built mathematical libraries and
functions, making it easier to calculate mathematical problems
and to perform data analysis.
Python's Pandas library provided that tools for reading
and writing data in various formats, such as CSV, Excel, and
SQL databases.
It is particularly useful for working with tabular data,
such as data in spreadsheets or databases.
9.
PYTHON FOR DATASCIENCE
Python has libraries with large collections of mathematical
functions and analytical tools.
1.Pandas - This library is used for structured data
operations, like import CSV files, create data frames, and
data preparation
2.Numpy - This is a mathematical library. Has a powerful
N-dimensional array object, linear algebra, Fourier
transform, etc.
3.Matplotlib - This library is used for visualization of data.
4.SciPy - This library has linear algebra modules
10.
PYTHON FOR DATASCIENCE
Key features:
Python's key features for data analysis include
its simplicity, expressive syntax, large library ecosystem,
easy integration with other languages, and scalability.
These features enable data scientists to perform complex
tasks efficiently and effectively.
11.
EXPLORE MACHINE LEARNINGUSING
PYTHON
DEFINITION:
Machine learning is a section of Artificial Intelligence
(AI) that aims at making a machine learn from experience
and automatically do the work without necessarily being
programmed on a task.
The Python programming language best fits machine
learning due to its independent platform and its popularity in
the programming community.
EXPLORE MACHINE LEARNINGUSING PYTHON
Advantages of machine learning:
1.Automation of Everything.
2.Wide Range of Applications.
3.Scope of Improvement.
4.Best for Education.
5.Efficient handling Of data.
14.
DATA VISUALISING USINGPYTHON
DEFINITION:
The process of finding trends and correlations in our
data by representing it pictorially is called Data
Visualization.
To perform data visualization in python, we can use
various python data visualization modules such as
Matplotlib, Seaborne, Plotly, etc.
15.
DATA VISUALISING USINGPYTHON
Types of data visualisation:
1.Bar chart
2.Pie chart
3.Line chart
4.Scatter plot
5.Box plot
6.Histogram
16.
DATA VISUALISING USINGPYTHON
Python visualisation libraries:
1.Matplotlib is one of the best Python visualization library
for generating powerful yet simple visualization.
It is a 2-D plotting library that can be used in various ways,
including Python, iPython sheets, and Jupyter notebooks.
2.Seaborn is the best python libraries for data visualization,
which offers a variety of visualized patterns.
It is designed to work more compatible with Pandas data
form and is widely used for statistical visualization.