We are living in a data explosive world where data is ubiquitous, and thus it is essential to build data analysis and modelling skills. Based on TIOBE Index, Python has overpassed Java and C and become the most popular programming language of today since October 2021. Python leads the top Data Science and Machine Learning platforms based on KDnuggets poll.
This course uses a real world project and dataset and well known Python libraries to show you how to explore data, find the problems and fix them, and how to develop classic statistical regression models and machine learning regression step by step in an easily understand way. This course is especially suitable for beginner and intermediate levels, but many of the methods are also very helpful for the advanced learners.
You will be able to setup Python environment to starting data anlysis and modelling.
Installing Anaconda Python
24:28
Required Python Packages
03:34
Installing Required Packages
10:45
Creating and Accessing Working Directory
08:27
Data Exploration
You will be able to read local and online datasets into Pandas DataFrame, and make basic data exploration.
Download the Data
02:37
Reading and Writing Data
14:28
Accessing Basic Information of DataFrame
10:15
Renaming Columns of DataFrame
13:30
Slicing DataFrame
19:39
Sorting DataFrame
07:45
Filtering DataFrame
17:02
Grouping DataFrame
08:21
Calculating Summary Statistics of DataFrame
15:21
Data Preparation
You will be able to clean the data and preprocess the dara for further analysis and creating models.
Detecting Missing Values
15:50
Imputing Missing Values
20:12
Detecting Outliers
24:28
Treating Outliers
17:23
Correlation Analysis and Feature Selection
30:04
Encoding Categorical Values
26:46
Data Splitting
10:08
Data Normalization
29:51
Classic Statistical Linear Regression Models
You will be able to develop a classic Statistical Linear Regression Models, interprete the results, improve it, evaluate it and visualizae the results.