Data Science Training in Marathahalli Bangalore

Module 1

Introduction To Data Science

Jargon Busting
Analytics Problem Solving Framework
Language of Data Analysts
Business and Data Understanding
Overview of analytics tools & their popularity
Data Dictionary & Data Granularity
Data Quality & Cleaning
Data Preparation
Data Visualization
Case Study

Module 2

Python For Data Science

Overview of Python
Need of Python for data science
Introduction to installation of Python
Introduction to Python Editors & IDE’s (Canopy, pycharm, Jupyter, Rodeo, Ipython etc…)
Understand Jupyter notebook & Customize Settings
Concept of Packages/Libraries - Important packages (NumPy, SciPy, scikit-learn, Pandas, Matplotlib, etc)
Installing & loading Packages & Name Spaces
Data Types & Data objects/structures (strings, Tuples, Lists, Dictionaries etc.)
Variable & Value Labels – Date & Time Values
Basic Operations - Mathematical - string – date
Importing Data from various sources (csv, txt, excel, xml etc.)
Database Input (Connecting to database)
Viewing Data objects - sub setting methods
Exporting Data to various formats (Different File systems)
Important python modules (NumPy, SciPy, scikit-learn, Pandas, Matplotlib, etc)
Puzzles/Exercise

Module 3

Data Wrangling In Python

Cleansing Data with Python
Data Manipulation steps (Sorting, filtering, duplicates, merging, appending, subsetting, derived variables, sampling, Data type conversions, renaming, formatting etc.)
Data manipulation helpers (Operators, Functions, Packages, control structures, Loops, arrays etc.)
Python Built-in Functions (Text, numeric, date, utility functions)
Python User Defined Functions (UDFs)
Formatting data
Puzzles/Exercise

Module 4

Introduction to Statistics

Basic Statistics - Measures of Central Tendencies and Variance
Building blocks - Probability Distributions - Normal distribution - Central Limit Theorem
Descriptive statistics, Frequency Tables and summarization
Univariate Analysis and Bivariate Analysis
Inferential Statistics -Sampling - Concept of Hypothesis Testing
Statistical Methods - Z/t-tests (One sample, independent, paired), Anova, Correlations and Chi-square
Data Visualization And Statistics Using Python

Introduction exploratory data analysis

Descriptive statistics, Frequency Tables and summarization
Univariate Analysis (Distribution of data & Graphical Analysis)
Bivariate Analysis (Cross Tabs, Distributions & Relationships, Graphical Analysis)
Creating Graphs- Bar/pie/line chart/histogram/ boxplot/ scatter/ density
Important Packages for Exploratory Analysis and for statistical methods (NumPy Arrays, Matplotlib, seaborn,
Pandas and scipy.stats etc.)
Case Study

Module 5

Introduction to Machine Learning & Predictive Modelling

Types of Business problems - Mapping of Techniques - Regression vs. classification vs. segmentation vs. Forecasting
Machine Learning Framework
Major Classes of Learning Algorithms -Supervised vs. Unsupervised Learning
Different Phases of Predictive Modelling (Data Pre-processing, Sampling, Model Building, Validation)
Over fitting (Bias-Variance Trade off) & Performance Metrics
Feature engineering & dimension reduction (PCA)
Concept of optimization & cost function
Overview of gradient descent algorithm
Overview of Cross validation (Bootstrapping, K-Fold validation etc.)
Model performance metrics (R-square, adjusted R-square, RMSE, MAPE, AUC, ROC curve, recall, precision,
sensitivity, specificity, and confusion metrics)
Linear Regression (SLR, MLR, Generalised Linear Regression, Regularization Regression)
Supervised Classification (K-NN, Naïve Bayes, Logistic Regression, Support Vector Machines, Decision Trees, Neural
Network)
Concept of Distance and related math background
Un-Supervised learning (K-Means Clustering, Hierarchical Clustering)
Time series forecasting, Time Series Components (Trend, Seasonality, Cyclicity and Level) and Decomposition
Basic Techniques of time series - Averages, Smoothening, etc.
Advanced Techniques of time series - AR Models, ARIMA, etc.
Understanding Forecasting Accuracy of time series - MAPE, MAD, MSE, etc.
Concept of Ensembling and Methods of Ensembling
Association Rule Mining
Case Study and project for Applying different algorithms to solve the business problems and bench mark the results

Module 6