AD3301 DATA EXPLORATION AND VISUALIZATION Anna University Syllabus R2021

 

AD3301 DATA EXPLORATION AND VISUALIZATION Anna University Syllabus R2021

AD3301 DATA EXPLORATION AND VISUALIZATION Anna University Syllabus R2021

AD3301 DATA EXPLORATION AND VISUALIZATION LTPC 3024

OBJECTIVES:

  •  To outline an overview of exploratory data analysis.
  •  To implement data visualization using Matplotlib.
  • To perform univariate data exploration and analysis.
  • To apply bivariate data exploration and analysis.
  • To use Data exploration and visualization techniques for multivariate and time series data.


UNIT I EXPLORATORY DATA ANALYSIS 9

EDA fundamentals – Understanding data science – Significance of EDA – Making sense of data –
Comparing EDA with classical and Bayesian analysis – Software tools for EDA - Visual Aids for
EDA- Data transformation techniques-merging database, reshaping and pivoting, Transformation
techniques - Grouping Datasets - data aggregation – Pivot tables and cross-tabulations.

UNIT II                       VISUALIZING USING MATPLOTLIB               9

Importing Matplotlib – Simple line plots – Simple scatter plots – visualizing errors – density and
contour plots – Histograms – legends – colors – subplots – text and annotation – customization –
three dimensional plotting - Geographic Data with Basemap - Visualization with Seaborn.

UNIT III                   UNIVARIATE ANALYSIS                            9

Introduction to Single variable: Distributions and Variables - Numerical Summaries of Level and
Spread - Scaling and Standardizing – Inequality - Smoothing Time Series.

UNIT IV                                BIVARIATE ANALYSIS                   9

Relationships between Two Variables - Percentage Tables - Analyzing Contingency Tables -
Handling Several Batches - Scatterplots and Resistant Lines – Transformations.

UNIT V                 MULTIVARIATE AND TIME SERIES ANALYSIS                 9

Introducing a Third Variable - Causal Explanations - Three-Variable Contingency Tables and
Beyond - Longitudinal Data – Fundamentals of TSA – Characteristics of time series data – Data
Cleaning – Time-based indexing – Visualizing – Grouping – Resampling.

PRACTICAL EXERCISES: 30 PERIODS

1. Install the data Analysis and Visualization tool: R/ Python /Tableau Public/ Power BI.
2. Perform exploratory data analysis (EDA) on with datasets like email data set. Export all your
emails as a dataset, import them inside a pandas data frame, visualize them and get different
insights from the data.
3. Working with Numpy arrays, Pandas data frames , Basic plots using Matplotlib.
4. Explore various variable and row filters in R for cleaning data. Apply various plot features in R
on sample data sets and visualize.
5. Perform Time Series Analysis and apply the various visualization techniques.
6. Perform Data Analysis and representation on a Map using various Map data sets with Mouse
Rollover effect, user interaction, etc..
7. Build cartographic visualization for multiple datasets involving various countries of the world;
states and districts in India etc.
8. Perform EDA on Wine Quality Data Set.
9. Use a case study on a data set and apply the various EDA and visualization techniques and
present an analysis report.

COURSE OUTCOMES:
At the end of this course, the students will be able to:
CO1: Understand the fundamentals of exploratory data analysis.
CO2: Implement the data visualization using Matplotlib.
CO3: Perform univariate data exploration and analysis.
CO4: Apply bivariate data exploration and analysis.
CO5: Use Data exploration and visualization techniques for multivariate and time series data.
TOTAL: 75 PERIODS

TEXT BOOKS:

1. Suresh Kumar Mukhiya, Usman Ahmed, “Hands-On Exploratory Data Analysis with Python”,
Packt Publishing, 2020. (Unit 1)
2. Jake Vander Plas, "Python Data Science Handbook: Essential Tools for Working with Data",
Oreilly, 1st Edition, 2016. (Unit 2)
3. Catherine Marsh, Jane Elliott, “Exploring Data: An Introduction to Data Analysis for Social
Scientists”, Wiley Publications, 2nd Edition, 2008. (Unit 3,4,5)

REFERENCES:

1. Eric Pimpler, Data Visualization and Exploration with R, GeoSpatial Training service, 2017.
2. Claus O. Wilke, “Fundamentals of Data Visualization”, O’reilly publications, 2019.
3. Matthew O. Ward, Georges Grinstein, Daniel Keim, “Interactive Data Visualization:
Foundations, Techniques, and Applications”, 2nd Edition, CRC press, 2015.

                                                                                              

Comments