Module 2: Project Data Wrangling
Previous Next
Instructions
For this portion of the project, you will examine your dataset for incorrect data. Any incorrect data should be removed, corrected, or imputed. Follow these steps:
Remove irrelevant data. If you are unsure if it is irrelevant, then keep it.
Remove duplicate records that are repeated.
Make sure numbers are interpreted as numerical data types.
Fix typos.
Standardize.
Investigate outliers.
Check and manage missing values.
Format and normalize data if needed.
Change categorical values into numbers if needed.
Once you have completed this, you will need to provide a Word document summarizing the pre-processing steps performed on your dataset.
Module 3: Project Exploratory Analysis
Previous Next
Instructions
In this assignment, you will perform an exploratory analysis that will allow you to get a feel for the data and start exploring potential relationships. This may include:
Descriptive statistics
Histograms
Bar charts
Heat maps
Line graphs
Box plots
Frequency tables
Once your analysis is complete, you will need to provide a Word document showing and describing the results of your exploratory analysis.
Using your chosen dataset, reevaluate the heat map from the last module.
Consider ways to perform a visual check to see if there is a relationship between fields.
With this insight, develop a model using either linear regression or multiple linear regression.
Report the intercepts, slope, model accuracy, output to predicted comparison, and a scatterplot with line portraying the model.
Once you complete these steps, you will need to provide a Word document showing and explaining the results of your model development.
After finishing Proposal create a final report of 5-6 pages
Use Python, Jupyter and show the visuals of the data analysis with introduction, conclusion
—-
Module 2: Project Data Manipulation
Previous Previous post:
Instructions
In this section of the project, you will look for errors in your dataset. Any inaccurate information should be removed, updated, or imputed. Take the following steps:
Remove any unnecessary information. If you’re not sure if anything is irrelevant, keep it.
Duplicate records that are repeated should be removed.
Ensure that numbers are treated as numerical data types.
Correct any typos.
Standardize.
Look into outliers.
Examine and manage missing values.
If necessary, format and normalize the data.
If necessary, convert category values to numbers.
Once you’ve finished this, you’ll need to submit a Word document outlining the pre-processing processes you took with your dataset.
Module 3: Project Exploratory Analysis
Previous Previous post:
Instructions
In this assignment, you will perform an exploratory analysis that