Homework Assignment 2: [15 points] Prepare a line graph - Dissertation App

Assignment Answers
Can someone help me write my essay
Homework Assignment 2: [15 points] Prepare a line graph

Can someone help me write my essay

Homework Assignment 2: [15 points] Prepare a line graph

January 26th, 2023 Dissertations

Homework Assignment 2:
1. [15 points] Prepare a line graph for the ridership of Amtrak data from the beginning of 1991 to March 2004 with the labels indicating the axes. Print your R command and line graph. After observing the behavior of the ridership from your graph, answer the following questions using related statistics:
Which year/month does max/min belong to?
What are the range and IQR of the ridership? Are there any outliers?

2. [15 points] Prepare a boxplot of the ridership as well as a histogram. Print the boxplot/histogram with R commands to generate them. What can you say about the outliers (are there outliers) and distribution (is the distribution symmetric or skewed, if skewed, is it skewed to right or left)?

3. [20 points] Use the Excel data set Pollution (the variables of this data set are explained below). Prepare a heatmap to observe the correlations of any two numeric variables of Pollution with correlation coefficients printed in cells. Print the heatmap and R command(s) to produce it. Which variable pairs have the highest/lowest positive/negative correlation. How do you interpret the highest positive and negative correlations as a data analyst, do these correlations make sense?

The Pollution.xlsx data set includes regional climate, pollution, and population demographic statistics from 1960 in the United States. Below are the descriptions for these variables.

Variable Description
PREC Average annual precipitation in inches
JANT Average January temperature in degrees F
JULT Average July temperature in degrees F
OVR65 Percent population aged 65 or older
POPN Average household size
EDUC Median school years completed by those over 22
HOUS Percent housing units, which are in good repair and with all facilities
DENS Population per square mile in urbanized areas, 1960
NONW Percent non-white population in urbanized areas, 1960
WWDRK Percent employed in white collar occupations
POOR Percent of families with income < $3000
HC Relative hydrocarbon pollution potential
NOX Relative nitric oxides pollution potential
SO2 Relative sulfur dioxide pollution potential
HUMID Annual average % relative humidity at 1:00 pm
MORT Total age-adjusted mortality rate per 100,000

4. [15 points] Select any five variables of Pollution you wish and prepare a matrix scatter plot with these five variables. Print the matrix scatter plot and the R command(s) to produce it. Interpret diagonal and off-diagonal elements of this matrix. Are the correlation coefficients you computed in line with your answer to Question 3?
Hint: Install GGally package first and then use ggpairs command like the one in your textbook.
5. [15 points] Use the Pollution data set for PCA. How many components are required for us to explain at least 99.9% of variance. Provide a table from the output to support your claim.
Hint: Refer to related lecture notes and textbook section. You can use an R command line like
“pcs <- prcomp(data.frame(Pollution))” to run the PCA.

6. [20 points] Run the PCA one more time with scaling, and print its output. How many components are required for us to explain at least 80% of variance.
Hint: Modify the R command you used in Q6 by a line like “pcs <- prcomp(data.frame(Pollution), scale. =T)”.

7. [10 points, BONUS] Return the observations of the Pollution in terms of its principal components by setting the “scores” equal to these, i.e. use a command like “scores = pcs$x”. Print first five observations of the data set in terms of its principal components, i.e. type an R command like “head(scores, 5)”. Compute the correlation between any two columns of scores to see that the principal components of scores are not correlated, i.e. type an R command like “cor(scores[ ,1], scores[ ,2])”. Keep in mind that the correlation coefficient like “-5.360244e-17” in R is practically 0.

====

Best Assignment Help Australia, Can Someone Write My Assignment for Me, can you write my dissertation, Do my assignment Australia

Published by

Dissertations

View all posts

RELATED ARTICLES

Screening is a process of identifying people who may have a disease or condition

Screening is a process of identifying people who may have a disease or condition, even though they may not be showing any symptoms. Screening can be done for a variety of diseases, including cancer, heart disease, and diabetes. There are many benefits to screening. By identifying people with a disease early, treatment can be started […]

Write My Paper in the U.S.

Write My Paper in the U.S. The truth is that a student’s life in the United States can be very busy. This is because a lot of students have more than one job, even though they are enrolled in different schools. This means that it can be hard for these students to finish all of […]

Research Report Writing

Research Report Writing Services Report writing is an important part of doing research. Basically, a research report shows how the whole research project was done. In addition to this, it talks about what the research showed. Because of this, it can be a long time. So, it makes sense that students dread having to write […]

In need of this paper or similar homework assignment answers? Order from us today and get the best custom writing services! Top grades - AI/plagiarsim free papers guaranteed

Place an order | Check price