Econ 382 Econometrics
Teacher: Ma
Fall 2020
Data Project
Due 11:59pm Sunday December 20
Submit your undertaking in a Phrase doc in Blackboard. For those who use Pages with a Mac, please export your pages doc right into a PDF. Solely Phrase or PDF shall be accepted.
This undertaking MUST be executed individually and independently. An identical or considerably comparable work will lead to an F for all authors concerned.
This undertaking is designed to offer you a taste of how quantitative analysis is performed in the true world, and the way among the econometric strategies we mentioned at school are utilized. There are a number of statistical packages on the market, and you might be required to conduct this undertaking utilizing R/RStudio.
This undertaking will contain the next duties:
1. Discover information on the CDC web site, discover out what variables can be found, and the way the variables are outlined.
2. Formulate an empirical mannequin with six or extra variables. What you wish to examine with the mannequin should make sense.
three. Obtain not less than two datasets in SAS transport (XPT) recordsdata (the variables you select should come from two or extra datasets).
four. Load the SAS transport information recordsdata into RStudio.
5. Merge the datasets into one.
6. Summarize descriptive statistics of the variables of curiosity.
7. Run a regression or regressions to estimate the coefficients.
eight. Estimate the coefficients, and conduct speculation checks.
9. Focus on the findings of the outcomes and any issues with the mannequin or the outcomes. For instance, are you lacking any key variables? Is that this prone to result in omitted variable bias?
10. Write a 5-to-6-page report or memorandum in 12pt Instances New Roman double-spaced with an appendix that features all of the instructions and outputs from RStudio (this web page rely doesn’t embrace the appendix). The report ought to describe intimately what and the way you’ve executed for each merchandise listed above. You’re anticipated to have the ability to use the info to again up your arguments, and inform an entire story.
The place to Get R and RStudio?
Obtain R at http://www.r-project.org/, and set up R in your pc. After putting in R, obtain RStudio at http://www.rstudio.com/, and set up RStudio in your pc.
The place to Get Data?
Go to the CDC (Facilities for Illness Management and Prevention) homepage at https://www.cdc.gov/. Scroll down, and click on on “CDC Group”. On the CDC group web page, click on on “Nationwide Middle for Well being Statistics” (NCHS). Subsequent, below “Inhabitants Surveys”, click on on “Nationwide Well being and Diet Examination Survey” (NHANES). On the NHANES web page, click on on “Questionnaires, Datasets, and Associated Documentation”, then choose the “HNANES 2017-2018” information. There, you will notice 5 classes of knowledge that you’ve got entry to: Demographics Date, Dietary Data, Examination Date, Laboratory Data, and Questionnaire Data. Every class has a number of datasets (apart from Demographics Data). You’re required to make use of not less than TWO datasets. I do counsel that you simply embrace Demographics Data the place yow will discover many of the widespread demographics variables resembling age, training, revenue, and so forth.
The way to Obtain the Datasets in SAS Transport (XPT) Information?
Proper click on on the XPT hyperlinks to the best of the datasets that you simply select, then choose “save hyperlink as”, and save them in your flash drive or C: drive. You’ll be able to rename them as you want.
The way to Load the SAS XPT Information into Stata?
Trace: You will have the bundle “Hmisc”. I’ve demonstrated at school fairly a number of occasions methods to set up packages. After putting in and calling the bundle, you’ll need the command “sasxport.get” to open the XPT recordsdata.
The way to Merge the Datasets?
Every particular person within the datasets has an ID quantity known as “seqn” (you’ll be able to see it within the Variable Lists), and you’ll merge the datasets by this widespread variable. Trace: use the “merge” command. Discover out by your self how this command works.
Different.
There could also be lacking/empty values (proven as NA) or values that don’t make sense (as an illustration 99) of some variables for some observations/people. Whenever you do the abstract statistics, you’ll need to take care of them appropriately (for instance, it is best to change these unusual values to NA, and inform RStudio to disregard the NA when calculating).