Probability and Statistics for Engineers
Project #1: Probability Distributions for a Real-World Populations
This project is the primary assessment mechanism in IE342 for the Probability portion of the course. It is
the culmination of our study of Probability Theory over Modules 2-9, which included the following
topics:
• sample spaces,
• events,
• additive rules of probability,
• conditional probability,
• Bayes Theorem,
• random variables,
• general properties of probability distributions,
• expected value and other mathematical expectation definitions,
• joint probability distributions,
• covariance,
• correlation,
• linear combinations of random variables,
• Chebyshev’s Theorem,
• parameterized discrete probability distributions (specifically the binomial distribution),
• the normal distribution, and
• parameterized asymmetric continuous probability distributions.
After a half-semester focused on these topics, you are now tasked with applying the concepts of
Probability Theory to real-world populations.
I. Define Random Variables
Project #1 Description
Pick a (or multiple) real-world system(s) or scenario(s) where a population can be modeled with four
random variables that have the following characteristics:
• : a normal random variable that has a normal distribution
• : a continuous random variable with an asymmetric distribution (i.e. exponential or lognormal)
• : a random variable with a uniform distribution (could be discrete or continuous)
• : a binomial random variable with large enough such that it can be accurately modeled using
a normal approximation
Keep in mind that this is an exercise in modeling, and no model is perfect. Thus, you will need to
make assumptions about the nature of the population and how you are quantifying observations to
define the random variables. This project does NOT require any real data associated with the
population of choice. Instead, assign distribution types and parameters based on your understanding of
the system. Data may be used to justify parameter choices, but it is not required. Discuss how
probabilistic modeling is useful for the system of choice.
Probability and Statistics for Engineers
Project #1: Probability Distributions for a Real-World Populations
Here are example definitions related to winter weather in Chicago (DO NOT USE THESE
DEFINITIONS FOR YOUR PROJECT, THE BLUE FONT IS USED FOR EXAMPLES):
• : daily high temperature during winter in Chicago
• : total daily snowfall, in inches, during winter in Chicago
• : the day, indexed from the start of winter, with the largest snowfall amount
• : number of winter days with nonzero snowfall
Within these sample random variable definitions, modeling with a uniform distribution likely has
some inconsistencies with reality, in that there are systematic date preferences for higher/lower
snowfall amounts. However, it is okay to assume that a uniform distribution model captures the overall
tendencies well.
Some other ideas for real-world systems that would work nicely for the probabilistic modeling for this
project include:
• COVID-19 – number of confirmed cases, hospitalizations, deaths, etc. across different states,
countries, or worldwide and potentially through time; efficacy of COVID tests, antibody tests, or
vaccines; adherence rates for public health guidance, etc.
• Federal, State, and/or Local Elections – voting rates and preferences for various demographic
groups or through time,
• Public Health, Poverty, and Social Justice – use U.S. Census of WHO data to study matters
related to housing, economic conditions, public health, racial inequalities, etc.
• NASA Exoplanet Exploration – exoplanet radius, star radius, distance to its star, equilibrium
temperature, orbit period, habitability for life, number of exoplanets per stellar system, etc.
• Sports – world record marathon times, time between goals in soccer, most common score in
basketball, etc.
• Etc. – find something that you are passionate about; there is a great deal of flexibility here, so
you should be able to make it work with just about any real-world system or scenario of interest
to you.
II. Set Population Parameters
With the four random variables set, choose realistic values for the population parameters. Provide
references and justification for all parameter choices.
For the example random variables above, a quick internet search leads to some reasonable choices as
• = 35℉ (https://www.currentresults.com/Weather/Illinois/Places/chicago-temperatures-by-month-average.php),
• = 10℉ (a reasonable choice based on my experience living in Chicagoland),
• = 0.12 (https://www.currentresults.com/Weather/Illinois/Places/chicago-snowfall-totals-snow-accumulation-averages.php),
• = 91 (# winter days = 365/4)
• etc.
III.Produce Distribution Plots
Define the probability distributions and generate distribution plots for all four random variables. The
plot for should include both the binomial distribution AND the normal curve for the approximation.
It is strongly recommended that you use MATLAB for this task, using the tools developed on the HW
assignments.
Probability and Statistics for Engineers
Project #1: Probability Distributions for a Real-World Populations
IV. Define a Joint Probability Distribution
Choose two of the random variables from the list of four to produce a joint probability distribution.
Consider the dependency of the two random variables. If the two random variables can be reasonably
assumed to be independent, then provide justification for the assumption, and quickly get to a joint
probability distribution (this is the straightforward approach). Otherwise, build a joint distribution that
reasonably models the joint nature of the two random variables (this is more complicated, but is more
likely to produce more accurate models). Again, provide justification for the distribution form applied.
V. Calculate Meaningful Probabilities and Expected Values
Use the probability distribution definitions to calculate at least two meaningful probabilities for each
random variable AND two meaningful joint probabilities. Thus, at least 10 probability calculations are
required. Use your plots to help visualize the probability values (i.e. show the areas under the curves).
For example, using the Chicago weather random variable definitions from above, it would be
interesting to know the following:
• (26 < < 35) as that relates to the likelihood of having icy road conditions
• ( > 3) as at least 3 inches of snow are needed for sledding
• ( < 25, > 5) as that gives likelihood of a cold, snowy day where the snow may stay awhile
• ( > 20) as 20 snow days was regular in the 1980s, and winter 2021 had that many
Additionally, calculate and report the mean and variance for each random variable.
Lastly, define a fifth random variable as a meaningful linear combination of two (or more) of the
original random variables. A convenient choice would be the two random variables from part IV.
Then, calculate and report the mean and variance of the new random variable.
For example, using the Chicago weather random variable definitions, define = 32 − + 5 as a
measure of the intensity of winter weather, where the colder and snowier days produce larger values of
U, with an inch of snowfall having equal impact to a 5 degree Fahrenheit drop in temperature.
VI. Discussion of Results and Conclusions
Summarize your calculations and discuss the overall validity of the models. Do the probabilities and
expected values match what would be expected for the real-world system? Where do the models work
well and where is their accuracy limited, for each random variable? Are the potential issues related to
the parameter values, the assumed form of the distribution, or both? Suggest potential improvements
to the models. Lastly, discuss how to design a statistical experiment that could be implemented to test
the population parameters. How can you ensure a random sample?
Probability and Statistics for Engineers
Project #1: Probability Distributions for a Real-World Populations
Deliverables
I. Report (submitted as a group if working with collaborators): written presentation of your
work, with supporting figures. The report should include six sections, associated with I-VI above,
with full descriptions for each distribution definition, plot, and calculated values. Use an easy-to follow format with proper grammar. Make sure figures are properly labeled, captioned, and
referenced in the text. Practice conciseness by finding an optimal balance between rigorousness
and brevity. Submit your MATLAB (or other visualization tool) code with your submission.
II. Video Highlight (submitted individually): 2-3 minute video presentation of ONE probabilistic
model and ONE associated probability calculation. Approach this as a summary highlight of ONE
aspect of your work presented to your boss or a client in a limited timeframe. Your highlight
should include the following:
• Introduce the real-world system being studying in your overall work and how
probabilistic modeling is useful for that system
• Define the ONE random variable you have chosen to highlight and justify the distribution
type chosen
• Present the distribution using a visualization (i.e. a distribution plot) and relevant
parameters for the chosen random variable, with justification for each.
• Explain all steps of the ONE probability calculation you have chosen to highlight.
• Discuss the significance of the probability value calculated and the relevance of the
probabilistic model chosen for the random variable and the real-world system.
Prepare a short (1-3 slides) PowerPoint presentation to organize the summary highlight discussion.
Then, record yourself presenting the slides as an individual. Record the video using the Panopto,
Zoom, or any other method used for Quizzes all semester. If working with a partner or group on
this project, all members must choose a different random variable to highlight, so some
coordination is required here. Submit both the presentation slides and a link to the video recording.
Probability and Statistics for Engineers
Project #1: Probability Distributions for a Real-World Populations
Project #1 GRADING RUBRIC
All
Criteria
are
graded
using this
scale:
Excellent
[5 points]
Criteria is fully
satisfied and/or
demonstratestop level critical
thinking
Very Good
[4 points]
Criteria is mostly
satisfied and/or
demonstrates a
high-level of
critical thinking
Acceptable
[3 points]
Criteria is partially
satisfied and/or
demonstratessome
critical thinking
Minimal
[2 points]
Criteria is not
satisfied and/or
demonstrates a
low-level of critical
thinking
Missing
[0 points]
Criteria is not addressed
or is missing
completely
Criteria Score
1. Report – Define RVs: introduction to real-world system or scenario, four RVs clearly defined
2. Report – Population Parameters: realistic values, justification and/or references provided for each
3. Report – Distribution Plots: mathematical definitions for all RV distributions are presented
4. Report – Distribution Plots: visualize all RV distributions, corresponding descriptions highlight key features
5. Report – Joint Distribution: independence discussed with full justification
6. Report – Joint Distribution: mathematical definition for joint distribution with derivation presented
7. Report – Calculated Values: two probabilities for each RV, two joint probabilities, with discussions
8. Report – Calculated Values: means and variances for each RV, with discussions
9. Report – Calculated Values: linear combination of RVs defined, mean and variance reported and discussed
10. Report – Conclusions: validity of models, where they fall short, potential improvements
11. Report – Conclusions: discussion of statistical experiment design to ensure random sampling
12. Report – Organization & Format: easy to follow, six sections, proper grammar, neat style, labeled plots, etc.
13. Report – Submission: report uses clearly readable file format; visualization code (e.g. MATLAB) is included*
14. Video Highlight – System Introduction: benefits of probabilistic modeling for real-world system discussed
15. Video Highlight – Random Variable Definition: clear definition; distribution type justified
16. Video Highlight – Distribution & Parameters: distribution plot shown and discussed; parameters justified
17. Video Highlight – Probability Calculation: all steps justified and explained clearly for ONE probability value
18. Video Highlight – Significance: probability value and probabilistic model relevance for real-world system
19. Video Highlight – Organization & Format: easy to follow slides and verbal explanations, within 2-3 minutes
20. Video Highlight – Submission: accessible video link and/or valid video format; slides included
TOTAL (out of 100):
The Role of Telehealth in Managing Hypertension in the Adult Population
The Role of Telehealth in Managing Hypertension in the Adult Population The health of the adult population is a significant concern, influenced by various factors, including chronic diseases. Hypertension, a prevalent global burden of disease, poses a considerable threat to adult health. This paper examines hypertension in adults, exploring its impact on quality of life […]