Assignment due Monday, April 15, 2024 by 11:00pm The .ipynb file should be submitted as searchable PDF and not as a picture. Read the instructions carefully. Please download your assignment here. https://colab.research.google.com/drive/1L9NNWit1wTfH2jEZFJr43E9h37D4IVql Submission Requirements: Submit all answers along with their corresponding code as a **searchable PDF **. All generated csv and .ipynb files must be submitted in a zip-folder as a secondary source. Ensure your zip folder contains four csv files (i.e., the csv files for each of the four questions below; YourName.csv, RelianceRetailVisits_ordered.csv, Scores.csv, Vaccinated.csv). You may use Jupyter notebook or Colab as per your convenience. You should ONLY use the concepts and techniques covered in the course to generate your answers. Statistical techniques that are NOT covered in the course will NOT be evaluated. Note: Reach out to your instructor for any question regarding csv files, codes, or the zip-folder. Non-compliance with the above instructions will result in a 0 grade on the relevant portions of the assignment. Your instructor will grade your assignment based on what you submitted. Failure to submit the assignment or submitting an assignment intended for another class will result in a 0 grade, and resubmission will not be allowed. Make sure that you submit your original work. Suspected cases of plagiarism will be treated as potential academic misconduct and will be reported to the College Academic Integrity Committee for a formal investigation. As part of this procedure, your instructor may require you to meet with them for an oral exam on the assignment. A. Statistical Intuitions in Mental Health **Question 1: **(#Probability, #CompProgramDesign, #Visualizations) We are going to work with a dataset that was collected on mental health issues. In total, 824 individuals (teenagers, college students, housewives, businesses professionals, and other groups) completed the survey. Their data provides valuable insights into the prevalence, and factors associated with, mental health issues in different groups. Using the sample dataset from the CSV file you generated answer the following questions: Question 1a. Is each of these two variables independent of being female? Explain your reasoning. Make sure to include a two-way table for each of these two variables with gender, and show all your calculations to support your answers. **Question 1b. **Is there a relationship between the two variables returned by the code? Explain your reasoning. Make sure you include a two-way table, a stacked bar graph, and all your probability calculations in your answer. **Question 1c. **Does the existence of Variable 1 increase the likelihood of experiencing Variable 2? If so, by how much? Explain your reasoning. Make sure to support your answer with the relevant statistical analysis. **Question 1d. **Look back at your answers to Questions 1a-c. Now use what you learned to answer the following question: Imagine ZU wanted to use the insights from this research to improve its mental health support program. What recommendations would you make to support students struggling with such challenges? B. Statistical Intuition in Store Ratings **Question 2: **(#Distributions, #Probability) Imagine you are the manager of an Electronic store in Dubai mall. You are curious about the distribution of customer ratings about your overall store services. So you ask random customers who visit the store to complete a short survey, recording variables such as their age group, and overall experience rating. Question 2a. Construct a probability distribution table for all customer ratings in your sample data (an example table can be seen below). Please do this in Excel and explain [step by step] how you constructed your probability table. Question 2b. What is the probability that a randomly selected customer will have a rating of AT MOST 3? Question 2c. Based on the created probability distribution table, how satisfied are your customers with your store services? Question 2d. Find the expected rating of your store. Show your work and interpret your answer in context. Question 2e. Interpret the Standard Deviation in context. What rating is considered unusual. Explain. Question 2f. Identify any trends or differences in customer satisfaction levels (and variability) among the different age groups. Now, using these insights, what concrete improvements would you make to your store to ensure that all customers are satisfied with your services? C. Statistical Intuition in SAT Exams **Question 3: **(#Probability, #Distributions) Imagine you are working for a prestigious university in the UAE. It is your job to decide which students are admitted to the university. To help you do this, you analyze the high school (SAT) scores of potential students. These scores help you understand their academic readiness and potential for success at the university. You have just received the scores of applicants who would like to join the university in September 2024. These scores follow a normal distribution. Use the Scores dataset and the statistics provided by the code, to answer the following questions. IMPORTANT: Make sure to support your answers by explaining and showing how you came to your conclusions. -_ If you use online calculators then please include screenshots of those calculators as part of your work._ -_ Please__ do not__ use code to solve these questions. The questions are designed to test your understanding._ Question 3a. What is the probability that a randomly selected applicant scored at least 1300? Show your work. Question 3b. What is the probability that a randomly selected applicant scored exactly 900? Show your work. Question 3c. What percentage of applicants scored between 900 and 1000? Show your work. Question 3d. Calculate the 40th percentile of scores among the applicants. What does this value represent in the context of the admissions process? Show your work. Question 3e. Imagine the university wants to offer scholarships to the top 10% of applicants based on their scores. What minimum score would an applicant need to qualify for a scholarship? Show your work. Question 3f. Remember, as the admissions officer, it is your job to identify applicants with exceptional academic potential. Would you automatically recommend that applicants with SAT scores above 1400 to be admitted into the university? Or do you think additional criteria should also be considered? Explain your reasoning. D. Statistical Intuition in Public Health **Question 4: **(#InferentialStats) Now imagine that it is year 2034 and you are working as a public health researcher in the UAE. You are working on a project to assess vaccination coverage for a new global pandemic. The UAE government has implemented a widespread vaccination campaign to combat the spread of the virus and achieve herd immunity. You want to determine the proportion of individuals who have received the new vaccine among a sample of 100 residents in different parts of the country. Use the dataset to answer the following questions. IMPORTANT: Make sure to support your answers by explaining and showing how you came to your conclusions. -_ Please do not use code to solve these questions. The questions are designed to test your understanding._ Question 4a. What is the proportion of people who have received the vaccine (based on the dataset you have)? Question 4b. Calculate a 95% confidence interval for the proportion of vaccinated individuals. What does this interval tell us about the likely range of vaccination coverage in the entire population? Show your work. Question 4c. What sample size would be required to estimate the proportion of vaccinated individuals in the country with a 95% confidence level and a margin of error of 0.02? Show your work. Question 4d. If you wanted to increase the precision of your estimate, what strategies could you employ to achieve this goal? Explain your reasoning. Question 4e. Analyze the effectiveness of the current vaccination campaign using the proportion of vaccinated individuals and the confidence interval. What recommendations would you make for future campaigns? Assignment Information Weight: 18% Learning Outcomes Added CompProgramDesign: Generate working programs in a computer language that can solve computational problems; find and fix bugs that appear in them. Distributions: Identify different types of distributions and make inferences based on samples from distributions appropriately. Visualizations: Interpret, analyze, and create data visualizations. InferentialStats: Apply and interpret confidence intervals, statistical significance, and regression. Probability: Apply and interpret fundamental concepts of probability, including conditional and bayesian probabilities