Create the project described below for the attached sample data.

Submit in Excel or Word format.

1) Test of two means. You should select a hypothesis you are interested in testing and then use a test of two means to test this hypothesis. For example, you may be interested in testing whether the GPA of females is higher than that of males in that class. One is going to draw the inference
by using a random sample with replacement of size 25 from each group. Report the p-value of the test. Perform the test using a significance level a = .05. Use graphical methods to present the two populations of interest.

2) Paired difference test. You should select another hypothesis that you are interested in and select it in such a way that the paired difference test is appropriate. For example, suppose one is interested in testing whether the right arm length is equal to left arm length. Then, in this case, a paired difference test is appropriate. Draw inference by using a
random sample with replacement of 25 from each group. Perform the test using a significance level a that you choose.

3) Regression and correlation. Pick any two columns that have a correlation coefficient greater than 0.6 or less than -0.6. Make sure to pick the one with the highest absolute value.

a. Draw the scatter diagram of Y against X, and explain any noted significance.

b. Compute correlation coefficient (ρ or r), and what do you find? Make sure to explain thoroughly what you mean.

c. Obtain a and b of the regression equation defined as Y = a + b X, and the
Coefficient of Determination (r2) from the Excel regression output, what can you tell? What is the relationship between r2 and ρ?

d. Compute the above statistics in 4) step by step using SXiYi, SXi, SYi, SXi2, SYi2 from Excel, and compare them with the results in C).

e. Draw the fitted regression line on the scatter diagram, obtain the residuals and plot them on the scatter diagram too. Explain your findings.

f. Write a paragraph or so on any observations you may have on the data,
regression estimates or the regression residuals;

g. Calculate the additional y values for at least five other x values that do not appear in our data. Include that information in your report above and
comment on whether you believe the calculate y value seems realistic and
consistent with the other information you have calculated in each of the parts above