Discussion: Regression and Correlation Coefficient

Discussion: Regression and Correlation Coefficient ORDER NOW FOR CUSTOMIZED AND ORIGINAL ESSAY PAPERS ON Discussion: Regression and Correlation Coefficient Collaborate Summary: four points for a two-page summary of the Collaborate lecture. Bullets and outline format are fine. Students can annotate the written lecture document with thoughtful notes as another way to get credit. CNU BST 322 Regression and Correlation Coefficient week_four_collaborate_slides_revised_june_2020.pptx BST 322 Week Four Slides Revised June 22, 2020 Brooks Ensign, MBA, M.Acc. Deadlines • Week Four ( end of course): Final Exam in MyStatLab, Independent Project, Wk 4 HW, Discussion Questions, MyStatLab ASK ME FOR HELP !!! – MyStatLab Final Exam This Week: Week Four Our agenda this week: PREDICTIONS? • Scatterplot ? Correlation calculation ? • Correlation calculation ? Derive regression equation ? • Regression equation ? use to “predict” (“maybe” – if “significant”) • Consider confounding variables and multivariate regression • ANCOVA: introduce(lightly) This Week: Week Four • Review Correlation from Week One ( Ch. 4) • Algebra: draw a line with two points, and get the slope and intercept: gives you the equation (simplified regression process) • Regression: simple bivariate (two variables) • Multivariate: > 1 independent variables – (x1, x2 , x3 ) • Ch. 9: simple bivariate, Ch. 10: multivariate • Ch. 11 (just first 6 pages): Intro. To ANCOVA This Week: “Regression” • Chapters 9 and 10 (and a tiny intro bit of 11) • With interval or ratio data: • From scatterplots, to correlations, to regressions: deriving an equation to describe the data, and (maybe) using the equation to predict values • Y “prime” = Y’ = a plus (b times x) • Y ‘ = the predicted value of y • B = slope, and a = intercept “Regression” • Regress: step back and analyze • In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable and one or more independent variables. • Regression equations can be used to predict values, if … “Maybe” predict? CNU BST 322 Regression and Correlation Coefficient • Explanation: • MyStatLab has strict rule: if the correlation is statistically significant ( p value less than 0.05), then, and only then, you can use the regression equation to predict. Otherwise, you simply use the mean (average) value for the dependent variable. • Class in DQ1: gray area: p value of 0.06, but we will use it anyway (borderline) Preview: Correlation -> Regression ->Prediction • This week we focus on Chapter 9 (all of it), the first half of Chapter 10 (light treatment), and the first third of Chapter 11 (very light treatment), • This week, our test statistics “are (r) ” : • “r “ – lower case r, for the simple correlation and regression with two variables • “R “ – upper case R: for multivariate regression: more than one independent variable; we give multivariate regression a light treatment Significance? • In order to declare that our results are “significant” (i.e. “probably not random”), • We need to “reject the null hypothesis” and we need: • A LARGE test statistic and a very small p value • Test statistics: t … F …. Chi ( ? ) … and now … “ r “ and “R” • For significance of r see page 199 and page 418 • For significance of R see page 231 (stay tuned) “They Work Together:” Think of the test statistic and the p value as the opposite ends of a seesaw. They work in opposite directions. For statistical significance, we want a large r or R test statistic (larger than the table value) and a small p value (smaller than 0.05, i.e., alpha). r or R Test statistic greater than table value P value less than alpha (0.05), the level of significance Review If the absolute value we find for the test statistic is > than the tabled value (at a certain level of significance (?) or P value) Or if we get a P Value < 0.05 then the null hypothesis is rejected and the result is significant. Correlation and Regression (Review) We did correlation when we covered scatterplots in ch. 4 (Pearson’s r) This value r is calculated from a sample of data The population value of r (the correlation coefficient) is “rho” (?) (the lowercase Greek r) We study “ r “ in a sample as an estimate of the “rho” (? ) correlation in the population Remember that r is easily calculated in StatCrunch Scatterplot ? CNU BST 322 Regression and Correlation Coefficient Correlation, and now … • Correlation ? Regression equation-> prediction • Regression: derive an equation from the correlation (“if” the correlation is statistically significant and strong enough to be predictive) • y’ = a plus (b times x) with a = intercept and b = slope • “y prime” = y’ (is the predicted value for y) Correlation in StatCrunch “Click:” Stats — > Summary Stats — > Correlation Slope, Correlation and R2 CONTRAST THESE (they are different) • 1. Slope: “rise / run” ; “b” in y = a plus (b)*x • Slope: “steep?” • 2. r = Correlation: from Week One, “r” = does a change in y relate to a change in x? • Strong Correlation can have low slope! • 3. R2 = regression answer: “proportion of variance” (how much of variance is explained?); also known as “coefficient of determination” • R2 = r times r Correlation is not “slope” (rise over run) Tight fit? Or “messy” Correlation is the degree of “fit” to a line: is it “tight” (very close to being a line, ie. Correlation of 0.7 – 0.9), or is it …. Weak correlation (none) is: A “messy cloud” (zero or low correlation, i.e., 0.1 ) ? Perfect ( r = 1.0 ) Correlation; Slope is 0.1 8 7 6 Strong Correlation, with low slope 5 4 3 2 1 0 0 1 2 3 4 5 6 7 Correlation as a test statistic Now we can look at the Pearson’s r value in terms of it being a test statistic Are the values we see significant? The Null hypothesis here is that the correlation value is……. H0: r = ? A. zero – there is no relationship B. not zero – there is a relationship C. 0.5 – there is a weak relationship Vote now! Correlation as a test statistic H0: rho ? = 0 H1: rho ? ? 0 The Null hypothesis here is that there is … ( no relationship, no correlation, r very small, close to zero) Any ideas from students? What is the null hypothesis? What is the alternative hypothesis? Discuss… Correlation as a test statistic H0: rho ? = 0 H1: rho ? ? 0 The Null hypothesis here is that there is no relationship between the variables in the population (r = 0) — SEE PAGE 199 So we compare the test statistic r (which we use as an estimate of “rho” ?) to the critical value in the table (p.418) —CNU BST 322 Regression and Correlation Coefficient Again, if the test statistic (absolute value) is > than a certain critical value then the null hypothesis is rejected and the result is significant or we let the computer tell us by making it calculate the exact P value (and just compare that to 0.05) Easy Way • The easy way to determine statistical significance of the regression: the p value of the slope (not the p value of the intercept) • Is the p value of the slope less than 0.05? Correlation-Regression ExampleBetter Charts Bad Y Good 4.0 3.5 3.0 Weight Gain After Overeating 2.5 2.0 Y 1.5 3.5 0.5 0.0 0 100 200 300 400 See Polit p.35 for more tips on graphs 500 600 700 800 Fat gain (kilograms) 1.0 y = -0.0033x + 3.3413 R² = 0.6211 3.0 2.5 2.0 1.5 1.0 100 200 300 400 500 600 Nonexercise activity (calories) 700 Significance of r • This was optional in week one; it is now required (easy w StatCrunch: p value?) • Four Slides from Week One (see p. 199, top): • Follow these instructions to test the significance of your correlation in the Independent Project, #5 and #6 • Required for Question Six in the independent project: test the significance of your correlation coefficient (page 199) Meaning of r value (page 71) • Pearson’s r can be between 0 and 1 for positive correlation and 0 and negative 1 for negative correlation. • Positive correlation: 0 < r < 0.2 is weak, 0.2 to 0.5 is moderate, 0.5 to 0.7 is stronger, and >0.7 is very strong (these are “rough” descriptions) • Negative: strong correlation if less than -0.5, weak if between -0.2 and 0. • Value of Zero or near Zero: No Correlation Is “r” value “significant?” • The easiest way: – Look at the p value of the slope in your StatCrunch results (bottom right corner) – Is the p value of the slope: < 0.05 ?? Parameter estimates: Parameter Estimate Std. Err. Alternative DF T-Stat P-Value Intercept Slope 665.7143 131.6546 ? 0 5 5.0565214 0.0039 -0.6989286 0.29438862 ? 0 5 -2.3741696 0.0636 Is “r” value “significant?” • Week 2: “Significant” in statistics means “not random.” (rather than “important”) • Test of significance for Pearson’s r • (top of page 199 and page 418) • Calculate d.f. (degrees of freedom): N-2, with N being the number of data points • Notice: at the very bottom of the table: a low r value can be significant with a large data sample • At the very top of these tables: small samples require LARGE test statistics • Discussion Question One: Large r value may not be significant with a small sample size (top rows in tables) • Vs. Contrast this with the bottom of the table: a small r value may be significant with a large data set Is “r” value “significant?” • Refer to page 199 (top) • Refer to page 418: Use shaded column (0.05) • CNU BST 322 Regression and Correlation Coefficient Find the row that corresponds to d.f. (degrees of freedom); e.g., 10 -2 = 8 d.f. • If your calculated “r” value is greater than the table value, then the calculated “r” value is significant (“non-random”). Is “r” value “significant?” • • • • • • • • Question 14 in W1 homework (week one): Ten data points, d.f = N-2 = 8 Page 418: shaded column (? = 0.05) Page 418: Table A.6, row: d.f. = 8 Table value: 0.632 r value is significant if greater than 0.632 Is “r” value “significant?” (yes, 0.91 > 0.632) Test significance of r value in Independent Project StatCrunch – Discussion Question One • W4 DQ one: the 0.73 r value “seems” large (and significant?) but it is not quite significant, because: the data set is very small • Remember: we predict “maybe?” l • This is the only “close call” in our course • 0.728 < table value of 0.754 (how did I find this table value on page 418 using the guidance from page 199?) Regression • Regression: use the equation derived from the correlation / scatterplot • StatCrunch does all of this for us • Regression: use the equation to PREDICT Regression in StatCrunch • Click: Stat: ? Regression ? Simple linear Regression in StatCrunch: fill in the template Prediction in StatCrunch (using Regression) StatCrunch – Discussion Question One Simple linear regression results: Equation: y = intercept minus b times x Dependent Variable: Cholesterol cholesterol = 665 – 0.69 times Caffeine Independent Variable: Caffeine Cholesterol = 665.7143 – 0.6989286 Caffeine Sample size: 7 R (correlation coefficient) = -0.728 R-sq = 0.52992857 Estimate of error standard deviation: 155.77582 R and R-squared Parameter estimates: l Parameter Estimate Intercept Slope Std. Err. Alternative DF T-Stat P-Value 131.6546 ?0 5 5.0565214 0.0039 -0.6989286 0.29438862 ?0 5 -2.3741696 0.0636 665.7143 Significance? Look for the p value of the slope 0.06 > . 05 Not sig. (but very close) Answers • How do you answer the questions in the discussion questions and the homework? CNU BST 322 Regression and Correlation Coefficient • See the next few slides !! Discussion Question One (use this for Homework Q-7 also ) • Q: r What is the correlation coefficient r and what does it mean in this case? • A: The correlation coefficient (r)=-.728 which means there is a strong, negative correlation. • Q: r2 What is the coefficient of determination and what does it mean in this case? • A: The coefficient of determination is r2. In this case it is equal to .53. This means that 53% of the variation in cholesterol is explained by the independent variable. • Q: Is there a statistically significant correlation between caffeine intake and cholesterol levels in this case? • A: The table value is .754 and the absolute value of r = .73. Because the calculated value does not exceed the table value then there is not statistical significance (“very close,” but not quite). Discussion Question One • The correlation seems strong, but it is not quite significant … • How many more data points do you need? (one or two) • Note: this is “strong” correlation that lacks significance (very small sample) • We can also have weak correlation, with significance (in a large sample): look at the bottom of page 418 – small values Discussion Question One • Using regressions to PREDICT: • Difference in Methodology: • MyStatLab teaches us that we “only” use regressions to predict, if the regression is statistically significant. Otherwise we just use the average value… • But this Discussion Question asks you to predict, using this equation, which is “not quite” signficant • Sometimes statistical approaches differ; this is the only “borderline” example in this class, but there are many in real life StatCrunch Discussion Question One The numbers here are slightly different from your discussion question USE THE NUMBERS IN THE DQ – DON’T JUST COPY THESE NUMBERS Discussion Question One: Predictions • Q: What is the intercept? CNU BST 322 Regression and Correlation Coefficient (or –what would be your cholesterol level while ingesting no caffeine?) • A: The intercept is 665.714. That would be the cholesterol level while ingesting 0 mg of caffeine. • Q: What is the slope? (or, what is what we call b in the linear regression equation?) • A: The slope ( or b in the linear regression equation) is -0.636 • Simple linear regression results: Dependent Variable: Cholesterol Independent Variable: Caffeine Cholesterol = 665.7143 – 0.6989286 Caffeine Sample size: 7 R (correlation coefficient) = -0.728 R-sq = 0.52992857 Estimate of error standard deviation: 155.77582 Discussion Question One Parameter estimates: Parameter Intercept Slope Estimate Std. Err. Alternative DF T-Stat P-Value 131.6546 ?0 5 5.0565214 0.0039 -0.6989286 0.29438862 ?0 5 -2.3741696 0.0636 665.7143 Use the p value of the slope – it is the same as the p value for the correlation. 0.06 is > than 0.05, so the results are not quite statistically significant. ? P value of the slope Is 0.06 Discussion Question One: use a regression to predict • c) How many cups of coffee must you drink to lower your total cholesterol to 150 mg/dL (given that 1 cup of coffee equals 100 mg of caffeine)? ALGEBRA • x=(150-665.714)/(-0.636) • x=810/100 mg • 8 cups • Better way: use the StatCrunch prediction tool for the DQ and for the HW Q7 StatCrunch: Prediction Scroll down in the Simple Linear Regression screen, until you see this: Enter the value of X (the assumed value of X) and StatCrunch will calculate the predicted Y value, based on the regression equation EC Discussion Question Three • Optional: but interesting (fun and easy) CNU BST 322 Regression and Correlation Coefficient • The Most Important Question in This Course • No math! This is your chance to use what you have learned in this course, to… • Recognize the mistakes and misconceptions in the medical literature; some statistical studies are poorly designed / executed… • Misadventures… Skim the Vox Article (design of medical research studies) http://www.vox.com/2015/1/5/7482871/types-of-study-design Misadventures? Look at “Misadventures” in this site: http://www.improvingmedicalstatistics.com/index.html http://www.improvingmedicalstatistics.com/entry_media.h tm http://www.improvingmedicalstatistics.com/entry_high_sc hool.htm http://www.improvingmedicalstatistics.com/Biased%20pro tocol.htm Choose one of the examples cited. Write a short paragraph: identify the article and identify the abuse / misuse of statistical analysis. NOTE: These research articles are prominent, recent medical articles (WITH MISTAKES !!! ??? ). Regression & Multiple Regression Regression (bivariate in ch. 9) one x variable -used to make predictions about the values of variables once we know their relationship easiest – linear -use the equation of a line to predict y variable values, with one x Multiple Regression (multivariate in ch. 10) an extension of simple linear regression where we use two or more x variables (“factors”) to predict the value of the dependent variable YOU LEARNED IN THIS CLASS: The word “factor” is used in this class instead of “cause.” We recognize that explanations usually involve “multiple factors.” What is the “cause” of my hypertension? • Trick question, because there is not “one” cause of hypertension (and many other medical conditions) • There are “multiple ‘contributing’ factors:” salt, stress, genetics, diet, exercise, caffeine, decongestants, medicine • “Multiple factors: – these questions are addressed with “multivariate” regression in CH 10 ( and MF ANOVA, in Ch- 7) Capital R: Multiple Regression (we just want the basics in ch. 10) • Facto Get a 10 % discount on an order above $ 100 Use the following coupon code : NURSING10

Struggling to find relevant content? Order a custom essay on
Discussion: Regression and Correlation Coefficient
Let our experts save you the hassle
Order Now
Calculate the price
Make an order in advance and get the best price
Pages (550 words)
*Price with a welcome 15% discount applied.
Pro tip: If you want to save more money and pay the lowest price, you need to set a more extended deadline.
We know how difficult it is to be a student these days. That's why our prices are one of the most affordable on the market, and there are no hidden fees.

Instead, we offer bonuses, discounts, and free services to make your experience outstanding.
Sign up, place your order, and leave the rest to our professional paper writers in less than 2 minutes.
step 1
Upload assignment instructions
Fill out the order form and provide paper details. You can even attach screenshots or add additional instructions later. If something is not clear or missing, the writer will contact you for clarification.
Get personalized services with GPA Fix
One writer for all your papers
You can select one writer for all your papers. This option enhances the consistency in the quality of your assignments. Select your preferred writer from the list of writers who have handledf your previous assignments
Same paper from different writers
Are you ordering the same assignment for a friend? You can get the same paper from different writers. The goal is to produce 100% unique and original papers
Copy of sources used
Our homework writers will provide you with copies of sources used on your request. Just add the option when plaing your order
What our partners say about us
We appreciate every review and are always looking for ways to grow. See what other students think about our do my paper service.
Customer 452813, July 9th, 2022
Customer 452813, September 20th, 2022
Social Work and Human Services
Great Work!
Customer 452587, August 31st, 2021
Classic English Literature
Nicely done. Ty. Worth every penny.
Customer 452455, June 6th, 2021
Customer 452521, May 13th, 2021
The skilled writer did a great job on assignment!! Thank you!!
Customer 452547, June 16th, 2021
Customer 452813, June 26th, 2022
Always perfect!!! Don is the best!
Customer 452453, July 27th, 2021
Thank you!
Customer 452707, April 2nd, 2022
Thank you , this is perfect !
Customer 452795, May 15th, 2022
Criminal Justice
always great!
Customer 452465, February 23rd, 2021
This has everything that was in the rubric. Thank you!
Customer 452707, May 29th, 2022
15% OFF your first order
Use a coupon FIRST15 and enjoy expert help with any task at the most affordable price.
Claim my 15% OFF Order in Chat
error: Content is protected !!

Save More. Score Better. Use coupon code SPECIAL for a 15%discount