# California National University BST 322 Inferential Statistic Case Analysis

California National University BST 322 Inferential Statistic Case Analysis ORDER NOW FOR CUSTOMIZED AND ORIGINAL ESSAY PAPERS ON California National University BST 322 Inferential Statistic Case Analysis Collaborate Summary: four points for a two-page summary of the Collaborate lecture. Bullets and outline format are fine. Students can annotate the written lecture document with thoughtful notes as another way to get credit. California National University BST 322 Inferential Statistic Case Analysis week_twoa__collaborate_slides_jan_17_2020_2.pptx week_twoa__collaborate_slides_jan_17_2020_1.pptx week_two_b_collaborate_slides_jan_17_ch_six_and_seven.pptx BST 322 Week Two Slides: (First half, 1 of 2 parts) Revised: Jan 17, 2020 Brooks Ensign, MBA Deadlines Thursdays Week One: Jan. 19 Week Two: Jan. 23 We will catch up (the class gets easier, faster, after chapter five) Week One Grades (preliminary) The grades you have seen so far are just meant as (very) preliminary feedback MyStatLab: (15 points total for Weeks One and Two); this is a long assignment, with important material. They will be shorter in the future. Discussion Questions: 2 points if you posted twice in DQ-1; 2 points per question, with 4 points total IT IS Still early: you have time to get all the points !!!!!!!!!!!! Homework Make Sure You Use: the homework suggestions and the data files Required: first sentence: tell me you used the suggestions (or I will not grade it) REQUIRED: Use the actual Word doc in the shell, not a blank page, and DO NOT USE an assignment from a different (prior) section Name at top, name in file name Ask for help Week One was descriptive Weeks 2-3: INFERENTIAL Week Two: The hardest week I have personally written many guidance documents to help you get started. I have sent them to you. Have you read them? CONCEPTUAL CHANGE: Last week was descriptive stats, now we move to: INFERENTIAL STATISTICS Week Two: Key Concepts We learned descriptive statistics last week Inferential: what can we learn from the data? We learn INFERENTIAL STATISTICS this week. There are two approaches: Parameter Estimation (confidence interval) We give this a light treatment, with one HW quest. HYPOTHESIS TESTING: we give this very HEAVY EMPHASIS IN THIS COURSE, especially in weeks 2-3 Week Two Youtube Videos Some students have found these videos to be helpful. Look for videos on our key concepts: hypothesis testing, confidence intervals, t test, ANOVA Examples: https://www.youtube.com/watch?v=rWFDXt-MlNs https://www.youtube.com/watch?v=cW16A7hXbTo https://www.youtube.com/watch?v=Rao8TTcviS0 Week Two: 3 Disc. Questions: Lets Discuss California National University BST 322 Inferential Statistic Case Analysis These three questions are critical this week, for your learning. I have posted guidance and data files. I will help you with your posts. I will add new questions, with new data files, to keep it interesting. Two posts for each question, please. Why two? This is a discussion ( so respond to your classmates, ask / answer questions, etc.) Discussion Questions Introduce Us To Inferential Statistics Hypothesis Testing Method: Our three questions in the Discussion section all involve the hypothesis testing method, which is integral to our course (pages 98-105 in the textbook, especially page 105). READ PAGE 105 many times! Common to all questions: We have an experimental (alternative) hypothesis, along the lines of the following: one group is different from the population (DQ1), the two groups are different (DQ2), and the three groups are different (DQ3). However, we cannot prove this hypothesis directly Because the results may just have been random Start with the experimental hypothesis Experimental hypothesis: the new thing is different, the drug works, the nursing technique is superior, Group A is different from the Population, or Group B, etc. BUT: what if these results are random The null hypothesis is based on this risk of random results, and it represents the opposite of the experimental hypothesis Null ( H0 ): Null Hypothesis Null ( H0 ): the groups are equal = (results are random) Instead, we create the H0 NULL HYPOTHESIS, which is the opposite of our experimental hypothesis: one group is NOT different from the population (DQ1), the two groups are NOT different (DQ2), and the three groups are NOT different (DQ3). The null hypothesis tries to tell us that the apparent differences that we see in the differences between the samples are just random fluctuations. Our goal is to REJECT the null hypothesis !!!, by showing that the differences probably (*) could not have occurred randomly. Probably: p value: there will still be a small chance that the results were random (less than five percent??) Shortcut: the p value < 0.05? Very easy to get confused here, so California National University BST 322 Inferential Statistic Case Analysis To avoid confusion, use this shortcut in answering hypothesis testing questions: p value: probability of random occurrence If the p value is LESS than 0.05, the results are significant (0.05 = five percent) If the p value is GREATER than 0.05, the results are NOT significant Statistically Significant Know this definition: Statistical significance: means it probably did not happen by random chance Only means this Does not mean: important, newsworthy, clinically meaningful, valuable, exciting, etc. And: it still could have happened by random chance (but probably not i.e., < 5% chance) Does not mean clinically significant P value and alpha ( ? ) Is p value < alpha ( ? ) The risk of random occurrence is defined by the p value which is the probability that the results could have occurred randomly. We thus want the p value to be very low: we usually want it to be below 5%, < 0.05. Our alpha ? is this 0.05 threshold, the level of significance (burden of proof), the acceptable risk that we do have a random result. The popular alpha in this class and in many scientific discussions is 0.05, five percent. We will almost always use 0.05 for alpha, unless the problem states otherwise. You will know Look at your calculated test statistic ( t or F) Remember the very first DQ in W1? The students heart rate of 127 was not a big deal, because his z score was very low ( 0.13 ) So: in the same way, we look at the test statistic and ask: is it large enough (in absolute value) to be greater than the critical value, i.e., to be statistically significant? Football season: in order to score a touchdown, you have to get beyond the goal line (critical value threshold) the far right end of the distribution The extreme zone on the right (black) is the critical region Test Statistics: r, z, t, F and chi Remember the z score? It measures how far the value is away from the mean, in standard deviation units. It was usual to have a z score < 2, and unusual for the z score to be greater than 2. This is a test statistic example with the z score Other test statistics: t and F (this week), chi (week three) and r (week four, and week one) The test statistic measures: how far away from the typical value (mean) larger test statistics are unusual and may be statistically significant (not random) ..California National University BST 322 Inferential Statistic Case Analysis But how large??? Test statistic ( t and F) and table (critical) value We want the calculated (observed) test statistic ( t in the first two questions, F in the third) to be GREATER than the table (critical) value. We get the calculated (observed) test statistic from StatCrunch and we get the table value from the tables at the back of the book (or Excel). Pages 412, 413, 414, and 415 See the next slide: test statistic and p value They Work Together: Think of the test statistic and the p value as the opposite ends of a seesaw. They work in opposite directions. For statistical significance, we want a large test statistic (larger than the table value) and a small p value (smaller than 0.05, i.e., alpha). t Test statistic > greater than t table value P value < less than alpha (0.05), the level of significance DO NOT, DO NOT, DO NOT (mistakes by prior students) Do not compare the test statistic with alpha! Do not compare the p value with the table value! Why? Please review the last two slides and study the next slide and read below DO: Compare the test statistic with the table value DO: Compare the p value with alpha ALWAYS (follow p. 105) Every hypothesis testing question: State the Null Hypothesis State the Alternative Hypothesis State the assumptions (random sampling, normal distribution, approximately equal variance) Compare the calculated test statistic to the table (critical) value Compare the p value to alpha (0.05) Only two choices: either REJECT the null hypothesis or FAIL TO REJECT (RETAIN please do not say accept) the null hypothesis. Our goal: REJECT the null DESIRED RESULT: If the calculated (observed) test statistic is greater than the table (critical) value, you REJECT the null hypothesis and describe the results as statistically significant. California National University BST 322 Inferential Statistic Case Analysis The p value will be less than alpha / 0.05, indicating that the difference probably did not occur by chance. You support (but you have not proven) the alternative hypothesis. You have not proven this experimental hypothesis because of the risk of random occurrence. If we cannot REJECT the null OTHERWISE: If the calculated (observed) test statistic is NOT greater than the table (critical) value, you FAIL TO REJECT (we can also say RETAIN, but try not to say accept) the null hypothesis and describe the results as NOT statistically significant. The p value will NOT be less than alpha / 0.05, indicating that the difference probably DID occur by chance. You DO NOT support (your evidence does not support) the alternative hypothesis. Note: your experimental hypothesis COULD still be valid, but your evidence in this experiment does not support it. Time for another experiment (if you have the money and desire). Fail to Reject and Retain are better than Accept Problem with using ACCEPT: the book does use the word accept, rather than retain or the preferred language, fail to reject. The problem with accept is it might suggest that you are proving the null hypothesis, or even just supporting it. You are not doing either of these things. You are trying to reject it. If you cannot, you just fail to reject and retain it. Probability The p value defined The Probability (p) of some event occurring is defined as: P (event) = # of ways it can happen/ total # of possibilities Probabilities are expressed as proportions . Probability of Consecutive Events Pages 85-86 p value: what are the odds of something happening by chance Probability of one card being a spade = 13/52 = 1 / 4 = 25% so p = 0.25 (write it this way) Probability of three spades in a row ? (MULTIPLY) HW: Consecutive Events (multiply) If the events are independent, (cards are returned to the deck and reshuffled): p value = 13/52 = ¼ = 0.25 MULTIPLY with consecutive events = (1/4) x (1/4) x (1/4) = 1 / 64 = .0156 But, if the first spade is discarded, the events are not independent: = (13/52) x (12/51) x (11/50) = .0129 Convert z scores to probabilities Some challenging questions in MyStatLab: made much easier, start with page 411, and then: Use the David Lane online calculator. California National University BST 322 Inferential Statistic Case Analysis It can convert from z scores to probabilities, and from probabilities to z scores http://davidmlane.com/hyperstat/z_table.html Watch this video: https://www.youtube.com/watch?v=NQ5j-WTWs-s David Lane calculator Statistical Inference p value: what are the odds of something happening by chance Inference: An attempted trial conclusion, that needs to be tested The process of making conclusions using data that is subject to chance -Is what we observe just a fluke? -How do we know it is not? Deciding to reject or fail to reject the Null hypothesis (Polit p. 105) THIS SLIDE IS CRITICAL If the absolute value of the computed (we also say calculated and observed )statistic is greater than the table (the critical value) value, the null hypothesis can be rejected and the result is said to be statistically significant at the specified probability level Statistical significance the results seen are not attributable to chance (PROBABLY DID NOT happen by chance but still could have!) Test statistic and a p value Use my guidance document (look for the p value!) Use my guidance document (look for the p value!) Values beyond the critical value are significant ( here: 1.96) State the Null Hypothesis Opposite of your experimental hypothesis (why?) Examples: IV does not affect DV, different DV groups will have equal results (unaffected by IV), the therapy (drug) has no effect, etc. implying that any apparent effect or difference is random groups are equal because the separation into groups (the IV) does not change the DV RETAIN or Reject? Avoid Accept Our choice is either to Reject the Null Hypothesis (we want to do so, and are trying to do so), or Fail to Reject When we Fail to Reject, We Retain Retain is a better word than Accept (because we are not proving or even supporting the null hypothesis, we are just failing to reject it) We NEVER support the null hypothesis we do our best to try to REJECT it Seven Steps in Hypothesis Testing See Page 105 (critical instructions) But: Required before you even start these required seven steps: STEP ZERO (required in this class): State the null hypothesis and alternative hypothesis. Complete all seven steps (page 105). Step 7: Decide to RETAIN or reject the null hypothesis. Do you have significance? With significance, we REJECT the null hypothesis. Interpreting your p value -For small p values (usually < 0.05), reject Ho, your data dont support Ho and your evidence is beyond a reasonable doubt (probably did not happen by chance but still could have the p value is never zero) -For large p values (usually > 0.05), you cant reject Ho, you dont have enough evidence against it -If your p value is close to 0.05 your results are marginal (could go either way) to evaluate the value of the t statistic . Where did we get the 0.05 from? A. Google said so B. Excel found it to be best C. It is known that chance occurs that much D. We just decided to live with 5% chance Vote now! Where do we get the 0.05 from? ans A. Google said so B. Excel found it to be best C. It is known that chance occurs that much D. We just decided to live with 5% chance (R.A. Fishers suggestion, from many years ago ) .California National University BST 322 Inferential Statistic Case Analysis Welcome to the world of stats! The ? 0.05 level of significance The 0.05 level of significance ( alpha, ? ) of five percent (burden of proof) The acceptable risk that we may be wrong, and the results may be random A popular convention (rule of thumb) in statistics (not just this class) But you could choose different levels for alpha: 0.01, 0.001, 0.1, etc. Review Test statistic computed from a formula to determine critical region of the data If the absolute value we find for the test statistic is > than a certain critical value Then the null hypothesis is rejected and the result is significant. Statistical significance the results seen are not attributable to chance P value (probability values also called significance) the probability that the result occurred by chance is the P value less than of significance? ? alpha, our risk threshold, our level They Work Together: Think of the test statistic and the p value as the opposite ends of a seesaw. They work in opposite directions. For statistical significance, we want a large test statistic (larger than the table value) and a small p value (smaller than 0.05, i.e., alpha). Test statistic greater than table value P value less than alpha (0.05), the level of significance SEM: Reminder: STANDARD ERROR OF THE MEAN Reminder: SEM is the standard error of the mean, in sample testing SEM = (standard deviation) / squareroot (n) The standard error of the mean is the standard deviation divided by the square root of n Use for Homework Problem SEM: Standard Error of the Mean SEM = (standard deviation) / squareroot (n) SEM = sd / ?( n) Know the Assumptions !!! Each week, know the ASSUMPTIONS for the tests we are learning !! StatCrunch does the math, you need to know the assumptions so that you can choose the correct tool This week: the assumptions are: random sampling (representative samples), normal distribution, and equal variance Preparing for Discussion Question One: THE ONE-SAMPLE T TEST (PAGES 98-105) Simple t test Given: Sample Mean = 73 SD = 15 ? = 0.05 n = 100 Similar to HW Q: assume we are talking about heart rates (but these numbers could represent other variables also) Ho : mu = population mean µ Ho : mu = = 75 H1 : µ ? 75 where µ = the population mean, mu For HW-4: the (2 sided) µ = 72 ( not zero !!!! ) Simple t test Given: Mean = 73 SD = 15 ? = 0.05 n = 100 µ Ho : mu = = 75 H1 : µ ? 75 where µ = the population mean, mu Ho : Null Hypothesis (H sub zero, or sub naught) H1 : Alternative Hypothesis (H sub one) (2 sided) Discussion Q1 Two Hypotheses here: H0 Null hypothesis: the sample is just like the population, the same mean should really be equal to the population mean, and any apparent difference is just random Ho : mu = µ = 75 California National University BST 322 Inferential Statistic Case Analysis H1 Alternative hypothesis: the sample is NOT like the population, the same mean should NOT be equal to the population mean, and the difference is NOT just random H1 : µ ? 75 is not equal to 75 T stat is NOT NEGATIVE WITH NEGATIVE VALUES: TAKE THE ABSOLUTE VALUE of t ( p. 100 ) Absolute value: means make the negative number positive Why? Size matters: we care about the magnitude, not the direction. Simple t test Given: Mean = 73 SD = 15 ? = 0.05 n = 100 µ Ho : mu = = 75 H1 : µ ? 75 where µ = the population mean, mu t=? And is it significant? (See Polit text p. 100) [use the absolute value ] t = (sample mean population mean) / SEM (2 sided) Simple t test cont. Mean = 73 SD = 15 ? = 0.05 n = 100 t = (sample mean pop mean) / SEM t = (73 75) / [ 15/ sq root of 100 ] t = -2 / 1.5 = -1.33 (this is the calculated or observed value) WITH NEGATIVE VALUES: TAKE THE ABSOLUTE VALUE of t ( p. 100 ) compare it (the absolute value) to a critical value, the table value to evaluate the value of the t statistic . Where do we find the critical value? A. B. C. D. E. Google it! Guess Use Excel (TINV) In a chart of the t distribution (p.412) I have no idea Vote now! Where do we find the critical value? ans A. B. C. C. A. B. Google it! Guess Excel = TINV (.05,99) Excel = TINV(alpha, d.f.) In a chart of the t distribution I have no idea p. 412 t = 1.98 Discuss interpolation note: d.f. = n 1 for the one sample t test not greater than that and therefore not sig! Page 412 Table A.2 Next two slides Where do we find the critical value? Finding the answer in Excel . A. B. C. C. D. Google it! Guess EXCEL: =TINV(alpha,d.f.) EXCEL: =TINV(.05,99) In a chart of the t distribution p. 412 t = 1.98 not greater than that and therefore not sig! Where do we find the critical value? Finding the answer in Excel . EXCEL: =TINV(alpha,d.f.) Alpha: level of significance Alpha ( ? ) = 0.05 and d.f. = N -1 (for this test) =TINV(0.05,99) = 1.98 Matches the table value p. 412 t = 1.98 not greater than that and therefore not sig! Another t test example Our sample is of salaries in one region of the country given in the dataset of some 474 people surveyed Lets run a one sample t test on this data given that the National salary average is $32,000 -What is the Null hypothesis? The null hypothesis is that our group ( in San Diego, for example) is not different from the national average. What is the Null hypothesis? A. The salaries of the people in our region are below the national average B. The salaries of the California National University BST 322 Inferential Statistic Case Analysis Get a 10 % discount on an order above $ 100 Use the following coupon code : NURSING10