A Gentle Introduction to Statistics Using SAS Studio. Ron Cody

Читать онлайн книгу.

A Gentle Introduction to Statistics Using SAS Studio - Ron Cody


Скачать книгу
if the probability of obtaining the difference by chance is less than 5%. Be really careful here. The term significant is also used by non-statisticians to mean important. In a statistical setting, significant only means that the probability of getting the result by chance is less than 5% or some other number that you specified before the experiment began. Because statisticians like to use Greek letters for lots of things, the predefined significance level is called alpha (⍺).

      Now for some terminology: You already know about the null hypothesis and a significance level. If caffeine had no effect on heart rate, what is the probability that you would see a difference of 4 points or more by chance alone? This probability is called the p-value. If the p-value is less than alpha, you reject the null hypothesis and accept the alternate hypothesis. The alternate hypothesis in this example is that caffeine does affect a person’s heart rate. Most studies will reject the null hypothesis whether the difference is positive or negative. As strange as this sounds, this means that if the decaf group had a significantly higher heart rate than the regular coffee group, you would also reject the null hypothesis. The reason you ran this experiment was that you expected that caffeine would increase heart rate. If this was a well-established fact, supported by several clinical trials, there would be no need to conduct the study—if the effect of caffeine on heart rate was never tested, you need to account for the possibility that it could either increase or decrease heart rate. Looking for a difference in either direction is called a 2-tailed test or a non-directional test.

      As you saw in this example, it is possible for the null hypothesis to be true (caffeine has no effect on heart rate) and, by chance, you reject the null hypothesis and say that caffeine does affect heart rate. This is called a type I error (a false positive result). If there is something called a type I error, there is probably something called a type II error—and there is. This other error occurs when the alternate hypothesis is actually true (caffeine increases heart rate) but you fail to reject the null hypothesis. How could this happen? The most common reason for a type II error is that the experimenter did not have a large enough sample (usually a group of subjects). This makes sense: If you had only a few people in each group, it is easy to see that the means of the two groups would be different. If you had several thousand people in each group, and caffeine really had an effect on heart rate, you would be pretty sure of confirming the fact. Just as you set a significance level before you started the experiment (the 5% value that statisticians call alpha), you can also compute the probability that you make the wrong decision and claim that caffeine has no effect on heart rate when it really does (a false negative). The probability of a type II error is called beta (β), (more Greek).

      Rather than think about the probability of a false negative, it is easier to think about the probability of a true positive. This probability is called power, and it is computed as 1 – beta. The last chapter of this book shows how to compute power for different statistical tests. Typically, the only way to increase power is to have a larger sample size.

      Before we leave this chapter, there are two more terms that you should be familiar with. Remember in our coffee experiment, we assigned people to drink regular or decaf coffee. The 10 people in each group is called a sample. When you do your experiment and state your conclusion, you are not merely making a statement about your sample. You are saying that anyone who drinks regular coffee will have a higher heart rate that you estimate to be 4 points. You are making a statement about anyone who drinks regular or decaf coffee, and the name given to this theoretical group of people that you are making conclusions about is called a population. In practice, you define your population (everyone in the US or the world, for example), you take samples from this population, and make inferences about the effect your intervention on an outcome. That is why this branch of statistics is called inferential statistics.

      ● Measures of central tendency – statistics such as a mean or median that describe the center of a group of data values.

      ● Dispersion – a measure that describes how spread out the data values are. The most common measure of dispersion is the standard deviation.

      ● Sample – the group of subjects on whom you are conducting your experiment.

      ● Population – a theoretical group of subjects on whom you make inferences, based on the results from your sample.

      ● Type I error – a false positive result from a study. For example, concluding a drug or treatment works when it does not.

      ● Alpha (α) – the significance level. It is the probability you are willing to accept for a type I error, usually set at .05.

      ● p-value – the probability of making a false positive (type I) error. If the p-value is less than alpha, you declare the results as significant. Some researchers prefer to just list the p-value and not set a specific alpha level.

      ● Type II error – a false negative result. For example, you have a drug or treatment that works but the results from the study is not significant (the probability of obtaining your result is greater than alpha.)

      ● Beta (β) – the probability of getting a false negative (type II) result.

      ● Power – the probability of a positive result when there truly is an effect. For example, you claim that your drug or treatment is better than a placebo or standard treatment, and you are correct in this decision. The power is the probability of the study rejecting the null hypothesis (finding the effect). Typically, powers of 80% or higher are considered acceptable. Large expensive studies may require powers of 90% or even 95%.

      Chapter 2: Study Designs

       Introduction

       Double-Blind, Placebo-Controlled Clinical Trials

       Cohort Studies

       Case-Control Studies

       Conclusion

      Whether you are reading a study or designing your own, one of the most important factors that you should consider is the study design. For example, are you going to randomly assign subjects to different treatments, or are you simply going to observe outcomes in people who have some trait in common? This chapter describes some of the commonly used study designs, and this discussion will help you decide the quality of study results and how much faith you should have in the results, regardless of the p-value or size of the effect.

      This is the “gold standard” of all study designs. Even though the term “clinical” is in the title, this study design can be used in all fields of study. The caffeine study described in the previous chapter is an example of this study design. The first step is to select a representative sample of people from the population on which you want to make inferences. The population might be limited by age, gender, or ethnicity. For example, for your caffeine study, you would probably not include very young children. After you conduct an analysis to determine how many subjects you should include in your study (discussed in Chapter 15), you randomly assign the subjects into as many groups as necessary. This design does not necessarily have to include a placebo group even though the study design includes the word “placebo” in the title. It might not be ethical to include a placebo group if this could cause harm to the subjects in this group. For example, if you are comparing drugs to lower blood pressure in a population of subjects with high blood pressure, you might choose to have one group take one of the standard medicines and the other groups take one or more new treatments that you want to study.

      Next, we need to discuss the term double-blind. Double-blind means that neither the subject nor the person evaluating the subject knows what treatment or drug a subject is receiving. This is quite easy to do if the


Скачать книгу