We often use inferential statistics to make decisions or judgements about the value of a parameter, such as a population mean. For example, we might need to decide whether the mean weight, 𝜇, of all bags of pretzels packaged by a particular company differs from the advertised weight of 454 grams, or we might want to determine whether the mean age, 𝜇, of all cars in use has increased from the year 2000 mean of 9.0 years. One of the most commonly used methods for making such decisions or judgments is to perform a hypothesis test. A hypothesis is a statement that something is true. For example, the statement “the mean weight of all bags of pretzels packaged differs from the advertised weight of 454 g” is a hypothesis.

Typically, a hypothesis test involves two hypotheses: the null hypothesis and the alternative hypothesis (or research hypothesis), which we define as follows. For instance, in the pretzel packaging example, the null hypothesis might be “the mean weight of all bags of pretzels packaged equals the advertised weight of 454 g,” and the alternative hypothesis might be “the mean weight of all bags of pretzels packaged differs from the advertised weight of 454 g.”

**The first step in setting up a hypothesis test is to decide on the null hypothesis and the alternative hypothesis**. Generally, the null hypothesis for a hypothesis test concerning a population mean, 𝜇, alway specifies a single value for that parameter. Hence, we can express the null hypothesis as

*H*0: 𝜇 = 𝜇0

The choice of the alternative hypothesis depends on and should reflect the purpose of the hypothesis test. Three choices are possible for the alternative hypothesis.

- If the primary concern is deciding whether a population mean, 𝜇, is different from a specific value 𝜇0, we express the alternative hypothesis as,
*H*a ≠ 𝜇0. A hypothesis test whose alternative hypothesis has this form is called**a two-tailed test**. - If the primary concern is deciding whether a population mean, 𝜇, is less than a specific value 𝜇0, we express the alternative hypothesis as,
*H*a < 𝜇0. A hypothesis test whose alternative hypothesis has this form is called**a left-tailed test**. - If the primary concern is deciding whether a population mean, 𝜇, is greater than a specified value 𝜇0, we express the alternative hypothesis as,
*H*a > 𝜇0. A hypothesis test whose alternative hypothesis has this form is called**a right-tailed test**.

PS: A hypothesis test is called a one-tailed test if it is either left tailed or right tailed.

After we have chosen the null and alternative hypotheses, we must decide whether to reject the null hypothesis in favor of the alternative hypothesis. The procedure for deciding is roughly as follows. In practice, of course, we must have a precise criterion for deciding whether to reject the null hypothesis, which involves **a test statistic**, that is, a statistic calculated from the data that is used as a basis for deciding whether the null hypothesis should be rejected.

**Type I and Type II Errors**

In statistics, type I error is to reject the null hypothesis when it is in fact true; whereas type II error is not to reject the null hypothesis when it is in fact false. The probabilities of both type I and type II errors are useful (and essential) to evaluating the effectiveness of a hypothesis test, which involves analyzing the chances of making an incorrect decision. A type I error occurs if a true null hypothesis is rejected. The probability of that happening, the type I error probability, commonly called the significance level of the hypothesis test, is denote 𝛼. A type II error occurs if a false null hypothesis is not rejected. The probability of that happening, the type II error probability, is denote 𝛽.

Ideally, both type I and Type II errors should have small probabilities. Then the chance of making an incorrect decision would be small, regardless of whether the null hypothesis is true or false. We can design a hypothesis test to have any specified significance level. So, for instance, of not rejecting a true null hypothesis is important, we should specify a small value for 𝛼. However, in making our choice for 𝛼, we must keep Key Fact 9.1 in mind. Consequently, we must always assess the risks involved in committing both types of errors and use that assessment as a method for balancing the type I and type II error probabilities.

The significance level, 𝛼, is the probability of making type I error, that is, of rejecting a true null hypothesis. Therefore, if the hypothesis test is conducted at a small significance level (e.g., 𝛼 = 0.05), the chance of rejecting a true null hypothesis will be small. Thus, if we do reject the null hypothesis, we can be reasonably confident that the null hypothesis is false. In other words, if we do reject the null hypothesis, we conclude that the data provide sufficient evidence to support the alternative hypothesis.

However, we usually do not know the probability, 𝛽, of making a type II error, that is, of not rejecting a false null hypothesis. **Consequently, if we do not reject the null hypothesis, we simply reserve judgement about which hypothesis is true**. In other words, if we do not reject the null hypothesis, we conclude only that the data do not provide sufficient evidence to support the alternative hypothesis; we do not conclude that the data provide sufficient evidence to support the null hypothesis. In short, it might be true that there is a true difference but the power of the statistic procedure is not high enough to detect it.