Introduction to Hypothesis Testing
To decide which one is correct, a hypothesis test is carried out to ascertain whether or not there is enough statistical evidence in favour of a certain belief or hypothesis about a population parameter.
A hypothesis test consists of: collecting data from a sample, evaluating the data, and deciding if there is sufficient evidence in the data to reject the null hypothesis.
Definition:
Null Hypothesis ($H_0$)
Definition:
Alternate Hypothesis ($H_1$)
Definition:
Test Statistic
Definition:
Level of Significance
Definition:
$P-$value
Thus, a high $P$-value indicates that the sample data is likely with a true null and a low $P$-value indicates that the sample data is unlikely with a true null.
Furthermore, a low $P$-value suggests that the sample provides enough evidence that to reject the null hypothesis for the entire population.
So if the $P$-value is less than the significance level, $\alpha$, then we can reject the null hypothesis in favour of the alternate hypothesis. On the other hand, if the $P$-value is greater than the level of significance, $\alpha$, then we fail to reject the null hypothesis.
Guidelines On How To Conduct A Hypothesis Test
$k=$ the historical value of the parameter.
The null hypothesis always follows the form: $H_0: \theta=k$
Depending on the data obtained from the sample or on how the problem is framed, choose only one of the following alternate hypotheses to be $H_1$ $$ \begin{array}{lll} H_1: \theta < k & \Rightarrow & \text { left tail test } \\ H_1: \theta > k & \Rightarrow & \text { right tail test } \\ H_1: \theta \neq k & \Rightarrow & \text { double tail test } \end{array} $$
The test-statistic for each population parameter will be calculated differently. For the purpose of this course, the parameters of interest will be the mean when the population variance is known, the mean when the population variance is unknown, and the population proportion. Each will be treated in their own section.
If the test statistic is in the rejection zone, then there is sufficient evidence to reject the null hypothesis in favour of the alternate hypothesis.
Example
A company claims that the average lifetime of its light bulbs is 1,200 hours. A consumer protection group suspects that the average lifetime is actually less than 1,200 hours. What is the appropriate alternative hypothesis?
C. $H_1: \mu<1200$
Solution
Example
A pharmaceutical company claims that its new drug has no effect on blood pressure, and the average systolic blood pressure remains 120 mmHg . A doctor wants to test if the drug has any effect, either increasing or decreasing blood pressure. What is the appropriate alternative hypothesis?
B. $H_1: \mu\neq 120$
Solution
Example
A researcher believes that the mean weight of apples in a certain orchard has increased compared to last year's mean of $150$ grams. What is the appropriate alternative hypothesis?
D. $H_1: \mu> 150$
Solution
Example
The average test score of a math class is believed to be 75 points, but a teacher hypothesizes that the new teaching method will result in scores that are higher than 75 points. What is the appropriate alternative hypothesis?
D. $H_1: \mu> 75$
Solution
Example
The average daily sales in a small café are $\$ 800$, but the manager suspects that sales have decreased since a new competitor opened nearby. What is the appropriate alternative hypothesis?
C. $H_1: \mu< 800$
Solution
Example
A political poll states that $60 \%$ of voters support a new policy. A researcher believes that the actual proportion of voters who support the policy is different from $60 \%$. What is the appropriate alternative hypothesis?
B. $H_1: p\neq 0.6$
Solution
Example
A manufacturing company claims that $90 \%$ of its products meet quality standards. An auditor suspects that the actual proportion of products meeting quality standards is less than $90 \%$. What is the appropriate alternative hypothesis?
C. $H_1: p<0.9$
Solution
Example
A machine fills cereal boxes with an average of $500 g$ per box. The standard deviation is known to be 10 g . A random sample of 30 boxes has a mean weight of $505 g$. The test is conducted at a $5 \%$ significance level to determine if the mean weight is different from 500 g .
The calculated test statistic is $z=2.74$, and the critical values are $\pm 1.96$. What decision should be made?
A. Reject the null hypothesis because $|z|>1.96$.
Solution
Example
A researcher claims that the average score on a math test is 75 . A random sample of $25$ students has a mean score of 78 and a standard deviation of 5 . The test is conducted at a $1 \%$ significance level to determine if the average score is higher than 75.
The calculated test statistic is $t=3.00$, and the critical value is $t_{0.01,24}=2.492$. What decision should be made? What decision should be made?
A. Reject the null hypothesis because $t>2.492$.
Solution
Example
A survey reports that $40 \%$ of adults prefer online shopping. A random sample of 100 adults finds that $4 8 \%$ prefer online shopping. The test is conducted at a $5 \%$ significance level to determine if the true proportion is different from $40\%$.
The calculated test statistic is $z=1.60$, and the critical values are $\pm 1.96$. What decision should be made?
B. Fail to reject the null hypothesis because $|z|<1.96$..
Solution
Example
A factory claims that the mean lifetime of a type of battery is 300 hours. The standard deviation is known to be 50 hours. A random sample of 36 batteries has a mean lifetime of 290 hours. The test is conducted at a $5 \%$ significance level to determine if the mean lifetime is less than 300 hours.
The calculated test statistic is $z=-1.20$, and the critical value is $z_{0.05}=-1.645$. What decision should be made?
B. Fail to reject the null hypothosis because $z>-1.645$.
Solution
Example
It is claimed that $70\%$ of students regularly use the library. A random sample of $200$ students finds that 150 students ( $75 \%$ ) regularly use the library. The test is conducted at a $1 \%$ significance level to determine if the proportion is higher than $70 \%$.
The calculated test statistic is $z=1.83$, and the critical value is $z_{0.01}=2.33$. What decision should be made?
B. Fail to reject the null hypothosis because $z<2.33$.
Solution
Example
A nutritionist tests whether the average calorie content of a type of snack bar is less than 200 calories. The p -value of the test is 0.02 , and the significance level is $\alpha=0.05$. What decision should be made?
A. Reject the null hypothesis because $p<\alpha$.
Solution
Example
A factory claims that the mean diameter of its bolts is 5 mm . A test yields a $P-$value of 0.045 . The significance level is $\alpha=0.01$. What decision should be made?
B. Fail to reject the null hypothesis because $p>\alpha$.
Solution
Example
A researcher tests whether the proportion of students who own a tablet is greater than $40 \%$. The sample yields a $P-$value of 0.18 . The significance level is $\alpha=0.05$. What decision should be made?
B. Fail to reject the null hypothesis because $p>\alpha$.
Solution
Example
A study tests whether the mean weight of a certain fish species is 20 kg . A random sample produces a $P-$value of 0.008 . The test is conducted at a significance level of $\alpha=0.01$. What decision should be made?
A. Reject the null hypothesis because $p<\alpha$.
Solution
Example
A poll claims that $50 \%$ of voters support a new policy. A random sample produces a test statistic with a $P-$value of 0.12 . The test is conducted at a significance level of $\alpha=0.10$. What decision should be made?
B. Fail to reject the null hypothesis because $p>\alpha$.
Solution
Example
A company claims that the average processing time for an application is 15 minutes. A random sample yields a $P-$value of 0.03 in a two-tailed test. The significance level is $\alpha=0.05$. What decision should be made?
A. Reject the null hypothesis because $p<\alpha$.
Solution
Errors
Definition:
Type I Error
Definition:
Type II Error
Definition:
Power
Factors Influencing the Power of a Test
Example
An industrial engineer tests whether the average time to assemble a product differs from the standard time of 45 minutes. The null hypothesis is $H_0: \mu=45$, and the alternative hypothesis is $H_1: \mu\neq 45$. What is a Type II error in this context?
B. Concluding that the average assembly time is 45 minutes when it actually differs.
Solution
Example
A public health official tests whether the proportion of households with access to clean water in a region is different from $70 \%$. The null hypothesis is $H_0: p=0.70$, and the alternative hypothesis is $H_1: p \neq 0.70$ What is a Type I error in this context?
A. Concluding that the proportion of households with access to clean water is different from $70 \%$ when it does not.
Solution
Example
A clinical trial tests whether a new drug reduces blood pressure below the current average of 120 mmHg . The null hypothesis is $H_0: \mu=120$, and the alternative hypothesis is $H_1: \mu<120$. What is a Type I error in this context?
A. Concluding that the drug reduces blood pressure when it does not.
Solution
Example
A manufacturer claims that $95 \%$ of its products meet quality standards. The null hypothesis is $H_0$ : $p=0.95$, and the alternative hypothesis is $H_1: p<0.95$. What is a Type II errof in this context?
D. Failing to detect a decrease in the proportion meeting quality standards.
Solution
Example
A researcher tests whether the average temperature in a region has increased above the historical average of $15^{\circ} C$. The null hypothesis is $H_0: \mu=15$, and the alternative hypothesis is $H_1: \mu>15$.What is a Type I error in this context?
A. Concluding that the average temperature has increased when it has not.
Solution
Example
A health department tests whether the proportion of people vaccinated in a community is less than the required $80 \%$ for herd immunity. The null hypothesis is $H_0: p=0.80$, and the alternative hypothesis is $H_1: p<0.80$. What is a Type II error in this context?
C. Failing to detect that the vaccination rate is lower than $80 \%$.
Solution
Example
A biologist tests whether the proportion of a bird species migrating each year is different from $60 \%$. The null hypothesis is $H_0: p=0.60$, and the alternative hypothesis is $H_1: p \neq 0.60$. What is the Type I error in this context?
A. Concluding that the migration proportion differs from $60 \%$ when it does not.
Solution
Example
An agricultural scientist tests whether a new fertilizer increases crop yield above the standard average of 50 bushels per acre. The null hypothesis is $H_0: \mu=50$, and the alternative hypothesis is $H_1: \mu>50$ If the test leads to rejecting $H_0$ when $H_0$ is actually true, what type of error is this?
A. Type I error
Solution
Tests on the Mean When Population Variance is Known
As with interval estimation, assumptions for normality and implications from the Central Limit Theorem hold:
That is
Since the sampling distribution of the means follows a normal distribution, the standard normal ( $Z$-table) will be serve as the null (statistical) distribution for determining critical values and computing $P$-values.
How To Conduct A Test on The Mean When $\sigma$ Is Known
1. State the null hypothesis
$H_0:\mu=k$
2. State the alternate hypothesis
$H_1: \mu < k $
$H_1: \mu > k $ or
$H_1: \mu \neq k $
3. Draw the rejection zone if the level of significance, $\alpha$, is given, and determine the critical value(s), $Z_c$, associated with the rejection region.
4. Using the information obtained from the sample, calculate the test statistic, $Z_t$, $$Z_t=\frac{\bar{x}-k}{\sigma / \sqrt{n}}$$
5. Make a statistical decision. Do this with a graph or by comparing the test statistic to the critical value(s).
6. Make a conclusion in the context of the problem.
7. Calculate the $P$-value and make a decision based on the $P$-value.
Remark
Rule of Thumb
`` At the _____ level of significance, there is sufficient/insufficient evidence to indicate that the mean is _____ ``
`` At the _____ level of significance, there is/not enough evidence to indicate that the mean is _____ ``
** Write the alternate hypothesis in English, and in the context of problem in the last blank.
Remark
Remark
Failing to reject $H_0$, could occur for several reasons such as
Example
Consider a hypothesis test where $H_0: \mu=29$ and $H_1: \mu \neq 29$. A random sample of $25$ observations taken from a population produced a sample mean of $25.3$ . The population is normally distributed with $\sigma=8$.
Example
Consider a hypothesis test where $H_0: \mu=30$ and $H_1: \mu<30$. A random sample of $36$ observations taken from a population produced a sample mean of $27.6$ . The population has a standard deviation of $\sigma=10$.
Example
Consider a hypothesis test where $H_0: \mu=54$ and $H_1: \mu>54$. A random sample of 40 observations taken from a population produced a sample mean of $56.78$ . The population has a standard deviation of $\sigma=5.25$.
Example
A certain colleague of mine who teaches Differential Equations suspects that the $10$ ounce bag of fancy Swiss cheese he gets at the supermarket actually weighs less than $10$ ounces. He took a random sample of $20$ such packages and found that the mean weight for the sample was $9.955$ ounces. The population follows a normal distribution with a standard deviation of $0.15$ ounces.
Example
A study claims that senior citizens living in Mirabel spend an average of $14$ hours gardening during the weekend. A random sample of $200$ people showed that these senior citizens spend an average of $14.65$ hours on gardening during the weekend. Suppose that the standard deviation is known to be $3$ hours.
Example
The life in hours of a battery is known to be normally distributed with standard deviation $\sigma=1.25$ hours. A random sample of 10 batteries has a mean life of $\bar{x}=40.5$ hours.
Example
A melting point test of $n=10$ samples of a binder used in manufacturing a rocket propellant resulted in $\bar{x}=154.2^{\circ} F$. Assume that the melting point is normally distributed with $\sigma=1.5^{\circ} F$.
Example
An engineer who is studying the tensile strength of a steel alloy intended for use in golf club shafts knows that tensile strength is approximately normally distributed with $\sigma=60 psi$. A random sample of 12 specimens has a mean tensile strength of $\bar{x}=3450$ psi.
Example
A phone company claims that the mean duration of their longdistance calls made by residents is 10 minutes. A random sample of 100 long-distance calls made by its residential customers found that the mean duration for these calls was 10.20 minutes. Suppose that the standard deviation is known to be 3.80 minutes.
Example
At a dairy farm, a machine is set to fill 32 -ounce cartons with milk. However, the machine does not put exactly 32 -ounces into each carton; the amount varies from carton to carton but the volume is known to be normally distributed. When the machine is working correctly, the mean volume dispensed into each carton is 32 ounces, with a standard deviation of 1.5 ounces. A quality control inspector takes 25 cartons can finds that the average volume of milk in the containers is 31.93 ounces.
Tests on the Mean When Population Variance is Unknown
$$Z_t=\frac{\bar{x}-k}{\sigma / \sqrt{n}} \Rightarrow T_t=\frac{\bar{x}-k}{s / \sqrt{n}} $$ the $t$-distribution serves as the null (statistical) distribution, and the $P$-values are estimated differently than when the population variance is known.
How To Conduct A Test on The Mean When $\sigma$ Is Unknown
1. State the null hypothesis
$H_0:\mu=k$
2. State the alternate hypothesis
$H_1: \mu < k $
$H_1: \mu > k $ or
$H_1: \mu \neq k $
3. Draw the rejection zone if the level of significance, $\alpha$, is given, and determine the critical value(s), $t_{c, n-1}$, associated with the rejection region. The degrees of freedom associated with this test is $df=n-1$
4. Using the information obtained from the sample, calculate the test statistic, $T_t$, $$T_t=\frac{\bar{x}-k}{s / \sqrt{n}}$$
5. Make a statistical decision. Do this with a graph or by comparing the test statistic to the critical value(s).
6. Make a conclusion in the context of the problem.
7. Estimate the $P$-value and make a decision based on the $P$-value.
Remark
Remark
Rule of Thumb
`` At the _____ level of significance, there is sufficient/insufficient evidence to indicate that the mean is _____ ``
`` At the _____ level of significance, there is/not enough evidence to indicate that the mean is _____ ``
** Write the null hypothesis in English, and in the context of problem in the last blank.
Example
Consider a hypothesis test where $H_0: \mu=205$ and $H_1: \mu>205$. A random sample of 14 observations taken from a population that is normally distributed produced a sample mean of 212.37 and a standard deviation of 16.35 .
Example
Consider a hypothesis test where $H_0: \mu=50$ and $H_1: \mu<50$. A random sample of 8 observations taken from a population that is normally distributed produced a sample mean of 44.98 and a standard deviation of 6.77.
Example
Consider a hypothesis test where $H_0:\mu=10.70$ and $H_1: \mu \neq 10.70$. A random sample of 47 observations taken from a population produced a sample mean of 12.025 and a standard deviation of 4.90 .
Example
The President of a university claims that the mean time spent partying by students at this university is less 11 hours per week. A random sample of 40 students taken from this university showed that they spent an average of 10.5 hours partying, with a standard deviation of 2.3 hours.
Example
A team of physicists is studying the vibration frequency of a newly designed tuning fork. The manufacturer claims that the tuning fork vibrates at an average frequency of 256 Hz . The physicists suspect that the actual mean frequency differs from the claimed value. A random sample of 15 tuning forks produced a sample mean of 253 Hz and a standard deviation of 3.5 Hz.
Example
The body temperatures for 25 female subjects resulted in a sample average of $\bar{x}=98.264^{\circ} F$ and a standard deviation of $s=0.4821^{\circ}F$.
Example
A manufacturer of running shoes knows that the average lifetime for a particular model of shoes is 15 months. Someone in the research and development division of the shoe company claims to have developed a longer lasting product. This new product was worn by 36 individuals and lasted on average for 17 months. The variability of the original shoe is estimated based on the standard deviation of the new group which is 5.5 months.
Example
The company claims that their cookies have a shelf life of 5 years. A random sample of 200 cookies were taken from the warehouse found that the average shelf life of the sample was 58 months with a standard deviation of 4.5 months. Assume that the population is normally distributed.
Tests on the Population Proportion
$$Z=\frac{\hat{p}-k}{\sqrt{\frac{k(1-k)}{n}}}$$
where $\hat{p}$ is the sample proportion, $p$ is the claimed or historical value of the population proportion, and $n$ is the sample size.
How To Conduct A Test On The Population Proportion
1. State the null hypothesis
$H_0:p=k$
2. State the alternate hypothesis
$H_1: p < k $
$H_1: p > k $ or
$H_1: p \neq k $
3. Draw the rejection zone if the level of significance, $\alpha$, is given, and determine the critical value(s), $Z_{c}$, associated with the rejection region.
4. Using the information obtained from the sample, calculate the test statistic, $Z_t$, $$Z_t=\frac{\hat{p}-k}{ \sqrt{\frac{k(1-k)}{n}}} \quad ;\quad \hat{p}=\frac{x}{n}$$
5. Make a statistical decision. Do this with a graph or by comparing the test statistic to the critical value(s).
6. Make a conclusion in the context of the problem.
7. Calculate the $P$-value and make a decision based on the $P$-value.
Remark
Rule of Thumb
`` At the _____ level of significance, there is sufficient/insufficient evidence to indicate that the population proportion is _____ ``
`` At the _____ level of significance, there is/not enough evidence to indicate that the population percentage is _____ ``
** Write the null hypothesis in English, and in the context of problem in the last blank.
Example
A food company is planning to market a new type of frozen yoghurt. However, before marketing this yoghurt, the company wants to find the percentage of people who like it. The company's management has decided only to market this yoghurt if at least $35\%$ of people like it. The company's research team selected a random sample of 400 people and asked then to taste this yoghurt. Of these, 150 said that they liked it.
Example
A study in 2015 claimed that $11\%$ of all children in the US currently live with at least on grandparent. In 2020, a random sample of 1600 children found that 180 did currently live with at least grandparent.
Example
A company that sell computer parts claims that more $90\%$ of their orders are mailed within 72 hours of them being received. The quality control department took a random sample of 150 orders and found that 140 were mailed within 72 hours of the order being placed.
Example
A biologist is studying the population of a specific species of frogs in a wetland area. Previous research suggests that $40 \%$ of frogs in this region carry a gene that makes them resistant to a common fungal infection. The biologist hypothesizes that the proportion of resistant frogs has increased due to recent conservation efforts. A random sample of 150 frogs was taken and 72 were found to carry the gene.
Example
A biologist is studying a population of butterflies in a particular region. Historically, it is known that $35 \%$ of these butterflies carry a genetic marker that makes them resistant to a certain plant toxin. A random sample of 100 butterflies was taken and 30 were found to carry the genetic marker.
Example
A tech company claims that its new AI chatbot correctly answers $85 \%$ of user queries. The company has recently implemented an update, and the development team believes that the proportion of correct responses has improved. A random sample of 200 queries was taken, and 180 were found to be answered correctly.
Example
A chemical manufacturing company produces a catalyst that is supposed to speed up a reaction in $95 \%$ of trials. Due to a change in the production process, the company claims that the proportion of successful reactions has decreased.To test this claim, a random sample of $120$ trials using the new catalyst and observes that the reaction is successful in $110$ trials.
Example
A company specializing in facial recognition software claims that their Al algorithm correctly identifies faces $98\%$ of the time. A recent update to the algorithm was released, and engineers are concerned that the update might have changed the accuracy of the system. A random sample of 500 faces was taken, and the algorithm correctly identified 480 of them.
Example
A researcher claims that at least $10 \%$ of all football helmets have manufacturing flaws that could potentially cause injury to the wearer. A sample of 200 helmets revealed that 24 helmets contained such defects.