Introduction to Parameter Estimation
Key methods in inferential techniques include:
Parameter Estimation and Confidence Intervals
Definition:
Point Estimate
Remark
Definition:
Interval Estimate
This interval of values is called a confidence interval.
Remark
Definition:
Standard Error
Example
Suppose a random sample of 100 students has a mean test score of 85, and the population standard deviation is 10. The standard error of the mean is: $$SE_{\bar{X}}=\frac{\sigma}{\sqrt{n}}=\frac{10}{\sqrt{100}}=1$$
Solution
The standard error of the mean is $1$. This implies that the sample mean is expected to vary by about 1 unit from the true population mean on average $95\%$ of the time.
Example
Suppose you conduct a survey of 200 people to determine the proportion who prefer tea over coffee. Out of the 200 people, 120 say they prefer tea. The sample proportion $p$ is: $$ p=\frac{\text { Number of people preferring tea }}{\text { Total number of people surveyed }}=\frac{120}{200}=0.6 $$ The formula for the standard error of the sample proportion is: $$SE_p=\sqrt{\frac{p(1-p)}{n}}=\sqrt{\frac{(0.6)(1-0.6)}{200}}=0.0346$$
Solution
The standard error of the proportion is approximately 0.0346 . This means that if you repeated the survey many times, the sample proportion would typically vary by about 0.0346 (or 3.46 percentage points) from the true population proportion.
Confidence Intervals
The interval is constructed using the sample statistic (e.g., mean or proportion), the standard error, and a critical value determined by the desired confidence level.
Confidence intervals are widely used in research and decision-making to quantify the precision of an estimate and communicate the reliability of conclusions drawn from data.
Definition:
Confidence Level
Definition:
Margin of Error
Remark
Definition:
Confidence Interval
Remark
Remark
Definition:
Confidence Interval
Rule of Thumb
Estimating the Mean of A Population (Known Variance)
Remark
Formula:
Confidence Interval for the Mean; $\sigma$ Known
Remark
Example
A random sample of 25 students is taken from a population of students with a known variance of 100. The sample mean is 75. Find the 95% confidence interval for the population mean.
Solution
Given: $n=25$, $\sigma=10$, $\bar{x}=75$, $\alpha=0.05$. The critical value for a 95% confidence interval is $Z_{0.025}=1.96$. The margin of error is $1.96 \times \frac{10}{\sqrt{25}}=3.92$. The 95% confidence interval for the population mean is $75 \pm 3.92 = (71.08, 78.92)$. Interpretation: We are 95% confident that the true population mean lies between 71.08 and 78.92.
Example
A biologist is studying the average weight of a specific species of frogs in a population. From previous research, the population variance ( $\sigma^2$ ) is known to be $4 g^2$ (so $\sigma=2 g$ ). The biologist collects a random sample of $n=25$ frogs and finds that the sample mean weight is $\bar{x}=12.5g$. Construct a $98 \%$ confidence interval for the true mean weight of the frogs $(\mu)$.
Solution
Given: $n=25$, $\sigma=2$, $\bar{x}=12.5$, $\alpha=0.02$. The critical value for a 98% confidence interval is $Z_{0.01}=2.33$. The margin of error is $2.33 \times \frac{2}{\sqrt{25}}=0.932$. The 98% confidence interval for the population mean is $12.5 \pm 0.932 = (11.568, 13.432)$. Interpretation: We are 98% confident that the true population mean weight of the frog between $11.568$ and $13.432$ grams.
Example
A physicist measures the speed of sound in a different medium over $n=25$ trials, finding a sample mean $\bar{x}=343 m/s$. The population standard deviation is known to be $\sigma=3 m/s$.
Sample Sizes
Lowering the level of confidence is not advisable because it may produce less reliable results. Therefore, the preferable option is to always increase our sample sizes.
Formula:
Sample Size for Estimating the Mean; $\sigma$ Known
Example
A researcher wants to estimate the mean weight of a population of frogs with a margin of error of $0.5g$. The population standard deviation is known to be $2g$. What sample size is needed to achieve this margin of error with a $95 \%$ confidence level?
Solution
Given: $ME=0.5$, $\sigma=2$, $\alpha=0.05$. The critical value for a 95% confidence interval is $Z_{0.025}=1.96$. The sample size needed is $n=\left(\frac{1.96 \times 2}{0.5}\right)^2=61.4656$. Therefore, the researcher should take a sample size of $n=62$ to estimate the mean weight of the population with a margin of error of $0.5g$ and a $95 \%$ confidence level.
Example
A scientist wants to estimate the mean speed of sound in a medium with a margin of error of $0.5 m/s$. The population standard deviation is known to be $3 m/s$. What sample size is needed to achieve this margin of error with a $99 \%$ confidence level?
Solution
Given: $ME=0.5$, $\sigma=3$, $\alpha=0.01$. The critical value for a 99% confidence interval is $Z_{0.005}=2.58$. The sample size needed is $n=\left(\frac{2.58 \times 3}{0.5}\right)^2=595.584$. Therefore, the scientist should take a sample size of $n=596$ to estimate the mean speed of sound in the medium with a margin of error of $0.5 m/s$ and a $99 \%$ confidence level.
Example
A manufacturer claims that their lightbulbs have an average lifetime of $\mu=1200$ hours. A quality control team tests a random sample of $n=40$ lightbulbs and finds a sample mean lifetime of $\bar{x}=$ 1185 hours. The population standard deviation is known to be $\sigma=100$ hours.
Single Sided Confidence Bounds for the Mean; $\sigma$ Known
Formula:
One-Sided Confidence Bound for the Mean; $\sigma$ Known
Example
An engineering team is testing the tensile strength of a new type of steel alloy. From a sample of $n=50$ test pieces, they measure a sample mean tensile strength of $\bar{x}=850 MPa$. The population standard deviation is known to be $\sigma=40 MPa$. Construct and interpret a $98 \%$ lower confidence bound for the true mean tensile strength of the steel alloy $(\mu)$.
Solution
Given: $n=50$, $\sigma=40$, $\bar{x}=850$, $\alpha=0.02$. The critical value for a 98% confidence interval is $Z_{0.02}=2.33$. The margin of error is $2.33 \times \frac{40}{\sqrt{50}}=13.12$. The 98% lower confidence bound for the population mean is $850-13.12 = 836.88$. Interpretation: With repeated sampling, we are 98% confident that the true population mean tensile strength of the steel alloy is at least 836.88 MPa. This lower bound ensures that the true average tensile strength is unlikely to fall below this value, providing engineers with a conservative estimate for performance guarantees.
Example
A veterinarian is studying the weight of a new breed of puppies at 3 months old. From a sample of $n=30$ puppies, the average weight is found to be $\bar{x}=12.5 kg$, with a population standard deviation of $\sigma=2 kg$.Construct and interpret a $95 \%$ upper confidence bound for the true mean weight of the puppies $(\mu)$.
Solution
Given: $n=30$, $\sigma=2$, $\bar{x}=12.5$, $\alpha=0.05$. The critical value for a 95% confidence interval is $Z_{0.05}=1.645$. The margin of error is $1.645 \times \frac{2}{\sqrt{30}}=0.599$. The 95% upper confidence bound for the population mean is $12.5+0.599 = 13.099$. Interpretation: With repeated sampling, we are 95% confident that the true population mean weight of the puppies is at most 13.099 kg. This upper bound ensures that the true average weight is unlikely to exceed this value, providing veterinarians with a conservative estimate for feeding guidelines.
Estimating the Mean of A Population; Variance Unknown
The t-distribution adjusts for this added uncertainty and depends on the sample size through its degrees of freedom $(n-1)$ This method allows researchers to construct confidence intervals and perform hypothesis tests for the population mean, even in the absence of precise knowledge about the population variance, making it a widely applicable and robust statistical approach.
Remark
The $t-$Distribution
Remark
Example
Find the $t-$value of a $t-$distribution with $12$ degrees of freedom and $5\%$ in the right tail.
$t_{0.05,11}=1.782$
Solution
Example
Find the $t-$value of a $t-$distribution with $58$ degrees of freedom with $12.5%$ of the area in the right tail.
$t_{0.125,57}=1.67$
Solution
Example
Find the $t-$value of a $t-$distribution with $20$ degrees of freedom and $0.005$ in the left tail.
$t_{0.005,19}=-2.845$
Solution
Example
Find the $t-$value of a $t-$distribution with $1500$ degrees of freedom with $1%$ of the area in the left tail.
$t_{0.01,\infty}=-2.326$
Solution
Confidence Interval for the Population Mean; $\sigma$ Unknown
Formula:
Confidence Interval for the Population Mean; $\sigma$ Unknown
Remark
Example
A biologist is studying the wing length of a specific butterfly species. From a random sample of $n=$ 15 butterflies, the average wing length is found to be $\bar{x}=12.4 cm$, with a sample standard deviation of $s=0.8 cm$. Construct and interpret a $90\%$ confidence interval for the true average wing length of this butterfly species.
Solution
The $90\%$ confidence interval for the true average wing length of this butterfly species is $12.4 \pm 1.761 \left(\frac{0.8}{\sqrt{15}}\right) = 12.4 \pm 0.363$ cm. We are $90\%$ confident that the true average wing length of this butterfly species falls between $12.037$ and $12.763$ cm.
Example
A random sample of $n=$ 25 students is taken to estimate the average number of hours students spend studying per week. The sample mean is $\bar{x}=10.5$ hours, and the sample standard deviation is $s=2.3$ hours. Construct a $95\%$ confidence interval for the true average number of hours students spend studying per week.
Solution
The $95\%$ confidence interval for the true average number of hours students spend studying per week is $10.5 \pm 2.064 \left(\frac{2.3}{\sqrt{25}}\right) = 10.5 \pm 0.924$ hours. We are $95\%$ confident that the true average number of hours students spend studying per week falls between $9.576$ and $11.424$ hours.
One Sided Confidence Bounds for the Population Mean; $\sigma$ Unknown
Formula:
One Sided Confidence Bound for the Population Mean; $\sigma$ Unknown
Example
A chemist is studying the purity percentage of a newly synthesized chemical compound. From a random sample of $n=12$ batches, the sample mean purity is found to be $\bar{x}=98.6 \%$, with a sample standard deviation of $s=0.5 \%$. Calculate and interpret a $95 \%$ lower bound for the true mean purity ( $\mu$ ) of the compound.
Solution
The $95 \%$ lower bound for the true mean purity of the compound is $98.6 - 1.796 \left(\frac{0.5}{\sqrt{12}}\right) = 98.6 - 0.259 \%$. With 95% confidence, the true mean purity of the chemical compound is at least 98.34%. This lower bound provides a conservative estimate of the compound's purity, ensuring the chemist has a reliable minimum value for quality assurance.
Example
A random sample of $n=20$ patients is taken to estimate the average time it takes for a new medication to take effect. The sample mean is $\bar{x}=3.5$ hours, and the sample standard deviation is $s=0.8$ hours. Calculate and interpret a $90 \%$ upper bound for the true average time it takes for the medication to take effect.
Solution
The $90 \%$ upper bound for the true average time it takes for the medication to take effect is $3.5 + 1.725 \left(\frac{0.8}{\sqrt{20}}\right) = 3.5 + 0.305$ hours. We are $90 \%$ confident that the true average time it takes for the medication to take effect is at most $3.805$ hours.
When to Use the t-Distribution
For larger samples ( $n \geq 30$ ), the Central Limit Theorem ensures that the sampling distribution of the sample mean is approximately normal, so the $t$-distribution and normal distribution give nearly identical results.
When to Use the Normal Distribution
Rule of Thumb
Confidence Interval for A Population Proportion
Recall that a binomial distribution can be completely described by the number of independent trials in the experiment, $n$, and by the probability of success is each trial, $p$. Moreover, if $n p>5$ and $n(1-p)>5$, then the normal distribution can be used to approximate the binomial distribution.
Point Estimator for a Population Proportion
Definition:
Point Estimate for a Population Proportion
Sampling Distribution For Population Proportion Statistic, $\hat{p}$
Confidence Intervals for a Population Proportion
Formula:
Confidence Interval for a Population Proportion
Remark
Rule of Thumb
Example
A physicist is testing a batch of LED bulbs to determine the proportion that meets the required energy efficiency standards. Out of a random sample of $n=200$ bulbs, $x=170$ bulbs are found to be energy-efficient. Construct and interpret a $90\%$ confidence interval for the true proportion of energy-efficient LED bulbs in the batch.
Solution
The point estimate for the population proportion is $\hat{p}=\frac{x}{n}=\frac{170}{200}=0.85$. The critical value for a $90\%$ confidence interval is $Z_{\alpha / 2}=1.645$. The margin of error is $Z_{\alpha / 2} \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}=1.645 \sqrt{\frac{0.85(1-0.85)}{200}}=0.045$. Therefore, the $90\%$ confidence interval for the true proportion of energy-efficient LED bulbs in the batch is $0.85 \pm 0.045$ or $(0.805, 0.895)$. This means that we are $90\%$ confident that the true proportion of energy-efficient LED bulbs in the batch is between $80.5\%$ and $89.5\%$.
Example
A software company is testing a new algorithm for detecting malware. In a random sample of $n=$ 500 files, the algorithm correctly identifies $x=460$ malware-infected files. Construct and interpret a $96\%$ confidence interval for the true proportion of malware-infected files that the algorithm can detect.
Solution
The point estimate for the population proportion is $\hat{p}=\frac{x}{n}=\frac{460}{500}=0.92$. The critical value for a $96\%$ confidence interval is $Z_{\alpha / 2}=2.05$. The margin of error is $Z_{\alpha / 2} \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}=2.05 \sqrt{\frac{0.92(1-0.92)}{500}}=0.034$. Therefore, the $96\%$ confidence interval for the true proportion of malware-infected files that the algorithm can detect is $0.92 \pm 0.034$ or $(0.886, 0.954)$. This means that we are $96\%$ confident that the true proportion of malware-infected files that the algorithm can detect is between $88.6\%$ and $95.4\%$.
Example
A biologist is studying a population of frogs in a wetland to determine the proportion that are carriers of a specific gene mutation. Out of a random sample of $n=400$ frogs, $x=72$ are found to carry the mutation.