6 🧪 Hypothesis Testing Fundamentals
This chapter introduces the fundamental concepts of hypothesis testing, covering alternative hypothesis testing, p-value calculation, and hypothesis testing with null hypothesis.
6.1 Learning Objectives
By the end of this chapter, you will be able to:
- Understand the concept of hypothesis testing
- Formulate null and alternative hypotheses
- Calculate and interpret p-values
- Perform hypothesis tests on averages
- Make statistical decisions based on test results
6.2 Introduction to Hypothesis Testing
Hypothesis testing is a statistical method used to make decisions about population parameters based on sample data. It involves:
- Formulating hypotheses: Stating a null hypothesis (H₀) and an alternative hypothesis (H₁)
- Collecting data: Gathering sample data relevant to the hypothesis
- Calculating test statistics: Computing appropriate test statistics
- Making decisions: Comparing test statistics to critical values or p-values
6.3 Null and Alternative Hypotheses
6.3.1 Null Hypothesis (H₀)
The null hypothesis represents the status quo or the claim we want to test. It typically states that there is no effect, no difference, or no relationship.
Examples: - H₀: μ = 50 (population mean equals 50) - H₀: μ₁ = μ₂ (two population means are equal) - H₀: ρ = 0 (no correlation between variables)
6.4 P-values and Statistical Significance
6.4.1 What is a P-value?
The p-value is the probability of observing a test statistic as extreme as, or more extreme than, the one calculated from the sample data, assuming the null hypothesis is true.
6.5 Hypothesis Testing on Averages
6.5.1 One-Sample t-test
Used to test whether a population mean differs from a specified value.
Assumptions: - Data is normally distributed (or large sample size) - Observations are independent - Random sampling
Test Statistic:
t = (x̄ - μ₀) / (s/√n)
Where: - x̄ = sample mean - μ₀ = hypothesized population mean - s = sample standard deviation - n = sample size
6.6 Practical Example
Let’s work through a practical example using R:
# Load required packages
library(tidyverse)
# Example: Testing if a new teaching method improves test scores
# H₀: μ_new = μ_old (no difference in means)
# H₁: μ_new > μ_old (new method is better)
# Sample data
old_method <- c(65, 70, 68, 72, 69, 71, 67, 73, 70, 68)
new_method <- c(72, 75, 78, 74, 76, 79, 73, 77, 75, 74)
# Perform two-sample t-test
t_test_result <- t.test(new_method, old_method, alternative = "greater")
print(t_test_result)
# Extract p-value
p_value <- t_test_result$p.value
cat("P-value:", p_value, "\n")
# Make decision
if (p_value < 0.05) {
cat("Reject H₀: New method significantly improves scores\n")
} else {
cat("Fail to reject H₀: No significant improvement\n")
}6.7 Type I and Type II Errors
6.7.1 Type I Error (α)
- Definition: Rejecting H₀ when it’s actually true
- Probability: α (significance level, typically 0.05)
- Consequence: False positive
6.8 Best Practices
- State hypotheses clearly before collecting data
- Choose appropriate significance level (usually α = 0.05)
- Check assumptions before performing tests
- Report effect sizes along with p-values
- Avoid p-hacking (don’t change hypotheses after seeing results)
- Consider multiple comparisons when testing many hypotheses
6.9 Summary
Hypothesis testing is a powerful statistical tool for making data-driven decisions. Key points to remember:
- Always formulate clear null and alternative hypotheses
- Understand what p-values represent and don’t represent
- Consider both statistical and practical significance
- Be aware of Type I and Type II errors
- Follow best practices to ensure valid results