Confidence Interval
with Normal Distribution
Hypothesis Testing
Two-sample z-test with known variance
Reject at a confidence level if
Two-sample t-test with unknown equal variance
Reject at a confidence level if
- Two-sided test
- One sided greater test
- One-sided less test
Two-sample t-test with unknown unequal variance (Welch’s t-test)
Use Welch’s t-test by default if no information is given
Power t-test (optional)
power.t.test(delta=2.5, sd=2, sig.level = 0.05, power=0.80, type="two.sample",alternative = "two.sided")
ANOVA
-
Compare two or more groups of data/samples
-
Factor is a categorical variable denoting the different groups the data come from
-
Possible values of factor is called levels
-
The main numerical variable of the sample values is called the dependent variable
-
If only 1 factor, one-way ANOVA
-
If multiple factors, multi-way ANOVA (optional for this course)
Median
- Order the data points from small to large
- If n is odd, then median is the th data point
- If n is even, then median is the average of and and th data points
Quantile
- Order the data points from small to large and calculate
- If k is an integer, the k-th data point is the quantile
- If k is not an integer, the quantile is the average and th data point. is the largest integer smaller than k.
Boxplot
- Points outside are potential outliers
- LEFT: Smallest value greater than lower quartile minus 1.5 times IQR
- RIGHT: Largest value less than upper quartile plus 1.5 times IQR
- Think of it as shrinking the 2 bars in order to fit to a data point
One-way ANOVA
- Group 1: , iid samples from
- Group n: , iid samples from
- Assume samples from different groups are independent
- Assume variance unknown but equal, i.e.
is noise
Types of Variance
- Total variance, sum of square total
is the mean of all the sample data
- Between (group) variance, sum of square treatment
- Within (group) variance, sum of square error
- Relations
F Statistics
If null hypo is true, SST should be close to 0
A large support
MS_Treat = Mean square treatment
MSE = Mean square error
n - k because each group have n_each - 1 freedom, sum together we have n_total - k
Reject if F is large
Under , ,

Reject H0 if
Relation to t-test
if , we can apply both two-sided two-sample t-test and ANOVA
- Equivalent if ,
- Welch’s t-test is equivalent to Welch’s ANOVA at ,
Tukey’s Honestly Significant Difference (HSD)
res_aov = aov(time~treatment, data=rat_poison)
TukeyHSD(res_aov)
plot(TukeyHSD(res_aov))