1

a

x = c(121, 105, 111, 119, 108, 101, 90, 131, 106, 112)
y = c(101, 110, 107, 98, 89, 103, 86, 117, 113, 87)
 
len_x = length(x)
len_y = length(y)
 
mean_x = mean(x)
mean_y = mean(y)
 
var_x = var(x)
var_y = var(y)
 
v = (var_x/len_x+var_y/len_y)^2 /
    ( (var_x/len_x)^2/(len_x-1) + ((var_y/len_y)^2/(len_y-1)) )
 
t_v = qt(0.05/2, df=v, lower.tail=FALSE)
 
c(mean_x-mean_y-t_v*sqrt(var_x/len_x+var_y/len_y),
  mean_x-mean_y+t_v*sqrt(var_x/len_x+var_y/len_y))
 
# OR
t.test(x, y, mu=0, alternative="two.sided", conf.level=0.95)
[1] -1.2458 19.8458

	Welch Two Sample t-test

data:  x and y
t = 1.8529, df = 17.979, p-value = 0.08039
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -1.2458 19.8458
sample estimates:
mean of x mean of y 
    110.4     101.1

We use the Welch’s t-test to obtain the following CI

We do not reject because 0 is in the CI, i.e. we do not have enough evidence to conclude that there is a difference in mean

b

t_val = (mean_x-mean_y-25) / sqrt(var_x/len_x+var_y/len_y)
p_val = pt(t_val, df=v, lower.tail=FALSE)
t_sta = qt(0.05, df=v, lower.tail=FALSE)
 
print(c(t_val, p_val, t_sta))
 
# OR
t.test(x, y, mu=25, alternative="greater", conf.level=0.95)
[1] -3.1279976  0.9970909  1.7341734

	Welch Two Sample t-test

data:  x and y
t = -3.128, df = 17.979, p-value = 0.9971
alternative hypothesis: true difference in means is greater than 25
95 percent confidence interval:
 0.5958623       Inf
sample estimates:
mean of x mean of y 
    110.4     101.1

The p-value is 0.9970909, which is not less than 0.05

We do not reject , i.e. we do not have enough evidence to conclude that the mean difference is greater than 25

2

a

  1. Group 1: , iid samples from
  2. Group 2: , iid samples from
  3. Group 5: , iid samples from

Assume samples from different groups are independent and is unknown but equal

concrete_data = read.csv("concrete_data.csv")
 
x1 = concrete_data$X1
x2 = concrete_data$X2
x3 = concrete_data$X3
x4 = concrete_data$X4
x5 = concrete_data$X5
 
boxplot(concrete_data)
 
dat = data.frame(
  moisture=c(x1, x2, x3, x4, x5),
  aggregate=as.factor(
    c(rep(1,length(x1)),
      rep(2,length(x2)),
      rep(3,length(x3)),
      rep(4,length(x4)),
      rep(5,length(x5))
    )
  )
)
 
# F-stat
print(qf(0.02, df1=5-1, df2=30-5, lower.tail=FALSE))
 
res_aov = aov(moisture~aggregate, data=dat)
summary(res_aov)
[1] 3.549423
            Df Sum Sq Mean Sq F value  Pr(>F)   
aggregate    4  85356   21339   4.302 0.00875 **
Residuals   25 124020    4961                   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Since (), we reject , i.e. some means are different

b

TukeyHSD(res_aov, conf.level=0.98)
  Tukey multiple comparisons of means
    98% family-wise confidence level

Fit: aov(formula = moisture ~ aggregate, data = dat)

$aggregate
            diff         lwr        upr     p adj
2-1  -57.1666667 -193.158527  78.825194 0.6297485
3-1 -145.3333333 -281.325194  -9.341473 0.0116387
4-1  -41.1666667 -177.158527  94.825194 0.8472695
5-1    0.1666667 -135.825194 136.158527 1.0000000
3-2  -88.1666667 -224.158527  47.825194 0.2243248
4-2   16.0000000 -119.991861 151.991861 0.9946026
5-2   57.3333333  -78.658527 193.325194 0.6272414
4-3  104.1666667  -31.825194 240.158527 0.1088202
5-3  145.5000000    9.508139 281.491861 0.0115253
5-4   41.3333333  -94.658527 177.325194 0.8453941

The result from TukeyHSD shows that for 3-1 and 5-3 that 0 is not inside their CI and their p-value is less than 0.02, so we have enough evidence to conclude that they are different

We can also see that aggregate 3 is less than 1 and aggregate 3 is less than 5

c

oneway.test(moisture~aggregate, data=dat)
	One-way analysis of means (not assuming equal variances)

data:  moisture and aggregate
F = 5.4163, num df = 4.000, denom df = 12.372, p-value = 0.009433

By using Welch’s ANOVA, the p-value is 0.009433

3

k = 3
ni = 12
n = ni*k
 
ma = 32
mb = 40
mc = 30
m = (ma+mb+mc)/k
 
va = 145
vb = 138
vc = 150
 
SSTreat = ni*(ma-m)^2 + ni*(mb-m)^2 + ni*(mc-m)^2
MSTreat = SSTreat/(k-1)
 
SSE = (ni-1)*va + (ni-1)*vb + (ni-1)*vc
MSE = SSE/(n-k)
 
F_value = MSTreat/MSE
print(F_value)
 
F_stat = qf(0.05, df1=k-1, df2=n-k, lower.tail=FALSE)
print(F_stat)
 
p_value = pf(F_value, df1=k-1, df2=n-k, lower.tail=FALSE)
print(p_value)
[1] 2.327945
[1] 3.284918
[1] 0.1133019

Since (p-value > 0.05), we do not reject , i.e. we do not have enough evidence to reject that the mean time to clear a mild asthmatic attack is the same for all three steroids