So the problem is to investigate whether the observed difference in means is too large to be the result of random selection.Nevertheless, the sample means do look different. But what about the population means?
We will be asking ourselves
that is the difference (in the population mean) great enough
that you can rule out chance? Note we will investigate this claim use the available sample data.
I will be using ANOVA to test for Equality of all means and later perform
Post-Hoc Analysis called Tukey HSD (Honestly Significant Difference) in R. The one-way analysis of variance (ANOVA) is used to determine whether
there are any statistically significant differences between the means of
three or more independent (unrelated) groups. ANOVA test tells you whether you have an overall difference between your
groups, but it does not tell you which specific groups differed – post
hoc tests do. Because post hoc tests are run to confirm where the
differences occurred between groups, they should only be run when you
have a shown an overall statistically significant difference in group
means.
ANOVA
Test for Equality of All Means
So
our Anova procedures tests these Hypothesis: (Ho = Null Hypothesis, H1 =
Alternative Hypothesis)
Ho: m1 = m2 = m3 = m4 – mn all our sample means are the same
H1: two or more means are different
from the others
Let’s test these hypotheses at the α = 0.05
significance level.
One important thing to note is the format of data really matters when performing ANOVA in R, and the data must be stacked.
Then we go ahead and perform the ANOVA test and view the result.One important thing to note is the format of data really matters when performing ANOVA in R, and the data must be stacked.
![]() |
Original Data |
![]() |
Stacked Data |
Observations:
- Note that the mean square between samples 63.43 is much larger than within the samples2.74.
- The ratio, between-groups mean square over within-groups mean square, is called an F statistic (F = 63.43/2.74 in our case). It tells you how much more variability there is between sample groups than within the sample groups.
Since the F value is large we are more confident
in rejecting the null hypothesis, which was that all means are equal.
Conclusion:
The P-Value which is
0.0000000000000002 (very much close to zero) is below our
significance value of 0.05 it would be unlikely to have a p-value this low if
there were no real differences among the means of our samples
Therefore we reject H0 and accept
H1 , concluding that the mean of all samples is not the same.
Since our ANOVA test shows that the means aren’t all equal,
our next step is to determine which means are
different, using our level of significance that is α = 0.05. To do so I performed a post- Hoc analysis called Tukeys HSD in Agricolae package
Samples sharing the same letter are not significantly different, at the chosen level
(default, 5%). While those with different letters are significantly different.
No comments:
Post a Comment