Analysis of Variance

Michael Bulmer

19 Analysis of Variance

[latex]\newcommand{\pr}[1]{P(#1)} \newcommand{\var}[1]{\mbox{var}(#1)} \newcommand{\mean}[1]{\mbox{E}(#1)} \newcommand{\sd}[1]{\mbox{sd}(#1)} \newcommand{\Binomial}[3]{#1 \sim \mbox{Binomial}(#2,#3)} \newcommand{\Student}[2]{#1 \sim \mbox{Student}(#2)} \newcommand{\Normal}[3]{#1 \sim \mbox{Normal}(#2,#3)} \newcommand{\Poisson}[2]{#1 \sim \mbox{Poisson}(#2)} \newcommand{\se}[1]{\mbox{se}(#1)} \newcommand{\prbig}[1]{P\left(#1\right)} \newcommand{\degc}{$^{\circ}$C}[/latex]

Analysis of Variance for Regression

In Chapter 18 we saw two ways of assessing whether there is a linear relationship between two variables. The correlation coefficient, [latex]r[/latex], gives a summary measure of the strength and direction of a linear association and we can use its sampling distribution to see if it is significantly different from a population correlation of [latex]\rho = 0[/latex]. Least-squares regression gives a sample estimate, [latex]b_1[/latex], of the slope of a linear relationship and again we can use its sampling distribution to see if it is significantly different from a population slope of [latex]\beta_1 = 0[/latex].

An alternative is based on looking at the variability in the response. At the very beginning we emphasised the role of data analysis in describing and explaining variability. The method described in this chapter uses this idea literally to make inferences from data.

The figure below shows a scatter plot of basal oxytocin level from the data in the oxytocin example. The horizontal axis is age but here we are just modelling the response with the mean oxytocin level, 4.6 pg/mL. The oxytocin levels cover a range of about 1.5 pg/mL and the sample standard deviation from the mean line is [latex]s[/latex] = 0.3560 pg/mL.

Basal plasma oxytocin by age with sample mean

In Chapter 18 we saw that there was a significant linear relationship between basal oxytocin level and age. One way of thinking about this is that age explains some of the variability we see in the oxytocin levels. The figure below shows the same scatter plot with the least-squares line added. Visually there does not seem to be as much variability in the oxytocin level around this line as there was around the mean line in the previous figure.

Basal plasma oxytocin by age with least-squares fit

The following figure shows the variability of the residual deviations about the line. The mean of the residuals is always 0 and they have a range of about 1.12 pg/mL and a standard deviation from the mean of [latex]s[/latex] = 0.3049 pg/mL. (We will actually use the residual standard deviation, [latex]s_U[/latex] = 0.3117 pg/mL, to measure the residual variability. This has the same sum of squared deviations but a different degrees of freedom, as discussed in Chapter 18).

Oxytocin residuals by age with mean fit

Breaking Up Variance

To measure how much the variability has been reduced we first split the sample variance, the square of the sample standard deviation, into two components. Recall that the sample variance is
\[ s^2 = \frac{\sum (x_i – \overline{x})^2}{n-1},\]
the sum of squared deviations from the sample mean divided by the degrees of freedom. For regression we also saw the residual variance
\[ s_{U}^2 = \frac{\sum (y_i – \hat{y}_i)^2}{n-2},\]
another sum of squared deviations divided by a degrees of freedom. For analysis of variance we treat these components separately. We will denote a sum of squared deviations by ‘SS’ and a degrees of freedom by ‘DF’.

For basal oxytocin alone we saw [latex]s = 0.3560[/latex] from [latex]n = 24[/latex] observations. We can rearrange the formula for standard deviation to find
\[ \sum (x_i – \overline{x})^2 = (n-1)s^2 = 23 \times 0.3560^2 = 2.9149 \]
This is just a trick for finding the sum of squares when all you have is a calculator with a standard deviation button. In practice a computer can simply work out the sum of squared deviations for us. We call this the total sum of squared deviations and write
\[ \mbox{SST} = 2.9149, \; \mbox{DFT} = 23. \]

Once we have fitted the line we have the residual variability measured by [latex]s_U = 0.3117[/latex]. We can similarly split this into two components with
\[ \mbox{SSR} = 2.1375, \; \mbox{DFR} = 22. \]
Note that residual variability is often referred to as error variability.

The sum of squares and degrees of freedom that have disappeared between the total and the residual variability are attributed to the presence of the linear relationship. These can be calculated by difference so that
\[ \mbox{SSL} = \mbox{SST} – \mbox{SSR} = 2.9149 – 2.1375 = 0.7774 \]
and [latex]\mbox{DFL} = \mbox{DFT} - \mbox{DFR} = 1[/latex]. The ‘L’ stands for ‘line’. In this context you can think of DFL as the number of extra variables we are using to describe the response. Originally we just used the sample mean and so we had [latex]n-1[/latex] degrees of freedom but then we added age and went down to [latex]n-2[/latex], so DFL = 1. If we had tried explaining oxytocin using age and weight, for example, the residual degrees of freedom would have been [latex]n-3[/latex] instead and DFL = 2.

The above numbers suggest that some of the total sum of squared deviations seems to be related to the line. This is captured by the coefficient of determination
\[ R^2 = \frac{\mbox{SSL}}{\mbox{SST}} = \frac{0.7774}{2.9149} = 0.2667 \]
Thus around 27% of the total variability has been explained by the least-squares line. This is not very high since the relationship between oxytocin level and age is rather weak, but it does seem to be more than 0%. It is no coincidence that we have used [latex]R[/latex] here, the capital of the same letter we used for the correlation coefficient. As we saw in Chapter 18, the correlation for this relationship is [latex]r = -0.5165[/latex] and indeed [latex](-0.5165)^2 = 0.2667[/latex]. So correlation, least-squares regression, and analysis of variance are all interrelated, different ways of viewing the same idea of linear modelling.

The F statistic

We have seen then that the variability in oxytocin level has been reduced by incorporating age. The question that should be obvious by now is whether the reduction we observed could just have been due to the randomness of sampling variability. To decide this we would like to carry out a test of significance. If there was no relationship between oxytocin and age ([latex]H_0[/latex]) then we would expect the variability explained by the line to be about the same as the natural variability (measured by the residuals). To compare these we look at the mean sum of squares, MS = SS/DF. That is,
\[ \mbox{MSL} = \frac{\mbox{SSL}}{\mbox{DFL}} = \frac{0.7774}{1} = 0.7774 \]
and
\[ \mbox{MSR} = \frac{\mbox{SSR}}{\mbox{DFR}} = \frac{2.1375}{22} = 0.0972.\]
Note that the residual mean square, MSR, is simply the residual variance, [latex]s_{U}^{2} = 0.3117^2 = 0.0972[/latex]. Similarly MST, which we don’t use here, is the sample variance [latex]s^2 = 0.3560^2 = 0.1267[/latex]. Using mean sums of squares to measure variability is an idea we have been following from the beginning.

If the slope of the line was 0 then MSL would also be 0 since the residual variability would be exactly the same as the total variability. However, because of sampling variability we would not actually get a sample value for MSL of 0. Instead we expect it to have variability of the order of the estimated residual variability, MSR. Thus if there was no relationship between the variables then we would expect MSL and MSR to be similar and so we would expect the ratio
\[ F = \frac{\mbox{MSL}}{\mbox{MSR}} \]
to be close to 1. Here we find [latex]F = 8.00[/latex], implying that the variability explained by the line is about 8 times the underlying variability. How likely is a value like this to occur by chance if there was really no association?

One way to answer this is to use a randomisation test. Here the null hypothesis says there is no association between the observed oxytocin levels and ages. If that was the case then a random allocation of the oxytocin levels to the ages should be enough to obtain the [latex]F[/latex] value of 8.00. We can test this by randomly ordering the age values, recalculating [latex]F[/latex] each time. The figure below shows the distribution of the [latex]F[/latex] values obtained from 10000 randomisations. Since [latex]F[/latex] can never be negative this is a skewed distribution. It is clear that 8.00 is a pretty unusual value: only 88/10000 = 0.0088 of the random allocations gave values as far as 8.00. Thus the estimated [latex]P[/latex]-value is 0.0088, strong evidence to suggest that oxytocin level is related to age.

Distribution of [latex]F[/latex] from 10000 randomisations

F Distribution

An alternative to randomisation is to use the theoretical sampling distribution of this [latex]F[/latex] statistic under the assumption that the null hypothesis of no association is true.
This distribution is called the F distribution. There are two degrees of freedom involved with this distribution, one for the numerator (DFL) and one for the denominator (DFR). The figure below shows the [latex]F(1,22)[/latex] distribution, the distribution that our statistic, [latex]F_{1,22} = 8.00[/latex], would have if the null hypothesis of no difference were true. The probability from this distribution is [latex]\pr{F_{1,22} \ge 8.00} = 0.0098[/latex], close to the estimated value from the randomisation test.

[latex]F(1,22)[/latex] distribution

As with correlation, a test based on the [latex]F[/latex] statistic doesn’t really tell us much about the nature of the association, such as the magnitude of the slope. However, analysis of variance is important because it provides a very general method that can be used in a range of settings. Ronald A. Fisher developed these tools in the early part of the twentieth century, using the term “analysis of variance” in 1918 (Fisher, 1918). He was later knighted for his work and the letter [latex]F[/latex] is used in his honour. The following sections will show some of the other applications of ANOVA.

The [latex]F[/latex] distribution is another continuous distribution like the Normal and the [latex]t[/latex] distributions, though it is the first skewed density curve we have seen. You can imagine having tables for the [latex]F[/latex] distribution but there will be a lot of these since you get different probabilities for each combination of the numerator and denominator degrees of freedom. As an example, the table below shows probabilities for the [latex]F(1,22)[/latex] distribution that we have used in this section (and will also use later in this chapter). You can see from this table that the [latex]P[/latex]-value from [latex]F = 8.00[/latex] is approximately 0.010.

[latex]F(1,22)[/latex] distribution

	First decimal place of [latex]f[/latex]
[latex]f[/latex]	0	1	2	3	4	5	6	7	8	9
0	1.000	0.755	0.659	0.589	0.534	0.487	0.447	0.412	0.381	0.353
1	0.328	0.306	0.285	0.266	0.249	0.234	0.219	0.206	0.193	0.182
2	0.171	0.161	0.152	0.144	0.136	0.128	0.121	0.115	0.108	0.103
3	0.097	0.092	0.087	0.083	0.079	0.075	0.071	0.067	0.064	0.061
4	0.058	0.055	0.053	0.050	0.048	0.045	0.043	0.041	0.039	0.038
5	0.036	0.034	0.033	0.031	0.030	0.028	0.027	0.026	0.025	0.024
6	0.023	0.022	0.021	0.020	0.019	0.018	0.018	0.017	0.016	0.015
7	0.015	0.014	0.014	0.013	0.012	0.012	0.012	0.011	0.011	0.010
8	0.010	0.009	0.009	0.009	0.008	0.008	0.008	0.007	0.007	0.007
9	0.007	0.006	0.006	0.006	0.006	0.005	0.005	0.005	0.005	0.005
10	0.005	0.004	0.004	0.004	0.004	0.004	0.004	0.003	0.003	0.003
11	0.003	0.003	0.003	0.003	0.003	0.003	0.003	0.002	0.002	0.002
12	0.002	0.002	0.002	0.002	0.002	0.002	0.002	0.002	0.002	0.002
13	0.002	0.002	0.001	0.001	0.001	0.001	0.001	0.001	0.001	0.001
14	0.001	0.001	0.001	0.001	0.001	0.001	0.001	0.001	0.001	0.001
15	0.001	0.001	0.001	0.001	0.001	0.001	0.001	0.001	0.001	0.001
16	0.001	0.001	0.001	0.001	0.001	0.001	0.001

This table gives [latex]\pr{F_{1,22} \ge f}[/latex].

Below, we have a table of [latex]F(1,d)[/latex] where each line gives critical values for different degrees of freedom, [latex]d[/latex], in the denominator, identical in structure to the table of critical values for the [latex]t[/latex] distribution. With [latex]d = 22[/latex] the best we can get for the [latex]P[/latex]-value from [latex]F = 8.00[/latex] is that it is somewhere between 0.01 and 0.005, though very close to 0.01.

[latex]F(1,d)[/latex] distribution

	Probability [latex]p[/latex]
[latex]d[/latex]	0.25	0.10	0.05	0.025	0.01	0.005	0.001	0.0005	0.0001
2	2.571	8.526	18.51	38.51	98.50	198.5	998.5	1999	9999
3	2.024	5.538	10.13	17.44	34.12	55.55	167.0	266.5	784.0
4	1.807	4.545	7.709	12.22	21.20	31.33	74.14	106.2	241.6
5	1.692	4.060	6.608	10.01	16.26	22.78	47.18	63.61	124.9
6	1.621	3.776	5.987	8.813	13.75	18.63	35.51	46.08	82.49
7	1.573	3.589	5.591	8.073	12.25	16.24	29.25	36.99	62.17
8	1.538	3.458	5.318	7.571	11.26	14.69	25.41	31.56	50.69
9	1.512	3.360	5.117	7.209	10.56	13.61	22.86	27.99	43.48
10	1.491	3.285	4.965	6.937	10.04	12.83	21.04	25.49	38.58
11	1.475	3.225	4.844	6.724	9.646	12.23	19.69	23.65	35.06
12	1.461	3.177	4.747	6.554	9.330	11.75	18.64	22.24	32.43
13	1.450	3.136	4.667	6.414	9.074	11.37	17.82	21.14	30.39
14	1.440	3.102	4.600	6.298	8.862	11.06	17.14	20.24	28.77
15	1.432	3.073	4.543	6.199	8.683	10.80	16.59	19.51	27.45
16	1.425	3.048	4.494	6.115	8.531	10.58	16.12	18.89	26.36
17	1.419	3.026	4.451	6.042	8.400	10.38	15.72	18.37	25.44
18	1.413	3.007	4.414	5.978	8.285	10.22	15.38	17.92	24.66
19	1.408	2.990	4.381	5.922	8.185	10.07	15.08	17.53	23.99
20	1.404	2.975	4.351	5.871	8.096	9.944	14.82	17.19	23.40
21	1.400	2.961	4.325	5.827	8.017	9.830	14.59	16.89	22.89
22	1.396	2.949	4.301	5.786	7.945	9.727	14.38	16.62	22.43
23	1.393	2.937	4.279	5.750	7.881	9.635	14.20	16.38	22.03
24	1.390	2.927	4.260	5.717	7.823	9.551	14.03	16.17	21.66
25	1.387	2.918	4.242	5.686	7.770	9.475	13.88	15.97	21.34
26	1.384	2.909	4.225	5.659	7.721	9.406	13.74	15.79	21.04
27	1.382	2.901	4.210	5.633	7.677	9.342	13.61	15.63	20.77
28	1.380	2.894	4.196	5.610	7.636	9.284	13.50	15.48	20.53
29	1.378	2.887	4.183	5.588	7.598	9.230	13.39	15.35	20.30
30	1.376	2.881	4.171	5.568	7.562	9.180	13.29	15.22	20.09
40	1.363	2.835	4.085	5.424	7.314	8.828	12.61	14.35	18.67
50	1.355	2.809	4.034	5.340	7.171	8.626	12.22	13.86	17.88
60	1.349	2.791	4.001	5.286	7.077	8.495	11.97	13.55	17.38
70	1.346	2.779	3.978	5.247	7.011	8.403	11.80	13.33	17.03
80	1.343	2.769	3.960	5.218	6.963	8.335	11.67	13.17	16.78
90	1.341	2.762	3.947	5.196	6.925	8.282	11.57	13.05	16.58
100	1.339	2.756	3.936	5.179	6.895	8.241	11.50	12.95	16.43
[latex]\infty[/latex]	1.323	2.706	3.841	5.024	6.635	7.879	10.83	12.11	15.14

This table gives [latex]f^{*}[/latex] such that [latex]\pr{F_{1,d} \ge f^{*}} = p[/latex]

This is still for a numerator degrees of freedom of 1, corresponding to a single term in our model. We will often have more terms than this and so finally the following table shows a smaller number of critical values for differing degrees of freedom in the numerator and the denominator. See if you can find the [latex]P[/latex]-value for [latex]F_{1,22} = 8.00[/latex] in this final table.

[latex]F[/latex] distribution

[latex]d[/latex]	[latex]p[/latex]	[latex]n=[/latex]1	2	3	4	5	6	7	8	9
2	0.100	8.53	9.00	9.16	9.24	9.29	9.33	9.35	9.37	9.38
	0.050	18.5	19.0	19.2	19.2	19.3	19.3	19.4	19.4	19.4
	0.010	98.5	99.0	99.2	99.2	99.3	99.3	99.4	99.4	99.4
	0.001	999	999	999	999	999	999	999	999	999
3	0.100	5.54	5.46	5.39	5.34	5.31	5.28	5.27	5.25	5.24
	0.050	10.1	9.55	9.28	9.12	9.01	8.94	8.89	8.85	8.81
	0.010	34.1	30.8	29.5	28.7	28.2	27.9	27.7	27.5	27.3
	0.001	167	148	141	137	135	133	132	131	130
4	0.100	4.54	4.32	4.19	4.11	4.05	4.01	3.98	3.95	3.94
	0.050	7.71	6.94	6.59	6.39	6.26	6.16	6.09	6.04	6.00
	0.010	21.2	18.0	16.7	16.0	15.5	15.2	15.0	14.8	14.7
	0.001	74.1	61.2	56.2	53.4	51.7	50.5	49.7	49.0	48.5
5	0.100	4.06	3.78	3.62	3.52	3.45	3.40	3.37	3.34	3.32
	0.050	6.61	5.79	5.41	5.19	5.05	4.95	4.88	4.82	4.77
	0.010	16.3	13.3	12.1	11.4	11.0	10.7	10.5	10.3	10.2
	0.001	47.2	37.1	33.2	31.1	29.8	28.8	28.2	27.6	27.2
6	0.100	3.78	3.46	3.29	3.18	3.11	3.05	3.01	2.98	2.96
	0.050	5.99	5.14	4.76	4.53	4.39	4.28	4.21	4.15	4.10
	0.010	13.7	10.9	9.78	9.15	8.75	8.47	8.26	8.10	7.98
	0.001	35.5	27.0	23.7	21.9	20.8	20.0	19.5	19.0	18.7
7	0.100	3.59	3.26	3.07	2.96	2.88	2.83	2.78	2.75	2.72
	0.050	5.59	4.74	4.35	4.12	3.97	3.87	3.79	3.73	3.68
	0.010	12.2	9.55	8.45	7.85	7.46	7.19	6.99	6.84	6.72
	0.001	29.2	21.7	18.8	17.2	16.2	15.5	15.0	14.6	14.3
8	0.100	3.46	3.11	2.92	2.81	2.73	2.67	2.62	2.59	2.56
	0.050	5.32	4.46	4.07	3.84	3.69	3.58	3.50	3.44	3.39
	0.010	11.3	8.65	7.59	7.01	6.63	6.37	6.18	6.03	5.91
	0.001	25.4	18.5	15.8	14.4	13.5	12.9	12.4	12.0	11.8
9	0.100	3.36	3.01	2.81	2.69	2.61	2.55	2.51	2.47	2.44
	0.050	5.12	4.26	3.86	3.63	3.48	3.37	3.29	3.23	3.18
	0.010	10.6	8.02	6.99	6.42	6.06	5.80	5.61	5.47	5.35
	0.001	22.9	16.4	13.9	12.6	11.7	11.1	10.7	10.4	10.1
10	0.100	3.29	2.92	2.73	2.61	2.52	2.46	2.41	2.38	2.35
	0.050	4.96	4.10	3.71	3.48	3.33	3.22	3.14	3.07	3.02
	0.010	10.0	7.56	6.55	5.99	5.64	5.39	5.20	5.06	4.94
	0.001	21.0	14.9	12.6	11.3	10.5	9.93	9.52	9.20	8.96
11	0.100	3.23	2.86	2.66	2.54	2.45	2.39	2.34	2.30	2.27
	0.050	4.84	3.98	3.59	3.36	3.20	3.09	3.01	2.95	2.90
	0.010	9.65	7.21	6.22	5.67	5.32	5.07	4.89	4.74	4.63
	0.001	19.7	13.8	11.6	10.3	9.58	9.05	8.66	8.35	8.12
12	0.100	3.18	2.81	2.61	2.48	2.39	2.33	2.28	2.24	2.21
	0.050	4.75	3.89	3.49	3.26	3.11	3.00	2.91	2.85	2.80
	0.010	9.33	6.93	5.95	5.41	5.06	4.82	4.64	4.50	4.39
	0.001	18.6	13.0	10.8	9.63	8.89	8.38	8.00	7.71	7.48
13	0.100	3.14	2.76	2.56	2.43	2.35	2.28	2.23	2.20	2.16
	0.050	4.67	3.81	3.41	3.18	3.03	2.92	2.83	2.77	2.71
	0.010	9.07	6.70	5.74	5.21	4.86	4.62	4.44	4.30	4.19
	0.001	17.8	12.3	10.2	9.07	8.35	7.86	7.49	7.21	6.98
14	0.100	3.10	2.73	2.52	2.39	2.31	2.24	2.19	2.15	2.12
	0.050	4.60	3.74	3.34	3.11	2.96	2.85	2.76	2.70	2.65
	0.010	8.86	6.51	5.56	5.04	4.69	4.46	4.28	4.14	4.03
	0.001	17.1	11.8	9.73	8.62	7.92	7.44	7.08	6.80	6.58
15	0.100	3.07	2.70	2.49	2.36	2.27	2.21	2.16	2.12	2.09
	0.050	4.54	3.68	3.29	3.06	2.90	2.79	2.71	2.64	2.59
	0.010	8.68	6.36	5.42	4.89	4.56	4.32	4.14	4.00	3.89
	0.001	16.6	11.3	9.34	8.25	7.57	7.09	6.74	6.47	6.26
16	0.100	3.05	2.67	2.46	2.33	2.24	2.18	2.13	2.09	2.06
	0.050	4.49	3.63	3.24	3.01	2.85	2.74	2.66	2.59	2.54
	0.010	8.53	6.23	5.29	4.77	4.44	4.20	4.03	3.89	3.78
	0.001	16.1	11.0	9.01	7.94	7.27	6.80	6.46	6.19	5.98
17	0.100	3.03	2.64	2.44	2.31	2.22	2.15	2.10	2.06	2.03
	0.050	4.45	3.59	3.20	2.96	2.81	2.70	2.61	2.55	2.49
	0.010	8.40	6.11	5.18	4.67	4.34	4.10	3.93	3.79	3.68
	0.001	15.7	10.7	8.73	7.68	7.02	6.56	6.22	5.96	5.75
18	0.100	3.01	2.62	2.42	2.29	2.20	2.13	2.08	2.04	2.00
	0.050	4.41	3.55	3.16	2.93	2.77	2.66	2.58	2.51	2.46
	0.010	8.29	6.01	5.09	4.58	4.25	4.01	3.84	3.71	3.60
	0.001	15.4	10.4	8.49	7.46	6.81	6.35	6.02	5.76	5.56
19	0.100	2.99	2.61	2.40	2.27	2.18	2.11	2.06	2.02	1.98
	0.050	4.38	3.52	3.13	2.90	2.74	2.63	2.54	2.48	2.42
	0.010	8.18	5.93	5.01	4.50	4.17	3.94	3.77	3.63	3.52
	0.001	15.1	10.2	8.28	7.27	6.62	6.18	5.85	5.59	5.39
20	0.100	2.97	2.59	2.38	2.25	2.16	2.09	2.04	2.00	1.96
	0.050	4.35	3.49	3.10	2.87	2.71	2.60	2.51	2.45	2.39
	0.010	8.10	5.85	4.94	4.43	4.10	3.87	3.70	3.56	3.46
	0.001	14.8	9.95	8.10	7.10	6.46	6.02	5.69	5.44	5.24
21	0.100	2.96	2.57	2.36	2.23	2.14	2.08	2.02	1.98	1.95
	0.050	4.32	3.47	3.07	2.84	2.68	2.57	2.49	2.42	2.37
	0.010	8.02	5.78	4.87	4.37	4.04	3.81	3.64	3.51	3.40
	0.001	14.6	9.77	7.94	6.95	6.32	5.88	5.56	5.31	5.11
22	0.100	2.95	2.56	2.35	2.22	2.13	2.06	2.01	1.97	1.93
	0.050	4.30	3.44	3.05	2.82	2.66	2.55	2.46	2.40	2.34
	0.010	7.95	5.72	4.82	4.31	3.99	3.76	3.59	3.45	3.35
	0.001	14.4	9.61	7.80	6.81	6.19	5.76	5.44	5.19	4.99
23	0.100	2.94	2.55	2.34	2.21	2.11	2.05	1.99	1.95	1.92
	0.050	4.28	3.42	3.03	2.80	2.64	2.53	2.44	2.37	2.32
	0.010	7.88	5.66	4.76	4.26	3.94	3.71	3.54	3.41	3.30
	0.001	14.2	9.47	7.67	6.70	6.08	5.65	5.33	5.09	4.89
24	0.100	2.93	2.54	2.33	2.19	2.10	2.04	1.98	1.94	1.91
	0.050	4.26	3.40	3.01	2.78	2.62	2.51	2.42	2.36	2.30
	0.010	7.82	5.61	4.72	4.22	3.90	3.67	3.50	3.36	3.26
	0.001	14.0	9.34	7.55	6.59	5.98	5.55	5.23	4.99	4.80
25	0.100	2.92	2.53	2.32	2.18	2.09	2.02	1.97	1.93	1.89
	0.050	4.24	3.39	2.99	2.76	2.60	2.49	2.40	2.34	2.28
	0.010	7.77	5.57	4.68	4.18	3.85	3.63	3.46	3.32	3.22
	0.001	13.9	9.22	7.45	6.49	5.89	5.46	5.15	4.91	4.71
26	0.100	2.91	2.52	2.31	2.17	2.08	2.01	1.96	1.92	1.88
	0.050	4.23	3.37	2.98	2.74	2.59	2.47	2.39	2.32	2.27
	0.010	7.72	5.53	4.64	4.14	3.82	3.59	3.42	3.29	3.18
	0.001	13.7	9.12	7.36	6.41	5.80	5.38	5.07	4.83	4.64
27	0.100	2.90	2.51	2.30	2.17	2.07	2.00	1.95	1.91	1.87
	0.050	4.21	3.35	2.96	2.73	2.57	2.46	2.37	2.31	2.25
	0.010	7.68	5.49	4.60	4.11	3.78	3.56	3.39	3.26	3.15
	0.001	13.6	9.02	7.27	6.33	5.73	5.31	5.00	4.76	4.57
28	0.100	2.89	2.50	2.29	2.16	2.06	2.00	1.94	1.90	1.87
	0.050	4.20	3.34	2.95	2.71	2.56	2.45	2.36	2.29	2.24
	0.010	7.64	5.45	4.57	4.07	3.75	3.53	3.36	3.23	3.12
	0.001	13.5	8.93	7.19	6.25	5.66	5.24	4.93	4.69	4.50
29	0.100	2.89	2.50	2.28	2.15	2.06	1.99	1.93	1.89	1.86
	0.050	4.18	3.33	2.93	2.70	2.55	2.43	2.35	2.28	2.22
	0.010	7.60	5.42	4.54	4.04	3.73	3.50	3.33	3.20	3.09
	0.001	13.4	8.85	7.12	6.19	5.59	5.18	4.87	4.64	4.45
30	0.100	2.88	2.49	2.28	2.14	2.05	1.98	1.93	1.88	1.85
	0.050	4.17	3.32	2.92	2.69	2.53	2.42	2.33	2.27	2.21
	0.010	7.56	5.39	4.51	4.02	3.70	3.47	3.30	3.17	3.07
	0.001	13.3	8.77	7.05	6.12	5.53	5.12	4.82	4.58	4.39
[latex]\infty[/latex]	0.100	2.71	2.30	2.08	1.94	1.85	1.77	1.72	1.67	1.63
	0.050	3.84	3.00	2.60	2.37	2.21	2.10	2.01	1.94	1.88
	0.010	6.63	4.61	3.78	3.32	3.02	2.80	2.64	2.51	2.41
	0.001	10.8	6.91	5.42	4.62	4.10	3.74	3.47	3.27	3.10

This table gives [latex]f^{*}[/latex] such that [latex]\pr{F_{n,d} \ge f^{*}} = p[/latex].

Including tables like these in print is rather silly and some textbooks, such as Wild and Seber (2000), no longer include [latex]F[/latex] tables. The reason is that the calculations for ANOVA are complicated and tedious and are always done by computer these days. So if you are using a computer to do the calculations then you might as well get it to calculate the [latex]P[/latex]-value for you too. We have included some [latex]F[/latex] tables here to give you a feel for what the distribution is like and the kind of [latex]F[/latex] values you might expect.

We noted above that [latex]R^2[/latex], the square of the correlation coefficient, is intimately related to analysis of variance. It should be no surprise then that the [latex]t[/latex] test for slope we carried out in Chapter 18 is also related to ANOVA. The value we calculated was [latex]t_{22} = -2.83[/latex] and if you square this number you get [latex]t_{22}^2 = (-2.83)^2 = 8.00[/latex]. This is the value we found above for [latex]F_{1,22}[/latex]. The [latex]F[/latex] distribution is a generalisation of the [latex]t[/latex] distribution, since this always holds when the numerator degrees of freedom is 1. However we can apply ANOVA and the [latex]F[/latex] test for cases when the numerator degrees of freedom is more than 1, such as for regression with more than one predictor or for comparing more than two means (see the relevant section later in this chapter).

Note though that if we had a positive relationship of the same strength, so that [latex]t_{22} = +2.83[/latex], the [latex]F[/latex] statistic would still be 8.00. Thus ANOVA does not give any information about the direction of a significant association.

ANOVA Tables

Statistical software usually summarises an analysis of variance in the form of an ANOVA table, as shown in the following table. The “Source” column shows the source of the variability, breaking the total degrees of freedom (“df” column) and sum of squares (“SS” column) into the residual component and then the component attributed to the specified variable. The mean sums of squares are then calculated (“MS”) and then the [latex]F[/latex] statistic for the variable is given, along with the [latex]P[/latex]-value. All of these values were described in the above example. Values from software are slightly different to those we calculated by hand since the software keeps more decimal places at each step.

ANOVA table for oxytocin level (pg/mL) by age (years)

Source	df	SS	MS	F	P
Age	1	0.7777	0.7777	8.00	0.0098
Residuals	22	2.1377	0.0972
Total	23	2.9154

As a comparison, consider the least-squares fit shown in the figure below for the relationship between basal oxytocin level and weight.

Oxytocin level by weight with least-squares fit

ANOVA table for oxytocin level (pg/mL) by weight (kg)

Source	df	SS	MS	F	P
Weight	1	0.0621	0.0621	0.479	0.4963
Residuals	22	2.8533	0.1297
Total	23	2.9154

The ANOVA table given above has the same total sum of squares and degrees of freedom since this model is breaking down the same response variable as before. This time we see that very little of the variability in oxytocin level is explained by weight ([latex]R^2 = 0.021[/latex]), giving a smaller [latex]F[/latex] statistic and so a higher [latex]P[/latex]-value, [latex]p = 0.4963[/latex]. Thus there is no evidence of an association between basal oxytocin level and weight (and any suggestion of evidence from the line is likely due to the influential value on the right anyway).

Comparing Two Means

In the previous section we looked at analysis of variance for regression. This is a useful way to start thinking about ANOVA because we were already familiar with the context. In particular, we already saw in Chapter 18 the idea of a residual standard deviation in comparison with the sample standard deviation for one variable. In this section we look at another application of ANOVA to a setting we are already familiar with: comparing two population means.

Oxytocin level by relationship status

The figure above shows the same oxytocin level data as before but this time the predictor variable is relationship status. This is a similar picture to the preceding regression examples but since the predictor is categorical we are using a side-by-side dot plot instead of a scatter plot. Rather than having a straight line as the model we now simply imagine that each group has its own mean, joined by the dashed line, and that there is variability about this central value. As with regression we assume that this residual variability is Normally distributed and that its size does not depend on the group. This is just the setting of the pooled comparison described in Chapter 16.

Rather than work through the details of the analysis of variance we will begin by showing the completed ANOVA table below. You should be able to read this quite easily if you followed the discussion in the previous section.

ANOVA table for oxytocin level (pg/mL) by relationship status

Source	df	SS	MS	F	P
Single	1	0.7004	0.7004	6.96	0.0150
Residuals	22	2.2150	0.1007
Total	23	2.9154

The total sum of squares, SST = 2.9154, is the same as for the examples in the previous sections since we are modelling the same response variable. The residual sum of squares, [latex]\mbox{SSR}[/latex], is the sum of squared deviations for single women plus the sum of squared deviations for women in a relationship. The sample standard deviation for single women was [latex]s_1 = 0.2670[/latex] from [latex]n_1 = 12[/latex] observations, giving a sum of squares of [latex]11 \times 0.2670^2 = 0.7843[/latex]. For women in a relationship [latex]s_2 = 0.3606[/latex] from [latex]n_2 = 12[/latex] observations, giving a sum of squares of [latex]11 \times 0.3606^2 = 1.4307[/latex].
Thus [latex]\mbox{SSR} = 0.7843 + 1.4307 = 2.2150[/latex]. Note that this is identical to the calculation we carried out in Chapter 16 when pooling the standard deviations for a two-sample [latex]t[/latex] test.

The degrees of freedom of this residual error are [latex](12-1) + (12-1) = 22[/latex]. This is [latex]n-2[/latex], as it was in the regression between oxytocin and age, but it is [latex]n-2[/latex] for a different reason. The first [latex]-1[/latex] came from losing a degree of freedom because we had to estimate the mean level for single women. The second came from estimating the mean level for women in a relationship. Thus if, for example, our explanatory variable had had 4 categories instead of 2 then the degrees of freedom would be [latex]n-4[/latex] instead of [latex]n-2[/latex] (see the next section).

The sum of squares and degrees of freedom for the predictor can then be found by difference. Here we use SSG for the group sum of squares.

We can calculate the [latex]R^2[/latex] value as before, even though we cannot calculate the correlation coefficient in this setting. We find
\[ R^2 = \frac{\mbox{SSG}}{\mbox{SST}} = \frac{0.7004}{2.9154} = 0.2402, \]
so about 24% of variability in oxytocin level is explained by relationship status. This is a bit lower than the [latex]R^2[/latex] values for age (0.2670) and weight (0.3000) so knowing whether a woman is single does not explain quite as much variability in the basal oxytocin level.

However the [latex]F[/latex] statistic and [latex]P[/latex]-value suggest substantial evidence of an association between oxytocin level and relationship status. Since there are only two categories in this predictor we can interpret this as suggesting very strong evidence of a difference between single women and women in a relationship. We will see in the next section that the conclusion is more complicated for more than two groups.

Finally, note that for a pooled test (in Chapter 16) we obtain a [latex]t[/latex] statistic of [latex]t_{22} = 2.638[/latex]. Squaring this we find that [latex]t_{22}^2 = 2.638^2 = 6.96[/latex], the same as our [latex]F[/latex] statistic in the previous table. Again analysis of variance is a generalisation of the pooled procedures we have seen before. It won’t distinguish between a positive and negative difference, but it will allow us to test for differences amongst more than two means.

Comparing More Than Two Means

A standard one-way analysis of variance compares the mean response between two or more groups. It is a straightforward generalisation of the method described in the previous section where we now pool squared deviations from more than two groups.

Wind Speed and Transpiration

Thirty-five string bean plants were grown from seed for 2 weeks under constant artificial light. Fifteen plants which were similar in shape and size were then chosen to study the effect of wind on transpiration. These had their roots removed as well as all leaves except one, of similar size, on each plant. The plants were each placed in a graduated cylinder with the same amount of water and the top sealed to prevent evaporation. Five of the plants were arranged in front of a fan at the low setting, 5 were arranged in front of a fan at the high setting, while 5 were not exposed to a fan. After 1 hour the decrease in water in each cylinder was recorded, with results shown in the table below.

Water loss (mL) for different wind speeds

Wind	Water loss
High	8.5	6.5	6.0	5.0	7.0
Low	4.5	4.5	6.0	3.5	5.0
None	1.0	1.5	3.0	2.0	2.5

The figure belowshows the resulting water loss from 15 plants exposed to the different wind speeds. Is there any evidence that the water loss is related to fan speed?

Water loss for different wind speeds

The null hypothesis in this case is the usual one of no difference. Here we would say that there is no difference between either of the three mean water losses, [latex]\mu_{\small{\mbox{None}}}[/latex], [latex]\mu_{\small{\mbox{Low}}}[/latex], and [latex]\mu_{\small{\mbox{High}}}[/latex]. That is,

\[ H_0 :\mu_{\small{\mbox{None}}} = \mu_{\small{\mbox{Low}}} = \mu_{\small{\mbox{High}}}. \]

The alternative hypothesis is less straightforward since there are a lot of ways in which [latex]H_0[/latex] could be false. For example, it might be that low wind speed and no wind have the same average water loss but the high wind speed gives a higher water loss, or maybe all three speeds give different average losses. Before doing the study you may have been looking for one of these patterns in particular. However, all of these cases would lead to evidence against [latex]H_0[/latex]. We simply state the alternative hypothesis [latex]H_1[/latex] as “not all means are the same”.

Deviations from [latex]H_0[/latex] can be detected using analysis of variance to see if adding the categorical variable explains any of the response variability. The total sum of squares and degrees of freedom, the components of the sample variance of the response ignoring the explanatory variable, are calculated as before. We find the total sum of squared deviations is [latex]\mbox{SST } = 65.93[/latex], with 14 degrees of freedom.

The residual sum of squares is the sum of those squared deviations within each group, and so is often called the within-groups sum of squares. We calculate it in the same way we calculated the pooled standard deviation but now we are pooling three groups together.

Summary statistics for water loss by wind speed

	[asciimath]n[/asciimath]	[asciimath]\overline{x}[/asciimath]	[asciimath]s[/asciimath]
None	5	2.0	0.7906
Low	5	4.7	0.9083
High	5	6.6	1.2942

The table above shows the summary statistics for the water losses split by wind speed. We can work out the residual sum of squares as
\[ \mbox{SSR} = 4 \times 0.7906^2 + 4 \times 0.9083^2 + 4 \times 1.294^2 = 12.50, \]
and the associated degrees of freedom are
\[ \mbox{DFR} = 4 + 4 + 4 = 12. \]
These degrees of freedom are [latex]n - 3[/latex]. We lost one for each sample mean we had to estimate. If we were comparing [latex]k[/latex] groups then this would be [latex]n - k[/latex]. Return to Chapter 16 to see how this formula relates to the pooled standard deviation for two means.

The sum of squared deviations attributable to adding wind speed is then
\[ \mbox{SSG } = 65.93 – 12.50 = 53.43 \]
with degrees of freedom [latex]\mbox{DFG} = 14 - 12 = 2[/latex]. We can calculate
\[ R^2 = \frac{53.43}{65.93} = 0.8104, \]
so wind speed seems to explain around 81% of the variability in water loss.

The table below shows the ANOVA table for this analysis. The [latex]P[/latex]-value is less than 0.001, so there is substantial evidence of differences in the mean water loss between the three wind speeds.

ANOVA table for water loss by wind speed

Source	df	SS	MS	F	P
Wind	2	53.43	26.72	25.65	<0.001
Residuals	12	12.50	1.042
Total	14	65.93

Oxytocin and Emotions

We can finally answer the main research question behind the data in the original oxytocin example: was there any difference in oxytocin response between the three stimulus events? The figure below shows a side-by-side plot of the results.

Changes in oxytocin level by stimulus event

The null hypothesis is that the change in oxytocin is the same across all three stimulus events and the ANOVA table to test this is given in the following table. Note that the group degrees of freedom are again 2, since we are comparing three groups. Given the differences in the plot, it is not too surprising that the evidence for overall differences between the groups is very strong ([latex]R^2= 0.7996[/latex]). The question is then whether there is evidence of pairwise differences, such as for the change in oxytocin between reliving happy memories and receiving a massage. We will turn to that question in the next chapter.

ANOVA table for change in oxytocin (pg/mL) by stimulus event

Source	df	SS	MS	F	P
Group	2	1.6655	0.8327	41.91	<0.001
Residuals	21	0.4173	0.0199
Total	23	2.0828

Assumptions for ANOVA

The assumptions for regression analysis of variance are basically the same as those for linear regression:

Observations are independent;
The mean response is a linear function of the explanatory variable;
The residuals have a Normal distribution;
The variability of the residuals does not depend on the explanatory variable.

The model for one-way analysis of variance is simpler, with the observations of the response [latex]y[/latex] coming from
\[ Y = \mu_j + U, \]
where [latex]\mu_j[/latex] is the mean response for the population that group [latex]j[/latex] was sampled from, and [latex]\Normal{U}{0}{\sigma}[/latex], with [latex]\sigma[/latex] constant over the groups.

In words this model is saying that we assume each group has a mean response and that there is Normally distributed variability about the mean with a common standard deviation. We also have the general assumption, required for estimating the standard deviation, that the samples are independent. These assumptions can be checked in the same way we checked the assumptions for linear regression in Chapter 18. The residuals in this case are the differences between the observed responses and the sample mean for their group.

Rate of Dissolution

A study examined the effect of water temperature on dissolving times of soluble aspirin tablets. Three temperature conditions were used: hot (78-90[latex]^{\circ}[/latex]C), cold (7-10[latex]^{\circ}[/latex]C) and tap water (15-17[latex]^{\circ}[/latex]C). A total of 60 tablets, 20 at each temperature, were dissolved in a glass containing 150 mL of water. These glasses were continually rinsed and cycled during the experiment, and were placed on an insulating material while tablets were dissolving. The table below shows the recorded times for each tablet to dissolve.

Tablet dissolving times (s)

Cold	98.0	120.7	128.3	129.0	137.4
	145.7	146.8	149.7	151.4	155.1
	83.0	87.8	131.1	139.9	149.7
	149.8	156.5	121.5	88.7	119.4
Tap	70.9	72.0	74.2	74.7	75.0
	76.7	78.4	82.0	82.1	86.7
	87.0	87.9	89.8	93.1	96.1
	99.6	72.6	78.7	87.5	104.5
Hot	21.9	23.6	24.0	23.5	23.4
	23.2	22.1	23.0	23.0	22.7
	21.3	22.3	22.1	22.6	21.4
	22.2	21.1	22.1	20.6	22.6

This is a straightforward experiment to carry out, but the results obtained do not satisfy the assumptions for ANOVA. From the figure below it can be seen that the three distributions are fairly symmetric, so Normality is not necessarily an issue, but that they have very different standard deviations. This is a common pattern to see, with values having larger magnitude also having larger variability.

Dissolving time by water temperature

One solution is to try transforming the data to stabilise the variability. The authors of this study initially tried a log transform, as we have used in Chapter 14 and Chapter 16. Log transforms are often applicable in this type of problem, since they transform a multiplicative effect, one whose variability typically scales in this way, into an additive effect. However, after taking logs there was still a large difference between the variability in the Cold and Hot groups.

The authors instead tried taking the reciprocal of each observation so that a measurement of 98 s would become 1/98 = 0.0102 s[latex]^{-1}[/latex]. This is a physically meaningful quantity, giving the rate of dissolution of the tablet. The value of 0.0102 indicates that the tablet was dissolving at a rate of 0.0102 tablets per second. This transformation also did an excellent job of stabilising the variability, as shown in the figure below.

Rate of dissolution by water temperature

The transformed values can then be used to carry out an analysis of variance in the usual way.

Summary

Analysis of variance (ANOVA) breaks down the total variability in a response into the residual variability and the variability that can be explained by the model.
Each variability component is summarised by a sum of squared deviations and a degrees of freedom.
The [latex]R^2[/latex] value gives the proportion of the total sum of squared deviations that can be explained by the model.
The [latex]F[/latex] statistic is used in conjunction with the [latex]F[/latex] distribution to test whether the variability due to the model is significantly more than the residual variability.
The assumptions for regression ANOVA are the same as for linear regression. As a test of association the [latex]F[/latex] test gives identical results to the two-sided linear regression test of slope.
One-way analysis of variance breaks total variability into the residual variability within groups and the variability between groups.

Exercise 1

Find one pair of values from Student’s T distribution table and the [latex]F[/latex] distribution table to demonstrate the relationship between the [latex]t[/latex] and [latex]F[/latex] distributions.

Exercise 2

An experiment to determine optimal conditions for celery storage divided 15 stalks into three groups, each with similar numbers of thick and thin pieces to try to make the groups equal. Each group was placed in the fridge in different storage conditions, one group standing in a glass of water, another in a plastic bag wrapped up tightly and the last group just put directly in the fridge with no coverings. The celery was left untouched in the fridge for five days. After five days the angle of bend was measured by holding half in a straight line and bending the other half until it broke. This was done over a protractor and the degree at the breaking point was measured, as given in the table below.

Celery bend angle (degrees) for different storage methods

Method	Bend Angle
Water	15	19	13	16	30
Bag	9	32	26	35	22
None	80	47	44	53	61

Use a one-way analysis of variance [latex]F[/latex] test to determine whether there is a difference in mean celery bend angle between the three storage conditions. Comment on the validity of the assumptions underlying the test.

Exercise 3

Researchers conducted an observational study of green tea and coffee consumption from 537 men and women at two workplaces in Japan (Pham et al., 2013). Subjects were classified into three groups for green tea consumption: ‘[latex]\le[/latex] 1 cup/day’, ‘2-3 cups/day’ and ‘[latex]\ge[/latex] 4 cups/day’. The following table shows an incomplete ANOVA table for comparing the mean body mass index (BMI) between the three green tea groups. Complete the table and use it to determine whether there is any evidence of an association between BMI and green tea consumption.

ANOVA table for BMI (kg/m[asciimath] ^2[/asciimath]) by green tea consumption

Source	df	SS	MS	F	P
Green Tea	64.42
Residuals	5809.64
Total	5873.83

Exercise 4

Summary statistics for age (years) by coffee consumption

Consumption	[asciimath]n[/asciimath]	Mean	SD
< 1 cup/day	207	42.0	11.8
1 cup/day	114	43.7	11.2
≥ 2 cups/day	216	46.2	10.0

The table above gives the summary statistics for comparing the mean age between three coffee consumption groups (Pham et al., 2013).

Use the given standard deviations to calculate the pooled residual sum of squares and degrees of freedom.
Use the means to calculate the total of the ages in each group, [latex]T_j[/latex], and the total of all the ages, [latex]T = T_1 + T_2 + T_3[/latex].
The sum of squares between coffee groups can be then calculated using
\[ \mbox{SSG} = \frac{T_1^2}{n_1} + \frac{T_2^2}{n_2} + \frac{T_3^2}{n_3} – \frac{T^2}{n}, \]
where [latex]n = n_1 + n_2 + n_3[/latex] is the total number of subjects.
Use the values you have calculated to construct an ANOVA table and determine whether there is evidence of a difference in mean age between the three coffee consumption groups.

Licence

Icon for the Creative Commons Attribution-NonCommercial 4.0 International License

A Portable Introduction to Data Analysis Copyright © 2024 by The University of Queensland is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License, except where otherwise noted.

Analysis of Variance for Regression

Breaking Up Variance

The F statistic

F Distribution

[latex]F(1,22)[/latex] distribution

[latex]F(1,d)[/latex] distribution

[latex]F[/latex] distribution

ANOVA Tables

ANOVA table for oxytocin level (pg/mL) by age (years)

ANOVA table for oxytocin level (pg/mL) by weight (kg)

Comparing Two Means

ANOVA table for oxytocin level (pg/mL) by relationship status

Comparing More Than Two Means

Water loss (mL) for different wind speeds

Summary statistics for water loss by wind speed

ANOVA table for water loss by wind speed

ANOVA table for change in oxytocin (pg/mL) by stimulus event

Assumptions for ANOVA

Tablet dissolving times (s)

Celery bend angle (degrees) for different storage methods

ANOVA table for BMI (kg/m[asciimath] ^2[/asciimath]) by green tea consumption

Summary statistics for age (years) by coffee consumption

Licence

Share This Book