ANOVA : Help

ANCOVA (Analysis of Variance) is applied to compare more than two independent sample means, while controlling for a covariate.

Suppose you want to compare mean marks scored in 12 ^th standard, by students in two different schools. The results show that the mean marks for school A are significantly more than that for school B. Here, we tend to infer that school A is much better than school B.

Then comes next issue: selective admission to intelligent students in school A. Now, you want to test whether the difference in the marks is because of quality education or it is because of quality of admitted students to school A. To identify quality of admitted students, you can consider marks obtained in previous years, before the admission to these schools. For example, marks obtained in 10^th (considering students change the school after 10^th standard).

Now you want to compare the mean marks in 12^th AFTER controlling for marks in 10^th. Here the covariate is marks in 10^th standard. So, in this example dependent variable is marks in 12^th standard, independent variable is school of 12^th standard, and covariate is marks previous to the admission to the school. ANCOVA can be applied here to compare two means after controlling for the covariate.

Assumptions
1. Dependent variable and covariate should be continuous and normally distributed in all the categories of independent variable.
2. Samples are drawn using random sampling technique.
3. The observations are independent. There are no observations, which are in more than one group.
4. The residuals should be normally distributed. (homogeneity of variance / homoscedasticity). It can be tested by Levene's test of homogeneity of variance of errors.
5. There are no outliers.
6. There is one independent variable, which is nominal or ordinal, with more than two possible levels / categories.
7. Covariate should be linearly related to the dependent variable for each category of independent variable. It can be tested by scatter plot for dependent variable and covariate, one each for each category of independent variable.

Null hypothesis and Alternate Hypothesis

Null Hypothesis

A. Sample means for all categories of independent variables are equal (μ_i = μ_j), after controlling for covariate.

Alternate Hypothesis

A. There are at least two means (at least a pair of means), which are different than each other (μ_i ≠ μ_j), after controlling for covariate.

Where μ_i and μ_j can be any two category means.

Following is the sample data for marks in 12^th, school and previous marks.

Marks_12_Standard	School	Previous_marks
80	B	64
77	B	61
78	B	60
70	B	54
95	A	90
90	A	92
78	B	52
76	B	57
93	A	85
92	A	90
75	B	57
75	B	65
73	B	55
95	A	88
75	B	51
75	B	69
80	B	60
82	A	83
80	A	98
99	A	92
80	B	56
91	A	94
80	B	68
77	B	65
83	B	67
85	B	70

You can copy and paste above table, and paste in the textbox on the ANCOVA page. (You may need to paste it in excel, to get the data in required format)
Please choose your alpha (usually 5%). Select type III as model type for SSQ. This type III SSQ is more approproiate for most of the situations. For details, please read details about types of SSQ in two-way ANOVA help page.
Click on "Run test". You will get following output.

Results

Descriptive: School

Category	Mean	SD	Sample Size
B	77.4706	3.6762	17
A	90.7778	6.1599	9

Reference Category = B
(Explanation: Above table gives raw means for mean marks, SD and sample size in dependent variable (12th marks) for both the schools.

ANCOVA Results

Tests of between-subjects effects
Dependent variable = Marks_12_Standard

Source	Sum of Squares (Type III)	df	Mean Sum of Square	F	P
Corrected Model	1078.0207	2	539.0104	25.6234	0
Previous_marks	35.9654	1	35.9654	1.7097	0.2039
School	35.7206	1	35.7206	1.6981	0.2054
Error	483.8254	23	21.0359
Corrected Total	1561.8462	26

Non Significant p value for School(P = 0.2054): The means for various caterogies of School are equal, after adjusting for Previous_marks.
(Explanation: You need to look at the row for our independent variable:School. Here F=1.6981, p=0.2054, non-significant: See above interpretation)

Parameter Estimates

Parameter	Beta Coefficients	Std Error	t	P	LB	UB
Intercept	64.2726	10.1547	6.3294	0	43.2661	85.2792
Previous_marks	0.2176	0.1664	1.3076	0.2039	-0.1267	0.5619
A	6.8711	5.2729	1.3031	0.2054	-4.0367	17.7788

Explanation: Using this regression table, we can predict 12 ^th marks for a student. The regression equation will be as follows
DV = 64.2726 + 0.2176 * (Previous_marks) + 6.8711 * (School)
Here, DV= 12th marks
School = 1, for school A
School = 0, for school B

Adjusted Means

Category	Adjusted Mean	Standard Error	LB	UB
B	79.6985	2.0348	75.4891	83.9078
A	86.5696	3.563	79.1988	93.9403

Explanation: Above table shows average 12^th marks, after adjusting for covariate. Please compare the means before and after adjustment.

Column chart showing Unadjusted and adjusted means

Post-hoc Group-wise comparisons

Group-wise comparisons

Comparison Between	Difference between means	Standard Error	t	P^#	LB	UB
B AND A	-6.8711	5.2729	-1.3031	0.2049	-17.7788	4.0367

^# Multiple comparisons adjustments: Bonferroni
Explanation: If table for "Tests of between-subjects effects" shows significant p value for independent variable, it means that at least a pair of category means in independent variables is significantly different. Above table provides results of post-hoc test (Bonferroni) to identify such pair/pairs of means.
(In this example, p value for independent variable:School is non-significant in the table for "Tests of between-subjects effects". So post-hoc test is actually not needed.

Levene's test for equality of error variances

F	df1	df2	P
4.1021	1	24	0.0541

Levene test is not significant (P = 0.0541): The assumption of equality of variances (homoskedasticity) is met.
Explanation: If Levene test is significant, then assumption of equality of error variances is violated. In this situation, the results of ANCOVA are biased.
In this example, Levene test is non significant, indicating that we can proceed with ANCOVA.

Lack of Fit test

Source	Sum of Squares	df	Mean Sum of Squares	F	P
Lack of fit	434.3254	18	24.1292	2.4373	0.1646
Pure Error	49.5	5	9.9

Lack of Fit test is not significant (P = 0.1646): The linear relationship fits the model adequately.
Explanation: If Lack of Fit is significant, then there may be another relationship between the variables, which fits the data much better than this linear relationship.
In this example, Lack of Fit is non-significant, hence we can say that linear relationship in ANCOVA can be accepted.

How to report ANCOVA results:
A one-way ANCOVA was conducted to compare the marks in 12_th standard obtained by students in two schools, whilst controlling for marks in previous year before the admission to the school . Levene’s test was carried out to test the assumption of equality of error variance. The ANCOVA revealed no significant difference in mean marks in 12 ^th (F=1.6981, p=0.2054, after controlling for previous marks.

@ Sachin Mumbare

ANCOVA (Analysis of Covariance)

AIM: To compare (two or more) independent sample means while controlling for a covariate