| UP |
Firstly, we need to create a data.frame. On this dataframe we execute a three-way ANOVA. Description of the Data-set in Neter et al, 1996.
Steps to follow:
| 1) |
Create the data-frame on the data (textfile Ch30pr11 ; right click on the file and save on C:\Temp ). Look to the file in R by File/display file. |
| 2) | Create factors in the frame |
| 3) | Execute the ANOVA three-way without interaction;
Check the the graphically the predictions, residuals, QQ-plot, Cook distance plot...for every ANOVA and note your observations |
| 4) | Balanced and fixed factors; so SS-decomposition is useful and correct
The General Linear Test or classical F tests can be done. |
| 5) | If appropiate execute multiple comparison or test specific contrast questions. |
We create a data-frame by typing (or copy/paste) the red letters after > followed by <ENTER> The blue letters are the answer from the R-programme. We check the content by typing in the name of the data-frame. Remark systems like R originate in Unix in which they discriminate uppercase and undercase and replace the usual backslash "\" by two "\\".
Ch30pr11<-read.table("c:\\temp\\ch30pr11.txt");Ch30pr11
V1 V2 V3 V4
1 0.02 1 1
1
2 0.15 1 2
2
3 0.45 1 3
4
(rest of dataset)
Or directly from the webserver
Ch30pr11<-read.table("http://www.biw.kuleuven.be/vakken/statisticsbyR/datasetsTXT/CH30PR11.txt")
We attach the data.frame for direct access to the data.
names(Ch30pr11)=c("Differ","Person","Period","Medication");attach(Ch30pr11)
Important is to transform the qualitative variables expressed by numbers into factors.
Person<-factor(Person);Period<-factor(Period);Medication<-factor(Medication)
The Period and Person are used in a 4 by 4 Latin square ( this assumes crossing of this blocking factors and no carry-over effects which is avoided by a intervening month without treatment)
An useful visualisation is by calling the interaction plot function with (Factor A, Factor B , Response Var) as arguments
interaction.plot(Person,Medication,Differ)
Parallel lines or not? Look and judge carefully. What is your opinion?

We can also look to:
interaction.plot(Period,Medication,Differ)
Now we can execute a three-way factor without interaction ANOVA (the data are balanced so the decompostion of SSTR in SSA SSB is allowed and correct).
As a general approach (valid for unbalanced and balanced) we execute the GLT. In R just execute the Full model and the Reduced model and then ask execute the 'anova' function.
LatSqfull<-lm(Differ~Medication+Person+Period)
anova(LatSqfull)
Analysis of Variance Table
Response: Differ
Df Sum Sq Mean Sq
F value Pr(>F)
Medication 3 0.43332
0.14444 95.8935 1.851e-05 ***
Person 3
0.03462 0.01154 7.6611
0.01783 *
Period 3
0.00592 0.00197 1.3098
0.35497
Residuals 6
0.00904 0.00151
---
Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
What can we conclude for the factors; are significantly different from the H0?
We check the plots of residuals
layout(matrix(c(1,2,3,4),2,2))
plot(LatSqfull)
What are your answers to question for problem 30.11 page 1231? Are we happy with the residuals???
Calculation of the interaction contrast problem 30.12 c (NKNW pag 1231).
We can pick up the MSE out of the list by the anova function by extracting the corresponding
MSE= anova(LatSqfull)[4,3]; MSE
0.00150625
Averages according to medication:
muidotdot=tapply(Differ,Medication,mean);muidotdot
1
2 3
4
0.0050 0.1950 0.1775 0.4650
Coefficients for contrast to test the interaction:
ci=c(-1,1,1,-1); ci
[1] -1 1 1 -1
Number of observations for every medication
ni=tapply(Differ,Medication,length);ni
1 2 3 4
4 4 4 4
The standard error on the estimate (equation: 17.23 page 721)
sLest=sqrt(MSE*sum(ci^2/ni));sLest
0.03881044
Single contrast (equation 30.10 a pag 1215) use t-score:
tscore=qt(1-alfa/2,(rlevels-1)*(rlevels-2));tscore
[1] 1.943180
Contrast:
Lest=sum(ci*muidotdot);Lest
[1] -0.0975
Problem 30.12 with the same data; now we treat the medication as a factor A and a factor B.
In R we can use a boolean (or logical) variable (value "TRUE" or "FALSE") in a numeric calculation and the boolean is used as a 0 (false) and 1 (true). In one go we also ensure that the new vectors are treated as factors and not mistaken as numerical values.
trtA = factor ( ( Medication==2) + (Medication==4))
trtB = factor ( ( Medication==3) + (Medication==4)); trtB
[1] 0 0 1 1 0 1 0 1 1 1 0 0 1 0 1 0
Levels: 0 1
So in trtA and trtB we have 1 if the component is present and 0 if
absent.
Now we can test interaction by
LatSqinter<-lm(Differ~trtA+trtB+trtA:trtB+Person+Period)
anova( LatSqinter)
Analysis of Variance Table
Response: Differ
Df Sum Sq Mean
Sq F value
Pr(>F)
trtA 1
0.228006 0.228006 151.3734 1.757e-05 ***
trtB 1
0.195806 0.195806 129.9959 2.729e-05
***
Person 3 0.034619
0.011540 7.6611
0.01783 *
Period 3 0.005919
0.001973 1.3098
0.35497
trtA:trtB 1 0.009506
0.009506 6.3112
0.04577 *
Residuals 6 0.009038 0.001506
---
Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
Use this results to answer 30.12 part b.
Compare last ANOVA table with the previous one [anova(LatSqfull)]. What do you see? Is the sequence of factors in the model important and why?
Alternative Mixed model approach (NIET voor Proeftechniek)
Under construction
A Latin Square can be seen as a mixed model: fixed factor is the factor under investigation while the row/column factors are two crossed random factors.
Firstly load the package nlme
standard package but not in the base
either via menu: "Package/Load package" and select "nlme")
or by the comand:
library(nlme)
Loading required package: nls
Loading required package: lattice
and then ...... to be completed and checked.....
Compare the result from the "lme" to the classical ANOVA analysis by the "lm" function. In the first step we create a groupedData special dataframe for the data and in the secund step we need a complex random model
Ch30pr11GD=groupedData(Differ~Medication|Period/Person,data=Ch30pr11)
Latlme=lme(Differ~Medication,data=Ch30pr11GD,random=pdBlocked(list(pdIdent(~Person-1),pdIdent(~Period-1))))
Warning message:
Fewer observations than random effects in all level 2 groups in:
lme.formula(Differ ~ Medication, data = Ch30pr11GD, random =
pdBlocked(list(pdIdent(~Person -
.......etc.......
summary(Latlme)
Linear mixed-effects model fit by REML
Data: Ch30pr11GD
AIC BIC
logLik
-8.159266 -3.685865 11.07963
Random effects:
Composite Structure: Blocked
Block 1: Person
Formula: ~Person - 1 | Period
Person
StdDev: 0.0001662544
Block 2: Period
Formula: ~Period - 1 | Period
Period
StdDev: 0.0001255021
etc.....
| UP |
14 May, 2003 by Guido Wyseure