| UP |
Firstly, we need to create a data.frame. On this dataframe we execute a random effect model.
Steps to follow:
| 1) |
Create the data-frame on the data (textfile ch24pr07 ; right click on the file and save on C:\Temp ). Look to the file in R by File/display file. |
| 2) | Create factors in the frame |
| 3) | Execute the ANOVA two-way without interaction;
Check the the graphically the predictions, residuals, QQ-plot, Cook distance plot...for every ANOVA and note your observations |
| 4) | Balanced and fixed factors; so SS-decomposition is useful and correct
The General Linear Test or classical F tests can be done. |
| 5) | If appropiate execute multiple comparison or test specific contrast questions. |
We create a data-frame by typing (or copy/paste) the red letters after > followed by <ENTER> The blue letters are the answer from the R-programme. We check the content by typing in the name of the data-frame. Remark systems like R originate in Unix in which they discriminate uppercase and undercase and replace the usual backslash "\" by two "\\".
>Ch24pr07<-read.table("c:\\temp\\Ch24pr07.txt");Ch24pr07
V1 V2 V3
1 24.4
1 1
2 22.6
1 2
3 23.8
1 3
(rest of dataset)
Or directly from the webserver:
Ch24pr07<-read.table("http://www.biw.kuleuven.be/VAKKEN/statisticsbyR/datasetsTXT/CH24PR07.txt")
We give names to the variables.
names(Ch24pr07)=c("NaConc","Bier","Flesje")
We attach the data.frame for direct access to the data.
attach(Ch24pr07)
Important is to transfrom the qualitative variables expressed by numbers into factors.
Bier<-factor(Bier)
Flesje<-factor(Flesje)
An useful visualisation is by calling the interaction plot function with (Factor A, Factor B , Response Var) as arguments
interaction.plot(Bier,Flesje,NaConc)
Parallel lines or not? Look and judge carefully. What is your opinion?

We see differences depending on the Beer type (Bier)
More meaningfull is
boxplot(NaConc~Bier)

What do you see?
Firstly check on the replications and the number of levels
tapply(NaConc,Bier,length)
1 2 3 4 5 6
8 8 8 8 8 8
rlevels=nlevels(Bier);rlevels
[1] 6
We have 8 bottles for 6 types of beer. Our problem is balanced.
nreplic=mean(tapply(NaConc,Bier,length));nreplic
[1] 8
Now we can execute a one-way factor by lm
(the data are balanced so
lm can be used; however the interpretation in the case of random effects
is specific; see chapter 24 ).
Let us fit the ANOVA model like we would do for a fixed effects; however the interpretation will be different.
ResLml<-lm(NaConc~Bier)
anova(ResLml)
Analysis of Variance Table
Response: NaConc
Df Sum Sq
Mean Sq F value
Pr(>F)
Bier 5
854.53 170.91
238.71 < 2.2e-16 ***
Residuals 42 30.07
0.72
---
Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
Compare at alfa 1%:
qf(0.99,5,42)
[1] 3.488235
Do we accept Ho or Ha ?? What does it mean?
Estimate the sigma^2 for mu (see eq 24.22 page 970 in Neter et al, 1996)
ResLm2<-lm(NaConc~Bier-1);ResLm2
Call:
lm(formula = NaConc ~ Bier - 1)
Coefficients:
Bier1 Bier2 Bier3
Bier4 Bier5 Bier6
23.64 10.67 19.34 17.50
14.21 20.41
Compare with:
tapply(NaConc,Bier,mean)
1
2 3
4 5
6
23.6375 10.6750 19.3375 17.5000 14.2125 20.4125
Ydotdot=mean(NaConc);Ydotdot
[1] 17.62917
Test with alfa 1%:
alfa=0.01
For exercise 24.8 a:
qf(1-alfa/2,rlevels-1,rlevels*(nreplic-1))
[1] 3.952827
qf(alfa/2,rlevels-1,rlevels*(nreplic-1))
[1] 0.0799256
Confidence interval on MSE-estimate for 24.8 b and c:
qchisq(1-alfa/2,rlevels*(nreplic-1))
[1] 69.336
qchisq(alfa/2,rlevels*(nreplic-1))
[1] 22.13846
layout(matrix(c(1,2,3,4),2,2))
plot(ResLml)
Are we happy with the residuals???
Remark (niet voor Proeftechniek):
This analysis can also be done by specific functions for random and mixed models:
Firstly load the package nlme (standard package but not in the base via menu: "Package/Load package" and select "nlme") and then
NLME normally uses a special data structure: groupedData composed by the dataframe and a formula, which structures the data
Ch24pr07GD=groupedData(formula=NaConc~1|Bier,data=Ch24pr07)
Ch24pr07GD$Bier<-factor(Ch24pr07GD$Bier)
ResLme=lme(NaConc~1,random=~1|Bier,data=Ch24pr07GD)
summary(ResLme)
Linear mixed-effects model fit by REML
Data: Ch24pr07GD
AIC BIC
logLik
154.9230 160.4735
-74.46152
Random effects:
Formula: ~1 | Bier
(Intercept) Residual
StdDev: 4.612346 0.8461397
Fixed effects: NaConc ~ 1
Value Std.Error DF
t-value p-value
(Intercept) 17.62917 1.886939
42 9.342733 <.0001
Standardized Within-Group Residuals:
Min
Q1 Med
Q3 Max
-1.90551292 -0.68337933 0.08232268 0.79246858 1.64968962
Number of Observations: 48
Number of Groups: 6
This is a mixed model procedure. In this case not a Least Squares criterium is used but
the REML (REstricted ML; a type Maximum Likelihood) is used for fitting. This procedure is possible
for unbalanced data.
plot(ResLme)
A plot with the standardized residuals appears.
Confidence intervals on the estimations with 95% confidence:
intervals(ResLme,0.95)
Approximate 95% confidence intervals
Fixed effects:
lower
est. upper
(Intercept) 13.82117 17.62917 21.43716
Random Effects:
Level: Bier
lower est.
upper
sd((Intercept)) 2.475231 4.612346 8.594646
Within-group standard error:
lower
est. upper
0.6832301 0.8461397
1.0478935
We see that the estimates for σb = 4.61; σ = 0.846 and β=17.63. Is this different from the lm (linear model) estimation?
| UP |
23 April, 2003 by Guido Wyseure