|
Homogeneous structures: vector, factor, matrix and array. Only one mode in the structure.
| Object |
Max dimension |
One mode of |
| vector | 1 | numeric, character, complex or logical |
| factor | 1 | numeric or character |
| matrix | 2 | numeric, character, complex or logical |
| array | n | numeric, character, complex or logical |
Other structures can have a mixture of modes:
| Object | Main use | Several modes allowed in one object |
| data.frame | dataset for analysis | numeric, character, complex or logical |
| ts | time series data | numeric, character, complex or logical |
| list | results; information from analysis | numeric, character, complex, logical, function, expresion and formula |
Homogeneous structures: vector, factor, matrix and array
Concatenate function c() is useful to enter a few data into a structure. Here we create a vector "Stretch".
> Stretch<-c(46,54,48,50,44,42,52)
> Stretch
[1] 46 54 48 50 44 42 52
The individual elements in the vector can be extracted by indices between square brackets "[ ]". Extracting the third element:
> Stretch[3]
[1] 48
Or even by "deselecting". We select all elements with the exception of first up to third element. The ":" creates a range from 1 to 3.
> Stretch[-1:-3]
[1] 50 44 42 52
If we deselect element 1 and 3 but not 2 we need to give a vector. The ",' means another dimension.
> Stretch[-c(1,3)]
[1] 54 50 44 42 52
So as a vector has only one dimension; the minus 3 tries to exclude the
nonexisting second dimension:
> Stretch[-1,-3)]
Error: syntax error
We enter another vector:
> Distance<-c(148,182,173,166,109,141,166)
> Distance
[1] 148 182 173 166 109 141 166
Remark that vector do not carry names and cannot be given names.
> names(Distance)
NULL
> names(Distance)<-c("cm")
Error in "names<-.default"(*tmp*, value = c("cm")) :
names attribute must be the same length as the vector
Now we construct a matrix: M.Elastic.Band by placing the vectors in columns
by the function cbind().
> M.Elastic.Band<-cbind(Stretch,Distance)
> M.Elastic.Band
Stretch Distance
[1,] 46
148
[2,] 54
182
[3,] 48
173
[4,] 50
166
[5,] 44
109
[6,] 42
141
[7,] 52
166
We check the type of object:
> is.matrix(M.Elastic.Band)
[1] TRUE
> is.data.frame(M.Elastic.Band)
[1] FALSE
An alternative is to give elements to the function matrix and indicate the dimensions. Not recommended here.
> M.El.Ba<-matrix(c(Stretch,Distance),nrow=7,ncol=2)
> M.El.Ba
[,1]
[,2]
[1,] 46 148
[2,] 54 182
[3,] 48 173
[4,] 50 166
[5,] 44 109
[6,] 42 141
[7,] 52 166
We need however dataframes for statistical analysis:
> lm(M.Elastic.Band$Stretch~M.Elastic.Band$Distance,
data=M.Elastic.Band)
Error in model.frame.default(formula =
M.Elastic.Band$Stretch ~ M.Elastic.Band$Distance, :
`data' must be a data.frame, not a matrix or array
We can do operations on the vector, matrix and arrays.
We can operate on the individual elements:
> Dist2<-Distance*2
> Dist2
[1] 296 364 346 332 218 282 332
Or we can perform matrix operations: e.g. matrixmultiplication of the transposed
( function t() ) vector Stretch with the vector Distance. Matrix operations use
the operator sign between " %+ %" for matrix addition.
> SDmat<-t(Stretch)%*%Distance
> SDmat
[,1]
[1,] 52590
Application:
We would like to determine the sample size for a t-test. We are interested in a difference in the average which is equal to the standard deviation.
> Nsample<-c(5:30)
> Nsample
[1] 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
25 26 27 28 29
[26] 30
Use help to know the default values of the function power.t.test
> help(power.t.test)
Argument delta is given.
> power.t.test(n=Nsample,delta=1)
Two-sample t test power calculation
n = 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
25, 26, 27, 28, 29, 30
delta = 1
sd = 1
sig.level = 0.05
power = 0.2859276, 0.3471565, 0.4056971, 0.4611759, 0.5133256, 0.5619846,
0.6070844, 0.6486344, 0.6867055, 0.7214166, 0.7529210, 0.7813965, 0.8070359,
0.8300400, 0.8506120, 0.8689528, 0.8852576, 0.8997136, 0.9124983, 0.9237780,
0.9337076, 0.9424303, 0.9500772, 0.9567684, 0.9626126, 0.9677083
alternative = two.sided
NOTE: n is number in *each* group
We see that the function worked on individual elements in the vector
Nsample. We can see that n=17 will achieve a power of 80% which is often
accepted a sufficient power.
Data.frames and lists
Most analysis has to be executed on data.frames. The result of an analysis is a list.
We can create a data.frame
> Elastic.Band<-data.frame(M.Elastic.Band)
> Elastic.Band
Stretch
Distance
1
46 148
2
54 182
3
48 173
4
50 166
5
44 109
6
42 141
7
52 166
R has remembered the names of the vectors and uses them as names in the data.frames.
> names(Elastic.Band)
[1] "Stretch" "Distance"
For some functions like plot a matrix or a dataframe can be used:
> plot(Distance~Stretch,data=M.Elastic.Band)
> plot(Distance~Stretch,data=Elastic.Band)
The same plot will appear.
However to perform a statistical analysis like "lm" (linear model) for regression it has to be a data.frame. The commands below are equivalent and give the same result.
> lm(Elastic.Band$Distance~Elastic.Band$Stretch)
> lm(Distance~Stretch,data=Elastic.Band)
> attach(Elastic.Band)
> lm(Distance~Stretch)
Normally we prefer to store the result in a list:
> RegEB<-lm(Distance~Stretch)
> RegEB
Call:
lm(formula = Distance ~ Stretch)
Coefficients:
(Intercept) Stretch
-63.571 4.554
We see the same output each time....However there is more in the list then what is displayed.
By following commands we can see the full list.
> Nelem<-length(RegEB)
> Nelem
[1] 12
> RegEB[1:Nelem]
Very long display of the 12 different elements in the list. Content will be studied later. Important here is to understand the concept of a list. One example out of the long list:
$fitted.values
1 2 3 4 5 6 7
145.8929 182.3214 155.0000 164.1071 136.7857 127.6786 173.2143
We can get by its name or its number (=5)...
> RegEB[5]
$fitted.values
1
2
3
4
5
6
7
145.8929 182.3214 155.0000 164.1071 136.7857 127.6786 173.2143
> RegEB$fitted.values
1
2
3
4
5
6
7
145.8929 182.3214 155.0000 164.1071 136.7857 127.6786 173.2143
Useful is to save the Workspace under a name at the end of a session (and
during). At the beginning of next session the Workspace can be loaded. The
objects already created are available.
Similarly the history can be saved.
| Up |
10 May 2005 by Guido Wyseure