Open Rstudio to do the practicals. Note that tasks with * are optional.
In this practical, a number of R packages are used. The packages used (with versions that were used to generate the solutions) are:
survival
(version: 3.3.1)R version 4.2.1 (2022-06-23 ucrt)
For this practical, we will use the heart and
retinopathy data sets from the survival
package. More details about the data sets can be found in:
https://stat.ethz.ch/R-manual/R-devel/library/survival/html/heart.html
https://stat.ethz.ch/R-manual/R-devel/library/survival/html/retinopathy.html
Examine the following code in R.
<- function(k){
h if (k <= 20){
3 * k
else {
} 2 * k
} }
Investigate what this code takes as an input (e.g. scalar, vector, matrix, array, data.frame, list) and what the output will be in every case.
<- function(k){
h if (k <= 20){
3 * k
else {
} 2 * k
}
}
# scalar:
<- 10
scalar_test
# vector:
<- c(1:10)
vec_test
# matrix:
<- matrix(data = 1:6, nrow = 3, ncol = 3) mat_test
## Warning in matrix(data = 1:6, nrow = 3, ncol = 3): data length differs from size of matrix: [6 != 3
## x 3]
#array:
<- array(data = 1:5, dim = c(2, 2, 2))
array_test
# data frame:
<- data.frame(v1 = 1:10, v2 = sample(0:1, 10, replace = TRUE))
df_test
# list:
<- list(l1 = 1:30, l2 = c("m", "f"), l3 = c(TRUE, TRUE, FALSE, FALSE, FALSE))
list_test
# Test the function (after you remove the #):
# h(scalar_test)
# h(vec_test)
# h(mat_test)
# h(array_test)
# h(df_test)
# h(list_test)
Examine the following code in R.
<- function(k){
h for(i in 1:length(k)){
<- if (k[i] <= 20){
k[i]3 * k[i]
else {
} 2 * k[i]
}
}
k }
Investigate what this code takes as an input (e.g. scalar, vector, matrix, array, data.frame, list) and what the output will be in every case.
<- function(k){
h for(i in 1:length(k)){
<- if (k[i] <= 20){
k[i]3 * k[i]
else {
} 2 * k[i]
}
}
k
}
# scalar:
<- 10
scalar_test
# vector:
<- c(1:10)
vec_test
# matrix:
<- matrix(data = 1:6, nrow = 3, ncol = 3) mat_test
## Warning in matrix(data = 1:6, nrow = 3, ncol = 3): data length differs from size of matrix: [6 != 3
## x 3]
#array:
<- array(data = 1:5, dim = c(2, 2, 2))
array_test
# data frame:
<- data.frame(v1 = 1:10, v2 = sample(0:1, 10, replace = TRUE))
df_test
# list:
<- list(l1 = 1:30, l2 = c("m", "f"), l3 = c(TRUE, TRUE, FALSE, FALSE, FALSE))
list_test
# h(scalar_test)
# h(vec_test)
# h(mat_test)
# h(array_test)
# h(df_test)
# h(list_test)
Define a function (with the name mysummary
) that takes
one parameter (let’s call it x
) and investigates if it is a
data frame, a factor, or a numeric variable and returns the message
“This is a data.frame” if x
is a data frame, “This is a
factor” is x
is a factor and “This is numeric” if
x
is numeric.
Use an if statement. When you want to display results from the function you can use the print(…) function.
<- function(x){
mysummary if(is.data.frame(x)){
print('This is a data.frame')
else if(is.factor(x)){
}print('This is a factor')
else if(is.numeric(x)){
}print('This is numeric')
}
}<- data.frame(v1 = 1:10, v2 = sample(0:1, 10, replace = TRUE))
df_test <- as.factor(c("y", "n", "n", "n"))
vec1_test <- 1:10
vec2_test mysummary(df_test)
## [1] "This is a data.frame"
mysummary(vec1_test)
## [1] "This is a factor"
mysummary(vec2_test)
## [1] "This is numeric"
Extend the previous function mysummary
to take again one
parameter (let’s call it x
) but to return also some
descriptive statistics. We therefore modify the function in such a way
that a frequency table is given for factors and the mean and standard
deviation are given for numeric data. For a data frame, return only an
extra message that indicates that this is not implemented yet.
Use the functions table(…), mean(…) and sd(…).
<- function(x){
mysummary if(is.data.frame(x)){
print('This is a data.frame')
print('Not yet implemented')
else if(is.factor(x)){
}print('This is a factor')
print(table(x))
else if(is.numeric(x)){
}print('This is numeric')
print('mean')
print(mean(x))
print('sd')
print(sd(x))
}
}<- data.frame(v1 = 1:10, v2 = sample(0:1, 10, replace = TRUE))
df_test <- as.factor(c("y", "n", "n", "n"))
vec1_test <- 1:10
vec2_test mysummary(df_test)
## [1] "This is a data.frame"
## [1] "Not yet implemented"
mysummary(vec1_test)
## [1] "This is a factor"
## x
## n y
## 3 1
mysummary(vec2_test)
## [1] "This is numeric"
## [1] "mean"
## [1] 5.5
## [1] "sd"
## [1] 3.02765
The previous function mysummary
still does not do
anything useful when we apply it on the whole data.frame. Extend the
function to take again one parameter (let’s call it x
) but
to return also some descriptive statistics for the data frame.
This can be solved in the following way: whenever the function is called using a data.frame as argument, we use a for loop to loop over its columns. Within this loop we let the function call itself but now using the column as argument.
<- function(x){
mysummary if(is.data.frame(x)){
print('This is a data.frame')
for(i in 1:ncol(x)){
print(names(x)[i])
mysummary(x[,i])
} else if(is.factor(x)){
}print('This is a factor')
print(table(x))
else if(is.numeric(x)){
}print('This is numeric')
print('mean')
print(mean(x))
print('sd')
print(sd(x))
}
}<- data.frame(v1 = 1:10, v2 = sample(0:1, 10, replace = TRUE))
df1_test <- data.frame(v1 = 1:10, v2 = sample(c("no", "yes"), 10, replace = TRUE))
df2_test mysummary(df1_test)
## [1] "This is a data.frame"
## [1] "v1"
## [1] "This is numeric"
## [1] "mean"
## [1] 5.5
## [1] "sd"
## [1] 3.02765
## [1] "v2"
## [1] "This is numeric"
## [1] "mean"
## [1] 0.2
## [1] "sd"
## [1] 0.421637
mysummary(df2_test)
## [1] "This is a data.frame"
## [1] "v1"
## [1] "This is numeric"
## [1] "mean"
## [1] 5.5
## [1] "sd"
## [1] 3.02765
## [1] "v2"
Create a function with the name std_num
that takes as
input a data frame (let’s call it DF
), standardizes all the
numerical variables and returns the new data frame. Apply this function
to the heart data set.
Use an if statement and a for loop. Note that you will have to transform some variables to factors first.
$event <- as.factor(heart$event)
heart$surgery <- as.factor(heart$surgery)
heart$transplant <- as.factor(heart$transplant)
heart$id <- as.factor(heart$id)
heart
<- function(DF){
std_num for (j in 1:dim(DF)[2]){
if (is.numeric(DF[,j])){
<- (DF[,j] - mean(DF[,j]))/sd(DF[,j])
DF[,j]
}
}
DF
}std_num(heart)
Create a for loop that goes through the columns of the retinopathy data set and plots each column. If the column is a numerical variable then use a density plot, if the column is a categorical variable then use a barchart. Create a plot with multiple panels.
Use an if statement. Use the function par(mfrow = …) to create multiple plots. Note that you will have to transform some variables to factors first.
$trt <- as.factor(retinopathy$trt)
retinopathy$status <- as.factor(retinopathy$status)
retinopathy
par(mfrow = c(3, 3))
for (j in 1:dim(retinopathy)[2]){
if (is.numeric(retinopathy[,j])){
plot(density(retinopathy[,j]), col = rgb(0,0,1,0.5))
polygon(density(retinopathy[,j]), col = rgb(0,0,1,0.5), border = "blue")
else if (is.factor(retinopathy[,j])){
} plot(retinopathy[,j])
} }
Create a function that applies the previous code in any data set. The
name of the function will be plot_summary
and it will take
as input a data frame (let’s call it dt
) and the dimension
of the panel that includes the plots. In particular,
dim_row
represents the rows and dim_col
the
columns. The function will return a plot for each column. If the column
is a numerical variable then use a density plot, if the column is
categorical variable then use a barchart. Apply this function to the
retinopathy data set.
Use an if statement and a for loop. Use the function par(mfrow = …) to create multiple plots. Note that you will have to transform some variables to factors first.
$trt <- as.factor(retinopathy$trt)
retinopathy$status <- as.factor(retinopathy$status)
retinopathy
<- function(dt, dim_row, dim_col){
plot_summary par(mfrow = c(dim_row, dim_col))
for (j in 1:dim(dt)[2]){
if (is.numeric(dt[,j])){
plot(density(dt[,j]), col = rgb(0,0,1,0.5))
polygon(density(dt[,j]), col = rgb(0,0,1,0.5), border = "blue")
else if (is.factor(dt[,j])){
} plot(dt[,j])
}
}
}plot_summary(retinopathy, 3, 3)
Extend the previous function plot_summary
to include
also plot titles indicating the variable. Apply this function to the
retinopathy data set.
Use the paste0(…) function.
$trt <- as.factor(retinopathy$trt)
retinopathy$status <- as.factor(retinopathy$status)
retinopathy
<- function(dt, dim_row, dim_col){
plot_summary par(mfrow = c(dim_row, dim_col))
for (j in 1:dim(dt)[2]){
if (is.numeric(dt[,j])){
plot(density(dt[,j]), col = rgb(0,0,1,0.5), main = paste0(colnames(dt)[j]))
polygon(density(dt[,j]), col = rgb(0,0,1,0.5), border = "blue")
else if (is.factor(dt[,j])){
} plot(dt[,j], main = paste0(colnames(dt)[j]))
}
}
}plot_summary(retinopathy, 3, 3)
For each data set (heart and retinopathy), select only the observations that report an event.
Note that the variable name of the events is diffent for the two data sets.
$event == 1, ] heart[heart
$status == 1, ] retinopathy[retinopathy
Create a function with the name dt_events
that takes as
input a list with data sets (let’s call it dt_list
) and a
vector indicating the name of the event variable for each data set
(let’s call it event_name
). The function will return a list
consisting of each data set but including only the observations/rows
that had an event. Create a list that consists of the data sets
heart and retinopathy and apply the
function.
Use a for loop. Note that the variable name of the events is different for the two data sets.
<- function(dt_lists, event_name){
dt_events <- list()
new_dt for (k in 1:length(dt_lists)) {
<- dt_lists[[k]]
dt <- dt[dt[[event_name[k]]] == 1, ]
new_dt[[k]]
}print(new_dt)
}
dt_events(dt_lists = list(heart, retinopathy), event_name = c("event", "status"))
## [[1]]
## start stop event age year surgery transplant id
## 1 0.0 50 1 -17.15537303 0.12320329 0 0 1
## 2 0.0 6 1 3.83572895 0.25462012 0 0 2
## 4 1.0 16 1 6.29705681 0.26557153 0 1 3
## 6 36.0 39 1 -7.73716632 0.49007529 0 1 4
## 7 0.0 18 1 -27.21423682 0.60780287 0 0 5
## 8 0.0 3 1 6.59548255 0.70088980 0 0 6
## 10 51.0 675 1 2.86926762 0.78028747 0 1 7
## 11 0.0 40 1 -2.65023956 0.83504449 0 0 8
## 12 0.0 85 1 -0.83778234 0.85694730 0 0 9
## 14 12.0 58 1 -5.49760438 0.86242300 0 1 10
## 16 26.0 153 1 -0.01916496 0.87337440 0 1 11
## 17 0.0 8 1 5.19370294 0.96372348 0 0 12
## 19 17.0 81 1 6.57357974 0.96919918 0 1 13
## 21 37.0 1387 1 6.01232033 0.97193703 0 1 14
## 22 0.0 1 1 5.81519507 0.99110198 1 0 15
## 24 28.0 308 1 1.44832307 1.07049966 0 1 16
## 25 0.0 36 1 -27.66872005 1.07597536 0 0 17
## 27 20.0 43 1 8.84873374 1.08692676 0 1 18
## 28 0.0 37 1 11.12388775 1.13347023 0 0 19
## 30 18.0 28 1 7.27994524 1.33059548 0 1 20
## 32 8.0 1032 1 -4.65708419 1.33880903 0 1 21
## 34 12.0 51 1 -5.21560575 1.46201232 0 1 22
## 36 3.0 733 1 10.35728953 1.52772074 0 1 23
## 38 83.0 219 1 3.80013689 1.56605065 0 1 24
## 42 0.0 263 1 -39.21423682 1.59069131 0 0 27
## 44 71.0 72 1 6.02327173 1.68377823 0 1 28
## 45 0.0 35 1 2.43394935 1.78507871 0 0 29
## 47 16.0 852 1 -3.08829569 1.88364134 0 1 30
## 48 0.0 16 1 6.88569473 1.89459274 0 0 31
## 50 17.0 77 1 16.40793977 1.91101985 0 1 32
## 55 0.0 12 1 -4.53388090 2.30800821 0 0 35
## 57 46.0 100 1 0.92539357 2.50787132 0 1 36
## 59 19.0 66 1 13.50034223 2.56536619 0 1 37
## 61 4.5 5 1 -6.52977413 2.59274470 0 1 38
## 63 2.0 53 1 2.51882272 2.63381246 0 1 39
## 68 0.0 3 1 -11.55920602 2.88843258 0 0 42
## 69 0.0 2 1 -4.60780287 3.05817933 1 0 43
## 70 0.0 40 1 -5.42094456 3.16495551 1 0 44
## 72 1.0 45 1 -11.81656400 3.26351814 0 1 45
## 74 2.0 996 1 0.61054073 3.27720739 1 1 46
## 76 21.0 72 1 -0.90075291 3.34017796 0 1 47
## 77 0.0 9 1 8.03559206 3.34839151 0 0 48
## 81 83.0 980 1 -2.11362081 3.37577002 1 1 50
## 83 32.0 285 1 0.73374401 3.47707050 0 1 51
## 84 0.0 102 1 -6.75154004 3.56468172 0 0 52
## 86 41.0 188 1 -0.65708419 3.75085558 0 1 53
## 87 0.0 3 1 -0.20807666 3.75085558 0 0 54
## 89 10.0 61 1 4.45448323 3.85489391 0 1 55
## 92 0.0 149 1 -6.73511294 3.95071869 0 0 57
## 94 21.0 343 1 0.01642710 3.97809719 1 1 58
## 98 3.0 68 1 1.05407255 4.13141684 0 1 60
## 99 0.0 2 1 4.56399726 4.17522245 0 0 61
## 100 0.0 69 1 -8.64613279 4.18891170 0 0 62
## 104 33.0 584 1 0.81587953 4.33675565 1 1 64
## 106 12.0 78 1 3.29363450 4.42984257 0 1 65
## 107 0.0 32 1 5.21286790 4.46817248 0 0 66
## 109 57.0 285 1 -28.44900753 4.47638604 0 1 67
## 111 3.0 68 1 -2.75975359 4.51745380 0 1 68
## 115 5.0 30 1 5.00205339 4.71184120 0 1 70
## 121 27.0 90 1 8.33127995 4.94729637 0 1 73
## 123 5.0 17 1 -18.83367556 4.96646133 0 1 74
## 124 0.0 2 1 4.18069815 4.99657769 0 0 75
## 127 0.0 21 1 -6.88843258 5.01574264 0 0 77
## 131 67.0 96 1 5.78234086 5.16632444 0 1 79
## 138 32.0 80 1 5.30869268 5.31690623 0 1 83
## 140 37.0 334 1 -5.28131417 5.33333333 0 1 84
## 141 0.0 5 1 -0.01916496 5.35249829 0 0 85
## 145 60.0 110 1 -1.74674880 5.47022587 0 1 87
## 149 139.0 207 1 3.04722793 5.51129363 0 1 89
## 151 160.0 186 1 4.03285421 5.51403149 1 1 90
## 152 0.0 340 1 -0.40520192 5.53319644 0 0 91
## 158 4.0 165 1 -4.15879535 5.95482546 1 1 94
## 160 2.0 16 1 -7.71800137 5.97672827 0 1 95
## 167 0.0 21 1 1.83436003 6.23408624 0 0 99
## 172 0.0 6 1 -8.68446270 -0.04928131 0 0 103
##
## [[2]]
## id laser eye age type trt futime status risk
## 4 14 argon right 12 juvenile 0 31.30 1 6
## 10 29 xenon left 13 juvenile 0 0.30 1 10
## 12 46 xenon right 12 juvenile 0 54.27 1 9
## 14 49 argon right 8 juvenile 0 10.80 1 6
## 20 71 argon right 21 adult 0 13.83 1 9
## 21 100 argon left 23 adult 1 46.43 1 9
## 24 112 argon right 44 adult 0 7.90 1 12
## 27 127 xenon right 48 adult 1 30.83 1 6
## 28 127 xenon right 48 adult 0 38.57 1 10
## 30 133 argon right 26 adult 0 14.10 1 9
## 31 150 argon left 10 juvenile 1 20.17 1 9
## 32 150 argon left 10 juvenile 0 6.90 1 10
## 34 167 argon left 23 adult 0 41.40 1 9
## 44 214 xenon right 45 adult 0 0.60 1 12
## 45 220 argon right 11 juvenile 1 10.27 1 10
## 46 220 argon right 11 juvenile 0 1.63 1 10
## 49 255 xenon left 10 juvenile 1 5.67 1 12
## 50 255 xenon left 10 juvenile 0 13.83 1 9
## 52 264 argon right 12 juvenile 0 29.97 1 11
## 54 266 argon right 36 adult 0 26.37 1 11
## 55 284 argon left 53 adult 1 5.77 1 10
## 56 284 argon left 53 adult 0 1.33 1 12
## 57 295 xenon left 10 juvenile 1 5.90 1 11
## 58 295 xenon left 10 juvenile 0 35.53 1 11
## 59 300 xenon left 25 adult 1 25.63 1 10
## 60 300 xenon left 25 adult 0 21.90 1 9
## 61 302 argon left 14 juvenile 1 33.90 1 9
## 62 302 argon left 14 juvenile 0 14.80 1 9
## 63 315 argon left 16 juvenile 1 1.73 1 10
## 64 315 argon left 16 juvenile 0 6.20 1 8
## 66 324 xenon left 38 adult 0 22.00 1 8
## 69 335 argon left 10 juvenile 1 30.20 1 9
## 70 335 argon left 10 juvenile 0 22.00 1 11
## 73 349 xenon right 44 adult 1 25.80 1 11
## 74 349 xenon right 44 adult 0 13.87 1 8
## 75 357 argon left 21 adult 1 5.73 1 9
## 76 357 argon left 21 adult 0 48.30 1 9
## 79 385 argon right 13 juvenile 1 1.90 1 10
## 81 396 argon left 40 adult 1 9.90 1 10
## 82 396 argon left 40 adult 0 9.90 1 9
## 86 409 argon left 48 adult 0 2.67 1 12
## 88 419 xenon right 42 adult 0 13.83 1 10
## 90 429 xenon right 24 adult 0 4.27 1 10
## 92 433 argon left 55 adult 0 13.90 1 10
## 97 468 argon right 6 juvenile 1 1.70 1 10
## 98 468 argon right 6 juvenile 0 1.70 1 6
## 99 480 xenon right 19 juvenile 1 1.77 1 9
## 100 480 xenon right 19 juvenile 0 43.03 1 12
## 105 503 xenon left 27 adult 1 8.30 1 9
## 106 503 xenon left 27 adult 0 8.30 1 10
## 108 515 argon right 43 adult 0 18.43 1 9
## 112 538 argon left 45 adult 0 31.63 1 9
## 115 550 xenon left 3 juvenile 1 18.70 1 10
## 116 550 xenon left 3 juvenile 0 6.53 1 11
## 120 557 argon right 13 juvenile 0 22.23 1 8
## 122 561 xenon right 15 juvenile 0 14.00 1 9
## 123 568 argon left 10 juvenile 1 42.17 1 9
## 124 568 argon left 10 juvenile 0 42.17 1 10
## 126 572 xenon left 6 juvenile 0 5.33 1 9
## 128 576 xenon left 17 juvenile 0 59.80 1 10
## 130 581 argon left 37 adult 0 5.83 1 11
## 132 606 xenon right 18 juvenile 0 2.17 1 12
## 133 610 xenon left 13 juvenile 1 14.30 1 9
## 134 610 xenon left 13 juvenile 0 48.43 1 8
## 141 631 argon left 11 juvenile 1 13.33 1 12
## 142 631 argon left 11 juvenile 0 9.60 1 10
## 147 653 xenon left 15 juvenile 1 14.27 1 9
## 148 653 xenon left 15 juvenile 0 7.60 1 12
## 149 662 argon left 7 juvenile 1 34.57 1 12
## 150 662 argon left 7 juvenile 0 1.80 1 12
## 152 664 argon left 2 juvenile 0 4.30 1 12
## 153 683 xenon left 22 adult 1 4.10 1 8
## 154 683 xenon left 22 adult 0 12.20 1 6
## 160 706 xenon right 27 adult 0 12.73 1 10
## 162 717 xenon right 53 adult 0 54.10 1 11
## 164 722 xenon right 10 juvenile 0 9.40 1 12
## 165 731 xenon left 13 juvenile 1 21.57 1 12
## 166 731 xenon left 13 juvenile 0 9.90 1 10
## 181 778 xenon left 25 adult 1 26.23 1 8
## 184 780 argon right 15 juvenile 0 18.03 1 11
## 189 804 xenon right 23 adult 1 7.07 1 10
## 191 810 xenon left 13 juvenile 1 13.77 1 10
## 192 810 xenon left 13 juvenile 0 13.77 1 12
## 194 815 xenon left 45 adult 0 9.63 1 12
## 198 834 argon left 8 juvenile 0 1.50 1 12
## 199 838 xenon left 30 adult 1 33.63 1 9
## 200 838 xenon left 30 adult 0 33.63 1 9
## 203 866 argon left 39 adult 1 63.33 1 11
## 204 866 argon left 39 adult 0 27.60 1 10
## 205 887 argon right 26 adult 1 38.47 1 10
## 206 887 argon right 26 adult 0 1.63 1 10
## 210 910 xenon left 34 adult 0 25.30 1 10
## 212 920 xenon left 10 juvenile 0 46.20 1 8
## 214 925 argon right 40 adult 0 1.70 1 10
## 221 949 xenon right 13 juvenile 1 10.33 1 9
## 222 949 xenon right 13 juvenile 0 0.83 1 10
## 223 952 argon left 9 juvenile 1 6.13 1 12
## 226 962 argon left 5 juvenile 0 25.93 1 9
## 230 971 argon right 23 adult 0 19.40 1 10
## 232 978 argon left 2 juvenile 0 21.97 1 10
## 235 987 xenon left 7 juvenile 1 26.20 1 10
## 238 1002 argon left 13 juvenile 0 18.03 1 11
## 239 1017 xenon left 50 adult 1 13.83 1 8
## 240 1017 xenon left 50 adult 0 1.57 1 9
## 242 1029 xenon right 20 adult 0 13.37 1 12
## 243 1034 argon left 15 juvenile 1 11.07 1 11
## 244 1034 argon left 15 juvenile 0 1.97 1 9
## 246 1037 xenon right 30 adult 0 22.20 1 9
## 251 1074 xenon left 4 juvenile 1 6.10 1 10
## 253 1098 argon left 3 juvenile 1 2.10 1 10
## 254 1098 argon left 3 juvenile 0 11.30 1 11
## 255 1102 argon right 10 juvenile 1 17.73 1 10
## 265 1145 argon left 44 adult 1 12.93 1 10
## 266 1145 argon left 44 adult 0 4.97 1 9
## 268 1148 argon left 48 adult 0 26.47 1 12
## 271 1184 argon left 46 adult 1 24.43 1 12
## 272 1184 argon left 46 adult 0 9.87 1 11
## 275 1205 xenon left 12 juvenile 1 13.97 1 11
## 276 1205 xenon left 12 juvenile 0 30.40 1 12
## 278 1213 xenon right 12 juvenile 0 43.33 1 10
## 288 1267 xenon right 50 adult 0 38.57 1 11
## 290 1281 xenon right 44 adult 0 3.67 1 10
## 291 1287 argon left 8 juvenile 1 48.87 1 10
## 296 1296 xenon left 18 juvenile 0 15.83 1 10
## 298 1309 xenon right 56 adult 0 8.83 1 8
## 304 1321 xenon right 5 juvenile 0 22.13 1 9
## 305 1333 argon right 1 juvenile 1 6.30 1 9
## 308 1347 xenon right 1 juvenile 0 18.93 1 10
## 309 1361 xenon left 14 juvenile 1 13.80 1 10
## 310 1361 xenon left 14 juvenile 0 19.00 1 9
## 313 1373 argon left 8 juvenile 1 13.57 1 11
## 314 1373 argon left 8 juvenile 0 5.43 1 11
## 320 1413 argon right 3 juvenile 0 7.10 1 12
## 322 1425 argon left 35 adult 0 26.17 1 11
## 324 1447 argon left 8 juvenile 0 24.73 1 11
## 328 1469 xenon right 51 adult 0 10.97 1 12
## 330 1480 argon left 42 adult 0 21.10 1 9
## 332 1487 argon right 20 adult 0 43.70 1 8
## 336 1499 xenon right 22 adult 0 14.37 1 9
## 343 1533 argon left 41 adult 1 1.50 1 9
## 346 1537 xenon right 19 juvenile 0 38.40 1 10
## 348 1552 xenon right 4 juvenile 0 2.83 1 11
## 352 1562 xenon right 20 adult 0 6.57 1 9
## 354 1572 xenon right 24 adult 0 46.27 1 9
## 359 1596 argon right 16 juvenile 1 34.37 1 9
## 366 1619 xenon left 20 adult 0 61.83 1 12
## 367 1627 xenon left 10 juvenile 1 6.57 1 10
## 369 1636 argon right 16 juvenile 1 38.87 1 6
## 371 1640 xenon left 10 juvenile 1 42.43 1 11
## 372 1640 xenon left 10 juvenile 0 46.63 1 9
## 375 1649 argon right 1 juvenile 1 2.70 1 10
## 383 1688 xenon left 5 juvenile 1 13.10 1 11
## 388 1717 argon left 22 adult 0 42.33 1 11
## 390 1727 argon right 33 adult 0 2.90 1 10
## 392 1746 argon right 3 juvenile 0 1.43 1 10
For each data set (heart and retinopathy), obtain the mean age per event status.
tapply(heart$age, heart$event, mean)
## 0 1
## -3.421947 -1.270983
tapply(retinopathy$age, retinopathy$status, mean)
## 0 1
## 20.35146 21.44516
Create a function with the name summary_list
that takes
as input a list with data sets (let’s call it dt_list
), a
vector with the names of the continuous variables (let’s call it
dt_cont
) and a vector with the names of the categorical
variables (let’s call it dt_cat
) and returns the mean of
the continuous variable per group (categorical variable) for each data
set. Apply the function to the heart and
retinopathy data sets. In particular, use the
continuous variables year
(from the heart
data set) and risk
(from the retinopathy
data set) and the categorical variables surgery
(from the
heart data set) and type
(from the
retinopathy data set).
Use a for loop. Use the print(…) function to return the results.
<- function(dt_list, dt_cont, dt_cat){
summary_list for (i in 1:length(dt_list)){
<- dt_list[[i]]
dt print(tapply(dt[[dt_cont[i]]], dt[[dt_cat[i]]], mean))
}
}
summary_list(dt_list = list(heart, retinopathy), dt_cont = c("year", "risk"), dt_cat = c("surgery", "type"))
## 0 1
## 3.320975 4.105738
## juvenile adult
## 9.745614 9.632530
Extend the previous by including an extra input argument (let’s call
it calc
) which will indicate each time the measure that we
want to calculate e.g, the mean. So that the user can decide which
function to use.
<- function(dt_list, dt_cont, dt_cat, calc){
summary_list for (i in 1:length(dt_list)){
<- dt_list[[i]]
dt print(tapply(dt[[dt_cont[i]]], dt[[dt_cat[i]]], calc))
}
}
summary_list(dt_list = list(heart, retinopathy), dt_cont = c("year", "risk"), dt_cat = c("surgery", "type"), calc = median)
## 0 1
## 3.564682 3.978097
## juvenile adult
## 10 9
© Eleni-Rosalina Andrinopoulou