Open Rstudio to do the practicals. Note that tasks with * are optional.
In this practical, a number of R packages are used. The packages used (with versions that were used to generate the solutions) are:
survival
(version: 3.3.1)R version 4.2.1 (2022-06-23 ucrt)
For this practical, we will use the heart and
retinopathy data sets from the survival
package. More details about the data sets can be found in:
https://stat.ethz.ch/R-manual/R-devel/library/survival/html/heart.html
https://stat.ethz.ch/R-manual/R-devel/library/survival/html/retinopathy.html
start
,
stop
, event
, age
,
year
, surgery
of the heart
data set.age
,
futime
, risk
of the
retinopathy data set.apply(heart[, c("start", "stop", "event", "age", "year", "surgery")], 2, mean)
## start stop event age year surgery
## 15.5145349 201.2936047 0.4360465 -2.4840266 3.4532894 0.1686047
apply(retinopathy[, c("age", "futime", "risk")], 2, mean)
## age futime risk
## 20.78173 35.57929 9.69797
Create the matrix
dataset1 <- cbind(A = 1:30, B = sample(1:100, 30))
and
find the row sum of dataset1
.
<- cbind(A = 1:30, B = sample(1:100, 30))
dataset1 apply(dataset1, 1, sum)
## [1] 5 25 75 45 12 56 66 21 96 26 65 13 92 104 110 48 20 95 116 104 90 40 93 113
## [25] 93 36 84 33 58 86
Create the following function
DerivativeFunction <- function(x) { log10(x) + 10 }
.
Apply the DerivativeFunction
to
dataset1 <- cbind(A = 1:30, B = sample(1:100, 30))
. The
output should be a list.
<- function(x) { log10(x) + 10 }
DerivativeFunction <- cbind(A = 1:30, B = sample(1:100, 30))
dataset1 lapply(dataset1, DerivativeFunction)
## [[1]]
## [1] 10
##
## [[2]]
## [1] 10.30103
##
## [[3]]
## [1] 10.47712
##
## [[4]]
## [1] 10.60206
##
## [[5]]
## [1] 10.69897
##
## [[6]]
## [1] 10.77815
##
## [[7]]
## [1] 10.8451
##
## [[8]]
## [1] 10.90309
##
## [[9]]
## [1] 10.95424
##
## [[10]]
## [1] 11
##
## [[11]]
## [1] 11.04139
##
## [[12]]
## [1] 11.07918
##
## [[13]]
## [1] 11.11394
##
## [[14]]
## [1] 11.14613
##
## [[15]]
## [1] 11.17609
##
## [[16]]
## [1] 11.20412
##
## [[17]]
## [1] 11.23045
##
## [[18]]
## [1] 11.25527
##
## [[19]]
## [1] 11.27875
##
## [[20]]
## [1] 11.30103
##
## [[21]]
## [1] 11.32222
##
## [[22]]
## [1] 11.34242
##
## [[23]]
## [1] 11.36173
##
## [[24]]
## [1] 11.38021
##
## [[25]]
## [1] 11.39794
##
## [[26]]
## [1] 11.41497
##
## [[27]]
## [1] 11.43136
##
## [[28]]
## [1] 11.44716
##
## [[29]]
## [1] 11.4624
##
## [[30]]
## [1] 11.47712
##
## [[31]]
## [1] 11.60206
##
## [[32]]
## [1] 11.86332
##
## [[33]]
## [1] 11.89209
##
## [[34]]
## [1] 11.91908
##
## [[35]]
## [1] 11.17609
##
## [[36]]
## [1] 11.39794
##
## [[37]]
## [1] 11.74036
##
## [[38]]
## [1] 11.57978
##
## [[39]]
## [1] 11.72428
##
## [[40]]
## [1] 11.5682
##
## [[41]]
## [1] 11.9345
##
## [[42]]
## [1] 11.98227
##
## [[43]]
## [1] 10.60206
##
## [[44]]
## [1] 11.91381
##
## [[45]]
## [1] 11.80618
##
## [[46]]
## [1] 10.77815
##
## [[47]]
## [1] 11.36173
##
## [[48]]
## [1] 11.716
##
## [[49]]
## [1] 11.62325
##
## [[50]]
## [1] 11.79934
##
## [[51]]
## [1] 11.70757
##
## [[52]]
## [1] 11.99123
##
## [[53]]
## [1] 11.51851
##
## [[54]]
## [1] 11.86923
##
## [[55]]
## [1] 12
##
## [[56]]
## [1] 11.61278
##
## [[57]]
## [1] 11.83885
##
## [[58]]
## [1] 10.47712
##
## [[59]]
## [1] 11.94939
##
## [[60]]
## [1] 11.77815
age
and
year
from the heart data set and the
variable risk
from the retinopathy data
set. Give the name list1
to this list.<- list(heart$age, heart$year, retinopathy$risk)
list1 lapply(list1, median)
## [[1]]
## [1] -0.1136208
##
## [[2]]
## [1] 3.750856
##
## [[3]]
## [1] 10
Create the following function
Function2 <- function(x) { exp(x) + 0.1 }
. Apply the
Function2
to
dataset2 <- cbind(A = c(1:10), B = rnorm(10, 0, 1))
. The
output should be simplified.
<- function(x) { exp(x) + 0.1 }
Function2 <- cbind(A = c(1:10), B = rnorm(10, 0, 1))
dataset2 sapply(dataset2, Function2)
## [1] 2.818282e+00 7.489056e+00 2.018554e+01 5.469815e+01 1.485132e+02 4.035288e+02 1.096733e+03
## [8] 2.981058e+03 8.103184e+03 2.202657e+04 5.829599e-01 1.706285e+00 1.994492e+00 3.778803e+00
## [15] 1.213039e+00 1.416347e+00 8.806216e-01 7.133495e-01 4.448679e+00 2.392794e+00
transplant
from the heart data set and the variable
status
from the retinopathy data set. Give
the name list2
to this list.<- list(heart$transplant, retinopathy$status)
list2
library(memisc)
sapply(list2, function(x) { percent(x) } )
## [,1] [,2]
## 0 59.88372 60.6599
## 1 40.11628 39.3401
## N 172.00000 394.0000
sapply(list2, function(x) { percent(x) } )
## [,1] [,2]
## 0 59.88372 60.6599
## 1 40.11628 39.3401
## N 172.00000 394.0000
Control_Flow_and_Functions
:
Writing your own function (Task 1 and 2)
? Now try to create
again the same function (called summary_df
) but avoid the
use of a for loop. Apply the function to the
retinopathy dat set.Use the functions summary_continuous() and
summary_categorical().
summary_continuous <- function(x) {
paste0(round(mean(x), 1), " ( ", round(sd(x), 1), ") ")
}
summary_categorical <- function(x) {
tab <- prop.table(table(x))
paste0(round(tab * 100, 1), "% ", names(tab), collapse = ", ")
}
<- function(x) {
summary_continuous paste0(round(mean(x), 1), " (", round(sd(x), 1), ")")
}
<- function(x) {
summary_categorical <- prop.table(table(x))
tab paste0(round(tab * 100, 1), "% ", names(tab), collapse = ", ")
}
<- function(dat) {
summary_df <- sapply(dat, is.factor)
vec_categorical print(sapply(dat[,vec_categorical], summary_categorical))
<- sapply(dat, is.numeric)
vec_continuous print(sapply(dat[,vec_continuous], summary_continuous))
}
summary_df(dat = retinopathy)
## laser eye type
## "50.8% xenon, 49.2% argon" "45.2% right, 54.8% left" "57.9% juvenile, 42.1% adult"
## id age trt futime status risk
## "873.2 (495.5)" "20.8 (14.8)" "0.5 (0.5)" "35.6 (21.4)" "0.4 (0.5)" "9.7 (1.5)"
year
per transplant
group using the heart data set.futime
per status
group
using the retinopathy data set.tapply(heart$year, heart$transplant, median)
## 0 1
## 3.47707 3.92334
tapply(retinopathy$futime, retinopathy$status, median)
## 0 1
## 48.53 13.83
Fun1 <- function(x) { mean(x)/(length(x) - 2) }
to
year
per transplant
and surgery
group using the heart data set.futime
per status
,
type
and trt
group using the
retinopathy data set.<- function(x) { mean(x)/(length(x) - 2) }
Fun1 tapply(heart$year, list(heart$transplant, heart$surgery), Fun1)
## 0 1
## 0 0.03820181 0.2818764
## 1 0.06362334 0.3910915
tapply(retinopathy$futime, list(retinopathy$status, retinopathy$type, retinopathy$trt), mean)
## , , 0
##
## juvenile adult
## 0 45.22127 48.42273
## 1 18.65137 19.25160
##
## , , 1
##
## juvenile adult
## 0 45.62218 47.92323
## 1 16.66944 21.32833
© Eleni-Rosalina Andrinopoulou