multiLME-vignette

Installation

devtools::install_github("ERandrinopoulou/multiLME", dependencies = TRUE, force = TRUE)

library(JMbayes2)
library(jagsUI)
library(multiLME)
library(lattice)

Multivariate mixed-effects model

This package fits a multivariate mixed-effects model, assuming different association structures between the longitudinal outcomes.

Value parameterization

Let’s assume that we want to fit the following multivariate mixed-effects model:

\[ \begin{align} y_{i2}(t) &= x_{2}^\top(t) \beta_2 + z_{i2}^\top(t) b_{i2} + \epsilon_{i2}(t) = m_{i2}(t) + \epsilon_{i2}(t) \\ y_{i1}(t) &= x_{1}^\top(t) \beta_1 + z_{i1}^\top(t) b_{i1} + \epsilon_{i1}(t) = m_{i1}(t) + \alpha m_{i2}(t) + \epsilon_{i1}(t) \end{align} \]

where:

$y_{i1}(t)$ and $y_{i2}(t)$ are the longitudinal outcomes 1 and 2, respectively, for individual $i$ at time $t$,
$x_{j}(t)$ and $z_{ij}(t)$ are the design vectors for the fixed and random effects of outcome $j \in \{1, 2\}$,
$\beta_j$ are the fixed effect coefficients, $b_{ij}$ are the individual-specific random effects, and $\epsilon_{ij}(t)$ are residual error terms,
$m_{ij}(t) = x_{j}^\top(t) \beta_j + z_{ij}^\top(t) b_{ij}$ denotes the mean (noise-free) trajectory of outcome $j$.

The model for outcome 1 includes the underlying value of outcome 2 at time $t$, represented by the term

\[ m_{i2}(t), \]

which captures the instantaneous level of outcome 2. Its effect on outcome 1 is scaled by the parameter $\alpha$.

Example

First we simulate data:

n = 100 

K <- 10  # number of planned repeated measurements per subject, per outcome
t.max <- 10 # maximum follow-up time

# parameters for the linear mixed effects model 1
betaa1 <- c("(Intercept)" = 3.01, "Time1" = 0.47, "Group" = -2)
sigma1.y <- 1.017671  # measurement error standard deviation

# parameters for the linear mixed effects model 2
betaa2 <- c("(Intercept)" = 2.15, "Time1" = 0.91)
sigma2.y <- 1  # measurement error standard deviation

# association parameter
alphaa <- 0.9 # association parameter - value

# Variance-covariance matrix for random effects
invDvec <- c(1, 0, 0, 0,
             0, 1, 0, 0,
             0, 0, 1, 0,
             0, 0, 0, 1)

invDmat <- matrix(invDvec, 4, 4)
D <- solve(invDmat)

# design matrices for the longitudinal measurement model
times <- c(replicate(n, c(0, sort(runif(K-1, 0, t.max))))) 
group <- sample(c(0,1), n, replace = TRUE)

DF <- data.frame(year = times, drug = factor(rep(group, each = K)))

X1 <- model.matrix(~ year + drug, data = DF)
Z1 <- model.matrix(~ year, data = DF)

X2 <- model.matrix(~ year, data = DF)
Z2 <- model.matrix(~ year, data = DF)

#simulate random effects
library(MASS)

b <- mvrnorm(n, rep(0, nrow(D)), D)

# simulate longitudinal responses
id <- rep(1:n, each = K)

eta.y2 <- as.vector(X2 %*% betaa2 + rowSums(Z2 * b[id, 3:4])) 
y2 <- rnorm(n * K, eta.y2, sigma2.y)

eta.y1 <- as.vector(X1 %*% betaa1 + rowSums(Z1 * b[id, 1:2])) + alphaa * eta.y2 
y1 <- rnorm(n * K, eta.y1, sigma1.y)

dat <- DF[, ]
dat$id <- id
dat$y1 <- y1
dat$y2 <- y2

names(dat) <- c("year", "drug", "id", "y1", "y2")
data <- dat

Then, we run the multivariate model as follows:

formulas <- list(y1 ~ year + drug + (year | id),
                 y2 ~ year + (year | id))
families <- c("gaussian", "gaussian")

  
fit <- mv_lme(formulas = formulas, data = data, 
                     families = families, 
                     corr_RE = FALSE,
                     assoc = TRUE,
                     assoc_from = 2,
                     assoc_to = 1,
                     extraForm = NULL,
                     time_var = "year")

Accessing Results

Once the model is fitted, results can be accessed via the fit object. For example:

Posterior means are available through fit$postMeans.

Corresponding 95% credible intervals, computed from the 2.5th and 97.5th percentiles of the posterior distributions, can be retrieved using fit$CIs.

Additional output elements, such as mcmc.info, Deviance Information Criterion (DIC), effective number of parameters (pD), and convergence diagnostics (Rhat), are returned by jagsUI::jags() function.

Individualized Predictions

Value parameterization

Now, let us consider the scenario where individualized predictions are desired:

library(gifski)

object = fit
newdata_all = data[data$id == 1, ]
newdata <- newdata_all[1:6,]


res_pred <- DynPred_mv_lme(object = fit, newdata = newdata, 
                             families, 
                             level = 0.95, IdVar = "id", timeVar = "year",
                             M = 200, times = NULL,
                             assoc = TRUE,
                             assoc_from = 2,
                             assoc_to = 1,
                             extraForm = NULL)



p1 <- xyplot(res_pred$Mean1 + res_pred$Lower1 + res_pred$Upper1 ~ res_pred$Time,
         xlim = c(0, 10), 
         ylim = c(min(y1), max(y1)), 
         strip = FALSE, xlab = "Year", ylab = "Y1",
         lwd = c(3, 2, 2), lty = c(1, 3, 3), type = "l", col = 1,
         panel = function(x, y,  ...) {
           panel.xyplot(x, y, ...)
           panel.abline(v = max(newdata$year), col = 2, lty = 2)
           panel.points(newdata_all$year, newdata_all$y1, pch = 16, cex = 0.8, col = "blue")
           })

print(p1)

The object res_pred returns a data frame including the posterior means with the 95/% prediction interval.

Alternative Association Parameter Models

Slope parameterization

We aim to fit the following multivariate mixed-effects model:

\[ \begin{align} y_{i2}(t) &= x_{2}^\top(t) \beta_2 + z_{i2}^\top(t) b_{i2} + \epsilon_{i2}(t) = m_{i2}(t) + \epsilon_{i2}(t) \\ y_{i1}(t) &= x_{1}^\top(t) \beta_1 + z_{i1}^\top(t) b_{i1} + \epsilon_{i1}(t) = m_{i1}(t) + \alpha \frac{d}{dt} m_{i2}(t) + \epsilon_{i1}(t) \end{align} \]

where:

$y_{i1}(t)$ and $y_{i2}(t)$ are the longitudinal outcomes 1 and 2, respectively, for individual $i$ at time $t$,
$x_{j}(t)$ and $z_{ij}(t)$ are the design vectors for the fixed and random effects of outcome $j \in \{1, 2\}$,
$\beta_j$ are the fixed effect coefficients, $b_{ij}$ are the individual-specific random effects, and $\epsilon_{ij}(t)$ are residual error terms,
$m_{ij}(t) = x_{j}^\top(t) \beta_j + z_{ij}^\top(t) b_{ij}$ denotes the mean (noise-free) trajectory of outcome $j$.

The model for outcome 1 includes the instantaneous rate of change of outcome 2: specifically, the derivative term

\[ \frac{d}{dt} m_{i2}(t) \]

represents the slope of outcome 2 at time $t$, capturing its dynamic influence on outcome 1. This effect is scaled by the parameter $\alpha$.

Example

First we simulate data:

n = 100 

K <- 10  # number of planned repeated measurements per subject, per outcome
t.max <- 10 # maximum follow-up time

# parameters for the linear mixed effects model 1
betaa1 <- c("(Intercept)" = 3.01, "Time1" = 0.47, "Group" = -2)
sigma1.y <- 1.017671  # measurement error standard deviation

# parameters for the linear mixed effects model 2
betaa2 <- c("(Intercept)" = 2.15, "Time1" = 0.91)
sigma2.y <- 1  # measurement error standard deviation

# association parameter
alphaa <- 0.5

invDvec <- c(1, 0, 0, 0,
             0, 1, 0, 0,
             0, 0, 1, 0,
             0, 0, 0, 1)


invDmat <- matrix(invDvec, 4, 4)
D <- solve(invDmat)

# design matrices for the longitudinal measurement model
times <- c(replicate(n, c(0, sort(runif(K-1, 0, t.max))))) 
group <- sample(c(0,1), n, replace = TRUE)

DF <- data.frame(year = times, drug = factor(rep(group, each = K)))

X1 <- model.matrix(~ year + drug, data = DF)
Z1 <- model.matrix(~ year, data = DF)

X2 <- model.matrix(~ year, data = DF)
Z2 <- model.matrix(~ year, data = DF)

# design matrices for the slope of the 1st longitudinal outcome
extraFormY2 <- list(fixed = ~ 0 + 1,
                    random = ~ 0 + 1,
                    indFixed = 2, indRandom = 2)

mfX_derivY2 <- model.frame(terms(extraFormY2$fixed), data = DF)
mfZ_derivY2 <- model.frame(terms(extraFormY2$random), data = DF)
XderivY2 <- model.matrix(extraFormY2$fixed, mfX_derivY2)
ZderivY2 <- model.matrix(extraFormY2$random, mfZ_derivY2)

#simulate random effects
library(MASS)

b <- mvrnorm(n, rep(0, nrow(D)), D)

# simulate longitudinal responses
id <- rep(1:n, each = K)

eta.y2 <- as.vector(X2 %*% betaa2 + rowSums(Z2 * b[id, 3:4])) 
y2 <- rnorm(n * K, eta.y2, sigma2.y)
deriv.y2 <- as.vector(XderivY2 %*% betaa2[2] + rowSums(ZderivY2 * b[id, 4]))

eta.y1 <- as.vector(X1 %*% betaa1 + rowSums(Z1 * b[id, 1:2])) + alphaa * deriv.y2 
y1 <- rnorm(n * K, eta.y1, sigma1.y)

dat <- DF[, ]
dat$id <- id
dat$y1 <- y1
dat$y2 <- y2

names(dat) <- c("year", "drug", "id", "y1", "y2")
data <- dat

Then, we run the multivariate model as follows:

formulas <- list(y1 ~ year + drug + (year | id),
                 y2 ~ year + (year | id))
families <- c("gaussian", "gaussian")
extraForm <- list(fixed = ~ 0 + 1,
                  random = ~ 0 + 1,
                  indFixed = 2, indRandom = 2)
  
fit <- mv_lme(formulas = formulas, data = data, 
                     families = families, 
                     corr_RE = FALSE,
                     assoc = TRUE,
                     assoc_from = 2,
                     assoc_to = 1,
                     extraForm = extraForm,
                     time_var = "year")

Truncated area parameterization

We aim to fit the following multivariate mixed-effects model:

\[ \begin{align} y_{i2}(t) &= x_{2}^\top(t) \beta_2 + z_{i2}^\top(t) b_{i2} + \epsilon_{i2}(t) = m_{i2}(t) + \epsilon_{i2}(t) \\ y_{i1}(t) &= x_{1}^\top(t) \beta_1 + z_{i1}^\top(t) b_{i1} + \epsilon_{i1}(t) = m_{i1}(t) + \alpha \int_{t-2}^{t} m_{i2}(s) \, ds + \epsilon_{i1}(t) \end{align} \] where:

$y_{i1}(t)$ and $y_{i2}(t)$ are the longitudinal outcomes 1 and 2, respectively, for individual $i$ at time $t$,
$x_{j}(t)$ and $z_{ij}(t)$ are the design vectors for the fixed and random effects of outcome $j \in \{1, 2\}$,
$\beta_j$ are the fixed effect coefficients, $b_{ij}$ are the individual-specific random effects, and $\epsilon_{ij}(t)$ are residual error terms,
$m_{ij}(t) = x_{j}^\top(t) \beta_j + z_{ij}^\top(t) b_{ij}$ denotes the mean (noise-free) trajectory of outcome $j$.

The model for outcome 1 includes a historical effect of outcome 2: specifically, the integral term

\[ \int_{t-2}^{t} m_{i2}(s) \, ds \]

represents the area under the trajectory of outcome 2 over the two years preceding time $t$, capturing its cumulative influence. This term is scaled by the parameter $\alpha$.

Example

First we simulate data:

n = 100

K <- 10  # number of planned repeated measurements per subject, per outcome
t.max <- 10 # maximum follow-up time

# parameters for the linear mixed effects model 1
betaa1 <- c("(Intercept)" = 3.01, "Time1" = 0.47, "Group" = -2)
sigma1.y <- 1.017671  # measurement error standard deviation

# parameters for the linear mixed effects model 2
betaa2 <- c("(Intercept)" = 2.15, "Time1" = 0.91)
sigma2.y <- 1  # measurement error standard deviation

# association parameter
alphaa <- 0.5

invDvec <- c(1, 0, 0, 0,
             0, 1, 0, 0,
             0, 0, 1, 0,
             0, 0, 0, 1)


invDmat <- matrix(invDvec, 4, 4)
D <- solve(invDmat)

# design matrices for the longitudinal measurement model
times <- c(replicate(n, c(0, sort(runif(K-1, 0, t.max))))) 
group <- sample(c(0,1), n, replace = TRUE)

DF <- data.frame(year = times, drug = factor(rep(group, each = K)))

X1 <- model.matrix(~ year + drug, data = DF)
Z1 <- model.matrix(~ year, data = DF)

X2 <- model.matrix(~ year, data = DF)
Z2 <- model.matrix(~ year, data = DF)

# design matrices for the area of the 1st longitudinal outcome
AextraFormY2 <- list(fixed = ~ -1 + year + I(year^2/2),
                     random = ~ -1 + year + I(year^2/2),
                     indFixed = 1:2, indRandom = 1:2)

mfXA_derivY2 <- model.frame(terms(AextraFormY2$fixed), data = DF)
mfZA_derivY2 <- model.frame(terms(AextraFormY2$random), data = DF)
XAderivY2 <- model.matrix(AextraFormY2$fixed, mfXA_derivY2)
ZAderivY2 <- model.matrix(AextraFormY2$random, mfZA_derivY2)


AextraFormY2_tr <- list(fixed = ~ -1 + I(year-2) + I((year-2)^2/2),
                     random = ~ -1 + I(year-2) + I((year-2)^2/2),
                     indFixed = 1:2, indRandom = 1:2)

mfXA_derivY2_tr <- model.frame(terms(AextraFormY2_tr$fixed), data = DF)
mfZA_derivY2_tr <- model.frame(terms(AextraFormY2_tr$random), data = DF)
XAderivY2_tr <- model.matrix(AextraFormY2_tr$fixed, mfXA_derivY2_tr)
ZAderivY2_tr <- model.matrix(AextraFormY2_tr$random, mfZA_derivY2_tr)

XAderivY2 <- XAderivY2 - XAderivY2_tr
ZAderivY2 <- ZAderivY2 - ZAderivY2_tr

#simulate random effects
library(MASS)

b <- mvrnorm(n, rep(0, nrow(D)), D)

# simulate longitudinal responses
id <- rep(1:n, each = K)

eta.y2 <- as.vector(X2 %*% betaa2 + rowSums(Z2 * b[id, 3:4])) 
y2 <- rnorm(n * K, eta.y2, sigma2.y)
auc.y2 <- as.vector(XAderivY2 %*% betaa2 + rowSums(ZAderivY2 * b[id, 3:4]))
eta.y1 <- as.vector(X1 %*% betaa1 + rowSums(Z1 * b[id, 1:2])) + alphaa * (auc.y2) 

dat <- DF[, ]
dat$id <- id
dat$y1 <- y1
dat$y2 <- y2

names(dat) <- c("year", "drug", "id", "y1", "y2")
data <- dat

Then, we run the multivariate model as follows:

formulas <- list(y1 ~ year + drug + (year | id),
                 y2 ~ year + (year | id))
families <- c("gaussian", "gaussian")
extraForm <- list(fixed = ~ -1 + year + I(year^2/2),
                  random = ~ -1 + year + I(year^2/2),
                  indFixed = 1:2, indRandom = 1:2)

extraForm_tr <- list(fixed = ~ -1 + I(year-2) + I((year-2)^2/2),
                     random = ~ -1 + I(year-2) + I((year-2)^2/2),
                     indFixed = 1:2, indRandom = 1:2)

  
fit <- mv_lme(formulas = formulas, data = data, 
                     families = families, 
                     corr_RE = FALSE,
                     assoc = TRUE,
                     assoc_from = 2,
                     assoc_to = 1,
                     extraForm = extraForm,
                     extraForm_tr = extraForm_tr,
                     time_var = "year")

Value (polynomials) parameterization

We aim to fit the following multivariate mixed-effects model:

\[ \begin{align} y_{i2}(t) &= x_{2}^\top(t) \beta_2 + z_{i2}^\top(t) b_{i2} + \epsilon_{i2}(t) = m_{i2}(t) + \epsilon_{i2}(t) \\ y_{i1}(t) &= x_{1}^\top(t) \beta_1 + z_{i1}^\top(t) b_{i1} + \epsilon_{i1}(t) = m_{i1}(t) + \alpha m_{i2}(t) + \epsilon_{i1}(t) \end{align} \]

where:

$y_{i1}(t)$ and $y_{i2}(t)$ are the longitudinal outcomes 1 and 2, respectively, for individual $i$ at time $t$,
$x_{j}(t)$ and $z_{ij}(t)$ are the design vectors for the fixed and random effects of outcome $j \in \{1, 2\}$,
$\beta_j$ are the fixed effect coefficients, $b_{ij}$ are the individual-specific random effects, and $\epsilon_{ij}(t)$ are residual error terms,
$m_{ij}(t) = x_{j}^\top(t) \beta_j + z_{ij}^\top(t) b_{ij}$ denotes the mean (noise-free) trajectory of outcome $j$.

The model for outcome 1 includes the underlying value of outcome 2 at time $t$, represented by the term

\[ m_{i2}(t), \]

which captures the instantaneous level of outcome 2. Its effect on outcome 1 is scaled by the parameter $\alpha$.

Example

First we simulate data:

n = 100 

K <- 10  # number of planned repeated measurements per subject, per outcome
t.max <- 10 # maximum follow-up time

# parameters for the linear mixed effects model 1
betaa1 <- c("(Intercept)" = 4.2206, "Group1" = 0.2885, "Time1" =  1.3009,
           "Time2" = 2.3196)
sigma1.y <- 1.017671  # measurement error standard deviation

# parameters for the linear mixed effects model 2
betaa2 <- c("(Intercept)" = 2.15, "Time1" = 0.91)
sigma2.y <- 1  # measurement error standard deviation

# association parameter
alphaa <- 1.6 # association parameter - value

invDvec <- c(1, 0, 0, 0, 0, 
             0, 1, 0, 0, 0, 
             0, 0, 1, 0, 0, 
             0, 0, 0, 1, 0, 
             0, 0, 0, 0, 1)

invDmat <- matrix(invDvec, 5, 5)
D <- solve(invDmat)

# design matrices for the longitudinal measurement model
times <- c(replicate(n, c(0, sort(runif(K-1, 0, t.max))))) 
group <- sample(c(0,1), n, replace = TRUE)

DF <- data.frame(year = times, drug = factor(rep(group, each = K)))

X1 <- model.matrix(~ drug + year + I(year^2), data = DF)
Z1 <- model.matrix(~ year + I(year^2), data = DF)

X2 <- model.matrix(~ year, data = DF)
Z2 <- model.matrix(~ year, data = DF)

#simulate random effects
library(MASS)

b <- mvrnorm(n, rep(0, nrow(D)), D)

# simulate longitudinal responses
id <- rep(1:n, each = K)

eta.y2 <- as.vector(X2 %*% betaa2 + rowSums(Z2 * b[id, 4:5])) 
y2 <- rnorm(n * K, eta.y2, sigma2.y)

eta.y1 <- as.vector(X1 %*% betaa1 + rowSums(Z1 * b[id, 1:3])) + alphaa * eta.y2 
y1 <- rnorm(n * K, eta.y1, sigma1.y)

dat <- DF[, ]
dat$id <- id
dat$y1 <- y1
dat$y2 <- y2

names(dat) <- c("year", "drug", "id", "y1", "y2")
data <- dat

Then, we run the multivariate model as follows:

formulas <- list(y1 ~ drug + year + I(year^2) +
                     (  year + I(year^2) | id),
                   y2 ~ year + (year | id))
families <- c("gaussian", "gaussian")


fit <- mv_lme(formulas = formulas, data = data,
                     families = families,
                     corr_RE = FALSE,
                     assoc = TRUE,
                     assoc_from = 2,
                     assoc_to = 1,
                     extraForm = NULL,
                     time_var = "year")

More details about the specification of the association parameters can be also found here: Rizopoulos, D. (2012). Joint models for longitudinal and time-to-event data: With applications in R. CRC press. section 5.1