Open Rstudio to do the practicals. Note that tasks with * are optional.
R version 4.3.1 (2023-06-16 ucrt)
Explore the pre-loaded data set ToothGrowth
.
Use the R functions hist(…) / boxplot(…) to visually explore the distribution of a numeric vector.
Since the data looks normally distributed and we have 60 subjects, we can use the t-test. Use the R function t.test() to investigate whether the mean sample is different from a particular value.
To extract a particular result from an object use the $ symbol.
The one-sample t-test is a special case of the regression model. Keep in mind that the p-value of a linear regression model with only the intercept investigates whether the coefficient is equal to 0. Here we want to investigate whether the mean is 20.
Null hypothesis: The mean tooth length is equal to 20.
Alternative hypothesis: The mean tooth length is different from 20.
##
## One Sample t-test
##
## data: ToothGrowth$len
## t = -1.2017, df = 59, p-value = 0.2343
## alternative hypothesis: true mean is not equal to 20
## 95 percent confidence interval:
## 16.83731 20.78936
## sample estimates:
## mean of x
## 18.81333
If we assume that the significant level is equal to 0.05, the output indicates that we cannot reject the null hypothesis that the sample mean length is 20.
## t
## -1.201661
The test statistic of a one-sample t-test is \(\frac{\bar{x} - \mu_0}{sd(x)/\sqrt{n}}\)
# Calculate the test statistic manually
test_statistic <- (mean(ToothGrowth$len) - 20)/(sd(ToothGrowth$len)/sqrt(length(ToothGrowth$len)))
test_statistic
## [1] -1.201661
The confidence interval is:
[\(\bar{x} - critical \ value_{\alpha/2} \frac{sd(x)}{\sqrt{n}}\), \(\bar{x} + critical \ value_{\alpha/2} \frac{sd(x)}{\sqrt{n}}\)]
The sampling distribution of the mean is assumed to follow the Student’s t-distribution. We will use this distribution to obtain the critical values and the p-value.
# Calculate the confidence interval manually
# lower limit:
mean(ToothGrowth$len) + qt(0.025, df = length(ToothGrowth$len) - 1) * (sd(ToothGrowth$len)/sqrt(length(ToothGrowth$len)))
## [1] 16.83731
# or
mean(ToothGrowth$len) - qt(0.025, df = length(ToothGrowth$len) - 1, lower.tail = FALSE) * (sd(ToothGrowth$len)/sqrt(length(ToothGrowth$len)))
## [1] 16.83731
# upper limit:
mean(ToothGrowth$len) + qt(0.025, df = length(ToothGrowth$len) - 1, lower.tail = FALSE) * (sd(ToothGrowth$len)/sqrt(length(ToothGrowth$len)))
## [1] 20.78936
# Calculate the p-value manually
p_value = 2 * pt(test_statistic, df = length(ToothGrowth$len) - 1)
p_value
## [1] 0.2342964
Definition of p-value: If the null hypothesis is true, then there is a 24% chance of obtaining a difference at least as extreme as the results observed between the sample mean and 20.
The one-sample t-test is a special case of the regression model. Keep in mind that the p-value of a linear regression model with only the intercept investigates whether the coefficient is equal to 0. Here we want to investigate whether the mean is equal to 20. Therefore, we assume our new outcome to be \(y-20\).
# Subtract the value 20:
y <- ToothGrowth$len - 20
# Fit the linear regression model with only intercept:
summary(lm(y ~ 1))
##
## Call:
## lm(formula = y ~ 1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -14.6133 -5.7383 0.4367 6.4617 15.0867
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.1867 0.9875 -1.202 0.234
##
## Residual standard error: 7.649 on 59 degrees of freedom
The p-value is 0.234, the df is 59 and the t-value is -1.202. These results were also observed in the t-test:
##
## One Sample t-test
##
## data: ToothGrowth$len
## t = -1.2017, df = 59, p-value = 0.2343
## alternative hypothesis: true mean is not equal to 20
## 95 percent confidence interval:
## 16.83731 20.78936
## sample estimates:
## mean of x
## 18.81333
The coefficient in the linear regression model can be obtained as the mean sample - 20 (18.81333 - 20 = -1.18667).
Explore the pre-loaded data set ToothGrowth
.
Use the R function wilcox.test() to investigate whether the median sample is different from a particular value. It is recommended to use the exact p-value calculation when there are no ties, and to use continuity correction if an exact calculation is not possble.
Null hypothesis: The median tooth length of the supplement type VC is
equal to 20.
Alternative hypothesis: The median tooth length of the supplement type
VC is different from 20.
test_res_b <- wilcox.test(x = ToothGrowth$len[ToothGrowth$supp == "VC"],
mu = 20, exact = FALSE)
test_res_b
##
## Wilcoxon signed rank test with continuity correction
##
## data: ToothGrowth$len[ToothGrowth$supp == "VC"]
## V = 140, p-value = 0.05842
## alternative hypothesis: true location is not equal to 20
# Because there are ties in the data, the exact calculation of the p-value could not be used and the normal approximation was used instead. Continuity correction was assumed (by default).
If we assume that the significant level is equal to 0.05, the output indicates that we cannot reject the null hypothesis that the sample median length of the VC group is 20. Keep in mind that the p-value is very close to the significant level.
The test statistic is either the sum of the ranks (of the absolute
difference) where the difference was positive or negative:
# Calculate the test statistic manually
x <- ToothGrowth$len[ToothGrowth$supp == "VC"]
y <- 20
# Calculate the rank of the absolute differece:
res <- rank(abs(x - y))
# Calculate the sum of the ranks where the difference was positive:
sum(res[(x - y) > 0])
## [1] 140
## [1] 325
## V
## 140
Explore the pre-loaded data set ToothGrowth
.
This is a one-tailed test.
Use the R function wilcox.test() to investigate whether the median sample is different from a particular value. It is recommended to use the exact p-value calculation when there are no ties, and to use continuity correction if an exact calculation is not possble.
In the R function wilcox.test() check the argument alternative.
Null hypothesis: The the median tooth length of dose equal to 2 is
equal to 20.
Alternative hypothesis: The the median tooth length of dose equal to 2
is larger than 20.
wilcox.test(x = ToothGrowth$len[ToothGrowth$dose == "2"],
mu = 20, exact = FALSE, alternative = c("greater"))
##
## Wilcoxon signed rank test with continuity correction
##
## data: ToothGrowth$len[ToothGrowth$dose == "2"]
## V = 208.5, p-value = 5.972e-05
## alternative hypothesis: true location is greater than 20
If we assume that the significant level is equal to 0.05, the output indicates that we can reject the null hypothesis that the sample median length of dose equal to 2 is 20 (or less).
# Calculate the test statistic manually
x <- ToothGrowth$len[ToothGrowth$dose == "2"]
mu <- 20
# Calculate the rank of the absolute difference:
res <- rank(abs(x - mu))
# Calculate the sum of the ranks where the difference was positive:
sum(res[(x - mu) > 0])
## [1] 208.5
## [1] 1.5
Explore the pre-loaded data set iris
.
Sepal.Length
per
species.Use the R functions boxplot(…) / hist(…) to visually explore the distribution of a numeric vector per group.
Since the data looks normally distributed and we have 50 subjects, we can use the t-test. Use the R function t.test() to investigate whether the mean values of two sample groups are different.
To extract a particular result from an object use the $ symbol.
par(mfrow = c(1, 3))
hist(iris$Sepal.Length[iris$Species == "setosa"],
main = "setosa", xlab = "Length")
hist(iris$Sepal.Length[iris$Species == "versicolor"],
main = "versicolor", xlab = "Length")
hist(iris$Sepal.Length[iris$Species == "virginica"],
main = "virginica", xlab = "Length")
Null hypothesis: The mean sepal length is equal for setosa and
versicolor.
Alternative hypothesis: The mean sepal length is different for setosa
and versicolor.
test_res2 <- t.test(x = iris$Sepal.Length[iris$Species == "setosa"],
y = iris$Sepal.Length[iris$Species == "versicolor"])
test_res2
##
## Welch Two Sample t-test
##
## data: iris$Sepal.Length[iris$Species == "setosa"] and iris$Sepal.Length[iris$Species == "versicolor"]
## t = -10.521, df = 86.538, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -1.1057074 -0.7542926
## sample estimates:
## mean of x mean of y
## 5.006 5.936
If we assume that the significant level is equal to 0.05, the output indicates that we can reject the null hypothesis that the sample mean values of the (sepal) length for setosa and versicolor are the same.
Let’s calculate everything also manually.
We first test the homogeneity of the variances:
\(F\) statistic: \(\frac{highest \ variance}{lowest \
variance}\).
Moreover, the degrees of freedom are: \(n_1 -
1\) and \(n_2 - 1.\)
# Calculate the test statistic manually
x <- iris$Sepal.Length[iris$Species == "setosa"]
y <- iris$Sepal.Length[iris$Species == "versicolor"]
# Test whether the variances are equal
# Test statistic:
F_statistic = var(y)/var(x)
# P-value:
pf(F_statistic, df1 = length(x) - 1, df2 = length(y) - 1, lower.tail = FALSE)
## [1] 0.004328594
The test statistic is: \(t =
\frac{(\bar{x}_1 - \bar{x}_2) - (\mu_1 -
\mu_2)}{\sqrt{\frac{sd^2(x_1)}{n_1} +
\frac{sd^2(x_2)}{n_2}}}\),
where
\(\mu_1\) and \(\mu_2\) are 0 since we want to investigate whether the difference is 0.
The degrees of freedom are: \(df = \frac{\bigg[\frac{sd^2(x_1)}{n_1} + \frac{sd^2(x_2)}{n_2} \bigg]^2}{ \frac{[sd^2(x_1)/n_1]^2}{n_1 - 1} + \frac{[sd^2(x_2)/n_2]^2}{n_2 - 1} }\)
# Test statistic:
Test_statistic <- (mean(x) - mean(y))/sqrt(var(x)/length(x) + var(y)/length(y))
Test_statistic
## [1] -10.52099
# Degrees of freedom:
degrees_fr <- (var(x)/length(x) + var(y)/length(y))^2/( (var(x)/length(x))^2/(length(x) - 1) + (var(y)/length(y))^2/(length(y) - 1) )
# P-value:
2 * pt(Test_statistic, df = degrees_fr, lower.tail = TRUE)
## [1] 3.746743e-17
## t
## -10.52099
## [1] 3.746743e-17
The confidence interval is:
\(\bigg[(\bar{x}_1-\bar{x}_2) - critical \ value_{\alpha/2} \sqrt{\frac{sd^2(x_1)}{n_1} + \frac{sd^2(x_2)}{n_2}}\), \((\bar{x}_1-\bar{x}_2) + critical \ value_{\alpha/2} \sqrt{\frac{sd^2(x_1)}{n_1} + \frac{sd^2(x_2)}{n_2}}\bigg]\)
The sampling distribution of the mean is assumed to follow the Student’s t-distribution. We will use this distribution to obtain the critical values.
# Calculate the confidence interval manually
# lower limit:
(mean(x) - mean(y)) + qt(0.025, df = degrees_fr) * sqrt(var(x)/length(x) + var(y)/length(y))
## [1] -1.105707
# or
(mean(x) - mean(y)) - qt(0.025, df = degrees_fr, lower.tail = FALSE) * sqrt(var(x)/length(x) + var(y)/length(y))
## [1] -1.105707
# upper limit:
(mean(x) - mean(y)) + qt(0.025, df = degrees_fr, lower.tail = FALSE) * sqrt(var(x)/length(x) + var(y)/length(y))
## [1] -0.7542926
Explore the pre-loaded data set iris
.
Petal.Width
per
species.The distributions of the variables do not look normally distributed. Therefore, we will use the non-parametric test. Use the R function wilcox.test() to investigate whether the two populations have the same distribution. It is recommended to use the exact p-value calculation when there are no ties, and to use continuity correction if an exact calculation is not possble.
par(mfrow = c(1, 2))
par(mfrow = c(1, 3))
hist(iris$Petal.Width[iris$Species == "setosa"],
main = "setosa", xlab = "Width")
hist(iris$Petal.Width[iris$Species == "versicolor"],
main = "versicolor", xlab = "Width")
hist(iris$Petal.Width[iris$Species == "virginica"],
main = "virginica", xlab = "Width")
Null hypothesis: The distributions of petal width are equal for
setosa and versicolor.
Null hypothesis: The distributions of petal width are different for
setosa and versicolor.
The distributions of width (petal) for setosa and versicolor do not look normally distributed. We will, therefore, use the non-parametric test.
test_res2b <- wilcox.test(x = iris$Petal.Width[iris$Species == "setosa"],
y = iris$Petal.Width[iris$Species == "versicolor"],
exact = FALSE)
test_res2b
##
## Wilcoxon rank sum test with continuity correction
##
## data: iris$Petal.Width[iris$Species == "setosa"] and iris$Petal.Width[iris$Species == "versicolor"]
## W = 0, p-value < 2.2e-16
## alternative hypothesis: true location shift is not equal to 0
If we assume that the significant level is equal to 0.05, the output indicates that we can reject the null hypothesis that the distributions of the (petal) width for setosa and versicolor are the same.
## W
## 0
## [1] 2.284669e-18
Explore the pre-loaded data set airquality
.
Wind
per month
group.Wind
is different between month 5 and 8.
Let’s assume that the measurements are paired.The distributions of the variables do not look normally distributed. Therefore, we will use the non-parametric test. Use the R function wilcox.test() to investigate whether the median values are different. Check the argument paired in the R function.
par(mfrow = c(1, 2))
hist(airquality$Wind[airquality$Month == "5"],
main = "5 months", xlab = "Wind")
hist(airquality$Wind[airquality$Month == "8"],
main = "8 months", xlab = "Wind")
We notice that the distribution of the variable of interest is skewed (at least in one of the groups) and might not follow a normal distribution. Moreover, the sample size is small. We decide, therefore, to use a non-parametric test.
wilcox.test(x = airquality$Wind[airquality$Month == "5"],
y = airquality$Wind[airquality$Month == "8"],
paired = TRUE, exact = FALSE)
##
## Wilcoxon signed rank test with continuity correction
##
## data: airquality$Wind[airquality$Month == "5"] and airquality$Wind[airquality$Month == "8"]
## V = 359, p-value = 0.00952
## alternative hypothesis: true location shift is not equal to 0
# Alternatively:
wilcox.test(Wind ~ Month,
data = airquality[airquality$Month == "5" | airquality$Month == "8",],
paired = TRUE, exact = FALSE)
##
## Wilcoxon signed rank test with continuity correction
##
## data: Wind by Month
## V = 359, p-value = 0.00952
## alternative hypothesis: true location shift is not equal to 0
wilcox.test(Wind ~ Month, data = airquality,
subset = Month %in% c(5, 8),
paired = TRUE, exact = FALSE)
##
## Wilcoxon signed rank test with continuity correction
##
## data: Wind by Month
## V = 359, p-value = 0.00952
## alternative hypothesis: true location shift is not equal to 0
If we assume that the significant level is equal to 0.05, the output indicates that we can reject the null hypothesis that the sample median wind of the month 5 is equal to the month 8.
# Calculate the test statistic manually
x <- airquality$Wind[airquality$Month == "5"]
y <- airquality$Wind[airquality$Month == "8"]
# Create a data frame including the variables, the difference and the absolute difference:
mat <- data.frame(x, y, diff = c(x-y), abs_diff = abs(x-y))
# Remove the pairs that scored equal:
mat <- mat[-10,]
# Add the ranks of the absolute differences as an extra column:
mat <- data.frame(mat, ranks = rank(abs(mat$x-mat$y)))
# Calculate the sum of the ranks where the difference was positive:
sum(mat$ranks[mat$diff > 0])
## [1] 359
## [1] 106
Explore the pre-loaded data set airquality
.
We notice that the distribution of the variable of interest is skewed (at least in one of the groups) and might not follow a normal distribution. Moreover, the sample size is small. We decide, therefore, to use a non-parametric test.
Null hypothesis: The median temperature of month 5 is equal to the
median temperature of month 8.
Alternative hypothesis: The median temperature of month 5 is lower than
the median temperature of month 8.
wilcox.test(x = airquality$Temp[airquality$Month == "5"],
y = airquality$Temp[airquality$Month == "8"],
alternative = 'less', paired = TRUE, exact = FALSE)
##
## Wilcoxon signed rank test with continuity correction
##
## data: airquality$Temp[airquality$Month == "5"] and airquality$Temp[airquality$Month == "8"]
## V = 1, p-value = 6.733e-07
## alternative hypothesis: true location shift is less than 0
# Alternatively...
# When using the formula specification, x refers to the first level or values and y refers to the second level or value.
wilcox.test(Temp ~ Month,
data = airquality[airquality$Month == "5" | airquality$Month == "8",],
alternative = 'less', paired = TRUE, exact = FALSE)
##
## Wilcoxon signed rank test with continuity correction
##
## data: Temp by Month
## V = 1, p-value = 6.733e-07
## alternative hypothesis: true location shift is less than 0
If we assume that the significant level is equal to 0.05, we can reject the null hypothesis that the sample median values of the two groups are equal.
Explore the pre-loaded data set iris
.
# Create a new data set where the first variable indicates the length and the second variable the group (setona or versicolor):
var1 = iris$Sepal.Length[iris$Species == "setosa"]
var2 = iris$Sepal.Length[iris$Species == "versicolor"]
dat <- data.frame(outcome = c(var1, var2),
group = c(rep(0, length(var1)),
rep(1, length(var2))))
# Fit a linear regression model:
summary(lm(outcome ~ group, data = dat))
##
## Call:
## lm(formula = outcome ~ group, data = dat)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.0360 -0.3135 -0.0060 0.2715 1.0640
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.00600 0.06250 80.09 <2e-16 ***
## group 0.93000 0.08839 10.52 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.442 on 98 degrees of freedom
## Multiple R-squared: 0.5304, Adjusted R-squared: 0.5256
## F-statistic: 110.7 on 1 and 98 DF, p-value: < 2.2e-16
The p-value is <0.0001 and the t-value is 10.52. The same results can be observed from the t-test:
var1 = iris$Sepal.Length[iris$Species == "setosa"]
var2 = iris$Sepal.Length[iris$Species == "versicolor"]
t.test(var2, var1)
##
## Welch Two Sample t-test
##
## data: var2 and var1
## t = 10.521, df = 86.538, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 0.7542926 1.1057074
## sample estimates:
## mean of x mean of y
## 5.936 5.006
The slope coefficient in the linear regression model can be obtained as the mean difference (5.936 - 5.006 = 0.93).
We can also investigate the confidence intervals:
## [1] 0.7542926 1.1057074
## attr(,"conf.level")
## [1] 0.95
# Obtain the confidence interval from the linear regression:
confint(lm(outcome ~ group, data = dat))
## 2.5 % 97.5 %
## (Intercept) 4.8819618 5.130038
## group 0.7545835 1.105417
Explore the pre-loaded data set airquality
.
Use the R functions boxplot(…) / hist(…) to visually explore the distribution of a numeric vector per group.
We do not seem to have normally distributed data. Moreover, the sample size is small. Therefore, we can use the non-parametric test Kruskal-Wallis. Use the R function kruskal.test() to investigate whether the samples originate from the same distribution.
par(mfrow = c(2, 3))
hist(airquality$Temp[airquality$Month == "5"],
main = "Month 5", xlab = "Length")
hist(airquality$Temp[airquality$Month == "6"],
main = "Month 6", xlab = "Length")
hist(airquality$Temp[airquality$Month == "7"],
main = "Month 7", xlab = "Length")
hist(airquality$Temp[airquality$Month == "8"],
main = "Month 8", xlab = "Length")
hist(airquality$Temp[airquality$Month == "9"],
main = "Month 9", xlab = "Length")
We notice that the distribution of the variable of interest is skewed (at least in one of the groups) and might not follow a normal distribution. Moreover, the sample size is small. We decide, therefore, to use a non-parametric test.
Null hypothesis: the distribution of temperature is similar per month
group.
Alternative hypothesis: the distribution of temperature is different per
month group.
##
## Kruskal-Wallis rank sum test
##
## data: Temp by Month
## Kruskal-Wallis chi-squared = 73.328, df = 4, p-value = 4.496e-15
The output shows that, if the significant level is 0.05, we can reject the null hypothesis that the groups have similar distributions.
We have normally distributed data and a large sample size. Use the R function aov() to investigate whether the samples originate from the same distribution.
# Generate data
set.seed(123)
BMI1 <- rnorm(80, 20, 10)
BMI2 <- rnorm(80, 40, 6)
BMI3 <- rnorm(80, 15, 20)
# First we will explore the data with plots
boxplot(BMI1, BMI2, BMI3)
The distributions are normal and the sample size is large. We decide, therefore, to use an ANOVA test.
BMI <- c(BMI1, BMI2, BMI3)
group <- c(rep(1, 80), rep(2, 80), rep(3, 80))
summary(aov(BMI ~ as.factor(group)))
## Df Sum Sq Mean Sq F value Pr(>F)
## as.factor(group) 2 26720 13360 83.18 <2e-16 ***
## Residuals 237 38066 161
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The output shows that, if the significant level is 0.05, we can reject the null hypothesis that the sample mean values of the groups are equal.
Explore the pre-loaded data set iris
.
Sepal.Length
and Sepal.Width
.Sepal.Length
and Sepal.Width
.Sepal.Length
and Sepal.Width
.Use the R function plot(…) to create a scatterplot.
Use the R function cor(…) to obtain the correlation coefficient.
Use the R function cor.test(…) to perform the correlation test.
## [1] -0.1175698
##
## Pearson's product-moment correlation
##
## data: iris$Sepal.Length and iris$Sepal.Width
## t = -1.4403, df = 148, p-value = 0.1519
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.27269325 0.04351158
## sample estimates:
## cor
## -0.1175698
Explore the pre-loaded data set iris
.
Sepal.Length
and Sepal.Width
in the setosa
group.Sepal.Length
and Sepal.Width
in the setosa
group.Sepal.Length
and Sepal.Width
in the setosa
group.Use the R function cor(…) to obtain the correlation coefficient. Check the argument method.
Use the R function cor.test(…) to perform the correlation test. Check the argument method.
# Create new vectors with the desired variables
var1 <- iris$Sepal.Length[iris$Species == "setosa"]
var2 <- iris$Sepal.Width[iris$Species == "setosa"]
plot(x = var1, y = var2)
## [1] 0.7425467
## Warning in cor.test.default(x = var1, y = var2, method = "spearman"): Cannot compute exact p-value
## with ties
##
## Spearman's rank correlation rho
##
## data: var1 and var2
## S = 5095.1, p-value = 2.317e-10
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
## rho
## 0.7553375
Explore the pre-loaded data set iris
.
Sepal.Length
and Sepal.Width
.Sepal.Length
and Sepal.Width
.Use the R function cor.test(…) to perform the correlation test.
Use the R function lm(…) to perform a linear regression.
The slope of the linear regression becomes the correlation coefficient if the two variables have identical standard deviations (they are standardized).
##
## Pearson's product-moment correlation
##
## data: iris$Sepal.Length and iris$Sepal.Width
## t = -1.4403, df = 148, p-value = 0.1519
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.27269325 0.04351158
## sample estimates:
## cor
## -0.1175698
# Linear regression
# We standardize the variables:
y1 <- (iris$Sepal.Length - mean(iris$Sepal.Length))/sd(iris$Sepal.Length)
y2 <- (iris$Sepal.Width - mean(iris$Sepal.Width))/sd(iris$Sepal.Width)
# We fit a linear regression model:
summary(lm(y1 ~ y2))
##
## Call:
## lm(formula = y1 ~ y2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.8793 -0.7648 -0.1352 0.6738 2.6840
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -3.759e-16 8.136e-02 0.00 1.000
## y2 -1.176e-01 8.163e-02 -1.44 0.152
##
## Residual standard error: 0.9964 on 148 degrees of freedom
## Multiple R-squared: 0.01382, Adjusted R-squared: 0.007159
## F-statistic: 2.074 on 1 and 148 DF, p-value: 0.1519
##
## Call:
## lm(formula = scale(iris$Sepal.Length) ~ scale(iris$Sepal.Width))
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.8793 -0.7648 -0.1352 0.6738 2.6840
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -3.759e-16 8.136e-02 0.00 1.000
## scale(iris$Sepal.Width) -1.176e-01 8.163e-02 -1.44 0.152
##
## Residual standard error: 0.9964 on 148 degrees of freedom
## Multiple R-squared: 0.01382, Adjusted R-squared: 0.007159
## F-statistic: 2.074 on 1 and 148 DF, p-value: 0.1519
The correlation coefficient and the slope coefficient are both
-0.118.
The p-value from the correlation test and the slope coefficient are both
0.152.
© Eleni-Rosalina Andrinopoulou