\[\Huge r_{cor_xy,z} = \frac{r_{cor_xy} - r_{cor_xz} \cdot r_{cor_yz}}{\sqrt{(1-r_{cor_xz}^2) \cdot (1 - r_{cor_yz}^2)}}\]
Partial correlation allows us to control for variables that may be confounding variables in a data set. This allows us to see what the correlation between two variables would be if we control for a single variable. We can also control for more than one variable.
When performing a single-order partial correlation we can simple do this by running a correlation on each variable in question. For our purposes we are utilizing the following variables horsepower (hp), mile pers gallon (mpg) and lastly cylinders (cy). If you would like to follow along we are utilizing the mtcars data set that is built into R.
## 'data.frame': 32 obs. of 11 variables:
## $ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
## $ cyl : num 6 6 4 6 8 6 8 4 4 6 ...
## $ disp: num 160 160 108 258 360 ...
## $ hp : num 110 110 93 110 175 105 245 62 95 123 ...
## $ drat: num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
## $ wt : num 2.62 2.88 2.32 3.21 3.44 ...
## $ qsec: num 16.5 17 18.6 19.4 17 ...
## $ vs : num 0 0 1 1 0 1 0 1 1 1 ...
## $ am : num 1 1 1 0 0 0 0 0 0 0 ...
## $ gear: num 4 4 4 3 3 3 3 4 4 4 ...
## $ carb: num 4 4 1 1 2 1 4 2 2 4 ...
A data frame with 32 observations on 11 (numeric) variables.
Here we find the correlation between our mpg and hp
##
## Pearson's product-moment correlation
##
## data: mtcars$mpg and mtcars$hp
## t = -6.7424, df = 30, p-value = 1.788e-07
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.8852686 -0.5860994
## sample estimates:
## cor
## -0.7761684
Here we find the correlation between our hp and cyl
cor_test_2<- cor.test(mtcars$hp,mtcars$cyl)
tidy_result_2<- tidy(cor_test_2)# Convert result to a neat table
knitr::kable(tidy_result_2, caption = "Correlation Test Results",row.names = T,)
estimate | statistic | p.value | parameter | conf.low | conf.high | method | alternative | |
---|---|---|---|---|---|---|---|---|
1 | 0.8324475 | 8.228604 | 0 | 30 | 0.6816016 | 0.9154223 | Pearson’s product-moment correlation | two.sided |
##
## Pearson's product-moment correlation
##
## data: mtcars$hp and mtcars$cyl
## t = 8.2286, df = 30, p-value = 3.478e-09
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.6816016 0.9154223
## sample estimates:
## cor
## 0.8324475
Here we find the correlation between our mpg and cyl
##
## Pearson's product-moment correlation
##
## data: mtcars$mpg and mtcars$cyl
## t = -8.9197, df = 30, p-value = 6.113e-10
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.9257694 -0.7163171
## sample estimates:
## cor
## -0.852162
###Partial Correlation when controlling for another variable
###X is hp
###Y is mpg
###Z is cyl
cor_xy <- cor(mtcars$mpg,mtcars$hp)
cor_xz <- cor(mtcars$hp,mtcars$cyl)
cor_yz <- cor(mtcars$mpg,mtcars$cyl)
print(cor_xy)
## [1] -0.7761684
## [1] 0.8324475
## [1] -0.852162
\[\large r_{cor_xy} - r_{cor_xz} \cdot r_{cor_yz} \]
\[\large -0.7761684 - (0.8324475 \cdot -0.852162) \]
## [1] -0.06678832
\[\large \sqrt{(1-r_{cor_xz}^2) \cdot (1 - r_{cor_yz}^2)} \]
\[\large \sqrt{(1-0.8324475^2) \cdot (1 - (-0.852162)^2)} \]
###Bottom of formula
bottom_of_formula<- sqrt((1-(cor_xz^2))*(1-(cor_yz^2)))
print(bottom_of_formula)
## [1] 0.2899505
## [1] -0.2303439
## [1] -0.23
Note: Round up at the very end of the calculation.
\[\large r = \frac{-0.06678832}{0.2899505} \] \[\large r = -0.230439 \]
First order partial coefficient – is a correlation between two variables with just one additional variable partialed out of both.
Here we can utilize the ppcor package as an easy button method to calculate the partial correlation. You will need to install the ppcor package via CRAN: ppcor package
###Partial Correlation when controlling for another variable
##Load the ppcor library
library(ppcor)
## Loading required package: MASS
##
## Attaching package: 'MASS'
## The following object is masked from 'package:dplyr':
##
## select
###ppcor function controlling for cylinders in the mtcars dataset
ppcor::pcor.test(mtcars$mpg, mtcars$hp, mtcars[, c("cyl")])
###ppcor function controlling for cylinders,displacement in the mtcars dataset
ppcor::pcor.test(mtcars$mpg, mtcars$hp, mtcars[, c("cyl","disp")])
Hatcher, L. (2013). Advanced statistics in research: Reading, understanding, and writing up data analysis results. Shadow Finch Media.
Henderson and Velleman (1981), Building multiple regression models interactively. Biometrics, 37, 391–411.
Kim S (2015). ppcor: Partial and Semi-Partial (Part) Correlation. R package version 1.1, https://CRAN.R-project.org/package=ppcor.