
Chronological and gestational DNAm age estimation using different methylation-based clocks
Dolors Pelegri and Juan R Gonzalez
Institute for Global Health (ISGlobal), Barcelona, Spain Bioinformatics Research Group in Epidemiolgy (BRGE) http://brge.isglobal.org
2022-12-15
Source:vignettes/methylclock.Rmd
methylclock.Rmd
Description of implemented clocks
This manual describes how to estimate chronological and gestational DNA methylation (DNAm) age as well as biological age using different methylation clocks. The package includes the following estimators:
Chronological DNAm age (in years)
-
Horvath’s clock: It uses 353 CpGs described in
Horvath (2013). It was trained using 27K
and 450K arrays in samples from different tissues. Other three different
age-related biomarkers are also computed:
- AgeAcDiff (DNAmAge acceleration difference): Difference between DNAmAge and chronological age.
- IEAA Residuals obtained after regressing DNAmAge and chronological age adjusted by cell counts.
- EEAA Residuals obtained after regressing DNAmAge and chronological age. This measure was also known as DNAmAge acceleration residual in the first Horvath’s paper.
-
Hannum’s clock: It uses 71 CpGs described in Hannum et al. (2013). It was trained using 450K
array in blood samples. Another are-related biomarer is also computed:
- AMAR (Apparent Methylomic Aging Rate): Measure proposed in Hannum et al. (2013) computed as the ratio between DNAm age and the chronological age.
- BNN: It uses Horvath’s CpGs to train a Bayesian Neural Network (BNN) to predict DNAm age as described in Alfonso and Gonzalez (2020).
- Horvath’s skin+blood clock (skinHorvath): Epigenetic clock for skin and blood cells. It uses 391 CpGs described in Horvath et al. (2018). It was trained using 450K and EPIC arrays in skin and blood sampels.
- PedBE clock: Epigenetic clock from buccal epithelial swabs. It’s intended purpose is buccal samples from individuals aged 0-20 years old. It uses 84 CpGs described in McEwen et al. (2019). The authors gathered 1,721 genome-wide DNAm profiles from 11 different cohorts with individuals aged 0 to 20 years old.
- Wu’s clock: It uses 111 CpGs described in Wu et al. (2019). It is designed to predict age in children. It was trained using 27K and 450K.
-
BLUP clock: It uses 319607 CpGs described in Zhang et al. (2019). It was trained using 450K
and EPIC arrays in blood (13402 samples) and saliva (259 samples). Age
predictors based on training sets with various sample sizes using Best
Linear Unbiased Prediction (BLUP)
- EN clock: It uses 514 CpGs described in Zhang et al. (2019). It was trained using 450K and EPIC arrays in blood (13402 samples) and saliva (259 samples). Age predictors based on training sets with various sample sizes using Elastic Net (EN)
-
Neonatal Epigenetic Estimator of age (NEOage)
clocks: These are four epigenetic clocks that are focused on
age estimation of preterm infants based on their DNAm profile measured
in buccal epithelial cells. The number of CpGs within each clock range
from 303-522 CpGs with varying degrees of overlap between the clocks
described in Graw et al. (2021)
- PMA 450K Post-menstrual age (PMA), it uses 409 CpGs and was trained using 450K arrays.
- PMA EPIC Post-menstrual age (PMA), it uses 522 CpGs and was trained using EPIC arrays.
- PNA 450K Postnatal age (PNA) is equivalent to chronological age and is the time elapsed after birth, it uses 303 CpGs and was trained using 450K arrays.
- PNA EPIC Postnatal age (PNA) is equivalent to chronological age and is the time elapsed after birth, it uses 509 CpGs and was trained using EPIC arrays.
Gestational DNAm age (in weeks)
- Knight’s clock: It uses 148 CpGs described in Knight et al. (2016). It was trained using 27K and 450K arrays in cord blood samples.
- Bohlin’s clock: It uses 96 CpGs described in Bohlin et al. (2016). It was trained using 450K array in cord blood samples.
- Mayne’s clock: It uses 62 CpGs described in Mayne et al. (2017). It was trained using 27K and 450K.
- EPIC clock: EPIC-based predictor of gestational age. It uses 176 CpGs described in Haftorn et al. (2021). It was trained using EPIC arrays in cord blood samples.
-
Lee’s clocks: Three different biological clocks
described in Lee et al. (2019) are
implemented. It was trained for 450K and EPIC arrays in placenta
samples.
- RPC clock: Robust placental clock (RPC). It uses 558 CpG sites.
- CPC clock: Control placental clock (CPC). It usses 546 CpG sites.
- Refined RPC clock: Useful for uncomplicated term pregnancies (e.g. gestational age >36 weeks). It uses 396 CpG sites.
The biological DNAm clocks implemented in our package are:
- Levine’s clock (also know as PhenoAge): It uses 513 CpGs described in Levine et al. (2018). It was trained using 27K, 450K and EPIC arrays in blood samples.
- Telomere Length’s clock (TL): It uses 140 CpGs described in Lu et al. (2019) It was trained using 450K and EPIC arrays in blood samples.
The main aim of this package is to facilitate the interconnection
with R and Bioconductor’s infrastructure and, hence, avoiding submitting
data to online calculators. Additionally, methylclock
also
provides an unified way of computing DNAm age to help downstream
analyses.
Getting started
The package depends on some R packages that can be previously installed into your computer by:
Then methylclock
package is installed into your computer
by executing:
library(devtools)
install_github("isglobal-brge/methylclock")
The package is loaded into R as usual:
These libraries are required to reproduce this document:
DNA Methylation clocks
The main function to estimate chronological and biological mDNA age
is called DNAmAge
while the gestational DNAm age is
estimated using DNAmGA
function. Both functions have
similar input arguments. Next subsections detail some of the important
issues to be consider before computind DNAm clocks.
Data format
The methylation data is given in the argument x
. They
can be either beta or M values. The argument toBetas
should
be set to TRUE when M values are provided. The x
object can
be:
A matrix with CpGs in rows and individuals in columns having the name of the CpGs in the rownames.
A data frame or a tibble with CpGs in rows and individuals in columns having the name of the CpGs in the first column (e.g. cg00000292, cg00002426, cg00003994, …) as required in the Horvath’s DNA Methylation Age Calculator website (https://dnamage.genetics.ucla.edu/home).
A GenomicRatioSet object, the default method to encapsulate methylation data in
minfi
Bioconductor package.An ExpressionSet object as obtained, for instance, when downloading methylation data from GEO (https://www.ncbi.nlm.nih.gov/geo/).
Data nomalization
In principle, data can be normalized by using any of the existing
standard methods such as QN, ASMN, PBC, SWAN, SQN, BMIQ (see a revision
of those methods in Wang et al. (2015)).
DNAmAge
function includes the BMIQ method proposed by Teschendorff et al. (2012) using Horvath’s
robust implementation that basically consists of an optimal R code
implementation and optimization procedures. This normalization is
recommended by Horvath since it improves the predictions for his clock.
This normalization procedure is very time-consuming. In order to
overcome these difficulties, we have parallelize this process using
BiocParallel
library. This step is not mandatory, so that,
you can use your normalized data and set the argument
normalize
equal to FALSE (default).
Missing individual’s data
All the implemented methods require complete cases.
DNAmAge
function has an imputation method based on KNN
implemented in the function knn.impute
from
impute
Bioconductor package. This is performed when missing
data is present in the CpGs used in any of the computed clocks. There is
also another option based on a fast imputation method that imputes
missing values by the median of required CpGs as recommended in Bohlin et al. (2016). This is recommended when
analyzing 450K arrays since knn.impute
for large datasets
may be very time consuming. Fast imputation can be performed by setting
fastImp=TRUE
which is not the default value.
Missing CpGs of DNAm clocks
By default the package computes the different clocks when there are
more than 80% of the required CpGs of each method. Nothing is required
when having missing CpGs since the main functions will return NA for
those estimators when this criteria is not meet. Let us use a test
dataset (TestDataset
) which is available within the package
to illustrate the type of information we are obtaining:
cpgs.missing <- checkClocks(TestDataset)
There are some clocks that cannot be computed since your data do not contain the required CpGs.
These are the total number of missing CpGs for each clock :
clock Cpgs_in_clock missing_CpGs percentage
1 Horvath 353 2 0.6
2 Hannum 71 64 90.1
3 Levine 513 3 0.6
4 SkinHorvath 391 283 72.4
5 PedBE 94 91 96.8
6 Wu 111 2 1.8
7 TL 140 137 97.9
8 BLUP 319607 300288 94.0
9 EN 514 476 92.6
10 NEOaPMA450K 409 393 96.1
11 NEOaPNA450K 303 295 97.4
12 NEOaPMAEPIC 522 514 98.5
13 NEOaPNAEPIC 509 489 96.1
cpgs.missing.GA <- checkClocksGA(TestDataset)
There are some clocks that cannot be computed since your data do not contain the required CpGs.
These are the total number of missing CpGs for each clock :
clock Cpgs_in_clock missing_CpGs percentage
1 Knight 148 0 0.0
2 Bohlin 96 87 90.6
3 Mayne 62 0 0.0
4 Lee 1125 1072 95.3
5 EPIC 176 170 96.6
The objects cpgs.missing
and
cpgs.missing.GA
are lists having the missing CpGs of each
clock
names(cpgs.missing)
[1] "Horvath" "Hannum" "Levine" "skinHorvath" "PedBE"
[6] "Wu" "TL" "BLUP" "EN" "NEOaPMA450K"
[11] "NEOaPNA450K" "NEOaPMAEPIC" "NEOaPNAEPIC"
We can see which are those CpGs for a given clock (for example Hannum) just typing:
cpgs.missing$Hannum
[1] "cg20822990" "cg22512670" "cg25410668" "cg04400972"
[5] "cg16054275" "cg10501210" "ch.2.30415474F" "cg22158769"
[9] "cg02085953" "cg06639320" "cg22454769" "cg24079702"
[13] "cg23606718" "cg22016779" "cg03607117" "cg07553761"
[17] "cg00481951" "cg25478614" "cg25428494" "cg02650266"
[21] "cg08234504" "cg23500537" "cg20052760" "cg16867657"
[25] "cg06685111" "cg00486113" "cg13001142" "cg20426994"
[29] "cg14361627" "cg08097417" "cg07955995" "cg22285878"
[33] "cg03473532" "cg08540945" "cg07927379" "cg16419235"
[37] "cg07583137" "cg22796704" "cg19935065" "cg23091758"
[41] "cg23744638" "cg04940570" "cg11067179" "cg22213242"
[45] "cg06419846" "cg02046143" "cg00748589" "cg18473521"
[49] "cg01528542" "ch.13.39564907R" "cg03032497" "cg04875128"
[53] "cg09651136" "cg03399905" "cg04416734" "cg07082267"
[57] "cg14692377" "cg06874016" "cg21139312" "cg02867102"
[61] "cg19283806" "cg14556683" "cg07547549" "cg08415592"
In Section @ref(section-example) we describe how to change this 80% threshold.
Cell counts
The EEAA method requires to estimate cell counts. We use the package
meffil
(Min et al. (2018))
that provides some functions to estimate cell counts using predefined
datasets. This is performed by setting cell.count=TRUE
(default value). The reference panel is passed through the argument
cell.count.reference
. So far, the following options are
available:
- “blood gse35069 complete”: methylation profiles from Reinius et al. (2012) for purified blood cell types. It includes CD4T, CD8T, Mono, Bcell, NK, Neu and Eos.
- “blood gse35069”: methylation profiles from Reinius et al. (2012) for purified blood cell types. It includes CD4T, CD8T, Mono, Bcell, NK and Gran.
- “blood gse35069 chen”: methylation profiles from Chen et al. (2017) blood cell types. It includes CD4T, CD8T, Mono, Bcell, NK, Neu and Eos.
- “andrews and bakulski cord blood”. Cord blood reference from Bakulski et al. (2016). It includes Bcell, CD4T, CD8T, Gran, Mono, NK and nRBC.
- “cord blood gse68456” Cord blood methylation profiles from Goede et al. (2015). It includes CD4T, CD8T, Mono, Bcell, NK, Neu, Eos and RBC.
-
“gervin and lyle cord blood” Cord blood reference
generated by Kristina Gervin and Robert Lyle, available at
miffil
package. It includes CD14, Bcell, CD4T, CD8T, NK, Gran. - “saliva gse48472”: Reference generated from the multi-tissue pannel from Slieker et al. (2013). It includes Buccal, CD4T, CD8T, Mono, Bcell, NK, Gran.
- “guintivano dlpfc”: Reference generated from Guintivano, Aryee, and Kaminsky (2013). It includes dorsolateral prefrontal cortex, NeuN_neg and NeuN_pos.
- “combined cord blood”: References generated based in samples assayed by Bakulski et al, Gervin et al., de Goede et al., and Lin et al. It includes umbilical cord blood, Bcell, CD4T, CD8T, Gran, Mono, NK and nRBC
- “placenta”: References from Yuan et al. (2021), Yuan (2022). It includes placental cells from third trimester Trophoblasts, Stromal, Hofbauer, Endothelial, nRBC, Syncytiotrophoblast.
Chronological and biological DNAm age estimation
Next we illustrate how to estimate the chronological DNAm age using several datasets which aim to cover different data input formats.
IMPORTANT NOTE: On some systems we can find an error
in the DNAmAge()
function when parameter
cell.count = TRUE
. This error is related to
preprocessCore
package and can be fixed by disabling
multi-threading when installing the preprocessCore package using the
command
BiocManager::install("preprocessCore",
configure.args = "--disable-threading",
force = TRUE)
Data in Horvath’s format (e.g. csv
with CpGs in
rows)
Let us start by reproducing the results proposed in Horvath (2013). It uses the format available in
the file ’MethylationDataExample55.csv” from his tutorial (available here). These data are
available at methylclock
package. Although these data can
be loaded into R by using standard functions such as
read.csv
we hihgly recommend to use functions from
tidiverse
, in particular read_csv
from
readr
package. The main reason is that currently
researchers are analyzing Illumina 450K or EPIC arrays that contains a
huge number of CpGs that can take a long time to be loaded when using
basic importing R function. These functions import csv
data
as tibble which is one of the possible formats of DNAmAge
function
library(tidyverse)
path <- system.file("extdata", package = "methylclock")
MethylationData <- read_csv(file.path(path, "MethylationDataExample55.csv"))
MethylationData
[38;5;246m# A tibble: 27,578 × 17
[39m
ProbeID GSM946048 GSM946…¹ GSM94…² GSM94…³ GSM94…⁴ GSM94…⁵ GSM94…⁶ GSM94…⁷
[3m
[38;5;246m<chr>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[38;5;250m 1
[39m cg00000292 0.706 0.730 0.705 0.751 0.715 0.634 0.682 0.635
[38;5;250m 2
[39m cg00002426 0.272 0.274 0.311 0.279 0.178 0.269 0.330 0.501
[38;5;250m 3
[39m cg00003994 0.037
[4m0
[24m 0.014
[4m7
[24m 0.017
[4m1
[24m 0.029
[4m0
[24m 0.016
[4m3
[24m 0.024
[4m3
[24m 0.012
[4m7
[24m 0.020
[4m6
[24m
[38;5;250m 4
[39m cg00005847 0.133 0.120 0.121 0.107 0.110 0.129 0.102 0.124
[38;5;250m 5
[39m cg00006414 0.030
[4m9
[24m 0.019
[4m2
[24m 0.021
[4m7
[24m 0.013
[4m2
[24m 0.018
[4m1
[24m 0.024
[4m3
[24m 0.019
[4m9
[24m 0.014
[4m3
[24m
[38;5;250m 6
[39m cg00007981 0.070
[4m0
[24m 0.071
[4m5
[24m 0.065
[4m5
[24m 0.071
[4m9
[24m 0.091
[4m4
[24m 0.050
[4m8
[24m 0.029
[4m4
[24m 0.056
[4m4
[24m
[38;5;250m 7
[39m cg00008493 0.993 0.993 0.993 0.994 0.991 0.994 0.993 0.996
[38;5;250m 8
[39m cg00008713 0.021
[4m5
[24m 0.020
[4m2
[24m 0.018
[4m7
[24m 0.016
[4m9
[24m 0.016
[4m2
[24m 0.014
[4m3
[24m 0.017
[4m2
[24m 0.018
[4m9
[24m
[38;5;250m 9
[39m cg00009407 0.010
[4m5
[24m 0.005
[4m1
[24m
[4m8
[24m 0.004
[4m1
[24m
[4m0
[24m 0.006
[4m7
[24m
[4m1
[24m 0.007
[4m5
[24m
[4m8
[24m 0.005
[4m1
[24m
[4m8
[24m 0.005
[4m4
[24m
[4m3
[24m 0.006
[4m2
[24m
[4m4
[24m
[38;5;250m10
[39m cg00010193 0.634 0.635 0.621 0.639 0.599 0.591 0.594 0.583
[38;5;246m# … with 27,568 more rows, 8 more variables: GSM946064 <dbl>, GSM946065 <dbl>,
[39m
[38;5;246m# GSM946066 <dbl>, GSM946067 <dbl>, GSM946073 <dbl>, GSM946074 <dbl>,
[39m
[38;5;246m# GSM946075 <dbl>, GSM946076 <dbl>, and abbreviated variable names
[39m
[38;5;246m# ¹GSM946049, ²GSM946052, ³GSM946054, ⁴GSM946055, ⁵GSM946056, ⁶GSM946059,
[39m
[38;5;246m# ⁷GSM946062
[39m
IMPORTANT NOTE: Be sure that the first column contains the
CpG names. Sometimes, your imported data look like this one (it can
happen, for instance, if the csv
file was created in R
without indicating row.names=FALSE
)
> mydata
# A tibble: 473,999 x 6
X1 Row.names BIB_15586_1X BIB_33043_1X EDP_5245_1X KAN_584_1X
<int> <chr> <dbl> <dbl> <dbl> <dbl>
1 1 cg000000~ 0.635 0.575 0.614 0.631
2 2 cg000001~ 0.954 0.948 0.933 0.950
3 3 cg000001~ 0.889 0.899 0.901 0.892
4 4 cg000001~ 0.115 0.124 0.107 0.123
5 5 cg000002~ 0.850 0.753 0.806 0.815
6 6 cg000002~ 0.676 0.771 0.729 0.665
7 7 cg000002~ 0.871 0.850 0.852 0.863
8 8 cg000003~ 0.238 0.174 0.316 0.206
If so, the first column must be removed before being used as the
input object in DNAmAge
funcion. It can be done using
dplyr
function
> mydata2 <- select(mydata, -1)
# A tibble: 473,999 x 5
Row.names BIB_15586_1X BIB_33043_1X EDP_5245_1X KAN_584_1X
<chr> <dbl> <dbl> <dbl> <dbl>
1 cg000000~ 0.635 0.575 0.614 0.631
2 cg000001~ 0.954 0.948 0.933 0.950
3 cg000001~ 0.889 0.899 0.901 0.892
4 cg000001~ 0.115 0.124 0.107 0.123
5 cg000002~ 0.850 0.753 0.806 0.815
6 cg000002~ 0.676 0.771 0.729 0.665
7 cg000002~ 0.871 0.850 0.852 0.863
8 cg000003~ 0.238 0.174 0.316 0.206
In any case, if you use the object mydata
that contains
the CpGs in the second column, you will see this error message:
> DNAmAge(mydata)
Error in DNAmAge(mydata) : First column should contain CpG names
Once data is in the proper format, DNAmAge can be estimated by simply:
age.example55 <- DNAmAge(MethylationData)
Warning in predAge(cpgs.imp, coefHannum, intercept = FALSE, min.perc): The number of missing CpGs for Hannum clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefSkin, intercept = TRUE, min.perc): The number of missing CpGs for Skin clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefPedBE, intercept = TRUE, min.perc): The number of missing CpGs for PedBE clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefTL, intercept = TRUE, min.perc): The number of missing CpGs for TL clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefBlup, intercept = TRUE, min.perc): The number of missing CpGs for Blup clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefEN, intercept = TRUE, min.perc): The number of missing CpGs for EN clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefNEOaPMA450K, intercept = TRUE, min.perc): The number of missing CpGs for NEOaPMA450K clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefNEOaPNA450K, intercept = TRUE, min.perc): The number of missing CpGs for NEOaPNA450K clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefNEOaPMAEPIC, intercept = TRUE, min.perc): The number of missing CpGs for NEOaPMAEPIC clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefNEOaPNAEPIC, intercept = TRUE, min.perc): The number of missing CpGs for NEOaPNAEPIC clock exceeds 80 %.
---> This DNAm clock will be NA.
age.example55
[38;5;246m# A tibble: 16 × 15
[39m
id Horvath Hannum Levine BNN skinHor…¹ PedBE Wu TL BLUP EN
[3m
[38;5;246m<chr>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<lgl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<lgl>
[39m
[23m
[3m
[38;5;246m<lgl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<lgl>
[39m
[23m
[3m
[38;5;246m<lgl>
[39m
[23m
[3m
[38;5;246m<lgl>
[39m
[23m
[38;5;250m 1
[39m GSM946048 51.8
[31mNA
[39m -
[31m30
[39m
[31m.
[39m
[31m3
[39m 56.4
[31mNA
[39m
[31mNA
[39m 1.08
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m 2
[39m GSM946049 39.8
[31mNA
[39m -
[31m29
[39m
[31m.
[39m
[31m6
[39m 42.1
[31mNA
[39m
[31mNA
[39m 0.808
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m 3
[39m GSM946052 26.4
[31mNA
[39m -
[31m33
[39m
[31m.
[39m
[31m3
[39m 25.6
[31mNA
[39m
[31mNA
[39m 0.772
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m 4
[39m GSM946054 34.0
[31mNA
[39m -
[31m36
[39m
[31m.
[39m
[31m0
[39m 28.0
[31mNA
[39m
[31mNA
[39m 0.941
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m 5
[39m GSM946055 10.1
[31mNA
[39m -
[31m52
[39m
[31m.
[39m
[31m8
[39m 13.4
[31mNA
[39m
[31mNA
[39m 0.456
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m 6
[39m GSM946056 20.4
[31mNA
[39m -
[31m42
[39m
[31m.
[39m
[31m2
[39m 16.7
[31mNA
[39m
[31mNA
[39m 0.621
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m 7
[39m GSM946059 6.00
[31mNA
[39m -
[31m44
[39m
[31m.
[39m
[31m8
[39m 7.54
[31mNA
[39m
[31mNA
[39m 0.258
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m 8
[39m GSM946062 34.6
[31mNA
[39m -
[31m23
[39m
[31m.
[39m
[31m2
[39m 34.6
[31mNA
[39m
[31mNA
[39m 0.624
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m 9
[39m GSM946064 7.91
[31mNA
[39m -
[31m49
[39m
[31m.
[39m
[31m8
[39m 12.0
[31mNA
[39m
[31mNA
[39m 0.237
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m10
[39m GSM946065 4.72
[31mNA
[39m -
[31m48
[39m
[31m.
[39m
[31m2
[39m 6.43
[31mNA
[39m
[31mNA
[39m 0.396
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m11
[39m GSM946066 29.6
[31mNA
[39m -
[31m39
[39m
[31m.
[39m
[31m9
[39m 28.5
[31mNA
[39m
[31mNA
[39m 0.413
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m12
[39m GSM946067 1.38
[31mNA
[39m -
[31m48
[39m
[31m.
[39m
[31m3
[39m 3.48
[31mNA
[39m
[31mNA
[39m 0.122
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m13
[39m GSM946073 56.0
[31mNA
[39m -
[31m26
[39m
[31m.
[39m
[31m7
[39m 47.3
[31mNA
[39m
[31mNA
[39m 0.714
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m14
[39m GSM946074 24.0
[31mNA
[39m -
[31m39
[39m
[31m.
[39m
[31m7
[39m 23.3
[31mNA
[39m
[31mNA
[39m 0.676
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m15
[39m GSM946075 9.38
[31mNA
[39m -
[31m45
[39m
[31m.
[39m
[31m4
[39m 11.9
[31mNA
[39m
[31mNA
[39m 0.251
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m16
[39m GSM946076 38.8
[31mNA
[39m -
[31m27
[39m
[31m.
[39m
[31m5
[39m 41.4
[31mNA
[39m
[31mNA
[39m 0.599
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;246m# … with 4 more variables: NEOaPMA450K <lgl>, NEOaPNA450K <lgl>,
[39m
[38;5;246m# NEOaPMAEPIC <lgl>, NEOaPNAEPIC <lgl>, and abbreviated variable name
[39m
[38;5;246m# ¹skinHorvath
[39m
As mention in Section @ref(section-missingCpGs) some clocks returns NA when there are more than 80% of the required CpGs are missing as we can see when typing
missCpGs <- checkClocks(MethylationData)
There are some clocks that cannot be computed since your data do not contain the required CpGs.
These are the total number of missing CpGs for each clock :
clock Cpgs_in_clock missing_CpGs percentage
1 Horvath 353 0 0.0
2 Hannum 71 64 90.1
3 Levine 513 0 0.0
4 SkinHorvath 391 282 72.1
5 PedBE 94 91 96.8
6 Wu 111 0 0.0
7 TL 140 137 97.9
8 BLUP 319607 300192 93.9
9 EN 514 476 92.6
10 NEOaPMA450K 409 393 96.1
11 NEOaPNA450K 303 295 97.4
12 NEOaPMAEPIC 522 514 98.5
13 NEOaPNAEPIC 509 489 96.1
Here we can observe that 72.1% of the required CpGs for SkinHorvath
clock are missing. We could estimate DNAm age using this clock just
changing the argument min.perc
in DNAmAge
. For
example, we can indicate that the minimum amount of required CpGs for
computing a given clock should be 25%.
age.example55.2 <- DNAmAge(MethylationData, min.perc = 0.25)
Warning in predAge(cpgs.imp, coefHannum, intercept = FALSE, min.perc): The number of missing CpGs for Hannum clock exceeds 25 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefPedBE, intercept = TRUE, min.perc): The number of missing CpGs for PedBE clock exceeds 25 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefTL, intercept = TRUE, min.perc): The number of missing CpGs for TL clock exceeds 25 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefBlup, intercept = TRUE, min.perc): The number of missing CpGs for Blup clock exceeds 25 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefEN, intercept = TRUE, min.perc): The number of missing CpGs for EN clock exceeds 25 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefNEOaPMA450K, intercept = TRUE, min.perc): The number of missing CpGs for NEOaPMA450K clock exceeds 25 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefNEOaPNA450K, intercept = TRUE, min.perc): The number of missing CpGs for NEOaPNA450K clock exceeds 25 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefNEOaPMAEPIC, intercept = TRUE, min.perc): The number of missing CpGs for NEOaPMAEPIC clock exceeds 25 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefNEOaPNAEPIC, intercept = TRUE, min.perc): The number of missing CpGs for NEOaPNAEPIC clock exceeds 25 %.
---> This DNAm clock will be NA.
age.example55.2
[38;5;246m# A tibble: 16 × 15
[39m
id Horvath Hannum Levine BNN skinHor…¹ PedBE Wu TL BLUP EN
[3m
[38;5;246m<chr>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<lgl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<lgl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<lgl>
[39m
[23m
[3m
[38;5;246m<lgl>
[39m
[23m
[3m
[38;5;246m<lgl>
[39m
[23m
[38;5;250m 1
[39m GSM946048 51.8
[31mNA
[39m -
[31m30
[39m
[31m.
[39m
[31m3
[39m 56.4 7.15
[31mNA
[39m 1.08
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m 2
[39m GSM946049 39.8
[31mNA
[39m -
[31m29
[39m
[31m.
[39m
[31m6
[39m 42.1 7.09
[31mNA
[39m 0.808
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m 3
[39m GSM946052 26.4
[31mNA
[39m -
[31m33
[39m
[31m.
[39m
[31m3
[39m 25.6 5.93
[31mNA
[39m 0.772
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m 4
[39m GSM946054 34.0
[31mNA
[39m -
[31m36
[39m
[31m.
[39m
[31m0
[39m 28.0 6.34
[31mNA
[39m 0.941
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m 5
[39m GSM946055 10.1
[31mNA
[39m -
[31m52
[39m
[31m.
[39m
[31m8
[39m 13.4 5.76
[31mNA
[39m 0.456
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m 6
[39m GSM946056 20.4
[31mNA
[39m -
[31m42
[39m
[31m.
[39m
[31m2
[39m 16.7 5.79
[31mNA
[39m 0.621
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m 7
[39m GSM946059 6.00
[31mNA
[39m -
[31m44
[39m
[31m.
[39m
[31m8
[39m 7.54 5.64
[31mNA
[39m 0.258
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m 8
[39m GSM946062 34.6
[31mNA
[39m -
[31m23
[39m
[31m.
[39m
[31m2
[39m 34.6 5.55
[31mNA
[39m 0.624
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m 9
[39m GSM946064 7.91
[31mNA
[39m -
[31m49
[39m
[31m.
[39m
[31m8
[39m 12.0 5.06
[31mNA
[39m 0.237
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m10
[39m GSM946065 4.72
[31mNA
[39m -
[31m48
[39m
[31m.
[39m
[31m2
[39m 6.43 5.48
[31mNA
[39m 0.396
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m11
[39m GSM946066 29.6
[31mNA
[39m -
[31m39
[39m
[31m.
[39m
[31m9
[39m 28.5 6.19
[31mNA
[39m 0.413
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m12
[39m GSM946067 1.38
[31mNA
[39m -
[31m48
[39m
[31m.
[39m
[31m3
[39m 3.48 4.91
[31mNA
[39m 0.122
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m13
[39m GSM946073 56.0
[31mNA
[39m -
[31m26
[39m
[31m.
[39m
[31m7
[39m 47.3 7.07
[31mNA
[39m 0.714
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m14
[39m GSM946074 24.0
[31mNA
[39m -
[31m39
[39m
[31m.
[39m
[31m7
[39m 23.3 6.23
[31mNA
[39m 0.676
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m15
[39m GSM946075 9.38
[31mNA
[39m -
[31m45
[39m
[31m.
[39m
[31m4
[39m 11.9 5.57
[31mNA
[39m 0.251
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m16
[39m GSM946076 38.8
[31mNA
[39m -
[31m27
[39m
[31m.
[39m
[31m5
[39m 41.4 6.69
[31mNA
[39m 0.599
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;246m# … with 4 more variables: NEOaPMA450K <lgl>, NEOaPNA450K <lgl>,
[39m
[38;5;246m# NEOaPMAEPIC <lgl>, NEOaPNAEPIC <lgl>, and abbreviated variable name
[39m
[38;5;246m# ¹skinHorvath
[39m
In that case, we see as SkinHorvath clock is estimated (though it can be observed that the estimation is not very accurate - this is why we considered at least having 80% of the required CpGs).
By default all available clocks (Horvath, Hannum, Levine, BNN,
skinHorvath, …) are estimated. One may select a set of clocks by using
the argument clocks
as follows:
[38;5;246m# A tibble: 16 × 3
[39m
id Horvath BNN
[3m
[38;5;246m<chr>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[38;5;250m 1
[39m GSM946048 51.8 56.4
[38;5;250m 2
[39m GSM946049 39.8 42.1
[38;5;250m 3
[39m GSM946052 26.4 25.6
[38;5;250m 4
[39m GSM946054 34.0 28.0
[38;5;250m 5
[39m GSM946055 10.1 13.4
[38;5;250m 6
[39m GSM946056 20.4 16.7
[38;5;250m 7
[39m GSM946059 6.00 7.54
[38;5;250m 8
[39m GSM946062 34.6 34.6
[38;5;250m 9
[39m GSM946064 7.91 12.0
[38;5;250m10
[39m GSM946065 4.72 6.43
[38;5;250m11
[39m GSM946066 29.6 28.5
[38;5;250m12
[39m GSM946067 1.38 3.48
[38;5;250m13
[39m GSM946073 56.0 47.3
[38;5;250m14
[39m GSM946074 24.0 23.3
[38;5;250m15
[39m GSM946075 9.38 11.9
[38;5;250m16
[39m GSM946076 38.8 41.4
Age acceleration
However, in epidemiological studies one is interested in assessing whether age acceleration is associated with a given trait or condition. Three different measures can be computed:
- ageAcc: Difference between DNAmAge and chronological age.
- ageAcc2: Residuals obtained after regressing chronological age and DNAmAge (similar to IEAA).
- ageAcc3: Residuals obtained after regressing chronological age and DNAmAge adjusted for cell counts (similar to EEAA).
All this estimates can be obtained for each clock when providing
chronological age through age
argument. This information is
normally provided in a different file including different covariates
(metadata or sample annotation data). In this example data are available
at ‘SampleAnnotationExample55.csv’ file that is also available at
methylclock
package:
[38;5;246m# A tibble: 16 × 14
[39m
OriginalOr…¹ id title geo_a…² Tissu…³ Tissue disea…⁴ Age PostM…⁵ Cause…⁶
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<chr>
[39m
[23m
[3m
[38;5;246m<chr>
[39m
[23m
[3m
[38;5;246m<chr>
[39m
[23m
[3m
[38;5;246m<chr>
[39m
[23m
[3m
[38;5;246m<chr>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<chr>
[39m
[23m
[38;5;250m 1
[39m 3 GSM9… Auti… GSM946… Fresh … occip… 1 60 26.5 cancer
[38;5;250m 2
[39m 4 GSM9… Cont… GSM946… Fresh … occip… 0 39
[31mNA
[39m cardiac
[38;5;250m 3
[39m 7 GSM9… Auti… GSM946… Fresh … occip… 1 28 43 cancer
[38;5;250m 4
[39m 9 GSM9… Auti… GSM946… Fresh … occip… 1 39 14 cardiac
[38;5;250m 5
[39m 10 GSM9… Auti… GSM946… Fresh … occip… 1 8 22.2 cancer
[38;5;250m 6
[39m 11 GSM9… Auti… GSM946… Fresh … occip… 1 22 25 hypoxia
[38;5;250m 7
[39m 14 GSM9… Cont… GSM946… Fresh … occip… 0 4 17 cardiac
[38;5;250m 8
[39m 17 GSM9… Cont… GSM946… Fresh … occip… 0 28 13 other
[38;5;250m 9
[39m 19 GSM9… Auti… GSM946… Fresh … occip… 1 5 25.5 hypoxia
[38;5;250m10
[39m 20 GSM9… Auti… GSM946… Fresh … occip… 1 2 4 hypoxia
[38;5;250m11
[39m 21 GSM9… Auti… GSM946… Fresh … occip… 1 30 16 cardiac
[38;5;250m12
[39m 22 GSM9… Cont… GSM946… Fresh … occip… 0 1 19 unknown
[38;5;250m13
[39m 28 GSM9… Cont… GSM946… Fresh … occip… 0 60 24.2 unknown
[38;5;250m14
[39m 29 GSM9… Cont… GSM946… Fresh … occip… 0 22 21.5 unknown
[38;5;250m15
[39m 30 GSM9… Cont… GSM946… Fresh … occip… 0 8 5 cardiac
[38;5;250m16
[39m 31 GSM9… Cont… GSM946… Fresh … occip… 0 30 15 hypoxia
[38;5;246m# … with 4 more variables: individual <dbl>, Female <dbl>, Caucasian <lgl>,
[39m
[38;5;246m# FemaleOriginal <lgl>, and abbreviated variable names ¹OriginalOrder,
[39m
[38;5;246m# ²geo_accession, ³TissueDetailed, ⁴diseaseStatus, ⁵PostMortemInterval,
[39m
[38;5;246m# ⁶CauseofDeath
[39m
In this case, chronological age is available at Age
column:
age <- covariates$Age
head(age)
[1] 60 39 28 39 8 22
The different methylation clocks along with their age accelerated estimates can be simply computed by:
age.example55 <- DNAmAge(MethylationData, age=age,
cell.count=TRUE)
Warning in predAge(cpgs.imp, coefHannum, intercept = FALSE, min.perc): The number of missing CpGs for Hannum clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefSkin, intercept = TRUE, min.perc): The number of missing CpGs for Skin clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefPedBE, intercept = TRUE, min.perc): The number of missing CpGs for PedBE clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefTL, intercept = TRUE, min.perc): The number of missing CpGs for TL clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefBlup, intercept = TRUE, min.perc): The number of missing CpGs for Blup clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefEN, intercept = TRUE, min.perc): The number of missing CpGs for EN clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefNEOaPMA450K, intercept = TRUE, min.perc): The number of missing CpGs for NEOaPMA450K clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefNEOaPNA450K, intercept = TRUE, min.perc): The number of missing CpGs for NEOaPNA450K clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefNEOaPMAEPIC, intercept = TRUE, min.perc): The number of missing CpGs for NEOaPMAEPIC clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefNEOaPNAEPIC, intercept = TRUE, min.perc): The number of missing CpGs for NEOaPNAEPIC clock exceeds 80 %.
---> This DNAm clock will be NA.
age.example55
[38;5;246m# A tibble: 16 × 28
[39m
id Horvath ageAc…¹ ageAc…² ageAc…³ Hannum Levine ageAc…⁴ ageAc…⁵ ageAc…⁶
[3m
[38;5;246m<chr>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<lgl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[38;5;250m 1
[39m GSM946… 51.8 -
[31m8
[39m
[31m.
[39m
[31m22
[39m -
[31m4
[39m
[31m.
[39m
[31m45
[39m -
[31m4
[39m
[31m.
[39m
[31m91
[39m
[31mNA
[39m -
[31m30
[39m
[31m.
[39m
[31m3
[39m -
[31m90
[39m
[31m.
[39m
[31m3
[39m -
[31m5
[39m
[31m.
[39m
[31m94
[39m -
[31m6
[39m
[31m.
[39m
[31m19
[39m
[38;5;250m 2
[39m GSM946… 39.8 0.754 2.00 1.59
[31mNA
[39m -
[31m29
[39m
[31m.
[39m
[31m6
[39m -
[31m68
[39m
[31m.
[39m
[31m6
[39m 3.10 5.38
[38;5;250m 3
[39m GSM946… 26.4 -
[31m1
[39m
[31m.
[39m
[31m59
[39m -
[31m1
[39m
[31m.
[39m
[31m67
[39m -
[31m1
[39m
[31m.
[39m
[31m86
[39m
[31mNA
[39m -
[31m33
[39m
[31m.
[39m
[31m3
[39m -
[31m61
[39m
[31m.
[39m
[31m3
[39m 3.72 2.20
[38;5;250m 4
[39m GSM946… 34.0 -
[31m5
[39m
[31m.
[39m
[31m00
[39m -
[31m3
[39m
[31m.
[39m
[31m76
[39m -
[31m0
[39m
[31m.
[39m
[31m463
[39m
[31mNA
[39m -
[31m36
[39m
[31m.
[39m
[31m0
[39m -
[31m75
[39m
[31m.
[39m
[31m0
[39m -
[31m3
[39m
[31m.
[39m
[31m29
[39m -
[31m0
[39m
[31m.
[39m
[31m339
[39m
[38;5;250m 5
[39m GSM946… 10.1 2.06 -
[31m0
[39m
[31m.
[39m
[31m428
[39m 2.82
[31mNA
[39m -
[31m52
[39m
[31m.
[39m
[31m8
[39m -
[31m60
[39m
[31m.
[39m
[31m8
[39m -
[31m7
[39m
[31m.
[39m
[31m77
[39m 0.964
[38;5;250m 6
[39m GSM946… 20.4 -
[31m1
[39m
[31m.
[39m
[31m61
[39m -
[31m2
[39m
[31m.
[39m
[31m42
[39m -
[31m2
[39m
[31m.
[39m
[31m88
[39m
[31mNA
[39m -
[31m42
[39m
[31m.
[39m
[31m2
[39m -
[31m64
[39m
[31m.
[39m
[31m2
[39m -
[31m2
[39m
[31m.
[39m
[31m76
[39m -
[31m3
[39m
[31m.
[39m
[31m70
[39m
[38;5;250m 7
[39m GSM946… 6.00 2.00 -
[31m0
[39m
[31m.
[39m
[31m971
[39m -
[31m0
[39m
[31m.
[39m
[31m827
[39m
[31mNA
[39m -
[31m44
[39m
[31m.
[39m
[31m8
[39m -
[31m48
[39m
[31m.
[39m
[31m8
[39m 1.81 2.88
[38;5;250m 8
[39m GSM946… 34.6 6.65 6.57 5.32
[31mNA
[39m -
[31m23
[39m
[31m.
[39m
[31m2
[39m -
[31m51
[39m
[31m.
[39m
[31m2
[39m 13.9 7.97
[38;5;250m 9
[39m GSM946… 7.91 2.91 0.058
[4m9
[24m -
[31m2
[39m
[31m.
[39m
[31m61
[39m
[31mNA
[39m -
[31m49
[39m
[31m.
[39m
[31m8
[39m -
[31m54
[39m
[31m.
[39m
[31m8
[39m -
[31m3
[39m
[31m.
[39m
[31m61
[39m -
[31m6
[39m
[31m.
[39m
[31m59
[39m
[38;5;250m10
[39m GSM946… 4.72 2.72 -
[31m0
[39m
[31m.
[39m
[31m489
[39m 1.46
[31mNA
[39m -
[31m48
[39m
[31m.
[39m
[31m2
[39m -
[31m50
[39m
[31m.
[39m
[31m2
[39m -
[31m0
[39m
[31m.
[39m
[31m845
[39m 2.53
[38;5;250m11
[39m GSM946… 29.6 -
[31m0
[39m
[31m.
[39m
[31m427
[39m -
[31m0
[39m
[31m.
[39m
[31m268
[39m -
[31m1
[39m
[31m.
[39m
[31m37
[39m
[31mNA
[39m -
[31m39
[39m
[31m.
[39m
[31m9
[39m -
[31m69
[39m
[31m.
[39m
[31m9
[39m -
[31m3
[39m
[31m.
[39m
[31m64
[39m -
[31m2
[39m
[31m.
[39m
[31m62
[39m
[38;5;250m12
[39m GSM946… 1.38 0.375 -
[31m2
[39m
[31m.
[39m
[31m95
[39m -
[31m2
[39m
[31m.
[39m
[31m19
[39m
[31mNA
[39m -
[31m48
[39m
[31m.
[39m
[31m3
[39m -
[31m49
[39m
[31m.
[39m
[31m3
[39m -
[31m0
[39m
[31m.
[39m
[31m513
[39m -
[31m3
[39m
[31m.
[39m
[31m85
[39m
[38;5;250m13
[39m GSM946… 56.0 -
[31m4
[39m
[31m.
[39m
[31m0
[39m
[31m1
[39m -
[31m0
[39m
[31m.
[39m
[31m242
[39m 1.62
[31mNA
[39m -
[31m26
[39m
[31m.
[39m
[31m7
[39m -
[31m86
[39m
[31m.
[39m
[31m7
[39m -
[31m2
[39m
[31m.
[39m
[31m32
[39m -
[31m1
[39m
[31m.
[39m
[31m0
[39m
[31m1
[39m
[38;5;250m14
[39m GSM946… 24.0 2.03 1.23 -
[31m0
[39m
[31m.
[39m
[31m669
[39m
[31mNA
[39m -
[31m39
[39m
[31m.
[39m
[31m7
[39m -
[31m61
[39m
[31m.
[39m
[31m7
[39m -
[31m0
[39m
[31m.
[39m
[31m227
[39m 0.021
[4m1
[24m
[38;5;250m15
[39m GSM946… 9.38 1.38 -
[31m1
[39m
[31m.
[39m
[31m11
[39m -
[31m0
[39m
[31m.
[39m
[31m885
[39m
[31mNA
[39m -
[31m45
[39m
[31m.
[39m
[31m4
[39m -
[31m53
[39m
[31m.
[39m
[31m4
[39m -
[31m0
[39m
[31m.
[39m
[31m362
[39m -
[31m2
[39m
[31m.
[39m
[31m37
[39m
[38;5;250m16
[39m GSM946… 38.8 8.76 8.92 5.85
[31mNA
[39m -
[31m27
[39m
[31m.
[39m
[31m5
[39m -
[31m57
[39m
[31m.
[39m
[31m5
[39m 8.78 4.73
[38;5;246m# … with 18 more variables: BNN <dbl>, ageAcc.BNN <dbl>, ageAcc2.BNN <dbl>,
[39m
[38;5;246m# ageAcc3.BNN <dbl>, skinHorvath <lgl>, PedBE <lgl>, Wu <dbl>,
[39m
[38;5;246m# ageAcc.Wu <dbl>, ageAcc2.Wu <dbl>, ageAcc3.Wu <dbl>, TL <lgl>, BLUP <lgl>,
[39m
[38;5;246m# EN <lgl>, NEOaPMA450K <lgl>, NEOaPNA450K <lgl>, NEOaPMAEPIC <lgl>,
[39m
[38;5;246m# NEOaPNAEPIC <lgl>, age <dbl>, and abbreviated variable names
[39m
[38;5;246m# ¹ageAcc.Horvath, ²ageAcc2.Horvath, ³ageAcc3.Horvath, ⁴ageAcc.Levine,
[39m
[38;5;246m# ⁵ageAcc2.Levine, ⁶ageAcc3.Levine
[39m
By default, the argument cell.count
is set equal to TRUE
and, hence, can be omitted. This implies that ageAcc3
will
be computed for all clocks. In some occassions this can be very time
consuming. In such cases one can simply estimate DNAmAge, accAge and
accAge2 by setting cell.count=FALSE
. NOTE: see section 3.5
to see the reference panels available to estimate cell counts.
Then, we can investigate, for instance, whether the accelerated age is associated with Autism. In that example we will use a non-parametric test (NOTE: use t-test or linear regression for large sample sizes)
autism <- covariates$diseaseStatus
kruskal.test(age.example55$ageAcc.Horvath ~ autism)
Kruskal-Wallis rank sum test
data: age.example55$ageAcc.Horvath by autism
Kruskal-Wallis chi-squared = 1.3346, df = 1, p-value = 0.248
kruskal.test(age.example55$ageAcc2.Horvath ~ autism)
Kruskal-Wallis rank sum test
data: age.example55$ageAcc2.Horvath by autism
Kruskal-Wallis chi-squared = 3.1875, df = 1, p-value = 0.0742
kruskal.test(age.example55$ageAcc3.Horvath ~ autism)
Kruskal-Wallis rank sum test
data: age.example55$ageAcc3.Horvath by autism
Kruskal-Wallis chi-squared = 2.8235, df = 1, p-value = 0.09289
Chronological age prediction using ExpressionSet
data
One may be interested in assessing association between chronologial
age and DNA methylation age or evaluating how well chronological age is
predicted by DNAmAge. In order to illustrate this analysis we downloaded
data from GEO corresponding to a set of healthy individuals (GEO
accession number GSE58045). Data can be retrieved into R by using
GEOquery
package as an ExpressionSet
object
that can be the input of our main function.
dd <- GEOquery::getGEO("GSE58045")
gse58045 <- dd[[1]]
gse58045
ExpressionSet (storageMode: lockedEnvironment)
assayData: 27578 features, 172 samples
element names: exprs
protocolData: none
phenoData
sampleNames: GSM1399890 GSM1399891 ... GSM1400061 (172 total)
varLabels: title geo_accession ... twin:ch1 (43 total)
varMetadata: labelDescription
featureData
featureNames: cg00000292 cg00002426 ... cg27665659 (27578 total)
fvarLabels: ID Name ... ORF (38 total)
fvarMetadata: Column Description labelDescription
experimentData: use 'experimentData(object)'
pubMedIds: 22532803
Annotation: GPL8490
The chronological age is obtained by using pData
function from Biobase
package that is able to deal with
ExpressionSet
objects:
pheno <- pData(gse58045)
age <- as.numeric(pheno$`age:ch1`)
And the different DNA methylation age estimates are obtained by using
DNAmAge
function (NOTE: as there are missing values, the
program automatically runs impute.knn
function to get
complete cases):
age.gse58045 <- DNAmAge(gse58045, age=age)
Imputing missing data of the entire matrix ....
Data imputed. Starting DNAm clock estimation ...
Warning in predAge(cpgs.imp, coefHannum, intercept = FALSE, min.perc): The number of missing CpGs for Hannum clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefSkin, intercept = TRUE, min.perc): The number of missing CpGs for Skin clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefPedBE, intercept = TRUE, min.perc): The number of missing CpGs for PedBE clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefTL, intercept = TRUE, min.perc): The number of missing CpGs for TL clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefBlup, intercept = TRUE, min.perc): The number of missing CpGs for Blup clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefEN, intercept = TRUE, min.perc): The number of missing CpGs for EN clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefNEOaPMA450K, intercept = TRUE, min.perc): The number of missing CpGs for NEOaPMA450K clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefNEOaPNA450K, intercept = TRUE, min.perc): The number of missing CpGs for NEOaPNA450K clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefNEOaPMAEPIC, intercept = TRUE, min.perc): The number of missing CpGs for NEOaPMAEPIC clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefNEOaPNAEPIC, intercept = TRUE, min.perc): The number of missing CpGs for NEOaPNAEPIC clock exceeds 80 %.
---> This DNAm clock will be NA.
age.gse58045
[38;5;246m# A tibble: 172 × 28
[39m
id Horvath ageAc…¹ ageAc…² ageAc…³ Hannum Levine ageAc…⁴ ageAc…⁵ ageAc…⁶
[3m
[38;5;246m<chr>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<lgl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[38;5;250m 1
[39m GSM139… 65.6 1.07 4.58 5.46
[31mNA
[39m 50.7 -
[31m13
[39m
[31m.
[39m
[31m8
[39m 2.91 4.28
[38;5;250m 2
[39m GSM139… 66.3 0.197 4.06 5.06
[31mNA
[39m 51.3 -
[31m14
[39m
[31m.
[39m
[31m8
[39m 2.45 5.12
[38;5;250m 3
[39m GSM139… 53.9 -
[31m5
[39m
[31m.
[39m
[31m31
[39m -
[31m2
[39m
[31m.
[39m
[31m98
[39m -
[31m2
[39m
[31m.
[39m
[31m42
[39m
[31mNA
[39m 40.5 -
[31m18
[39m
[31m.
[39m
[31m7
[39m -
[31m3
[39m
[31m.
[39m
[31m52
[39m -
[31m2
[39m
[31m.
[39m
[31m54
[39m
[38;5;250m 4
[39m GSM139… 40.6 -
[31m5
[39m
[31m.
[39m
[31m23
[39m -
[31m5
[39m
[31m.
[39m
[31m89
[39m -
[31m6
[39m
[31m.
[39m
[31m14
[39m
[31mNA
[39m 31.3 -
[31m14
[39m
[31m.
[39m
[31m5
[39m -
[31m3
[39m
[31m.
[39m
[31m21
[39m -
[31m4
[39m
[31m.
[39m
[31m67
[39m
[38;5;250m 5
[39m GSM139… 50.1 0.982 1.06 1.28
[31mNA
[39m 41.1 -
[31m8
[39m
[31m.
[39m
[31m00
[39m 4.27 3.37
[38;5;250m 6
[39m GSM139… 63.7 -
[31m0
[39m
[31m.
[39m
[31m895
[39m 2.64 2.92
[31mNA
[39m 48.1 -
[31m16
[39m
[31m.
[39m
[31m6
[39m 0.223 -
[31m1
[39m
[31m.
[39m
[31m25
[39m
[38;5;250m 7
[39m GSM139… 44.7 -
[31m0
[39m
[31m.
[39m
[31m875
[39m -
[31m1
[39m
[31m.
[39m
[31m59
[39m -
[31m1
[39m
[31m.
[39m
[31m76
[39m
[31mNA
[39m 29.2 -
[31m16
[39m
[31m.
[39m
[31m4
[39m -
[31m5
[39m
[31m.
[39m
[31m17
[39m -
[31m4
[39m
[31m.
[39m
[31m57
[39m
[38;5;250m 8
[39m GSM139… 59.7 -
[31m8
[39m
[31m.
[39m
[31m55
[39m -
[31m4
[39m
[31m.
[39m
[31m20
[39m -
[31m3
[39m
[31m.
[39m
[31m48
[39m
[31mNA
[39m 41.0 -
[31m27
[39m
[31m.
[39m
[31m2
[39m -
[31m9
[39m
[31m.
[39m
[31m37
[39m -
[31m7
[39m
[31m.
[39m
[31m0
[39m
[31m6
[39m
[38;5;250m 9
[39m GSM139… 48.4 -
[31m5
[39m
[31m.
[39m
[31m84
[39m -
[31m4
[39m
[31m.
[39m
[31m63
[39m -
[31m2
[39m
[31m.
[39m
[31m50
[39m
[31mNA
[39m 43.8 -
[31m10
[39m
[31m.
[39m
[31m4
[39m 3.31 7.18
[38;5;250m10
[39m GSM139… 59.3 -
[31m3
[39m
[31m.
[39m
[31m93
[39m -
[31m0
[39m
[31m.
[39m
[31m719
[39m -
[31m0
[39m
[31m.
[39m
[31m609
[39m
[31mNA
[39m 46.1 -
[31m17
[39m
[31m.
[39m
[31m1
[39m -
[31m0
[39m
[31m.
[39m
[31m693
[39m 0.339
[38;5;246m# … with 162 more rows, 18 more variables: BNN <dbl>, ageAcc.BNN <dbl>,
[39m
[38;5;246m# ageAcc2.BNN <dbl>, ageAcc3.BNN <dbl>, skinHorvath <lgl>, PedBE <lgl>,
[39m
[38;5;246m# Wu <dbl>, ageAcc.Wu <dbl>, ageAcc2.Wu <dbl>, ageAcc3.Wu <dbl>, TL <lgl>,
[39m
[38;5;246m# BLUP <lgl>, EN <lgl>, NEOaPMA450K <lgl>, NEOaPNA450K <lgl>,
[39m
[38;5;246m# NEOaPMAEPIC <lgl>, NEOaPNAEPIC <lgl>, age <dbl>, and abbreviated variable
[39m
[38;5;246m# names ¹ageAcc.Horvath, ²ageAcc2.Horvath, ³ageAcc3.Horvath, ⁴ageAcc.Levine,
[39m
[38;5;246m# ⁵ageAcc2.Levine, ⁶ageAcc3.Levine
[39m
Figure \(\ref{fig:horvath_age}\) shows the correlation between DNAmAge obtained from Horvath’s method and the chronological age, while Figure \(\ref{fig:bnn_age}\) depicts the correlation of a new method based on fitting a Bayesian Neural Network to predict DNAmAge based on Horvath’s CpGs.
plotDNAmAge(age.gse58045$Horvath, age)
plotDNAmAge(age.gse58045$BNN, age, tit="Bayesian Neural Network")
Use of DNAmAge in association studies
Let us illustrate how to use DNAmAge information in association studies (e.g case/control, smokers/non-smokers, responders/non-responders, …). GEO number GSE19711 contains transcriptomic and epigenomic data of a study in lung cancer. Data can be retrieved into R by
dd <- GEOquery::getGEO("GSE19711")
gse19711 <- dd[[1]]
The object gse19711
is an ExpressionSet
that
can contains CpGs and phenotypic (e.g clinical) information
gse19711
ExpressionSet (storageMode: lockedEnvironment)
assayData: 27578 features, 540 samples
element names: exprs
protocolData: none
phenoData
sampleNames: GSM491937 GSM491938 ... GSM492476 (540 total)
varLabels: title geo_accession ... stage:ch1 (58 total)
varMetadata: labelDescription
featureData
featureNames: cg00000292 cg00002426 ... cg27665659 (27578 total)
fvarLabels: ID Name ... ORF (38 total)
fvarMetadata: Column Description labelDescription
experimentData: use 'experimentData(object)'
pubMedIds: 20219944
Annotation: GPL8490
Let us imagine we are interested in comparing the accelerated age between cases and controls. Age and case/control status information can be obtained by:
pheno <- pData(gse19711)
age <- as.numeric(pheno$`ageatrecruitment:ch1`)
disease <- pheno$`sample type:ch1`
table(disease)
disease
bi-sulphite converted genomic whole blood DNA from Case
266
bi-sulphite converted genomic whole blood DNA from Control
274
disease[grep("Control", disease)] <- "Control"
disease[grep("Case", disease)] <- "Case"
disease <- factor(disease, levels=c("Control", "Case"))
table(disease)
disease
Control Case
274 266
The DNAmAge estimates of different methods is computed by
age.gse19711 <- DNAmAge(gse19711, age=age)
Imputing missing data of the entire matrix ....
Data imputed. Starting DNAm clock estimation ...
Warning in predAge(cpgs.imp, coefHannum, intercept = FALSE, min.perc): The number of missing CpGs for Hannum clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefSkin, intercept = TRUE, min.perc): The number of missing CpGs for Skin clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefPedBE, intercept = TRUE, min.perc): The number of missing CpGs for PedBE clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefTL, intercept = TRUE, min.perc): The number of missing CpGs for TL clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefBlup, intercept = TRUE, min.perc): The number of missing CpGs for Blup clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefEN, intercept = TRUE, min.perc): The number of missing CpGs for EN clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefNEOaPMA450K, intercept = TRUE, min.perc): The number of missing CpGs for NEOaPMA450K clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefNEOaPNA450K, intercept = TRUE, min.perc): The number of missing CpGs for NEOaPNA450K clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefNEOaPMAEPIC, intercept = TRUE, min.perc): The number of missing CpGs for NEOaPMAEPIC clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefNEOaPNAEPIC, intercept = TRUE, min.perc): The number of missing CpGs for NEOaPNAEPIC clock exceeds 80 %.
---> This DNAm clock will be NA.
We can observe there are missing data. The funcion automatically
impute those using impute.knn
function from
impute
package since complete cases are required to compute
the different methylation clocks. The estimates are:
age.gse19711
[38;5;246m# A tibble: 540 × 28
[39m
id Horvath ageAc…¹ ageAc…² ageAc…³ Hannum Levine ageAc…⁴ ageAc…⁵ ageAc…⁶
[3m
[38;5;246m<chr>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<lgl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[38;5;250m 1
[39m GSM491… 62.9 -
[31m5
[39m
[31m.
[39m
[31m14
[39m -
[31m0
[39m
[31m.
[39m
[31m351
[39m -
[31m1
[39m
[31m.
[39m
[31m10
[39m
[31mNA
[39m 61.1 -
[31m6
[39m
[31m.
[39m
[31m90
[39m 7.63 3.65
[38;5;250m 2
[39m GSM491… 68.8 -
[31m12
[39m
[31m.
[39m
[31m2
[39m -
[31m2
[39m
[31m.
[39m
[31m85
[39m -
[31m2
[39m
[31m.
[39m
[31m13
[39m
[31mNA
[39m 57.0 -
[31m24
[39m
[31m.
[39m
[31m0
[39m -
[31m5
[39m
[31m.
[39m
[31m48
[39m -
[31m1
[39m
[31m.
[39m
[31m54
[39m
[38;5;250m 3
[39m GSM491… 60.0 3.96 4.54 4.37
[31mNA
[39m 43.0 -
[31m13
[39m
[31m.
[39m
[31m0
[39m -
[31m2
[39m
[31m.
[39m
[31m13
[39m -
[31m2
[39m
[31m.
[39m
[31m0
[39m
[31m3
[39m
[38;5;250m 4
[39m GSM491… 57.9 -
[31m4
[39m
[31m.
[39m
[31m13
[39m -
[31m1
[39m
[31m.
[39m
[31m45
[39m -
[31m1
[39m
[31m.
[39m
[31m38
[39m
[31mNA
[39m 40.9 -
[31m21
[39m
[31m.
[39m
[31m1
[39m -
[31m8
[39m
[31m.
[39m
[31m40
[39m -
[31m6
[39m
[31m.
[39m
[31m35
[39m
[38;5;250m 5
[39m GSM491… 59.0 -
[31m13
[39m
[31m.
[39m
[31m0
[39m -
[31m6
[39m
[31m.
[39m
[31m79
[39m -
[31m6
[39m
[31m.
[39m
[31m98
[39m
[31mNA
[39m 57.0 -
[31m15
[39m
[31m.
[39m
[31m0
[39m 0.756 -
[31m1
[39m
[31m.
[39m
[31m77
[39m
[38;5;250m 6
[39m GSM491… 57.0 -
[31m4
[39m
[31m.
[39m
[31m00
[39m -
[31m1
[39m
[31m.
[39m
[31m66
[39m -
[31m1
[39m
[31m.
[39m
[31m0
[39m
[31m9
[39m
[31mNA
[39m 44.7 -
[31m16
[39m
[31m.
[39m
[31m3
[39m -
[31m3
[39m
[31m.
[39m
[31m92
[39m -
[31m1
[39m
[31m.
[39m
[31m35
[39m
[38;5;250m 7
[39m GSM491… 61.9 -
[31m3
[39m
[31m.
[39m
[31m0
[39m
[31m8
[39m 0.657 0.183
[31mNA
[39m 47.9 -
[31m17
[39m
[31m.
[39m
[31m1
[39m -
[31m3
[39m
[31m.
[39m
[31m47
[39m -
[31m4
[39m
[31m.
[39m
[31m21
[39m
[38;5;250m 8
[39m GSM491… 59.1 -
[31m11
[39m
[31m.
[39m
[31m9
[39m -
[31m6
[39m
[31m.
[39m
[31m0
[39m
[31m7
[39m -
[31m5
[39m
[31m.
[39m
[31m53
[39m
[31mNA
[39m 50.0 -
[31m21
[39m
[31m.
[39m
[31m0
[39m -
[31m5
[39m
[31m.
[39m
[31m60
[39m -
[31m3
[39m
[31m.
[39m
[31m0
[39m
[31m6
[39m
[38;5;250m 9
[39m GSM491… 60.7 -
[31m16
[39m
[31m.
[39m
[31m3
[39m -
[31m8
[39m
[31m.
[39m
[31m33
[39m -
[31m9
[39m
[31m.
[39m
[31m33
[39m
[31mNA
[39m 47.7 -
[31m29
[39m
[31m.
[39m
[31m3
[39m -
[31m12
[39m
[31m.
[39m
[31m0
[39m -
[31m15
[39m
[31m.
[39m
[31m7
[39m
[38;5;250m10
[39m GSM491… 51.1 -
[31m7
[39m
[31m.
[39m
[31m93
[39m -
[31m6
[39m
[31m.
[39m
[31m30
[39m -
[31m6
[39m
[31m.
[39m
[31m33
[39m
[31mNA
[39m 52.5 -
[31m6
[39m
[31m.
[39m
[31m54
[39m 5.26 1.30
[38;5;246m# … with 530 more rows, 18 more variables: BNN <dbl>, ageAcc.BNN <dbl>,
[39m
[38;5;246m# ageAcc2.BNN <dbl>, ageAcc3.BNN <dbl>, skinHorvath <lgl>, PedBE <lgl>,
[39m
[38;5;246m# Wu <dbl>, ageAcc.Wu <dbl>, ageAcc2.Wu <dbl>, ageAcc3.Wu <dbl>, TL <lgl>,
[39m
[38;5;246m# BLUP <lgl>, EN <lgl>, NEOaPMA450K <lgl>, NEOaPNA450K <lgl>,
[39m
[38;5;246m# NEOaPMAEPIC <lgl>, NEOaPNAEPIC <lgl>, age <dbl>, and abbreviated variable
[39m
[38;5;246m# names ¹ageAcc.Horvath, ²ageAcc2.Horvath, ³ageAcc3.Horvath, ⁴ageAcc.Levine,
[39m
[38;5;246m# ⁵ageAcc2.Levine, ⁶ageAcc3.Levine
[39m
The association between disease status and DNAmAge estimated using Horvath’s method can be computed by
mod.horvath1 <- glm(disease ~ ageAcc.Horvath ,
data=age.gse19711,
family="binomial")
summary(mod.horvath1)
Call:
glm(formula = disease ~ ageAcc.Horvath, family = "binomial",
data = age.gse19711)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.358 -1.160 -1.030 1.184 1.771
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.10995 0.09771 -1.125 0.2605
ageAcc.Horvath -0.02023 0.01154 -1.753 0.0795 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 748.48 on 539 degrees of freedom
Residual deviance: 745.25 on 538 degrees of freedom
AIC: 749.25
Number of Fisher Scoring iterations: 4
mod.skinHorvath <- glm(disease ~ ageAcc2.Horvath ,
data=age.gse19711,
family="binomial")
summary(mod.skinHorvath)
Call:
glm(formula = disease ~ ageAcc2.Horvath, family = "binomial",
data = age.gse19711)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.279 -1.163 -1.082 1.189 1.589
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.02970 0.08617 -0.345 0.730
ageAcc2.Horvath -0.01315 0.01209 -1.087 0.277
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 748.48 on 539 degrees of freedom
Residual deviance: 747.27 on 538 degrees of freedom
AIC: 751.27
Number of Fisher Scoring iterations: 3
mod.horvath3 <- glm(disease ~ ageAcc3.Horvath ,
data=age.gse19711,
family="binomial")
summary(mod.horvath3)
Call:
glm(formula = disease ~ ageAcc3.Horvath, family = "binomial",
data = age.gse19711)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.338 -1.163 -1.046 1.185 1.771
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.02993 0.08626 -0.347 0.729
ageAcc3.Horvath -0.01927 0.01283 -1.502 0.133
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 748.48 on 539 degrees of freedom
Residual deviance: 746.13 on 538 degrees of freedom
AIC: 750.13
Number of Fisher Scoring iterations: 4
We do not observe statistical significant association between age acceleration estimated using Horvath method and the risk of developing lung cancer. It is worth to notice that Horvath’s clock was created to predict chronological age and the impact of age acceleration of this clock on disease may be limited. On the other hand, Levine’s clock aimed to distinguish risk between same-aged individuals. Let us evaluate whether this age acceleration usin Levine’s clock is associated with lung cancer
mod.levine1 <- glm(disease ~ ageAcc.Levine , data=age.gse19711,
family="binomial")
summary(mod.levine1)
Call:
glm(formula = disease ~ ageAcc.Levine, family = "binomial", data = age.gse19711)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.592 -1.149 -0.939 1.174 1.733
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.40956 0.17894 2.289 0.02209 *
ageAcc.Levine 0.03178 0.01133 2.806 0.00502 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 748.48 on 539 degrees of freedom
Residual deviance: 740.17 on 538 degrees of freedom
AIC: 744.17
Number of Fisher Scoring iterations: 4
mod.levine2 <- glm(disease ~ ageAcc2.Levine , data=age.gse19711,
family="binomial")
summary(mod.levine2)
Call:
glm(formula = disease ~ ageAcc2.Levine, family = "binomial",
data = age.gse19711)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.7053 -1.1328 -0.8614 1.1529 1.8015
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.02925 0.08718 -0.336 0.737225
ageAcc2.Levine 0.04430 0.01234 3.589 0.000332 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 748.48 on 539 degrees of freedom
Residual deviance: 734.49 on 538 degrees of freedom
AIC: 738.49
Number of Fisher Scoring iterations: 4
mod.levine3 <- glm(disease ~ ageAcc3.Levine , data=age.gse19711,
family="binomial")
summary(mod.levine3)
Call:
glm(formula = disease ~ ageAcc3.Levine, family = "binomial",
data = age.gse19711)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.354 -1.161 -1.057 1.187 1.408
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.02962 0.08622 -0.344 0.731
ageAcc3.Levine 0.01679 0.01244 1.350 0.177
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 748.48 on 539 degrees of freedom
Residual deviance: 746.62 on 538 degrees of freedom
AIC: 750.62
Number of Fisher Scoring iterations: 3
Here we observe as the risk of developing lung cancer increases 3.23
percent per each unit in the age accelerated variable
(ageAcc
). Similar conclusion is obtained when using
ageAcc2
and ageAcc3
variables.
In some occasions cell composition should be used to assess
association. This information is calculated in DNAmAge
function and it can be incorporated in the model by:
cell <- attr(age.gse19711, "cell_proportion")
mod.cell <- glm(disease ~ ageAcc.Levine + cell, data=age.gse19711,
family="binomial")
summary(mod.cell)
Call:
glm(formula = disease ~ ageAcc.Levine + cell, family = "binomial",
data = age.gse19711)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.9605 -1.0832 -0.6241 1.0742 2.3395
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -9.768206 4.380382 -2.230 0.025748 *
ageAcc.Levine 0.003959 0.012208 0.324 0.745746
cellCD4T -3.339693 3.833531 -0.871 0.383656
cellMono 10.165096 4.594096 2.213 0.026922 *
cellNeu 16.319534 4.584745 3.560 0.000372 ***
cellNK -0.882134 4.296498 -0.205 0.837326
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 748.48 on 539 degrees of freedom
Residual deviance: 686.56 on 534 degrees of freedom
AIC: 698.56
Number of Fisher Scoring iterations: 4
Here we observe as the positive association disapears after adjusting for cell counts.
Use of DNAm age in children
dd <- GEOquery::getGEO("GSE109446")
gse109446 <- dd[[1]]
controls <- pData(gse109446)$`diagnosis:ch1`=="control"
gse <- gse109446[,controls]
age <- as.numeric(pData(gse)$`age:ch1`)
age.gse <- DNAmAge(gse, age=age)
Warning in predAge(cpgs.imp, coefNEOaPMAEPIC, intercept = TRUE, min.perc): The number of missing CpGs for NEOaPMAEPIC clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefNEOaPNAEPIC, intercept = TRUE, min.perc): The number of missing CpGs for NEOaPNAEPIC clock exceeds 80 %.
---> This DNAm clock will be NA.
plotCorClocks(age.gse)
Gestational DNAm age estimation
Model predicion
Let us start by reproducing the example provided in Knight et al. (2016) as a test data set (file
‘TestDataset.csv’). It consists on 3 individuals whose methylation data
are available as supplementary data of their paper. The data is also
available at methylclock
package as a data frame.
TestDataset[1:5,]
CpGName Sample1 Sample2 Sample3
1 cg00000292 0.72546496 0.72350947 0.69023377
2 cg00002426 0.85091763 0.80077888 0.80385777
3 cg00003994 0.05125853 0.05943935 0.05559333
4 cg00005847 0.08775420 0.11722333 0.10845113
5 cg00006414 0.03982478 0.06146891 0.03491992
The Gestational Age (in months) is simply computed by
ga.test <- DNAmGA(TestDataset)
Warning in DNAmGA(TestDataset): CpGs in all Gestational Age clocks are not present in your data. Try 'checkClocksGA' function
to find the missing CpGs of each method.
Warning in predAge(cpgs.imp, coefBohlin, intercept = TRUE, min.perc): The number of missing CpGs for Bohlin clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefEPIC, intercept = TRUE, min.perc): The number of missing CpGs for EPIC clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in DNAmGA(TestDataset): The number of missing CpGs for Lee clocks exceeds 80%.
---> This DNAm clock will be NA.
ga.test
[38;5;246m# A tibble: 3 × 6
[39m
id Knight Bohlin Mayne EPIC Lee
[3m
[38;5;246m<chr>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<lgl>
[39m
[23m
[3m
[38;5;246m<lgl>
[39m
[23m
[38;5;250m1
[39m Sample1 38.2
[31mNA
[39m 35.8
[31mNA
[39m
[31mNA
[39m
[38;5;250m2
[39m Sample2 38.8
[31mNA
[39m 36.5
[31mNA
[39m
[31mNA
[39m
[38;5;250m3
[39m Sample3 40.0
[31mNA
[39m 36.6
[31mNA
[39m
[31mNA
[39m
like in DNAmAge we can use the parameter min.perc
to set
the minimum missing percentage.
The results are the same as those described in the additional file 7 of Knight et al. (2016) (link here)
Let us continue by illustrating how to compute GA of real examples.
The PROGRESS cohort data is available in the additional file 8 of Knight et al. (2016). It is available at
methylclock
as a tibble
:
progress_data
[38;5;246m# A tibble: 148 × 151
[39m
CpGmarker `784` `1052` `1048` `1017` `956` `1038` `989` `946` `941` `1024`
[3m
[38;5;246m<chr>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[38;5;250m 1
[39m cg00022866 0.289 0.372 0.347 0.351 0.313 0.300 0.298 0.294 0.322 0.313
[38;5;250m 2
[39m cg00466249 0.658 0.724 0.700 0.717 0.695 0.665 0.710 0.686 0.692 0.704
[38;5;250m 3
[39m cg00546897 0.682 0.711 0.684 0.717 0.627 0.605 0.684 0.716 0.684 0.666
[38;5;250m 4
[39m cg00575744 0.312 0.381 0.300 0.331 0.294 0.348 0.284 0.305 0.319 0.325
[38;5;250m 5
[39m cg00689340 0.566 0.576 0.556 0.571 0.521 0.569 0.599 0.575 0.532 0.564
[38;5;250m 6
[39m cg01056568 0.558 0.620 0.529 0.600 0.577 0.574 0.590 0.576 0.548 0.555
[38;5;250m 7
[39m cg01184449 0.712 0.718 0.667 0.744 0.668 0.676 0.710 0.744 0.685 0.717
[38;5;250m 8
[39m cg01348086 0.195 0.186 0.180 0.194 0.212 0.208 0.183 0.129 0.161 0.144
[38;5;250m 9
[39m cg02100629 0.329 0.330 0.340 0.344 0.268 0.280 0.288 0.314 0.283 0.346
[38;5;250m10
[39m cg02813863 0.819 0.858 0.832 0.874 0.861 0.830 0.894 0.873 0.895 0.863
[38;5;246m# … with 138 more rows, and 140 more variables: `1047` <dbl>, `1035` <dbl>,
[39m
[38;5;246m# `988` <dbl>, `939` <dbl>, `936` <dbl>, `748` <dbl>, `1031` <dbl>,
[39m
[38;5;246m# `903` <dbl>, `864` <dbl>, `874` <dbl>, `898` <dbl>, `1013` <dbl>,
[39m
[38;5;246m# `971` <dbl>, `966` <dbl>, `866` <dbl>, `924` <dbl>, `931` <dbl>,
[39m
[38;5;246m# `1007` <dbl>, `954` <dbl>, `958` <dbl>, `1037` <dbl>, `965` <dbl>,
[39m
[38;5;246m# `1008` <dbl>, `1005` <dbl>, `962` <dbl>, `979` <dbl>, `881` <dbl>,
[39m
[38;5;246m# `876` <dbl>, `764` <dbl>, `743` <dbl>, `987` <dbl>, `930` <dbl>, …
[39m
This file also contains different variables that are available in
this tibble
.
progress_vars
[38;5;246m# A tibble: 150 × 3
[39m
id birthweight EGA
[3m
[38;5;246m<chr>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[38;5;250m 1
[39m 784 2.62 38
[38;5;250m 2
[39m 1052 2.59 38.3
[38;5;250m 3
[39m 1048 3.20 38
[38;5;250m 4
[39m 1017 3.28 38.6
[38;5;250m 5
[39m 956 2.79 37.1
[38;5;250m 6
[39m 1038 2.89 38.1
[38;5;250m 7
[39m 989 2.47 38
[38;5;250m 8
[39m 946 2.42 37.7
[38;5;250m 9
[39m 941 2.96 36.7
[38;5;250m10
[39m 1024 2.61 38.6
[38;5;246m# … with 140 more rows
[39m
The Clinical Variables including clinical assesment of gestational
age (EGA) are available at this tibble
.
The Gestational Age (in months) is simply computed by
ga.progress <- DNAmGA(progress_data)
Warning in DNAmGA(progress_data): CpGs in all Gestational Age clocks are not present in your data. Try 'checkClocksGA' function
to find the missing CpGs of each method.
Warning in predAge(cpgs.imp, coefBohlin, intercept = TRUE, min.perc): The number of missing CpGs for Bohlin clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefMayneGA, intercept = TRUE, min.perc): The number of missing CpGs for Mayne clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefEPIC, intercept = TRUE, min.perc): The number of missing CpGs for EPIC clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in DNAmGA(progress_data): The number of missing CpGs for Lee clocks exceeds 80%.
---> This DNAm clock will be NA.
ga.progress
[38;5;246m# A tibble: 150 × 6
[39m
id Knight Bohlin Mayne EPIC Lee
[3m
[38;5;246m<chr>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<lgl>
[39m
[23m
[3m
[38;5;246m<lgl>
[39m
[23m
[3m
[38;5;246m<lgl>
[39m
[23m
[38;5;250m 1
[39m 784 38.8
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m 2
[39m 1052 37.2
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m 3
[39m 1048 40.3
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m 4
[39m 1017 39.2
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m 5
[39m 956 38.9
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m 6
[39m 1038 39.2
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m 7
[39m 989 37.2
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m 8
[39m 946 35.4
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m 9
[39m 941 33.5
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;250m10
[39m 1024 37.4
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[38;5;246m# … with 140 more rows
[39m
We can compare these results with the clinical GA available in the variable EGA
plotDNAmAge(ga.progress$Knight, progress_vars$EGA,
tit="GA Knight's method",
clock="GA")
Figure 3b (only for PROGRESS dataset) in Knight et al. (2016) representing the correlation between GA acceleration and birthweight can be reproduced by
library(ggplot2)
progress_vars$acc <- ga.progress$Knight - progress_vars$EGA
p <- ggplot(data=progress_vars, aes(x = acc, y = birthweight)) +
geom_point() +
geom_smooth(method = "lm", se=FALSE, color="black") +
xlab("GA acceleration") +
ylab("Birthweight (kgs.)")
p
Finally, we can also estimate the “accelerated gestational age” using
two of the the three different estimates previously described
(accAge
, accAge2
) by provinding information of
gestational age through age
argument. Notice that in that
case accAge3
cannot be estimates since we do not have all
the CpGs required by the default reference panel to estimate cell counts
for gestational age which is “andrews and bakulski cord blood”.
accga.progress <- DNAmGA(progress_data,
age = progress_vars$EGA,
cell.count=FALSE)
Warning in DNAmGA(progress_data, age = progress_vars$EGA, cell.count = FALSE): CpGs in all Gestational Age clocks are not present in your data. Try 'checkClocksGA' function
to find the missing CpGs of each method.
Warning in predAge(cpgs.imp, coefBohlin, intercept = TRUE, min.perc): The number of missing CpGs for Bohlin clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefMayneGA, intercept = TRUE, min.perc): The number of missing CpGs for Mayne clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in predAge(cpgs.imp, coefEPIC, intercept = TRUE, min.perc): The number of missing CpGs for EPIC clock exceeds 80 %.
---> This DNAm clock will be NA.
Warning in DNAmGA(progress_data, age = progress_vars$EGA, cell.count = FALSE): The number of missing CpGs for Lee clocks exceeds 80%.
---> This DNAm clock will be NA.
accga.progress
[38;5;246m# A tibble: 150 × 9
[39m
id Knight ageAcc.Knight ageAcc2.Knight Bohlin Mayne EPIC Lee age
[3m
[38;5;246m<chr>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[3m
[38;5;246m<lgl>
[39m
[23m
[3m
[38;5;246m<lgl>
[39m
[23m
[3m
[38;5;246m<lgl>
[39m
[23m
[3m
[38;5;246m<dbl>
[39m
[23m
[38;5;250m 1
[39m 784 38.8 0.792 1.27
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m 38
[38;5;250m 2
[39m 1052 37.2 -
[31m1
[39m
[31m.
[39m
[31m0
[39m
[31m5
[39m -
[31m0
[39m
[31m.
[39m
[31m488
[39m
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m 38.3
[38;5;250m 3
[39m 1048 40.3 2.29 2.77
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m 38
[38;5;250m 4
[39m 1017 39.2 0.643 1.28
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m 38.6
[38;5;250m 5
[39m 956 38.9 1.75 1.99
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m 37.1
[38;5;250m 6
[39m 1038 39.2 1.09 1.61
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m 38.1
[38;5;250m 7
[39m 989 37.2 -
[31m0
[39m
[31m.
[39m
[31m774
[39m -
[31m0
[39m
[31m.
[39m
[31m292
[39m
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m 38
[38;5;250m 8
[39m 946 35.4 -
[31m2
[39m
[31m.
[39m
[31m36
[39m -
[31m1
[39m
[31m.
[39m
[31m96
[39m
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m 37.7
[38;5;250m 9
[39m 941 33.5 -
[31m3
[39m
[31m.
[39m
[31m18
[39m -
[31m3
[39m
[31m.
[39m
[31m0
[39m
[31m6
[39m
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m 36.7
[38;5;250m10
[39m 1024 37.4 -
[31m1
[39m
[31m.
[39m
[31m12
[39m -
[31m0
[39m
[31m.
[39m
[31m486
[39m
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m
[31mNA
[39m 38.6
[38;5;246m# … with 140 more rows
[39m
One can also check which clocks can be estimated given the CpGs available in the methylation data by
checkClocksGA(progress_data)
There are some clocks that cannot be computed since your data do not contain the required CpGs.
These are the total number of missing CpGs for each clock :
clock Cpgs_in_clock missing_CpGs percentage
1 Knight 148 0 0.0
2 Bohlin 96 94 97.9
3 Mayne 62 61 98.4
4 Lee 1125 1125 100.0
5 EPIC 176 176 100.0
Correlation among DNAm clocks
We can compute the correlation among biological clocks using the
function plotCorClocks
that requires the package
ggplot2
and ggpubr
to be installed in your
computer.
We can obtain, for instance, the correlation among the clocks estimated for the healthy individuals study previosuly analyze (GEO accession number GSE58045) by simply executing:
plotCorClocks(age.gse58045)
Session Info
R version 4.2.1 (2022-06-23 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)
Matrix products: default
locale:
[1] LC_COLLATE=Spanish_Spain.utf8 LC_CTYPE=Spanish_Spain.utf8
[3] LC_MONETARY=Spanish_Spain.utf8 LC_NUMERIC=C
[5] LC_TIME=Spanish_Spain.utf8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] forcats_0.5.2 stringr_1.4.1 dplyr_1.0.10
[4] purrr_0.3.5 readr_2.1.3 tidyr_1.2.1
[7] tidyverse_1.3.2 GEOquery_2.66.0 ggpmisc_0.5.1
[10] ggpp_0.4.5 ggplot2_3.4.0 impute_1.72.1
[13] tibble_3.1.8 Biobase_2.58.0 BiocGenerics_0.44.0
[16] methylclock_0.8.0.3 quadprog_1.5-8 devtools_2.4.5
[19] usethis_2.1.6 BiocStyle_2.26.0
loaded via a namespace (and not attached):
[1] rappdirs_0.3.3 SparseM_1.81
[3] rtracklayer_1.58.0 R.methodsS3_1.8.2
[5] ragg_1.2.4 bumphunter_1.40.0
[7] minfi_1.44.0 bit64_4.0.5
[9] knitr_1.40 DelayedArray_0.24.0
[11] R.utils_2.12.2 data.table_1.14.4
[13] KEGGREST_1.38.0 RCurl_1.98-1.9
[15] generics_0.1.3 GenomicFeatures_1.50.2
[17] preprocessCore_1.60.0 callr_3.7.3
[19] RSQLite_2.2.18 bit_4.0.4
[21] tzdb_0.3.0 xml2_1.3.3
[23] lubridate_1.9.0 httpuv_1.6.6
[25] SummarizedExperiment_1.28.0 assertthat_0.2.1
[27] gargle_1.2.1 xfun_0.34
[29] hms_1.1.2 jquerylib_0.1.4
[31] evaluate_0.18 promises_1.2.0.1
[33] fansi_1.0.3 restfulr_0.0.15
[35] scrime_1.3.5 progress_1.2.2
[37] dbplyr_2.2.1 readxl_1.4.1
[39] DBI_1.1.3 htmlwidgets_1.5.4
[41] reshape_0.8.9 googledrive_2.0.0
[43] stats4_4.2.1 ellipsis_0.3.2
[45] ggpubr_0.5.0 backports_1.4.1
[47] bookdown_0.30 annotate_1.76.0
[49] biomaRt_2.54.0 sparseMatrixStats_1.10.0
[51] MatrixGenerics_1.10.0 vctrs_0.5.0
[53] remotes_2.4.2 quantreg_5.94
[55] abind_1.4-5 cachem_1.0.6
[57] withr_2.5.0 vroom_1.6.0
[59] GenomicAlignments_1.34.0 xts_0.12.2
[61] prettyunits_1.1.1 mclust_6.0.0
[63] crayon_1.5.2 genefilter_1.80.0
[65] labeling_0.4.2 pkgconfig_2.0.3
[67] GenomeInfoDb_1.34.3 nlme_3.1-160
[69] pkgload_1.3.2 rlang_1.0.6
[71] lifecycle_1.0.3 miniUI_0.1.1.1
[73] MatrixModels_0.5-1 filelock_1.0.2
[75] BiocFileCache_2.6.0 modelr_0.1.10
[77] cellranger_1.1.0 rprojroot_2.0.3
[79] matrixStats_0.62.0 rngtools_1.5.2
[81] base64_2.0.1 Matrix_1.5-3
[83] carData_3.0-5 boot_1.3-28
[85] Rhdf5lib_1.20.0 zoo_1.8-11
[87] reprex_2.0.2 processx_3.8.0
[89] googlesheets4_1.0.1 png_0.1-7
[91] rjson_0.2.21 bitops_1.0-7
[93] R.oo_1.25.0 rhdf5filters_1.10.0
[95] Biostrings_2.66.0 blob_1.2.3
[97] DelayedMatrixStats_1.20.0 doRNG_1.8.2
[99] nor1mix_1.3-0 rstatix_0.7.1
[101] S4Vectors_0.36.0 ggsignif_0.6.4
[103] scales_1.2.1 memoise_2.0.1
[105] magrittr_2.0.3 plyr_1.8.8
[107] zlibbioc_1.44.0 compiler_4.2.1
[109] BiocIO_1.8.0 RColorBrewer_1.1-3
[111] illuminaio_0.40.0 Rsamtools_2.14.0
[113] cli_3.4.1 XVector_0.38.0
[115] urlchecker_1.0.1 ps_1.7.2
[117] PerformanceAnalytics_2.0.4 MASS_7.3-58.1
[119] mgcv_1.8-41 tidyselect_1.2.0
[121] stringi_1.7.8 textshaping_0.3.6
[123] highr_0.9 yaml_2.3.6
[125] askpass_1.1 locfit_1.5-9.6
[127] grid_4.2.1 sass_0.4.2
[129] polynom_1.4-1 tools_4.2.1
[131] timechange_0.1.1 parallel_4.2.1
[133] rstudioapi_0.14 foreach_1.5.2
[135] gridExtra_2.3 farver_2.1.1
[137] digest_0.6.30 BiocManager_1.30.19
[139] shiny_1.7.3 Rcpp_1.0.9
[141] GenomicRanges_1.50.1 car_3.1-1
[143] siggenes_1.72.0 broom_1.0.1
[145] later_1.3.0 httr_1.4.4
[147] AnnotationDbi_1.60.0 colorspace_2.0-3
[149] rvest_1.0.3 XML_3.99-0.12
[151] fs_1.5.2 IRanges_2.32.0
[153] splines_4.2.1 pkgdown_2.0.6
[155] confintr_0.2.0 multtest_2.54.0
[157] sessioninfo_1.2.2 systemfonts_1.0.4
[159] xtable_1.8-4 jsonlite_1.8.3
[161] R6_2.5.1 profvis_0.3.7
[163] pillar_1.8.1 htmltools_0.5.3
[165] mime_0.12 glue_1.6.2
[167] fastmap_1.1.0 BiocParallel_1.32.1
[169] beanplot_1.3.1 codetools_0.2-18
[171] pkgbuild_1.3.1 utf8_1.2.2
[173] lattice_0.20-45 bslib_0.4.1
[175] curl_4.3.3 openssl_2.0.4
[177] survival_3.4-0 limma_3.54.0
[179] rmarkdown_2.18 desc_1.4.2
[181] munsell_0.5.0 rhdf5_2.42.0
[183] GenomeInfoDbData_1.2.9 iterators_1.0.14
[185] HDF5Array_1.26.0 haven_2.5.1
[187] gtable_0.3.1