3 Data sets
3.1 Exposome dataset
The exposome is composed of three different files (in *.csv
, *.tsv
or *.txt
format). Those files are refered inside the Shiny as exposures, description and phenotypes. Their content is the following:
- The
exposures
file contains the measures of each exposure for all the individuals included on the analysis. It is a matrix-like file having a row per individual and a column per exposures. It must includes a column with the subject’s identifier. - The
description
file contains a row for each exposure and, at last, defined the families of exposures. Usually, this file incorporates a description of the exposures, the matrix where it was obtained and the units of measurement among others. - The
phenotypes
file contains the covariates to be included in the analysis as well as the health outcomes of interest. It contains a row per individual included in the analysis and a column for each covariate and outcome. Moreover, it must include a column with the individual’s identifier.
Some remarks regarding this files:
- All three files have to share the same separator element, for
*.csv
files is typical to use a comma (,) but it could also be a semicolon (;). - The exposure names have to start with a character [a-z/A-Z], leading special characters will cause the data entry to return errors.
- Exactly the same exposures have to be present on the description and exposures files.
- Exactly the same samples have to be present on the exposures and phenotypes files.
- The exposures and phenotypes files have an ID column, the description file does not have an ID column nor row.
A visual representation of the three matrices and how they correlate is the following.

Exposures data file example:
id bde100 bde138 bde209 PFOA ...
sub01 2.4665 0.7702 1.6866 2.0075 ...
sub02 0.7799 1.4147 1.2907 1.0153 ...
sub03 -1.6583 -0.9851 -0.8902 -0.0806 ...
sub04 -1.0812 -0.6639 -0.2988 -0.4268 ...
sub05 -0.2842 -0.1518 -1.5291 -0.7365 ...
... ... ... ... ...
Description data file example:
exposure family matrix description
bde100 PBDEs colostrum BDE 100 - log10
bde138 PBDEs colostrum BDE 138 - log10
bde209 PBDEs colostrum BDE 209 - log10
PFOA PFAS cord blood PFOA - log10
PFNA PFAS cord blood PFNA - log10
PFOA PFAS maternal serum PFOA - log10
PFNA PFAS maternal serum PFNA - log10
hg Metals cord blood hg - log 10
Co Metals urine Co (creatinine) - log10
Zn Metals urine Zn (creatinine) - log10
Pb Metals urine Pb (creatinine) - log10
THM Water --- Average total THM uptake - log10
CHCL3 Water --- Average Chloroform uptake - log10
BROM Water --- Average Brominated THM uptake - log10
NO2 Air --- NO2 levels whole pregnancy- log10
Ben Air --- Benzene levels whole pregnancy- log10
Phenotypes data file example:
id asthma BMI sex age ...
sub01 control 23.2539 boy 4 ...
sub02 asthma 24.4498 girl 5 ...
sub03 asthma 15.2356 boy 4 ...
sub04 control 25.1387 girl 4 ...
sub05 control 22.0477 boy 5 ...
... ... ... ... ...
3.2 Plain datasets
If the researcher has gathered all the data on a single file which contains both phenotype and exposure data, this file can be used too. The user interface has a selector for it, more information on the correspondent section.
A visual representation of a plain dataset is the following.

Plain dataset example (3 exposures + 2 phenotypes):
id bde100 bde138 bde209 asthma BMI ...
sub01 2.4665 0.7702 1.6866 control 23.2539 ...
sub02 0.7799 1.4147 1.2907 asthma 24.4498 ...
sub03 -1.6583 -0.9851 -0.8902 asthma 15.2356 ...
sub04 -1.0812 -0.6639 -0.2988 control 25.1387 ...
sub05 -0.2842 -0.1518 -1.5291 control 22.0477 ...
... ... ... ... ... ...
3.3 Omics dataset
The omics data inputed to the Shiny must be provided as an *.RData
. This file has to contain an ExpressionSet, which is an S4 object. This object is a data container of the Bioconductor toolset.
For further information on ExpressionSet and how to create and manipulate them, please visit the official documentation and this selected vignette.