Omics QC & Preprocessing

1 Overview

Omics-specific quality control: imputation of missing SNP values, filtering by missingness rate, and filtering by Minor Allele Frequency.

2 Functions

2.1 impute_snps

Fills NA entries in SNP data by computing column or row means of non-missing values. Intended for 0/1/2-coded diploid genotype matrices.

2.2 filter_low_coverage

Removes columns (SNPs) or rows (samples) whose proportion of missing values (NAs) exceeds . Writes result to a new dataset.

When / are (default) the result is written alongside the input dataset with the suffix .

2.3 filter_maf

Removes columns or rows whose Minor Allele Frequency (MAF) exceeds . Designed for 0/1/2-coded diploid genotype matrices.

When / are (default) the result is written alongside the input dataset with suffix .