HDF5 Statistics

1 Overview

Descriptive statistics, normalization, correlation, and domain-specific preprocessing over HDF5.

2 Functions

2.1 bdapply_Function_hdf5

bdapply_Function_hdf5

2.2 bdNormalize_hdf5

bdNormalize_hdf5

2.3 bdImputeSNPs_hdf5

Performs imputation of missing values in SNP (Single Nucleotide Polymorphism) data stored in HDF5 format.

2.4 bdCorr_hdf5

This function computes Pearson or Spearman correlation matrix for matrices stored in HDF5 format. It automatically detects whether to compute: It automatically selects between direct computation for small matrices and block-wise processing for large matrices to optimize memory usage and performance.

Correlation types supported:

For omics data analysis:

2.5 bdgetSDandMean_hdf5

Computes standard deviation and/or mean statistics for a matrix stored in HDF5 format, with support for row-wise or column-wise computations.

2.6 bdRemovelowdata_hdf5

Removes SNPs (Single Nucleotide Polymorphisms) with low representation from genomic data stored in HDF5 format.

2.7 bdRemoveMAF_hdf5

Filters SNPs (Single Nucleotide Polymorphisms) based on Minor Allele Frequency (MAF) in genomic data stored in HDF5 format.