---title: "bdImputeSNPs_hdf5"subtitle: "bdImputeSNPs_hdf5"---<span class="category-badge hdf5_statistics">HDF5_STATISTICS</span>## DescriptionPerforms imputation of missing values in SNP (Single Nucleotide Polymorphism)data stored in HDF5 format.## Usage```rbdImputeSNPs_hdf5(filename, group, dataset, outgroup =NULL, outdataset =NULL, bycols =TRUE, paral =NULL, threads =NULL, overwrite =NULL)```## Arguments::: {.param-table}| Parameter | Description ||-----------|-------------||`filename`| Character string. Path to the HDF5 file. ||`group`| Character string. Path to the group containing input dataset. ||`dataset`| Character string. Name of the dataset to impute. ||`outgroup`| Character string (optional). Output group path. If NULL, uses input group. ||`outdataset`| Character string (optional). Output dataset name. If NULL, overwrites input dataset. ||`bycols`| Logical (optional). Whether to impute by columns (TRUE) or rows (FALSE). Default is TRUE. ||`paral`| Logical (optional). Whether to use parallel processing. ||`threads`| Integer (optional). Number of threads for parallel processing. ||`overwrite`| Logical (optional). Whether to overwrite existing dataset. |:::## Value::: {.return-value}List with components:- **`fn`**: Character string with the HDF5 filename- **`ds`**: Character string with the full dataset path to the imputed data (group/dataset):::## DetailsThis function provides efficient imputation capabilities for genomic data withsupport for:- Imputation options: - Row-wise or column-wise imputation - Parallel processing - Configurable thread count- Output options: - Custom output location - In-place modification - Overwrite protection- Implementation features: - Memory-efficient processing - Safe file operations - Error handlingThe function supports both in-place modification and creation of new datasets.## Examples```{r}#| eval: false#| code-fold: showlibrary(BigDataStatMeth)# Create test data with missing valuesdata <-matrix(sample(c(0, 1, 2, NA), 100, replace =TRUE), 10, 10)# Save to HDF5fn <-"snp_data.hdf5"bdCreate_hdf5_matrix(fn, data, "genotype", "snps",overwriteFile =TRUE)# Impute missing valuesbdImputeSNPs_hdf5(filename = fn,group ="genotype",dataset ="snps",outgroup ="genotype_imputed",outdataset ="snps_complete",bycols =TRUE,paral =TRUE)# Cleanupif (file.exists(fn)) {file.remove(fn)}```## See Also::: {.see-also}- [bdCreate_hdf5_matrix](../hdf5_io_management/bdCreate_hdf5_matrix.html) for creating HDF5 matrices:::