bdapply_Function_hdf5

bdapply_Function_hdf5

HDF5_STATISTICS

1 Usage

bdapply_Function_hdf5(filename, group, datasets, outgroup, func, b_group = NULL, b_datasets = NULL, overwrite = FALSE, transp_dataset = FALSE, transp_bdataset = FALSE, fullMatrix = FALSE, byrows = FALSE, threads = 2L)

2 Arguments

Parameter Description
filename Character array, indicating the name of the file to create
group Character array, indicating the input group where the data set to be imputed is
datasets Character array, indicating the input datasets to be used
outgroup Character array, indicating group where the data set will be saved after imputation. If NULL, output dataset is stored in the same input group
func Character array, function to be applied: - “QR”: QR decomposition via bdQR() - “CrossProd”: Cross product via bdCrossprod() - “tCrossProd”: Transposed cross product via bdtCrossprod() - “invChol”: Inverse via Cholesky decomposition - “blockmult”: Matrix multiplication - “CrossProd_double”: Cross product with two matrices - “tCrossProd_double”: Transposed cross product with two matrices - “solve”: Matrix equation solving - “sdmean”: Standard deviation and mean computation
b_group Optional character array indicating the input group for secondary datasets (used in two-matrix operations)
b_datasets Optional character array indicating the secondary datasets for two-matrix operations
overwrite Optional boolean. If true, overwrites existing results
transp_dataset Optional boolean. If true, transposes first dataset
transp_bdataset Optional boolean. If true, transposes second dataset
fullMatrix Optional boolean for Cholesky operations. If true, stores complete matrix; if false, stores only lower triangular
byrows Optional boolean for statistical operations. If true, computes by rows; if false, by columns
threads Optional integer specifying number of threads for parallel processing

3 Value

Modifies the HDF5 file in place, adding computed results

4 Details

//’ For matrix multiplication operations (blockmult, CrossProd_double, tCrossProd_double), the datasets and b_datasets vectors must have the same length. Each operation is performed element-wise between the corresponding pairs of datasets. Specifically, the b_datasets vector defines the second operand for each matrix multiplication. For example, if datasets = {"A1", "A2", "A3"} and b_datasets = {"B1", "B2", "B3"}, the operations executed are: A1 %*% B1, A2 %*% B2, and A3 %*% B3.

Example: If datasets = {"A1", "A2", "A3"} and b_datasets = {"B1", "B2", "B3"}, the function computes: A1 %*% B1, A2 %*% B2, and A3 %*% B3

5 Examples

Code
# Create a sample large matrix in HDF5
# Create hdf5 datasets
bdCreate_hdf5_matrix(filename = "test_temp.hdf5", 
                    object = Y, group = "data", dataset = "Y",
                    transp = FALSE,
                    overwriteFile = TRUE, overwriteDataset = TRUE, 
                    unlimited = FALSE)

bdCreate_hdf5_matrix(filename = "test_temp.hdf5", 
                    object = X,  group = "data",  dataset = "X",
                    transp = FALSE,
                    overwriteFile = FALSE, overwriteDataset = TRUE, 
                    unlimited = FALSE)

bdCreate_hdf5_matrix(filename = "test_temp.hdf5",
                    object = Z,  group = "data",  dataset = "Z",
                    transp = FALSE,
                    overwriteFile = FALSE, overwriteDataset = TRUE,
                    unlimited = FALSE)

dsets <- bdgetDatasetsList_hdf5("test_temp.hdf5", group = "data")
dsets

# Apply function :  QR Decomposition
bdapply_Function_hdf5(filename = "test_temp.hdf5",
                     group = "data",datasets = dsets,
                     outgroup = "QR",func = "QR",
                     overwrite = TRUE)