bdblockSubstract_hdf5

bdblockSubstract_hdf5

BLOCKWISE_OPS

1 Usage

bdblockSubstract_hdf5(filename, group, A, B, groupB = NULL, block_size = NULL, paral = NULL, threads = NULL, outgroup = NULL, outdataset = NULL, overwrite = NULL)

2 Arguments

Parameter Description
filename String indicating the HDF5 file path
group String indicating the group containing matrix A
A String specifying the dataset name for matrix A
B String specifying the dataset name for matrix B
groupB Optional string indicating group containing matrix B. If NULL, uses same group as A
block_size Optional integer specifying block size for processing. If NULL, automatically determined based on matrix dimensions
paral Optional boolean indicating whether to use parallel processing. Default is false
threads Optional integer specifying number of threads for parallel processing. If NULL, uses maximum available threads
outgroup Optional string specifying output group. Default is “OUTPUT”
outdataset Optional string specifying output dataset name. Default is “A_-_B”
overwrite Optional boolean indicating whether to overwrite existing datasets. Default is false

3 Value

A list containing the location of the subtraction result:

  • fn: Character string. Path to the HDF5 file containing the result
  • ds: Character string. Full dataset path to the subtraction result (A - B) within the HDF5 file

4 Details

The function implements optimized subtraction through:

Operation modes: - Matrix-matrix subtraction (A - B) - Matrix-vector subtraction - Vector-matrix subtraction

Block processing: - Automatic block size selection - Memory-efficient operations - Parallel computation support

Block size optimization based on: - Matrix dimensions - Available memory - Operation type (matrix/vector)

Error handling: - Dimension validation - Resource management - Exception handling

5 Examples

Code
library(BigDataStatMeth)

# Create test matrices
N <- 1500
M <- 1500
set.seed(555)
a <- matrix(rnorm(N*M), N, M)
b <- matrix(rnorm(N*M), N, M)

# Save to HDF5
bdCreate_hdf5_matrix("test.hdf5", a, "data", "A",
                     overwriteFile = TRUE)
bdCreate_hdf5_matrix("test.hdf5", b, "data", "B",
                     overwriteFile = FALSE)

# Perform subtraction
bdblockSubstract_hdf5("test.hdf5", "data", "A", "B",
                      outgroup = "results",
                      outdataset = "diff",
                      block_size = 1024,
                      paral = TRUE)