library(BigDataStatMeth)# Create test matricesN <-1500M <-1500set.seed(555)a <-matrix(rnorm(N*M), N, M)b <-matrix(rnorm(N*M), N, M)# Save to HDF5bdCreate_hdf5_matrix("test.hdf5", a, "data", "A",overwriteFile =TRUE)bdCreate_hdf5_matrix("test.hdf5", b, "data", "B",overwriteFile =FALSE)# Perform subtractionbdblockSubstract_hdf5("test.hdf5", "data", "A", "B",outgroup ="results",outdataset ="diff",block_size =1024,paral =TRUE)
Source Code
---title: "bdblockSubstract_hdf5"subtitle: "bdblockSubstract_hdf5"---<span class="category-badge blockwise_ops">BLOCKWISE_OPS</span>## Usage```rbdblockSubstract_hdf5(filename, group, A, B, groupB =NULL, block_size =NULL, paral =NULL, threads =NULL, outgroup =NULL, outdataset =NULL, overwrite =NULL)```## Arguments::: {.param-table}| Parameter | Description ||-----------|-------------||`filename`| String indicating the HDF5 file path ||`group`| String indicating the group containing matrix A ||`A`| String specifying the dataset name for matrix A ||`B`| String specifying the dataset name for matrix B ||`groupB`| Optional string indicating group containing matrix B. If NULL, uses same group as A ||`block_size`| Optional integer specifying block size for processing. If NULL, automatically determined based on matrix dimensions ||`paral`| Optional boolean indicating whether to use parallel processing. Default is false ||`threads`| Optional integer specifying number of threads for parallel processing. If NULL, uses maximum available threads ||`outgroup`| Optional string specifying output group. Default is "OUTPUT" ||`outdataset`| Optional string specifying output dataset name. Default is "A_-_B" ||`overwrite`| Optional boolean indicating whether to overwrite existing datasets. Default is false |:::## Value::: {.return-value}A list containing the location of the subtraction result:- **`fn`**: Character string. Path to the HDF5 file containing the result- **`ds`**: Character string. Full dataset path to the subtraction result (A - B) within the HDF5 file:::## DetailsThe function implements optimized subtraction through:Operation modes:- Matrix-matrix subtraction (A - B)- Matrix-vector subtraction- Vector-matrix subtractionBlock processing:- Automatic block size selection- Memory-efficient operations- Parallel computation supportBlock size optimization based on:- Matrix dimensions- Available memory- Operation type (matrix/vector)Error handling:- Dimension validation- Resource management- Exception handling## Examples```{r}#| eval: false#| code-fold: showlibrary(BigDataStatMeth)# Create test matricesN <-1500M <-1500set.seed(555)a <-matrix(rnorm(N*M), N, M)b <-matrix(rnorm(N*M), N, M)# Save to HDF5bdCreate_hdf5_matrix("test.hdf5", a, "data", "A",overwriteFile =TRUE)bdCreate_hdf5_matrix("test.hdf5", b, "data", "B",overwriteFile =FALSE)# Perform subtractionbdblockSubstract_hdf5("test.hdf5", "data", "A", "B",outgroup ="results",outdataset ="diff",block_size =1024,paral =TRUE)```