bdblockSubstract_hdf5

BLOCKWISE_OPS

1 Usage

bdblockSubstract_hdf5(filename, group, A, B, groupB = NULL, block_size = NULL, paral = NULL, threads = NULL, outgroup = NULL, outdataset = NULL, overwrite = NULL)

2 Arguments

Parameter	Description
`filename`	String indicating the HDF5 file path
`group`	String indicating the group containing matrix A
`A`	String specifying the dataset name for matrix A
`B`	String specifying the dataset name for matrix B
`groupB`	Optional string indicating group containing matrix B. If NULL, uses same group as A
`block_size`	Optional integer specifying block size for processing. If NULL, automatically determined based on matrix dimensions
`paral`	Optional boolean indicating whether to use parallel processing. Default is false
`threads`	Optional integer specifying number of threads for parallel processing. If NULL, uses maximum available threads
`outgroup`	Optional string specifying output group. Default is “OUTPUT”
`outdataset`	Optional string specifying output dataset name. Default is “A_-_B”
`overwrite`	Optional boolean indicating whether to overwrite existing datasets. Default is false

3 Value

A list containing the location of the subtraction result:

fn: Character string. Path to the HDF5 file containing the result
ds: Character string. Full dataset path to the subtraction result (A - B) within the HDF5 file

4 Details

The function implements optimized subtraction through:

Operation modes: - Matrix-matrix subtraction (A - B) - Matrix-vector subtraction - Vector-matrix subtraction

Block processing: - Automatic block size selection - Memory-efficient operations - Parallel computation support

Block size optimization based on: - Matrix dimensions - Available memory - Operation type (matrix/vector)

Error handling: - Dimension validation - Resource management - Exception handling

5 Examples

Code

library(BigDataStatMeth)

# Create test matrices
N <- 1500
M <- 1500
set.seed(555)
a <- matrix(rnorm(N*M), N, M)
b <- matrix(rnorm(N*M), N, M)

# Save to HDF5
bdCreate_hdf5_matrix("test.hdf5", a, "data", "A",
                     overwriteFile = TRUE)
bdCreate_hdf5_matrix("test.hdf5", b, "data", "B",
                     overwriteFile = FALSE)

# Perform subtraction
bdblockSubstract_hdf5("test.hdf5", "data", "A", "B",
                      outgroup = "results",
                      outdataset = "diff",
                      block_size = 1024,
                      paral = TRUE)

--- title: "bdblockSubstract_hdf5" subtitle: "bdblockSubstract_hdf5" --- <span class="category-badge blockwise_ops">BLOCKWISE_OPS</span> ## Usage ```r bdblockSubstract_hdf5(filename, group, A, B, groupB = NULL, block_size = NULL, paral = NULL, threads = NULL, outgroup = NULL, outdataset = NULL, overwrite = NULL) ``` ## Arguments ::: {.param-table} | Parameter | Description | |-----------|-------------| | `filename` | String indicating the HDF5 file path | | `group` | String indicating the group containing matrix A | | `A` | String specifying the dataset name for matrix A | | `B` | String specifying the dataset name for matrix B | | `groupB` | Optional string indicating group containing matrix B. If NULL, uses same group as A | | `block_size` | Optional integer specifying block size for processing. If NULL, automatically determined based on matrix dimensions | | `paral` | Optional boolean indicating whether to use parallel processing. Default is false | | `threads` | Optional integer specifying number of threads for parallel processing. If NULL, uses maximum available threads | | `outgroup` | Optional string specifying output group. Default is "OUTPUT" | | `outdataset` | Optional string specifying output dataset name. Default is "A_-_B" | | `overwrite` | Optional boolean indicating whether to overwrite existing datasets. Default is false | ::: ## Value ::: {.return-value} A list containing the location of the subtraction result: - **`fn`**: Character string. Path to the HDF5 file containing the result - **`ds`**: Character string. Full dataset path to the subtraction result (A - B) within the HDF5 file ::: ## Details The function implements optimized subtraction through: Operation modes: - Matrix-matrix subtraction (A - B) - Matrix-vector subtraction - Vector-matrix subtraction Block processing: - Automatic block size selection - Memory-efficient operations - Parallel computation support Block size optimization based on: - Matrix dimensions - Available memory - Operation type (matrix/vector) Error handling: - Dimension validation - Resource management - Exception handling ## Examples ```{r} #| eval: false #| code-fold: show library(BigDataStatMeth) # Create test matrices N <- 1500 M <- 1500 set.seed(555) a <- matrix(rnorm(N*M), N, M) b <- matrix(rnorm(N*M), N, M) # Save to HDF5 bdCreate_hdf5_matrix("test.hdf5", a, "data", "A", overwriteFile = TRUE) bdCreate_hdf5_matrix("test.hdf5", b, "data", "B", overwriteFile = FALSE) # Perform subtraction bdblockSubstract_hdf5("test.hdf5", "data", "A", "B", outgroup = "results", outdataset = "diff", block_size = 1024, paral = TRUE) ```