bdcomputeMatrixVector_hdf5

BLOCKWISE_OPS

1 Description

Performs element-wise operations between a matrix and a vector stored in HDF5 format. The function supports addition, subtraction, multiplication, division and power operations, with options for row-wise or column-wise application and parallel processing.

2 Usage

bdcomputeMatrixVector_hdf5(filename, group, dataset, vectorgroup, vectordataset, outdataset, func, outgroup = NULL, byrows = NULL, paral = NULL, threads = NULL, overwrite = FALSE)

3 Arguments

Parameter	Description
`filename`	String. Path to the HDF5 file containing the datasets.
`group`	String. Path to the group containing the matrix dataset.
`dataset`	String. Name of the matrix dataset.
`vectorgroup`	String. Path to the group containing the vector dataset.
`vectordataset`	String. Name of the vector dataset.
`outdataset`	String. Name for the output dataset.
`func`	String. Operation to perform: “+”, “-”, “*“,”/“, or”pow”.
`outgroup`	Optional string. Output group path. If not provided, results are stored in the same group as the input matrix.
`byrows`	Logical. If TRUE, applies operation by rows. If FALSE (default), applies operation by columns.
`paral`	Logical. If TRUE, enables parallel processing.
`threads`	Integer. Number of threads for parallel processing. Ignored if paral is FALSE.
`overwrite`	Logical. If TRUE, allows overwriting existing datasets.

4 Value

List with components:

fn: Character string with the HDF5 filename
gr: Character string with the HDF5 group
ds: Character string with the full dataset path (group/dataset)

5 Details

This function provides a flexible interface for performing element-wise operations between matrices and vectors stored in HDF5 format. It supports: - Four basic operations: - Addition (+): Adds vector elements to matrix rows/columns - Subtraction (-): Subtracts vector elements from matrix rows/columns - Multiplication (*): Multiplies matrix rows/columns by vector elements - Division (/): Divides matrix rows/columns by vector elements - Power (pow): power matrix rows/columns by vector elements - Processing options: - Row-wise or column-wise operations - Parallel processing for improved performance - Configurable thread count for parallel execution - Memory-efficient processing for large datasets

The function performs extensive validation: - Checks matrix and vector dimensions for compatibility - Validates operation type - Verifies HDF5 file and dataset accessibility - Ensures proper data structures (matrix vs. vector)

6 Examples

library(BigDataStatMeth)
    
# Create test data
set.seed(123)
Y <- matrix(rnorm(100), 10, 10)
X <- matrix(rnorm(10), 10, 1)
        
# Save to HDF5
bdCreate_hdf5_matrix("test.hdf5", Y, "data", "Y",
                     overwriteFile = TRUE,
                     overwriteDataset = FALSE,
                     unlimited = FALSE)
bdCreate_hdf5_matrix("test.hdf5", X, "data", "X",
                     overwriteFile = FALSE,
                     overwriteDataset = FALSE,
                     unlimited = FALSE)
            
# Multiply matrix rows by vector
bdcomputeMatrixVector_hdf5("test.hdf5",
                           group = "data",
                           dataset = "Y",
                           vectorgroup = "data",
                           vectordataset = "X",
                           outdataset = "ProdComputed",
                           func = "*",
                           byrows = TRUE,
                           overwrite = TRUE)
    
# Subtract vector from matrix rows
bdcomputeMatrixVector_hdf5("test.hdf5",
                           group = "data",
                           dataset = "Y",
                           vectorgroup = "data",
                           vectordataset = "X",
                           outdataset = "SubsComputed",
                           func = "-",
                           byrows = TRUE,
                           overwrite = TRUE)
    
# Subtract vector from matrix columns
bdcomputeMatrixVector_hdf5("test.hdf5",
                           group = "data",
                           dataset = "Y",
                           vectorgroup = "data",
                           vectordataset = "X",
                           outdataset = "SubsComputed",
                           func = "-",
                           byrows = FALSE,
                           overwrite = TRUE)
                           
# Cleanup
if (file.exists("test.hdf5")) {
  file.remove("test.hdf5")
}

7 See Also

bdCreate_hdf5_matrix for creating HDF5 matrices

--- title: "bdcomputeMatrixVector_hdf5" subtitle: "bdcomputeMatrixVector_hdf5" --- <span class="category-badge blockwise_ops">BLOCKWISE_OPS</span> ## Description Performs element-wise operations between a matrix and a vector stored in HDF5 format. The function supports addition, subtraction, multiplication, division and power operations, with options for row-wise or column-wise application and parallel processing. ## Usage ```r bdcomputeMatrixVector_hdf5(filename, group, dataset, vectorgroup, vectordataset, outdataset, func, outgroup = NULL, byrows = NULL, paral = NULL, threads = NULL, overwrite = FALSE) ``` ## Arguments ::: {.param-table} | Parameter | Description | |-----------|-------------| | `filename` | String. Path to the HDF5 file containing the datasets. | | `group` | String. Path to the group containing the matrix dataset. | | `dataset` | String. Name of the matrix dataset. | | `vectorgroup` | String. Path to the group containing the vector dataset. | | `vectordataset` | String. Name of the vector dataset. | | `outdataset` | String. Name for the output dataset. | | `func` | String. Operation to perform: "+", "-", "*", "/", or "pow". | | `outgroup` | Optional string. Output group path. If not provided, results are stored in the same group as the input matrix. | | `byrows` | Logical. If TRUE, applies operation by rows. If FALSE (default), applies operation by columns. | | `paral` | Logical. If TRUE, enables parallel processing. | | `threads` | Integer. Number of threads for parallel processing. Ignored if paral is FALSE. | | `overwrite` | Logical. If TRUE, allows overwriting existing datasets. | ::: ## Value ::: {.return-value} List with components: - **`fn`**: Character string with the HDF5 filename - **`gr`**: Character string with the HDF5 group - **`ds`**: Character string with the full dataset path (group/dataset) ::: ## Details This function provides a flexible interface for performing element-wise operations between matrices and vectors stored in HDF5 format. It supports: - Four basic operations: - Addition (+): Adds vector elements to matrix rows/columns - Subtraction (-): Subtracts vector elements from matrix rows/columns - Multiplication (*): Multiplies matrix rows/columns by vector elements - Division (/): Divides matrix rows/columns by vector elements - Power (pow): power matrix rows/columns by vector elements - Processing options: - Row-wise or column-wise operations - Parallel processing for improved performance - Configurable thread count for parallel execution - Memory-efficient processing for large datasets The function performs extensive validation: - Checks matrix and vector dimensions for compatibility - Validates operation type - Verifies HDF5 file and dataset accessibility - Ensures proper data structures (matrix vs. vector) ## Examples ```{r} #| eval: false #| warning: false library(BigDataStatMeth) # Create test data set.seed(123) Y <- matrix(rnorm(100), 10, 10) X <- matrix(rnorm(10), 10, 1) # Save to HDF5 bdCreate_hdf5_matrix("test.hdf5", Y, "data", "Y", overwriteFile = TRUE, overwriteDataset = FALSE, unlimited = FALSE) bdCreate_hdf5_matrix("test.hdf5", X, "data", "X", overwriteFile = FALSE, overwriteDataset = FALSE, unlimited = FALSE) # Multiply matrix rows by vector bdcomputeMatrixVector_hdf5("test.hdf5", group = "data", dataset = "Y", vectorgroup = "data", vectordataset = "X", outdataset = "ProdComputed", func = "*", byrows = TRUE, overwrite = TRUE) # Subtract vector from matrix rows bdcomputeMatrixVector_hdf5("test.hdf5", group = "data", dataset = "Y", vectorgroup = "data", vectordataset = "X", outdataset = "SubsComputed", func = "-", byrows = TRUE, overwrite = TRUE) # Subtract vector from matrix columns bdcomputeMatrixVector_hdf5("test.hdf5", group = "data", dataset = "Y", vectorgroup = "data", vectordataset = "X", outdataset = "SubsComputed", func = "-", byrows = FALSE, overwrite = TRUE) # Cleanup if (file.exists("test.hdf5")) { file.remove("test.hdf5") } ``` ## See Also ::: {.see-also} - [bdCreate_hdf5_matrix](../hdf5_io_management/bdCreate_hdf5_matrix.html) for creating HDF5 matrices :::