library(BigDataStatMeth)
# Create test data
set.seed(123)
Y <- matrix(rnorm(100), 10, 10)
X <- matrix(rnorm(10), 10, 1)
# Save to HDF5
bdCreate_hdf5_matrix("test.hdf5", Y, "data", "Y",
overwriteFile = TRUE,
overwriteDataset = FALSE,
unlimited = FALSE)
bdCreate_hdf5_matrix("test.hdf5", X, "data", "X",
overwriteFile = FALSE,
overwriteDataset = FALSE,
unlimited = FALSE)
# Multiply matrix rows by vector
bdcomputeMatrixVector_hdf5("test.hdf5",
group = "data",
dataset = "Y",
vectorgroup = "data",
vectordataset = "X",
outdataset = "ProdComputed",
func = "*",
byrows = TRUE,
overwrite = TRUE)
# Subtract vector from matrix rows
bdcomputeMatrixVector_hdf5("test.hdf5",
group = "data",
dataset = "Y",
vectorgroup = "data",
vectordataset = "X",
outdataset = "SubsComputed",
func = "-",
byrows = TRUE,
overwrite = TRUE)
# Subtract vector from matrix columns
bdcomputeMatrixVector_hdf5("test.hdf5",
group = "data",
dataset = "Y",
vectorgroup = "data",
vectordataset = "X",
outdataset = "SubsComputed",
func = "-",
byrows = FALSE,
overwrite = TRUE)
# Cleanup
if (file.exists("test.hdf5")) {
file.remove("test.hdf5")
}bdcomputeMatrixVector_hdf5
bdcomputeMatrixVector_hdf5
BLOCKWISE_OPS
1 Description
Performs element-wise operations between a matrix and a vector stored in HDF5 format. The function supports addition, subtraction, multiplication, division and power operations, with options for row-wise or column-wise application and parallel processing.
2 Usage
bdcomputeMatrixVector_hdf5(filename, group, dataset, vectorgroup, vectordataset, outdataset, func, outgroup = NULL, byrows = NULL, paral = NULL, threads = NULL, overwrite = FALSE)3 Arguments
| Parameter | Description |
|---|---|
filename |
String. Path to the HDF5 file containing the datasets. |
group |
String. Path to the group containing the matrix dataset. |
dataset |
String. Name of the matrix dataset. |
vectorgroup |
String. Path to the group containing the vector dataset. |
vectordataset |
String. Name of the vector dataset. |
outdataset |
String. Name for the output dataset. |
func |
String. Operation to perform: “+”, “-”, “*“,”/“, or”pow”. |
outgroup |
Optional string. Output group path. If not provided, results are stored in the same group as the input matrix. |
byrows |
Logical. If TRUE, applies operation by rows. If FALSE (default), applies operation by columns. |
paral |
Logical. If TRUE, enables parallel processing. |
threads |
Integer. Number of threads for parallel processing. Ignored if paral is FALSE. |
overwrite |
Logical. If TRUE, allows overwriting existing datasets. |
4 Value
List with components:
fn: Character string with the HDF5 filenamegr: Character string with the HDF5 groupds: Character string with the full dataset path (group/dataset)
5 Details
This function provides a flexible interface for performing element-wise operations between matrices and vectors stored in HDF5 format. It supports: - Four basic operations: - Addition (+): Adds vector elements to matrix rows/columns - Subtraction (-): Subtracts vector elements from matrix rows/columns - Multiplication (*): Multiplies matrix rows/columns by vector elements - Division (/): Divides matrix rows/columns by vector elements - Power (pow): power matrix rows/columns by vector elements - Processing options: - Row-wise or column-wise operations - Parallel processing for improved performance - Configurable thread count for parallel execution - Memory-efficient processing for large datasets
The function performs extensive validation: - Checks matrix and vector dimensions for compatibility - Validates operation type - Verifies HDF5 file and dataset accessibility - Ensures proper data structures (matrix vs. vector)
6 Examples
7 See Also
- bdCreate_hdf5_matrix for creating HDF5 matrices