bdDiag_multiply_hdf5

HDF5_ALGEBRA

1 Description

Performs optimized diagonal multiplication between two datasets stored in HDF5 format. Automatically detects whether inputs are matrices (extracts diagonals) or vectors (direct operation) and uses the most efficient approach. This function performs element-wise multiplication and is ~50-250x faster than traditional matrix operations.

2 Usage

bdDiag_multiply_hdf5(filename, group, A, B, groupB = NULL, target = NULL, outgroup = NULL, outdataset = NULL, paral = NULL, threads = NULL, overwrite = NULL)

3 Arguments

Parameter	Description
`filename`	String. Path to the HDF5 file containing the datasets.
`group`	String. Group path containing the first dataset (A).
`A`	String. Name of the first dataset (matrix or vector).
`B`	String. Name of the second dataset (matrix or vector).
`groupB`	Optional string. Group path containing dataset B. If NULL, uses same group as A.
`target`	Optional string. Where to write result: “A”, “B”, or “new” (default: “new”).
`outgroup`	Optional string. Output group path. Default is “OUTPUT”.
`outdataset`	Optional string. Output dataset name. Default is “A_*_B” with .diag suffix if appropriate.
`paral`	Optional logical. Whether to use parallel processing. Default is FALSE.
`threads`	Optional integer. Number of threads for parallel processing. If NULL, uses maximum available threads.
`overwrite`	Optional logical. Whether to overwrite existing datasets. Default is FALSE.

4 Value

List with components:

fn: Character string with the HDF5 filename
ds: Character string with the full dataset path to the diagonal multiplication result (group/dataset)

5 Details

This function provides flexible diagonal multiplication with automatic optimization: - Operation modes: - Matrix * Matrix: Extract diagonals → vector multiplication → save as vector - Matrix * Vector: Extract diagonal → vector multiplication → save as vector
- Vector * Vector: Direct vector multiplication (most efficient) - Performance features: - Uses optimized vector operations for maximum efficiency - Automatic type detection and dimension validation - Memory-efficient processing for large datasets - Parallel processing support for improved performance - Mathematical properties: - Element-wise multiplication (not matrix multiplication) - Commutative operation: A * B = B * A - Handles overflow according to IEEE 754 standards - Preserves sign information correctly

6 Examples

Code

library(BigDataStatMeth)

# Create test matrices
N <- 1000
set.seed(123)
A <- matrix(rnorm(N*N), N, N)
B <- matrix(rnorm(N*N), N, N)

# Save to HDF5
bdCreate_hdf5_matrix("test.hdf5", A, "data", "matrixA",
                     overwriteFile = TRUE)
bdCreate_hdf5_matrix("test.hdf5", B, "data", "matrixB",
                     overwriteFile = FALSE)

# Multiply diagonals (element-wise)
result <- bdDiag_multiply_hdf5("test.hdf5", "data", "matrixA", "matrixB",
                              outgroup = "results",
                              outdataset = "diagonal_product",
                              paral = TRUE)

--- title: "bdDiag_multiply_hdf5" subtitle: "bdDiag_multiply_hdf5" --- <span class="category-badge hdf5_algebra">HDF5_ALGEBRA</span> ## Description Performs optimized diagonal multiplication between two datasets stored in HDF5 format. Automatically detects whether inputs are matrices (extracts diagonals) or vectors (direct operation) and uses the most efficient approach. This function performs element-wise multiplication and is ~50-250x faster than traditional matrix operations. ## Usage ```r bdDiag_multiply_hdf5(filename, group, A, B, groupB = NULL, target = NULL, outgroup = NULL, outdataset = NULL, paral = NULL, threads = NULL, overwrite = NULL) ``` ## Arguments ::: {.param-table} | Parameter | Description | |-----------|-------------| | `filename` | String. Path to the HDF5 file containing the datasets. | | `group` | String. Group path containing the first dataset (A). | | `A` | String. Name of the first dataset (matrix or vector). | | `B` | String. Name of the second dataset (matrix or vector). | | `groupB` | Optional string. Group path containing dataset B. If NULL, uses same group as A. | | `target` | Optional string. Where to write result: "A", "B", or "new" (default: "new"). | | `outgroup` | Optional string. Output group path. Default is "OUTPUT". | | `outdataset` | Optional string. Output dataset name. Default is "A_*_B" with .diag suffix if appropriate. | | `paral` | Optional logical. Whether to use parallel processing. Default is FALSE. | | `threads` | Optional integer. Number of threads for parallel processing. If NULL, uses maximum available threads. | | `overwrite` | Optional logical. Whether to overwrite existing datasets. Default is FALSE. | ::: ## Value ::: {.return-value} List with components: - **`fn`**: Character string with the HDF5 filename - **`ds`**: Character string with the full dataset path to the diagonal multiplication result (group/dataset) ::: ## Details This function provides flexible diagonal multiplication with automatic optimization: - Operation modes: - Matrix * Matrix: Extract diagonals → vector multiplication → save as vector - Matrix * Vector: Extract diagonal → vector multiplication → save as vector - Vector * Vector: Direct vector multiplication (most efficient) - Performance features: - Uses optimized vector operations for maximum efficiency - Automatic type detection and dimension validation - Memory-efficient processing for large datasets - Parallel processing support for improved performance - Mathematical properties: - Element-wise multiplication (not matrix multiplication) - Commutative operation: A * B = B * A - Handles overflow according to IEEE 754 standards - Preserves sign information correctly ## Examples ```{r} #| eval: false #| code-fold: show library(BigDataStatMeth) # Create test matrices N <- 1000 set.seed(123) A <- matrix(rnorm(N*N), N, N) B <- matrix(rnorm(N*N), N, N) # Save to HDF5 bdCreate_hdf5_matrix("test.hdf5", A, "data", "matrixA", overwriteFile = TRUE) bdCreate_hdf5_matrix("test.hdf5", B, "data", "matrixB", overwriteFile = FALSE) # Multiply diagonals (element-wise) result <- bdDiag_multiply_hdf5("test.hdf5", "data", "matrixA", "matrixB", outgroup = "results", outdataset = "diagonal_product", paral = TRUE) ```