bdReduce_hdf5_dataset

bdReduce_hdf5_dataset

HDF5_IO_MANAGEMENT

1 Description

Reduces multiple datasets within an HDF5 group using arithmetic operations (addition or subtraction).

2 Usage

bdReduce_hdf5_dataset(filename, group, reducefunction, outgroup = NULL, outdataset = NULL, overwrite = FALSE, remove = FALSE)

3 Arguments

Parameter Description
filename Character string. Path to the HDF5 file.
group Character string. Path to the group containing datasets.
reducefunction Character. Operation to apply, either “+” or “-”.
outgroup Character string (optional). Output group path. If NULL, uses input group.
outdataset Character string (optional). Output dataset name. If NULL, uses input group name.
overwrite Logical (optional). Whether to overwrite existing dataset. Default is FALSE.
remove Logical (optional). Whether to remove source datasets after reduction. Default is FALSE.

4 Value

List with components. If an error occurs, all string values are returned as empty strings (““):

  • fn: Character string with the HDF5 filename
  • ds: Character string with the full dataset path to the reduced dataset (group/dataset)
  • func: Character string with the reduction function applied

5 Details

This function provides efficient dataset reduction capabilities with: - Operation options: - Addition of datasets - Subtraction of datasets - Output options: - Custom output location - Configurable dataset name - Overwrite protection - Implementation features: - Memory-efficient processing - Safe file operations - Optional source cleanup - Comprehensive error handling

The function processes datasets efficiently while maintaining data integrity.

6 Examples

Code
library(BigDataStatMeth)

# Create test matrices
X1 <- matrix(1:100, 10, 10)
X2 <- matrix(101:200, 10, 10)
X3 <- matrix(201:300, 10, 10)

# Save to HDF5
fn <- "test.hdf5"
bdCreate_hdf5_matrix(fn, X1, "data", "matrix1",
                     overwriteFile = TRUE)
bdCreate_hdf5_matrix(fn, X2, "data", "matrix2",
                     overwriteFile = FALSE)
bdCreate_hdf5_matrix(fn, X3, "data", "matrix3",
                     overwriteFile = FALSE)

# Reduce datasets by addition
bdReduce_hdf5_dataset(
  filename = fn,
  group = "data",
  reducefunction = "+",
  outgroup = "results",
  outdataset = "sum_matrix",
  overwrite = TRUE
)

# Cleanup
if (file.exists(fn)) {
  file.remove(fn)
}

7 See Also