bdReduce_hdf5_dataset

HDF5_IO_MANAGEMENT

1 Description

Reduces multiple datasets within an HDF5 group using arithmetic operations (addition or subtraction).

2 Usage

bdReduce_hdf5_dataset(filename, group, reducefunction, outgroup = NULL, outdataset = NULL, overwrite = FALSE, remove = FALSE)

3 Arguments

Parameter	Description
`filename`	Character string. Path to the HDF5 file.
`group`	Character string. Path to the group containing datasets.
`reducefunction`	Character. Operation to apply, either “+” or “-”.
`outgroup`	Character string (optional). Output group path. If NULL, uses input group.
`outdataset`	Character string (optional). Output dataset name. If NULL, uses input group name.
`overwrite`	Logical (optional). Whether to overwrite existing dataset. Default is FALSE.
`remove`	Logical (optional). Whether to remove source datasets after reduction. Default is FALSE.

4 Value

List with components. If an error occurs, all string values are returned as empty strings (““):

fn: Character string with the HDF5 filename
ds: Character string with the full dataset path to the reduced dataset (group/dataset)
func: Character string with the reduction function applied

5 Details

This function provides efficient dataset reduction capabilities with: - Operation options: - Addition of datasets - Subtraction of datasets - Output options: - Custom output location - Configurable dataset name - Overwrite protection - Implementation features: - Memory-efficient processing - Safe file operations - Optional source cleanup - Comprehensive error handling

The function processes datasets efficiently while maintaining data integrity.

6 Examples

Code

library(BigDataStatMeth)

# Create test matrices
X1 <- matrix(1:100, 10, 10)
X2 <- matrix(101:200, 10, 10)
X3 <- matrix(201:300, 10, 10)

# Save to HDF5
fn <- "test.hdf5"
bdCreate_hdf5_matrix(fn, X1, "data", "matrix1",
                     overwriteFile = TRUE)
bdCreate_hdf5_matrix(fn, X2, "data", "matrix2",
                     overwriteFile = FALSE)
bdCreate_hdf5_matrix(fn, X3, "data", "matrix3",
                     overwriteFile = FALSE)

# Reduce datasets by addition
bdReduce_hdf5_dataset(
  filename = fn,
  group = "data",
  reducefunction = "+",
  outgroup = "results",
  outdataset = "sum_matrix",
  overwrite = TRUE
)

# Cleanup
if (file.exists(fn)) {
  file.remove(fn)
}

7 See Also

bdCreate_hdf5_matrix for creating HDF5 matrices

--- title: "bdReduce_hdf5_dataset" subtitle: "bdReduce_hdf5_dataset" --- <span class="category-badge hdf5_io_management">HDF5_IO_MANAGEMENT</span> ## Description Reduces multiple datasets within an HDF5 group using arithmetic operations (addition or subtraction). ## Usage ```r bdReduce_hdf5_dataset(filename, group, reducefunction, outgroup = NULL, outdataset = NULL, overwrite = FALSE, remove = FALSE) ``` ## Arguments ::: {.param-table} | Parameter | Description | |-----------|-------------| | `filename` | Character string. Path to the HDF5 file. | | `group` | Character string. Path to the group containing datasets. | | `reducefunction` | Character. Operation to apply, either "+" or "-". | | `outgroup` | Character string (optional). Output group path. If NULL, uses input group. | | `outdataset` | Character string (optional). Output dataset name. If NULL, uses input group name. | | `overwrite` | Logical (optional). Whether to overwrite existing dataset. Default is FALSE. | | `remove` | Logical (optional). Whether to remove source datasets after reduction. Default is FALSE. | ::: ## Value ::: {.return-value} List with components. If an error occurs, all string values are returned as empty strings (""): - **`fn`**: Character string with the HDF5 filename - **`ds`**: Character string with the full dataset path to the reduced dataset (group/dataset) - **`func`**: Character string with the reduction function applied ::: ## Details This function provides efficient dataset reduction capabilities with: - Operation options: - Addition of datasets - Subtraction of datasets - Output options: - Custom output location - Configurable dataset name - Overwrite protection - Implementation features: - Memory-efficient processing - Safe file operations - Optional source cleanup - Comprehensive error handling The function processes datasets efficiently while maintaining data integrity. ## Examples ```{r} #| eval: false #| code-fold: show library(BigDataStatMeth) # Create test matrices X1 <- matrix(1:100, 10, 10) X2 <- matrix(101:200, 10, 10) X3 <- matrix(201:300, 10, 10) # Save to HDF5 fn <- "test.hdf5" bdCreate_hdf5_matrix(fn, X1, "data", "matrix1", overwriteFile = TRUE) bdCreate_hdf5_matrix(fn, X2, "data", "matrix2", overwriteFile = FALSE) bdCreate_hdf5_matrix(fn, X3, "data", "matrix3", overwriteFile = FALSE) # Reduce datasets by addition bdReduce_hdf5_dataset( filename = fn, group = "data", reducefunction = "+", outgroup = "results", outdataset = "sum_matrix", overwrite = TRUE ) # Cleanup if (file.exists(fn)) { file.remove(fn) } ``` ## See Also ::: {.see-also} - [bdCreate_hdf5_matrix](bdCreate_hdf5_matrix.html) for creating HDF5 matrices :::