# View current options
hdf5matrix_options()
# Enable parallelization with 8 threads
hdf5matrix_options(paral = TRUE, threads = 8)
# Set block size to 1000 elements
hdf5matrix_options(block_size = 1000)
# Disable compression for benchmarking
hdf5matrix_options(compression = 0)
# Reset to defaults
hdf5matrix_options(paral = NULL, threads = NULL, block_size = NULL, compression = NULL)hdf5matrix_options
hdf5matrix_options
HDF5MATRIX_CORE
1 Description
Configure global settings for parallelization, block processing and compression in HDF5Matrix operations. These settings affect all HDF5Matrix computations unless explicitly overridden in individual method calls.
2 Usage
hdf5matrix_options(...)3 Arguments
| Parameter | Description |
|---|---|
paral |
Logical or NULL. Enable OpenMP parallelization? |
block_size |
Integer or NULL. Number of elements per block for block-wise processing. |
threads |
Integer or NULL. Number of OpenMP threads to use. |
compression |
Integer (0-9) or NULL. gzip compression level for created datasets. |
4 Value
When called with arguments: invisibly returns a list of all current options. When called without arguments: returns a list of all current options.
5 Details
BigDataStatMeth achieves high performance through two key mechanisms:
Block-wise processing: Large matrices are processed in chunks that fit in memory. The block_size parameter controls chunk size. Smaller blocks use less memory but require more I/O operations. Larger blocks are faster but require more RAM.
OpenMP parallelization: Operations are distributed across CPU cores. The paral and threads parameters control this. Parallelization provides near-linear speedup for compute-intensive operations.
Compression: Datasets are created with gzip compression (level 6 by default). This reduces disk usage by 60-80% at the cost of additional CPU time for compress/decompress. For benchmarks or workflows where speed is critical, set compression = 0. For long-term storage or large datasets, keep the default.
Priority: Options set here serve as defaults. Individual method calls can override: A$multiply(B, paral = TRUE, threads = 4, block_size = 2000)
Recommendations: