Computes the eigenvalue decomposition of a large matrix stored in an HDF5 file using the Spectra library. This provides consistent results with the RSpectra package and can handle both symmetric and non-symmetric matrices.
2 Usage
bdEigen_hdf5(filename, group =NULL, dataset =NULL, k =NULL, which =NULL, ncv =NULL, bcenter =NULL, bscale =NULL, tolerance =NULL, max_iter =NULL, compute_vectors =NULL, overwrite =NULL, threads =NULL)
3 Arguments
Parameter
Description
filename
Character string. Path to the HDF5 file containing the input matrix.
group
Character string. Path to the group containing the input dataset.
dataset
Character string. Name of the input dataset to decompose.
k
Integer. Number of eigenvalues to compute (default = 6, following Spectra convention).
which
Character string. Which eigenvalues to compute (default = “LM”): * “LM”: Largest magnitude * “SM”: Smallest magnitude * “LR”: Largest real part (non-symmetric matrices) * “SR”: Smallest real part (non-symmetric matrices) * “LI”: Largest imaginary part (non-symmetric matrices) * “SI”: Smallest imaginary part (non-symmetric matrices) * “LA”: Largest algebraic (symmetric matrices) * “SA”: Smallest algebraic (symmetric matrices)
ncv
Integer. Number of Arnoldi vectors (default = 0, auto-selected as max(2*k+1, 20)).
bcenter
Logical. If TRUE, centers the data by subtracting column means (default = FALSE).
bscale
Logical. If TRUE, scales the centered columns by their standard deviations (default = FALSE).
tolerance
Numeric. Convergence tolerance for Spectra algorithms (default = 1e-10).
max_iter
Integer. Maximum number of iterations for Spectra algorithms (default = 1000).
compute_vectors
Logical. If TRUE (default), computes both eigenvalues and eigenvectors.
overwrite
Logical. If TRUE, allows overwriting existing results (default = FALSE).
threads
Integer. Number of threads for parallel computation (default = NULL, uses available cores).
4 Value
List with components:
fn: Character string with the HDF5 filename
values: Character string with the full dataset path to the eigenvalues (real part) (group/dataset)
vectors: Character string with the full dataset path to the eigenvectors (real part) (group/dataset)
values_imag: Character string with the full dataset path to the eigenvalues (imaginary part), or NULL if all eigenvalues are real
vectors_imag: Character string with the full dataset path to the eigenvectors (imaginary part), or NULL if all eigenvectors are real
is_symmetric: Logical indicating whether the matrix was detected as symmetric
5 Details
This function uses the Spectra library (same as RSpectra) for eigenvalue computation, ensuring consistent results. Key features include: - Automatic detection of symmetric vs non-symmetric matrices - Support for both real and complex eigenvalues/eigenvectors - Memory-efficient block-based processing for large matrices - Parallel processing support - Various eigenvalue selection criteria - Consistent interface with RSpectra::eigs()
The implementation automatically: - Detects matrix symmetry and uses appropriate solver (SymEigsSolver vs GenEigsSolver) - Handles complex eigenvalues for non-symmetric matrices - Saves imaginary parts separately when non-zero - Provides the same results as RSpectra::eigs() function
6 Examples
Code
library(BigDataStatMeth)library(rhdf5)library(RSpectra)# Create a sample matrix (can be non-symmetric)set.seed(123)A <-matrix(rnorm(2500), 50, 50)fn <-"test_eigen.hdf5"bdCreate_hdf5_matrix_file(filename = fn, object = A, group ="data", dataset ="matrix")# Compute eigendecomposition with BigDataStatMethres <-bdEigen_hdf5(fn, "data", "matrix", k =6, which ="LM")# Compare with RSpectra (should give same results)rspectra_result <-eigs(A, k =6, which ="LM")# Extract results from HDF5eigenvals_bd <-h5read(res$fn, res$values)eigenvecs_bd <-h5read(res$fn, res$vectors)# Compare eigenvalues (should be identical)all.equal(eigenvals_bd, Re(rspectra_result$values), tolerance =1e-12)# For non-symmetric matrices, check imaginary partsif (!is.null(res$values_imag)) { eigenvals_imag <-h5read(res$fn, res$values_imag)all.equal(eigenvals_imag, Im(rspectra_result$values), tolerance =1e-12)}# Remove fileif (file.exists(fn)) {file.remove(fn)}
---title: "bdEigen_hdf5"subtitle: "bdEigen_hdf5"---<span class="category-badge hdf5_algebra">HDF5_ALGEBRA</span>## DescriptionComputes the eigenvalue decomposition of a large matrix stored in an HDF5 file usingthe Spectra library. This provides consistent results with the RSpectra package andcan handle both symmetric and non-symmetric matrices.## Usage```rbdEigen_hdf5(filename, group =NULL, dataset =NULL, k =NULL, which =NULL, ncv =NULL, bcenter =NULL, bscale =NULL, tolerance =NULL, max_iter =NULL, compute_vectors =NULL, overwrite =NULL, threads =NULL)```## Arguments::: {.param-table}| Parameter | Description ||-----------|-------------||`filename`| Character string. Path to the HDF5 file containing the input matrix. ||`group`| Character string. Path to the group containing the input dataset. ||`dataset`| Character string. Name of the input dataset to decompose. ||`k`| Integer. Number of eigenvalues to compute (default = 6, following Spectra convention). ||`which`| Character string. Which eigenvalues to compute (default = "LM"): * "LM": Largest magnitude * "SM": Smallest magnitude * "LR": Largest real part (non-symmetric matrices) * "SR": Smallest real part (non-symmetric matrices) * "LI": Largest imaginary part (non-symmetric matrices) * "SI": Smallest imaginary part (non-symmetric matrices) * "LA": Largest algebraic (symmetric matrices) * "SA": Smallest algebraic (symmetric matrices) ||`ncv`| Integer. Number of Arnoldi vectors (default = 0, auto-selected as max(2*k+1, 20)). ||`bcenter`| Logical. If TRUE, centers the data by subtracting column means (default = FALSE). ||`bscale`| Logical. If TRUE, scales the centered columns by their standard deviations (default = FALSE). ||`tolerance`| Numeric. Convergence tolerance for Spectra algorithms (default = 1e-10). ||`max_iter`| Integer. Maximum number of iterations for Spectra algorithms (default = 1000). ||`compute_vectors`| Logical. If TRUE (default), computes both eigenvalues and eigenvectors. ||`overwrite`| Logical. If TRUE, allows overwriting existing results (default = FALSE). ||`threads`| Integer. Number of threads for parallel computation (default = NULL, uses available cores). |:::## Value::: {.return-value}List with components:- **`fn`**: Character string with the HDF5 filename- **`values`**: Character string with the full dataset path to the eigenvalues (real part) (group/dataset)- **`vectors`**: Character string with the full dataset path to the eigenvectors (real part) (group/dataset)- **`values_imag`**: Character string with the full dataset path to the eigenvalues (imaginary part), or NULL if all eigenvalues are real- **`vectors_imag`**: Character string with the full dataset path to the eigenvectors (imaginary part), or NULL if all eigenvectors are real- **`is_symmetric`**: Logical indicating whether the matrix was detected as symmetric:::## DetailsThis function uses the Spectra library (same as RSpectra) for eigenvalue computation,ensuring consistent results. Key features include:- Automatic detection of symmetric vs non-symmetric matrices- Support for both real and complex eigenvalues/eigenvectors- Memory-efficient block-based processing for large matrices- Parallel processing support- Various eigenvalue selection criteria- Consistent interface with RSpectra::eigs()The implementation automatically:- Detects matrix symmetry and uses appropriate solver (SymEigsSolver vs GenEigsSolver)- Handles complex eigenvalues for non-symmetric matrices- Saves imaginary parts separately when non-zero- Provides the same results as RSpectra::eigs() function## Examples```{r}#| eval: false#| code-fold: showlibrary(BigDataStatMeth)library(rhdf5)library(RSpectra)# Create a sample matrix (can be non-symmetric)set.seed(123)A <-matrix(rnorm(2500), 50, 50)fn <-"test_eigen.hdf5"bdCreate_hdf5_matrix_file(filename = fn, object = A, group ="data", dataset ="matrix")# Compute eigendecomposition with BigDataStatMethres <-bdEigen_hdf5(fn, "data", "matrix", k =6, which ="LM")# Compare with RSpectra (should give same results)rspectra_result <-eigs(A, k =6, which ="LM")# Extract results from HDF5eigenvals_bd <-h5read(res$fn, res$values)eigenvecs_bd <-h5read(res$fn, res$vectors)# Compare eigenvalues (should be identical)all.equal(eigenvals_bd, Re(rspectra_result$values), tolerance =1e-12)# For non-symmetric matrices, check imaginary partsif (!is.null(res$values_imag)) { eigenvals_imag <-h5read(res$fn, res$values_imag)all.equal(eigenvals_imag, Im(rspectra_result$values), tolerance =1e-12)}# Remove fileif (file.exists(fn)) {file.remove(fn)}```## See Also::: {.see-also}- [bdSVD_hdf5](bdSVD_hdf5.html) for Singular Value Decomposition- [bdPCA_hdf5](bdPCA_hdf5.html) for Principal Component Analysis- `RSpectra::eigs` for the R equivalent function:::