bdEigen_hdf5

HDF5_ALGEBRA

1 Description

Computes the eigenvalue decomposition of a large matrix stored in an HDF5 file using the Spectra library. This provides consistent results with the RSpectra package and can handle both symmetric and non-symmetric matrices.

2 Usage

bdEigen_hdf5(filename, group = NULL, dataset = NULL, k = NULL, which = NULL, ncv = NULL, bcenter = NULL, bscale = NULL, tolerance = NULL, max_iter = NULL, compute_vectors = NULL, overwrite = NULL, threads = NULL)

3 Arguments

Parameter Description
filename Character string. Path to the HDF5 file containing the input matrix.
group Character string. Path to the group containing the input dataset.
dataset Character string. Name of the input dataset to decompose.
k Integer. Number of eigenvalues to compute (default = 6, following Spectra convention).
which Character string. Which eigenvalues to compute (default = “LM”): * “LM”: Largest magnitude * “SM”: Smallest magnitude * “LR”: Largest real part (non-symmetric matrices) * “SR”: Smallest real part (non-symmetric matrices) * “LI”: Largest imaginary part (non-symmetric matrices) * “SI”: Smallest imaginary part (non-symmetric matrices) * “LA”: Largest algebraic (symmetric matrices) * “SA”: Smallest algebraic (symmetric matrices)
ncv Integer. Number of Arnoldi vectors (default = 0, auto-selected as max(2*k+1, 20)).
bcenter Logical. If TRUE, centers the data by subtracting column means (default = FALSE).
bscale Logical. If TRUE, scales the centered columns by their standard deviations (default = FALSE).
tolerance Numeric. Convergence tolerance for Spectra algorithms (default = 1e-10).
max_iter Integer. Maximum number of iterations for Spectra algorithms (default = 1000).
compute_vectors Logical. If TRUE (default), computes both eigenvalues and eigenvectors.
overwrite Logical. If TRUE, allows overwriting existing results (default = FALSE).
threads Integer. Number of threads for parallel computation (default = NULL, uses available cores).

4 Value

List with components:

  • fn: Character string with the HDF5 filename
  • values: Character string with the full dataset path to the eigenvalues (real part) (group/dataset)
  • vectors: Character string with the full dataset path to the eigenvectors (real part) (group/dataset)
  • values_imag: Character string with the full dataset path to the eigenvalues (imaginary part), or NULL if all eigenvalues are real
  • vectors_imag: Character string with the full dataset path to the eigenvectors (imaginary part), or NULL if all eigenvectors are real
  • is_symmetric: Logical indicating whether the matrix was detected as symmetric

5 Details

This function uses the Spectra library (same as RSpectra) for eigenvalue computation, ensuring consistent results. Key features include: - Automatic detection of symmetric vs non-symmetric matrices - Support for both real and complex eigenvalues/eigenvectors - Memory-efficient block-based processing for large matrices - Parallel processing support - Various eigenvalue selection criteria - Consistent interface with RSpectra::eigs()

The implementation automatically: - Detects matrix symmetry and uses appropriate solver (SymEigsSolver vs GenEigsSolver) - Handles complex eigenvalues for non-symmetric matrices - Saves imaginary parts separately when non-zero - Provides the same results as RSpectra::eigs() function

6 Examples

Code
library(BigDataStatMeth)
library(rhdf5)
library(RSpectra)

# Create a sample matrix (can be non-symmetric)
set.seed(123)
A <- matrix(rnorm(2500), 50, 50)

fn <- "test_eigen.hdf5"
bdCreate_hdf5_matrix_file(filename = fn, object = A, group = "data", dataset = "matrix")

# Compute eigendecomposition with BigDataStatMeth
res <- bdEigen_hdf5(fn, "data", "matrix", k = 6, which = "LM")

# Compare with RSpectra (should give same results)
rspectra_result <- eigs(A, k = 6, which = "LM")

# Extract results from HDF5
eigenvals_bd <- h5read(res$fn, res$values)
eigenvecs_bd <- h5read(res$fn, res$vectors)

# Compare eigenvalues (should be identical)
all.equal(eigenvals_bd, Re(rspectra_result$values), tolerance = 1e-12)

# For non-symmetric matrices, check imaginary parts
if (!is.null(res$values_imag)) {
  eigenvals_imag <- h5read(res$fn, res$values_imag)
  all.equal(eigenvals_imag, Im(rspectra_result$values), tolerance = 1e-12)
}

# Remove file
if (file.exists(fn)) {
  file.remove(fn)
}

7 See Also

  • bdSVD_hdf5 for Singular Value Decomposition
  • bdPCA_hdf5 for Principal Component Analysis
  • RSpectra::eigs for the R equivalent function