\donttest{
csv_file <- tempfile(fileext = ".csv")
hdf5_file <- tempfile(fileext = ".h5")
# Write sample numeric data
write.table(matrix(rnorm(50), nrow = 10, ncol = 5),
csv_file, sep = ",", row.names = FALSE, col.names = TRUE)
# Import CSV to HDF5
mat <- hdf5_import(
source = csv_file,
filename = hdf5_file,
dataset = "raw/data",
sep = ","
)
dim(mat)
hdf5_close_all()
unlink(c(csv_file, hdf5_file))
}hdf5_import
hdf5_import
HDF5MATRIX_CORE
1 Description
Modern wrapper for importing CSV, TSV, or other delimited text files into HDF5 format. Returns an HDF5Matrix object ready for use.
2 Usage
hdf5_import(...)3 Arguments
| Parameter | Description |
|---|---|
source |
Character. Path to local file or URL to import. Supports compressed files (.gz, .tar.gz, .zip, .bz2). |
filename |
Character. Path to HDF5 output file (created if doesn’t exist). |
dataset |
Character. Full dataset path (e.g., “data/imported” or “group/dataset”). |
sep |
Character. Field separator. Default (auto-detect from extension: “,” for .csv, “\t” for .tsv, “\t” otherwise). |
header |
Logical or character vector. If , first row contains column names. If character vector, use these as column names. Default . |
rownames |
Logical or character vector. If , first column contains row names. If character vector, use these as row names. Default . |
overwrite |
Logical. If , overwrite dataset if exists. Default . |
parallel |
Logical. Use parallel processing for import. Default . |
threads |
Integer. Number of threads for parallel processing. Default (uses all available cores). |
4 Value
object pointing to the imported data.
5 Details
This function is a modern, user-friendly wrapper around bdImportData_hdf5 and bdImportTextFile_hdf5. It:
Supported formats:
Memory efficiency: Import is done in a streaming fashion, so very large files can be imported without loading them entirely into memory.
6 Examples
7 See Also
bdImportData_hdf5 for the underlying implementation, hdf5_create_matrix for creating matrices from R objects