\donttest{
hdf5_file <- tempfile(fileext = ".h5")
csv_file <- tempfile(fileext = ".csv")
# Create a test CSV file
data <- matrix(rnorm(100), 10, 10)
write.csv(data, csv_file, row.names = FALSE)
# Import to HDF5
bdImportTextFile_hdf5(
filename = csv_file,
outputfile = hdf5_file,
outGroup = "data",
outDataset = "matrix1",
sep = ",",
header = TRUE,
overwriteFile = TRUE
)
# Cleanup
unlink(c(csv_file, hdf5_file))
}bdImportTextFile_hdf5
bdImportTextFile_hdf5
HDF5_IO_MANAGEMENT
1 Description
Converts a text file (e.g., CSV, TSV) to HDF5 format, providing efficient storage and access capabilities.
2 Usage
bdImportTextFile_hdf5(filename, outputfile, outGroup, outDataset, sep = NULL, header = FALSE, rownames = FALSE, overwrite = FALSE, paral = NULL, threads = NULL, overwriteFile = NULL)3 Arguments
| Parameter | Description |
|---|---|
filename |
Character string. Path to the input text file. |
outputfile |
Character string. Path to the output HDF5 file. |
outGroup |
Character string. Name of the group to create in HDF5 file. |
outDataset |
Character string. Name of the dataset to create. |
sep |
Character string (optional). Field separator, default is “\t”. |
header |
Logical (optional). Whether first row contains column names. |
rownames |
Logical (optional). Whether first column contains row names. |
overwrite |
Logical (optional). Whether to overwrite existing dataset. |
paral |
Logical (optional). Whether to use parallel processing. |
threads |
Integer (optional). Number of threads for parallel processing. |
overwriteFile |
Logical (optional). Whether to overwrite existing HDF5 file. |
4 Value
List with components:
fn: Character string with the HDF5 filenameds: Character string with the full dataset path to the imported data (group/dataset)ds_rows: Character string with the full dataset path to the row namesds_cols: Character string with the full dataset path to the column names
5 Details
This function provides flexible text file import capabilities with support for: - Input format options: - Custom field separators - Header row handling - Row names handling - Processing options: - Parallel processing - Memory-efficient import - Configurable thread count - File handling: - Safe file operations - Overwrite protection - Comprehensive error handling
The function supports parallel processing for large files and provides memory-efficient import capabilities.
6 Examples
7 See Also
hdf5_create_matrixfor creating HDF5 matrices directly