split_dataset

split_dataset

AGGREGATIONS

1 Description

Splits an HDF5Matrixinto equal-sized sub-matrices stored as separate datasets in the same HDF5 file.

Output datasets are named <out_group>/<out_dataset>.0, <out_group>/<out_dataset>.1, (0-based index).

Exactly one of n_blocks or block_size must be provided.

2 Usage

split_dataset(...)

3 Arguments

Parameter Description
x An .
n_blocks Integer or . Number of blocks.
block_size Integer or . Rows or columns per block.
bycols Logical. Split by columns () or rows (default ).
out_group Character. Output HDF5 group (default ).
out_dataset Character or NULL. Base dataset name.
overwrite Logical. Overwrite existing blocks (default ).

4 Value

A named list of objects.

5 Examples

\donttest{
tmp  <- tempfile(fileext = ".h5")
M    <- hdf5_create_matrix(tmp, "data/M", data = matrix(1:60, 6, 10))
blks <- split_dataset(M, n_blocks = 3L)
length(blks)
lapply(blks, close)
close(M)
unlink(tmp)
}

6 See Also

\link[BigDataStatMeth]{cbind.HDF5Matrix}