The sorting order is specified through a list of data frames, where each data frame represents a block of elements to be sorted. Each data frame must contain: - Row names (current identifiers) - chr (new identifiers) - order (current positions) - newOrder (target positions)
In this example: - Block 1 maintains the original order - Block 2 assigns new identifiers (A5JA-D) to elements - Block 3 swaps identifiers between elements - The Diagonal column indicates whether the element is on the diagonal (1) or not (0)
---title: "bdSort_hdf5_dataset"subtitle: "bdSort_hdf5_dataset"---<span class="category-badge hdf5_io_management">HDF5_IO_MANAGEMENT</span>## DescriptionSorts a dataset in an HDF5 file based on a predefined ordering specifiedthrough a list of sorting blocks.## Usage```rbdSort_hdf5_dataset(filename, group, dataset, outdataset, blockedSortlist, func, outgroup =NULL, overwrite =FALSE)```## Arguments::: {.param-table}| Parameter | Description ||-----------|-------------||`filename`| Character string. Path to the HDF5 file. ||`group`| Character string. Path to the group containing input dataset. ||`dataset`| Character string. Name of the dataset to sort. ||`outdataset`| Character string. Name for the sorted dataset. ||`blockedSortlist`| List of data frames. Each data frame specifies the sorting order for a block of elements. See Details for structure. ||`func`| Character string. Function to apply: - "sortRows" for row-wise sorting - "sortCols" for column-wise sorting ||`outgroup`| Character string (optional). Output group path. If NULL, uses input group. ||`overwrite`| Logical (optional). Whether to overwrite existing dataset. Default is FALSE. |:::## Value::: {.return-value}List with components. If an error occurs, all string values are returned as empty strings (""):- **`fn`**: Character string with the HDF5 filename- **`ds`**: Character string with the full dataset path to the sorted dataset (group/dataset):::## DetailsThis function provides efficient dataset sorting capabilities with:- Sorting options: - Row-wise sorting - Column-wise sorting - Block-based processing- Implementation features: - Memory-efficient processing - Block-based operations - Safe file operations - Progress reportingThe sorting order is specified through a list of data frames, where eachdata frame represents a block of elements to be sorted. Each data framemust contain:- Row names (current identifiers)- chr (new identifiers)- order (current positions)- newOrder (target positions)Example sorting blocks structure:Block 1 (maintaining order): chr order newOrder DiagonalTCGA-OR-A5J1 TCGA-OR-A5J1 1 1 1TCGA-OR-A5J2 TCGA-OR-A5J2 2 2 1TCGA-OR-A5J3 TCGA-OR-A5J3 3 3 1TCGA-OR-A5J4 TCGA-OR-A5J4 4 4 1Block 2 (reordering with new identifiers): chr order newOrder DiagonalTCGA-OR-A5J5 TCGA-OR-A5JA 10 5 1TCGA-OR-A5J6 TCGA-OR-A5JB 11 6 1TCGA-OR-A5J7 TCGA-OR-A5JC 12 7 0TCGA-OR-A5J8 TCGA-OR-A5JD 13 8 1Block 3 (reordering with identifier swaps): chr order newOrder DiagonalTCGA-OR-A5J9 TCGA-OR-A5J5 5 9 1TCGA-OR-A5JA TCGA-OR-A5J6 6 10 1TCGA-OR-A5JB TCGA-OR-A5J7 7 11 1TCGA-OR-A5JC TCGA-OR-A5J8 8 12 1TCGA-OR-A5JD TCGA-OR-A5J9 9 13 0In this example:- Block 1 maintains the original order- Block 2 assigns new identifiers (A5JA-D) to elements- Block 3 swaps identifiers between elements- The Diagonal column indicates whether the element is on the diagonal (1) or not (0)## Examples```{r}#| eval: false#| code-fold: showlibrary(BigDataStatMeth)# Create test datadata <-matrix(rnorm(100), 10, 10)rownames(data) <-paste0("TCGA-OR-A5J", 1:10)# Save to HDF5fn <-"test.hdf5"bdCreate_hdf5_matrix(fn, data, "data", "matrix1",overwriteFile =TRUE)# Create sorting blocksblock1 <-data.frame(chr =paste0("TCGA-OR-A5J", c(2,1,3,4)),order =1:4,newOrder =c(2,1,3,4),row.names =paste0("TCGA-OR-A5J", 1:4))block2 <-data.frame(chr =paste0("TCGA-OR-A5J", c(6,5,8,7)),order =5:8,newOrder =c(6,5,8,7),row.names =paste0("TCGA-OR-A5J", 5:8))# Sort datasetbdSort_hdf5_dataset(filename = fn,group ="data",dataset ="matrix1",outdataset ="matrix1_sorted",blockedSortlist =list(block1, block2),func ="sortRows")# Cleanupif (file.exists(fn)) {file.remove(fn)}```## See Also::: {.see-also}- [bdCreate_hdf5_matrix](bdCreate_hdf5_matrix.html) for creating HDF5 matrices:::