Wide-to-tall Data Reshaping Using Regular Expressions and the nc Package

Research output: Contribution to journalArticlepeer-review

Abstract

Regular expressions are powerful tools for extracting tables from non-tabular text data. Capturing regular expressions that describe the information to extract from column names can be especially useful when reshaping a data table from wide (few rows with many regularly named columns) to tall (fewer columns with more rows). We present the R package nc (short for named capture), which provides functions for wide-to-tall data reshaping using regular expressions. We describe the main new ideas of nc, and provide detailed comparisons with related R packages (stats, utils, data.table, tidyr, tidyfast, tidyfst, reshape2, cdata).

Original languageEnglish (US)
Pages (from-to)69-82
Number of pages14
JournalR Journal
Volume13
Issue number1
DOIs
StatePublished - 2021
Externally publishedYes

ASJC Scopus subject areas

  • Statistics and Probability
  • Numerical Analysis
  • Statistics, Probability and Uncertainty

Fingerprint

Dive into the research topics of 'Wide-to-tall Data Reshaping Using Regular Expressions and the nc Package'. Together they form a unique fingerprint.

Cite this