--- title: "Introduction to dataSDA" author: "Po-Wei Chen" date: "2023-06-13" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Introduction to dataSDA} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup, include = FALSE} knitr::opts_chunk$set(echo = TRUE) knitr::opts_chunk$set(warning = FALSE) knitr::opts_chunk$set(message = FALSE) knitr::opts_chunk$set(out.width = "100%") knitr::opts_chunk$set(fig.align = 'center') library(knitr) library(dataSDA) library(RSDA) ``` ## Example of Interval-Valued Data ### Built-in Data ```{r} data(mushroom) head(mushroom) ``` ### Set variable format Changes the format of the set variables in the data to conform to the RSDA format. + `data` = the conventional dataframe + `location` = the location of the set variable in the data + `var` = the name of the set variable in the data ```{r} mushroom.set <- set_variable_format(data = mushroom, location = 8, var = "Species") head(mushroom.set) ``` ### RSDA format Changes the format of the data to conform to RSDA format. + `data` = the conventional dataframe + `sym_type1` = the labels I means an interval variable, and S means set variable in location + `location` = the location of the sym_type in the data + `sym_type2` = the labels I means an interval variable, and S means set variable in var + `var` = the name of the symbolic variable in the data ```{r} mushroom.tmp <- RSDA_format(data = mushroom.set, sym_type1 = c("I", "S"), location = c(25, 31), sym_type2 = c("S", "I", "I"), var = c("Species", "Stipe.Length_min", "Stipe.Thickness_min")) head(mushroom.tmp) ``` ### Clean the column names Clean up variable names to conform to the RSDA format. + `data` = the conventional dataframe ```{r} mushroom.clean <- clean_colnames(data = mushroom.tmp) head(mushroom.clean) ``` ### Write a symbolic data table from a CSV data file ```{r, eval = FALSE} write_csv_table(data = mushroom.clean, file = 'mushroom_interval.csv') ``` ### Read the symbolic data table and check the format ```{r} mushroom.int <- read.sym.table(file = 'mushroom_interval.csv', header = T, sep = ';', dec = '.', row.names = 1) head(mushroom.int) ``` ## Example of iGAP format Data ### Built-in Data ```{r} data(Abalone.iGAP) head(Abalone.iGAP) ``` ### iGAP to MM To convert iGAP files to CSV files. + `data` = the conventional dataframe(iGAP format) + `location` = the location of the symbolic variable in the data ```{r} Abalone <- iGAP_to_MM(Abalone.iGAP, c(1, 2, 3, 4, 5, 6, 7)) head(Abalone) ```