Function to determine the duplicated patterns from input data matrix

Description

dcDuplicated is supposed to determine the duplicated vectorised patterns from a matrix or data frame. The patterns can come from column-wise vectors or row-wise vectors. It returns an integer vector, in which the value indicates from which it duplicats.

Usage

dcDuplicated(data, pattern.wise = c("column", "row"), verbose = T)

Arguments

data
an input data matrix/frame
pattern.wise
a character specifying in which wise to define patterns from input data. It can be 'column' for column-wise vectors, and 'row' for row-wise vectors
verbose
logical to indicate whether the messages will be displayed in the screen. By default, it sets to TRUE for display

Value

an interger vector, in which an entry indicates from which it duplicats. When viewing column-wise patterns (or row-wise patterns), the returned integer vector has the same length as the column number (or the row number) of input data.

Note

none

Examples

# an input data matrix storing discrete states for tips (in rows) X four characters (in columns) data1 <- matrix(c(0,rep(1,3),rep(0,2)), ncol=1) data2 <- matrix(c(rep(0,4),rep(1,2)), ncol=1) data3 <- matrix(c(1,rep(0,3),rep(1,2)), ncol=1) data <- cbind(data1, data2, data1, data3) colnames(data) <- c("C1", "C2", "C3", "C4") data
C1 C2 C3 C4 [1,] 0 0 0 1 [2,] 1 0 1 0 [3,] 1 0 1 0 [4,] 1 0 1 0 [5,] 0 1 0 1 [6,] 0 1 0 1
# determine the duplicated patterns from inut data matrix res <- dcDuplicated(data, pattern.wise="column")
Merge 4 patterns (2015-07-23 12:50:53) ... Sort 4 patterns (2015-07-23 12:50:53) ... Find index from 4 patterns (2015-07-23 12:50:53) ... For the input data (with 4 patterns), there are 3 unique 'column'-wise patterns (2015-07-23 12:50:53)
## return an integer vector res
[1] 1 2 1 4
## get index for unique patterns ind <- sort(unique(res)) ## As seen above, the returned integer vector tells there are 3 unique patterns: ## they are in columns (1, 2, 4). The column 3 is duplicated from column 1.