How to Cluster Sequential Categorical Data in R -


consider data set users can choose among 3 activities, , have data choice of first 10 activities. example data:

for (i in 1:10)  {   # sample list of 3 strings using set probability   x <- sample( c("a", "b", "c"), 1000, replace=true, prob=c(0.5, 0.3, 0.2) )   # assign variable created on fly   assign( paste("cat", i, sep=""), x ) }  first10 <- data.frame(cat1, cat2, cat3, cat4, cat5, cat6, cat7, cat8, cat9, cat10) 

what's best approach in r cluster users according activity sequence?

i've looked around on stackoverflow, , similar questions ask how cluster categorical data in r (which part of analysis), in , of doesn't account sequential nature of data. there r packages well-suited analysis?

look frequent itemset mining instead of clustering.

most clustering methods continuous numerical data, , assume vector field. take every position account.

a frequent pattern, however, may part if sequence, sequence may exhibit multiple (or none) of these patterns, , patterns may have gaps inbetween. of these properties desirable.


Comments

Popular posts from this blog

c# - Binding a comma separated list to a List<int> in asp.net web api -

Delphi 7 and decode UTF-8 base64 -

html - Is there any way to exclude a single element from the style? (Bootstrap) -