r - How to check the consistency of specific variables from different IDs registered repeated times in a dataframe -


i know how can detect if different individuals captured repeated times have same value in specific variables along different measures.

specifically, have repeated measures of individuals (column id) values of different variables along time (e.g. sex, weight)

i check individuals assigned time same sex, having reference last measure because measure reliable.

later store every row or register mismatch references in 1 dataframe.

id <- c("1", "2", "3", "1", "2", "3", "1", "2", "3") sex <- c("m", "f", "m", "m", "m", "m", "f", "f", "m") weight <- c(20, 15, 30, 22, 18, 32, 26, 21, 36) time <- c(1, 1, 1, 2, 2, 2, 3, 3, 3) df <- data.frame (id, sex, weight, time) df 

to that, have selected last register of each id

library (data.table) dt <- as.data.table (df) dt_last_register <- dt [, .sd[c(.n)], = id] dt_last_register 

and create loop each id select registers not match, storing these registers in new dataframe (e.g. df_no_match)

# create vector ids id_vector <- unique (df$id) # create loop (i in 1:length(id_vector)    x <- id_vector [i]   df_subset <- subset (df$id==x) # select registers of 1 individual   ...   ... 

i don't know how follow step, , check registers of each individual. know how it?

finally, change values of variable sex register haven't matched reference, , store database changes in new dataframe. e.g df_final

id <- c("1", "2", "3", "1", "2", "3", "1", "2", "3") sex <- c("f", "f", "m", "f", "f", "m", "f", "f", "m") weight <- c(20, 15, 30, 22, 18, 32, 26, 21, 36) time <- c(1, 1, 1, 2, 2, 2, 3, 3, 3) df_final <- data.frame (id, sex, weight, time) df_final 

thanks in advance

i'm not 100% clear on goal, seems work. key self-merging data.table.

library(data.table) setdt(df)  #get gender of final observation each id df[df[,sex[.n],by=id], recent_sex:=(i.v1), on="id"]  #find if there mismatches id df[,mismatch:=any(recent_sex!=sex), by=id]  #overwrite erroneous genders df[,sex_new:=recent_sex] 

if want separate mismatched observations, do

df_mismatches<-df[(mismatch)] 

(note parentheses necessary force [.data.table interpret mismatch logical vector, otherwise expects mismatch data.table we're merging df)


Comments

Popular posts from this blog

c# - Binding a comma separated list to a List<int> in asp.net web api -

Delphi 7 and decode UTF-8 base64 -

html - Is there any way to exclude a single element from the style? (Bootstrap) -