r - dplyr having trouble redefining type with group_by() -
i have following problem:
when using dplyr mutate numeric column after group_by(), fails if row contains 1 value nan when using mutate command.
thus, if grouped column contains numeric, correctly classifies dbl, there instance of nan group, fails dplyr defines group lgl, while other groups dbl.
my first (and more general question) is: there way tell dplyr, when using group_by(), define column in way?
secondly, can me hack problem explained in mwe below:
# error: provide column defining error mentioned: df <- data_frame(a = c(rep(letters[1:2],4),"c"),g = c(rep(letters[5:7],3)), x = c(7, 8,3, 5, 9, 2, 4, 7,8)) %>% tbl_df() df <- df %>% group_by(a) %>% mutate_each(funs(sd(., na.rm=true)),x) df <- df %>% mutate(winsorise = ifelse(x>2,2,x)) # no error (as no groups have single entry nan): df2 <- data_frame(a = c(rep(letters[1:2],4),"c"),g = c(rep(letters[5:7],3)), x = c(7, 8,3, 5, 9, 2, 4, 7,8)) %>% tbl_df() df2 <- df2 %>% group_by(a) %>% mutate_each(funs(sd(., na.rm=true)),x) # update group row na - works df2[9,1] <- "a" df2 <- df2 %>% mutate(winsorise = ifelse(x>3,3,x)) # reason error: happens groups 1 member = nan, although want winsorise column dbl not lgl: df3 <- data_frame(g = "a",x = nan) df3 <- df3 %>% mutate(winsorise = ifelse(x>3,3,x))
the reason is, rightly pointed out in df3, mutate result cast logical when source column nan/na.
to circumvent this, cast answer numeric:
df <- df %>% mutate(winsorise = as.numeric(ifelse(x>2,2,x)))
perhaps @hadley shed light on why mutate result cast lgl?
Comments
Post a Comment