dplyr - R: adding together previous rows in a dataframe -
df
patient.id index.admission. adm_date dish_date bi 1 124 false 2/7/2009 2/8/2009 0 2 124 true 3/5/2009 3/15/2009 1 3 124 false 4/5/2011 4/7/2011 0 4 124 false 3/25/2012 3/27/2012 0 5 124 true 5/5/2012 5/20/2012 1 6 124 true 9/8/2013 9/15/2013 1 7 124 false 1/5/2014 1/15/2014 0 8 233 false 1/1/2010 1/8/2010 0 9 233 false 1/1/2011 1/5/2011 0 10 233 true 2/2/2011 2/25/2011 1 11 233 false 1/25/2012 1/28/2012 0 12 542 true 3/5/2015 3/15/2015 1 13 1243 true 2/5/2009 2/8/2009 1 14 1243 true 2/5/2011 2/19/2011 1
i need create new column adds bi
grouped patients.
my data should this:
patient.id index.admission. adm_date dish_date bi num_index_ad 1 124 false 2/7/2009 2/8/2009 0 0 2 124 true 3/5/2009 3/15/2009 1 1 3 124 false 4/5/2011 4/7/2011 0 1 4 124 false 3/25/2012 3/27/2012 0 1 5 124 true 5/5/2012 5/20/2012 1 2 6 124 true 9/8/2013 9/15/2013 1 3 7 124 false 1/5/2014 1/15/2014 0 3 8 233 false 1/1/2010 1/8/2010 0 0 9 233 false 1/1/2011 1/5/2011 0 0 10 233 true 2/2/2011 2/25/2011 1 1 11 233 false 1/25/2012 1/28/2012 0 1 12 542 true 3/5/2015 3/15/2015 1 1 13 1243 true 2/5/2009 2/8/2009 1 1 14 1243 true 2/5/2011 2/19/2011 1 2
using dplyr
i have:
df1 <- df %>% group_by(patient.id) %>% (i in df) { mutate(num_index_ad = bi[lag(i),] +bi[i,]) }
this gives error: "error in .subset2(x, i, exact = exact) : subscript out of bounds"
thanks in advance:
> dput(df) structure(list(patient.id = c(124l, 124l, 124l, 124l, 124l, 124l, 124l, 233l, 233l, 233l, 233l, 542l, 1243l, 1243l), index.admission. = c(false, true, false, false, true, true, false, false, false, true, false, true, true, true), adm_date = structure(c(8l, 10l, 12l, 9l, 13l, 14l, 4l, 1l, 2l, 5l, 3l, 11l, 6l, 7l), .label = c("1/1/2010", "1/1/2011", "1/25/2012", "1/5/2014", "2/2/2011", "2/5/2009", "2/5/2011", "2/7/2009", "3/25/2012", "3/5/2009", "3/5/2015", "4/5/2011", "5/5/2012", "9/8/2013"), class = "factor"), dish_date = structure(c(7l, 8l, 11l, 10l, 12l, 13l, 1l, 4l, 3l, 6l, 2l, 9l, 7l, 5l), .label = c("1/15/2014", "1/28/2012", "1/5/2011", "1/8/2010", "2/19/2011", "2/25/2011", "2/8/2009", "3/15/2009", "3/15/2015", "3/27/2012", "4/7/2011", "5/20/2012", "9/15/2013"), class = "factor"), bi = c(0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1)), .names = c("patient.id", "index.admission.", "adm_date", "dish_date", "bi"), row.names = c(na, -14l), class = "data.frame")
i didn't find general dupe here additional solutions
df$num_index_ad <- with(df, ave(bi, patient.id, fun = cumsum))
or
library(dplyr) df %>% group_by(patient.id) %>% mutate(num_index_ad = cumsum(bi))
or
library(data.table) setdt(df)[, num_index_ad := cumsum(bi), = patient.id]
Comments
Post a Comment