Modify column of dates using R data.table package -
i have data file has on 1.7 million rows , grows weekly. i'm trying use r create script summarizes quality based on our performance on time (product age plays heavily this) trends our problems in field. thought using dplyr , read.csv() versus using data.table , fread(). speed difference driving me toward data.table, i'm struggling syntax.
data stored in csv file using date codes such 201501 (january 2015) or 20150127 (january 27, 2015). trying convert these dates standard dates can calculate product age (manufacture date date of service call). want change 201601 2016-01-31.
i tried following using zoo package - expected date got original 6-digit code back. laptop running quite while on this.
dt <- dt[, mfrdate:=as.date(as.yearmon(as.character(mfrdate), "%y%m"), frac = 1)] i searched google , data.table cheat sheet , thought must being approaching wrong - set() appears correct way this. tried following:
set(dt,i = .n , j = "mfrdate", value = as.date(as.yearmon(as.character(dt[,2]), "%y%m"), frac = 1) i following error:
error in set(dt, = .n, j = "mfrdate", value = as.date(as.yearmon(as.character(dt[, : i[1] 1821628 out of range [1,nrow=1761094]. i thought = .n incorrect, so, took out , mistakenly ran command before changing else. ran without warnings or errors, changed of column na's. i'm missing something.
help appreciated.
> sessioninfo() r version 3.2.3 (2015-12-10) platform: x86_64-w64-mingw32/x64 (64-bit) running under: windows 7 x64 (build 7601) service pack 1 locale: [1] lc_collate=english_united states.1252 lc_ctype=english_united states.1252 lc_monetary=english_united states.1252 [4] lc_numeric=c lc_time=english_united states.1252 attached base packages: [1] stats graphics grdevices utils datasets methods base other attached packages: [1] revoutilsmath_3.2.3 loaded via namespace (and not attached): [1] tools_3.2.3
your first syntax works expected example:
require(data.table) require(zoo) require(stringr) dt <- data.table(r=c(1,2,3), mfrdate=c(200101, 20010228, 200103)) dt <- dt[str_length(mfrdate)==6, mfrdate:=as.date(as.yearmon(as.character(mfrdate), "%y%m"), frac = 1)] dt <- dt[str_length(mfrdate)==8, mfrdate:=as.date(as.yearmon(as.character(mfrdate), "%y%m%d"), frac = 1)] head(dt) r mfrdate mfrdate 1: 1 200101 2001-01-31 2: 2 20010228 2001-02-28 3: 3 200103 2001-03-31 so reported error linked incorrect data in dataset
Comments
Post a Comment