r - How to do a data.table rolling join? -


i have 2 data tables i'm trying merge. 1 data on company market values through time , other company dividend history through time. i'm trying find out how each company has paid each quarter , put value next market value data through time.

library(magrittr) library(data.table) library(zoo) library(lubridate)  set.seed(1337) # data table of company  market values companies <-      data.table(companyid = 1:10,                sedol = rep(c("91772e", "7a662b"), each = 5),                date = (as.date("2005-04-01") + months(seq(0, 12, 3))) - days(1),                mktcap = c(100 + cumsum(rnorm(5,5)),                           50 + cumsum(rnorm(5,1,5)))) %>%     setkey(sedol, date)  # data table of dividends dividends <-      data.table(divid = 1:7,                sedol = c(rep('91772e', each = 4), rep('7a662b', each = 3)),                date = as.date(c('2004-11-19', '2005-01-13', '2005-01-29',                                 '2005-10-01', '2005-06-29', '2005-06-30',                                 '2006-04-17')),                divamnt = rnorm(7, .8, .3)) %>%     setkey(sedol, date) 

i believe situation use data.table rolling join, like:

dividends[companies, roll = "nearest"] 

to try , dataset looks

       divid  sedol       date   divamnt companyid    mktcap     1:    na 7a662b       <na>        na         6  61.21061     2:     5 7a662b 2005-06-29 0.7772631         7  66.92951     3:     6 7a662b 2005-06-30 1.1815343         7  66.92951     4:    na 7a662b       <na>        na         8  78.33914     5:    na 7a662b       <na>        na         9  88.92473     6:    na 7a662b       <na>        na        10  87.85067     7:     2 91772e 2005-01-13 0.2964291         1 105.19249     8:     3 91772e 2005-01-29 0.8472649         1 105.19249     9:    na 91772e       <na>        na         2 108.74579    10:     4 91772e 2005-10-01 1.2467408         3 113.42261    11:    na 91772e       <na>        na         4 120.04491    12:    na 91772e       <na>        na         5 124.35588 

(note i've matched dividends company market values exact quarter)

but i'm not sure how execute it. cran pdf rather vague number or should if roll value (can pass dates? number quantify days forward carry? number of obersvations?) , changing rollends around doesn't seem me want.

in end, ended mapping dividend dates quarter end , joining on that. solution, not useful if end needing know how perform rolling joins. in answer, describe situation rolling joins solution me understand how perform them?

instead of rolling join, may want use overlap join foverlaps function of data.table:

# create interval in 'companies' datatable companies[, `:=` (start = compdate - days(90), end = compdate + days(15))] # create second date in 'dividends' datatable dividends[, date2 := divdate]  # set keys 2 datatable setkey(companies, sedol, start, end) setkey(dividends, sedol, ddate, date2)  # create vector of columnnames can removed afterwards deletecols <- c("date2","start","end")  # perform overlap join , remove helper columns res <- foverlaps(companies, dividends)[, (deletecols) := null] 

the result:

> res      sedol divid    divdate   divamnt companyid   compdate    mktcap  1: 7a662b    na       <na>        na         6 2005-03-31  61.21061  2: 7a662b     5 2005-06-29 0.7772631         7 2005-06-30  66.92951  3: 7a662b     6 2005-06-30 1.1815343         7 2005-06-30  66.92951  4: 7a662b    na       <na>        na         8 2005-09-30  78.33914  5: 7a662b    na       <na>        na         9 2005-12-31  88.92473  6: 7a662b    na       <na>        na        10 2006-03-31  87.85067  7: 91772e     2 2005-01-13 0.2964291         1 2005-03-31 105.19249  8: 91772e     3 2005-01-29 0.8472649         1 2005-03-31 105.19249  9: 91772e    na       <na>        na         2 2005-06-30 108.74579 10: 91772e     4 2005-10-01 1.2467408         3 2005-09-30 113.42261 11: 91772e    na       <na>        na         4 2005-12-31 120.04491 12: 91772e    na       <na>        na         5 2006-03-31 124.35588 

used data (the same in question, without creation of keys):

set.seed(1337) companies <- data.table(companyid = 1:10, sedol = rep(c("91772e", "7a662b"), each = 5),                         compdate = (as.date("2005-04-01") + months(seq(0, 12, 3))) - days(1),                         mktcap = c(100 + cumsum(rnorm(5,5)), 50 + cumsum(rnorm(5,1,5)))) dividends <- data.table(divid = 1:7, sedol = c(rep('91772e', each = 4), rep('7a662b', each = 3)),                         divdate = as.date(c('2004-11-19','2005-01-13','2005-01-29','2005-10-01','2005-06-29','2005-06-30','2006-04-17')),                         divamnt = rnorm(7, .8, .3)) 

Comments

Popular posts from this blog

php - Wordpress website dashboard page or post editor content is not showing but front end data is showing properly -

How to get the ip address of VM and use it to configure SSH connection dynamically in Ansible -

javascript - Get parameter of GET request -