Let's have one irregular time series table, e.g. buy/sell transactions and resulting positions on stocks (here Apple):
library(data.table)
txns = data.table(Instrument=rep("AAPL", 3),
Date=as.Date(c("2014-05-10","2014-05-15", "2014-05-20")),
Txn.Qty=c(100, -20, -30), #
key=c("Instrument", "Date"))
txns[, Pos.Qty:=cumsum(Txn.Qty), by="Instrument"]
txns
# Instrument Date Txn.Qty Pos.Qty
# 1: AAPL 2014-05-10 100 100
# 2: AAPL 2014-05-15 -20 80
# 3: AAPL 2014-05-20 -30 50
Then we want to track positions (and possibly daily valuation) in a second, mostly regular time-series table:
positions = data.table(Instrument=rep("AAPL", 13),
Date=as.Date(setdiff(as.Date("2014-05-07") + 1:15,
txns$Date[1:2])), # gaps are possible
key=c("Instrument", "Date"))
out = merge(positions, txns, all=T)
out[!txns, Txn.Qty:=0]
out[, Pos.Qty:=zoo::na.locf(Pos.Qty, na.rm=F), by=Instrument] # `zoo` dependency
out
# Instrument Date Txn.Qty Pos.Qty
# 1: AAPL 2014-05-08 0 NA
# 2: AAPL 2014-05-09 0 NA
# 3: AAPL 2014-05-10 100 100
# 4: AAPL 2014-05-11 0 100
# 5: AAPL 2014-05-12 0 100
# 6: AAPL 2014-05-13 0 100
# 7: AAPL 2014-05-14 0 100
# 8: AAPL 2014-05-15 -20 80
# 9: AAPL 2014-05-16 0 80
# 10: AAPL 2014-05-17 0 80
# 11: AAPL 2014-05-18 0 80
# 12: AAPL 2014-05-19 0 80
# 13: AAPL 2014-05-20 -30 50
# 14: AAPL 2014-05-21 0 50
# 15: AAPL 2014-05-22 0 50
I was hoping to speed up the above (i.e. roll last observation forward in some columns, fill non-joins with zeros for other columns) with a single, efficient join and roll argument, performing selectively only on some of the columns (similar to .SDcols philosophy).
Is it possible to achieve something like this using data.table's roll argument on selected columns only?
Also, additional argument fill in [.data.table would help, equivalent to na.fill (we can think about roll corresponding to na.locf)