combinations - how to omit reciprocal of words in a data.frame using r -
i have been searching answer online cannot seem to anywhere close..
i have set of tickers , used expand.grid()
find combinations of them:
# tickers <- c("air", "afap", "aal", "cece", "asa", "avx") # find combinations b <- expand.grid(a,a,stringsasfactors=false)
so want omit reciprocals, example:
row 2 , row 7 reciprocals, , want keep 1 of combinations not both.
head(b,10) var1 var2 1 air air 2 afap air 3 aal air 4 cece air 5 asa air 6 avx air 7 air afap 8 afap afap 9 aal afap 10 cece afap
using initial output op, can sort
'b' row using apply
margin=1
, non-duplicated logical index of 'd1' rows duplicated
, , use subset 'b'
d1 <- as.data.frame(t(apply(b, 1, sort))) b1 <- b[!duplicated(d1),] head(b1, 10) # var1 var2 #1 air air #2 afap air #3 aal air #4 cece air #5 asa air #6 avx air #8 afap afap #9 aal afap #10 cece afap #11 asa afap
another compact option using data.table
library(data.table) cj(a, a)[v1>=v2]
Comments
Post a Comment