What is the most appropriate distance measure for e-commerce data (explained in details below) to apply nearest neighbour algorithm? -
i have dataset of e-commerce website. data arranged in matrix number of rows same number of transactions (set of products bought) , number of columns same total number of products available on website. each [i, j]
cell of matrix either 1
if product j
bought in transaction i
or 0
otherwise. when new transaction comes, want find k nearest neighbours transaction. appropriate measure such data? e.g. if data binary (1/0)
, hamming distance won't make sense.
Comments
Post a Comment