python - Numpy: Tuple-Indexing of rows in ndarray (for later selection of subsets) -
i'm new numpy, , not expierenced python programmer, please excuse me if seems trivial ;)
i writing script extract specific data out of several molecular-dynamics simulation. therefore read data out of files , modify , truncate them uniform length , add row-wise, form 2d-array each simulation run.
these arrays appended each other, 3d-array, each slice along z-axis represent dataset of specific simulation run. goal later on easy manipulation, e.g. averaging on simulation runs.
this give basic idea of done:
import numpy np = np.zeros((2000), dtype = bool) = a.reshape((1, 2000)) # appending different rows form '2d-matrix', # actual data per simulation run in xrange(1,103): b = np.zeros((2000), dtype = bool) b = b.reshape((1, 2000)) = np.concatenate((a, b), axis=0) print a.shape # >>> (2000, 103) c = np.expand_dims(a, axis=2) = np.expand_dims(a, axis=2) print a.shape # >>> (2000, 103, 1) # appending different '2d-matrices' form 3d array, # each slice along z-axis representing 1 simulation run in xrange(1,50): = np.concatenate((a, c), axis=2) print a.shape # >>> (2000, 103, 50) so far good, actual question:
in 1 2d-array, each row represents different set of interacting atom-pairs. later on want create subsets of array, depending on different critera - e.g. 'show me pairs, distance x 10 < x <= 20'.
so when first add rows in for in xrange(1,103): ..., want include indexing of rows set of ints each row. data of atom pairs there anyway, @ moment i'm not including in ndarray.
i thinking of tuple, 2d-array like
[ [('int' a,'int' b), [false,true,false,...]], [('int' a,'int' d), [true, false, true...]], ... ] or that
[ [['int' a], ['int' b], [false,true,false,...]], [['int' a], ['int' d], [true, false, true...]], ... ] can think of or easier approach kind of filtering? i'm not quite sure if i'm on right track here , doesn't seem straight-forward have different datatypes in array that.
also notice, indexes ordered in same way in each 2d-array, because sort them (atm based on string) , add np.zeros() rows occur on other simulation runs. maybe lookup-table right approach?
thanks lot!
update/answer:
sorry, know question little bit specific , bloated code wasn't relevant question.
i answered question myself, , sake of documentation can find below. specific, maybe helps handle indexing in numpy.
short, general answer:
i created look-up-table python list , did simple numpy slicing operation selection mask, containing indices:
a = [[[1, 2], [3, 4], [5, 6]], [[7, 8], [9,10], [11,12]]] = np.asarray(a) # selects rows 1 , 2 each 2d array mask = [1,2] b = a[ : , mask, : ] which gives b:
[[[ 3 4] [ 5 6]] [[ 9 10] [11 12]]] complete answer, specific question above:
this 2d array:
a =[[true, false, false, false, false], [false, true, false, false, false], [false, false, true, false, false]] = np.asarray(a) indexing of rows tuples, due specific problem e.g.:
lut = [(1,2),(3,4),(3,5)] append other 2d array form 3d array:
c = np.expand_dims(a, axis=0) = np.expand_dims(a, axis=0) = np.concatenate((a, c), axis=0) this 3d array a:
>[[[ true false false false false] [false true false false false] [false false true false false]] [[ true false false false false] [false true false false false] [false false true false false]]] selecting rows, contain "3" in look-up-table
mask = [i i, v in enumerate(lut) if 3 in v] > [1, 2] applying mask 3d-array:
b = a[ : , mask, : ] now b 3d array a after selection:
[[[false true false false false] [false false true false false]] [[false true false false false] [false false true false false]]] to keep track of new indices of b: create new look-up-table further computation:
newlut = [v i, v in enumerate(lut) if in mask] >[(3, 4), (3, 5)]
Comments
Post a Comment