python - Sum up non-unique rows in DataFrame -
i have dataframe this:
id = [1,1,2,3] x1 = [0,1,1,2] x2 = [2,3,1,1] df = pd.dataframe({'id':id, 'x1':x1, 'x2':x2}) df id x1 x2 1 0 2 1 1 3 2 1 1 3 2 1 some rows have same id. want sum such rows (over x1 , x2) obtain new dataframe unique ids:
df_new id x1 x2 1 1 5 2 1 1 3 2 1 an important detail real number of columns x1, x2,... large, cannot apply function requires manual input of column names.
as discussed can use pandas groupby function sum based on id value:
df.groupby(df.id).sum() # or df.groupby('id').sum() if need don't want id become index can:
df.groupby('id').sum().reset_index() # or df.groupby('id', as_index=false).sum() # @john_gait
Comments
Post a Comment