python - Sum up non-unique rows in DataFrame -
i have dataframe this:
id = [1,1,2,3] x1 = [0,1,1,2] x2 = [2,3,1,1] df = pd.dataframe({'id':id, 'x1':x1, 'x2':x2}) df id x1 x2 1 0 2 1 1 3 2 1 1 3 2 1
some rows have same id
. want sum such rows (over x1
, x2
) obtain new dataframe unique ids
:
df_new id x1 x2 1 1 5 2 1 1 3 2 1
an important detail real number of columns x1
, x2
,... large, cannot apply function requires manual input of column names.
as discussed can use pandas groupby
function sum based on id
value:
df.groupby(df.id).sum() # or df.groupby('id').sum()
if need don't want id
become index can:
df.groupby('id').sum().reset_index() # or df.groupby('id', as_index=false).sum() # @john_gait
Comments
Post a Comment