python - Expressing pandas subsetting operation using pipe -
is there way can express pandas operations below using pipe operator?
df_a = df[df.index.year != 2000] df_b = df_a[(df_a['month'].isin([3, 4, 5])) & (df_a['region'] == 'usa')]
not sure why want use pipe operation.
pipe intended make easier syntax chained processing of dataframe chain of functions modify incoming dataframe (see docs).
what trying filter dataframe number of filters (or masks).
just illustrate using pipe operation cumbersome:
import pandas pd pd.np.random.seed(123) # generate data dates = pd.date_range('2014-01-01', '2015-12-31', freq='m') df = pd.dataframe({'region':pd.np.random.choice(['usa', 'non-usa'], len(dates))}, index=dates) df['month'] = df.index.month print df.head() region month 2014-01-31 usa 1 2014-02-28 non-usa 2 2014-03-31 usa 3 2014-04-30 usa 4 2014-05-31 usa 5 your original filter yield:
df_a = df[df.index.year != 2014] df_b = df_a[(df_a['month'].isin([3, 4, 5])) & (df_a['region'] == 'usa')] print df_b region month 2015-03-31 usa 3 2015-05-31 usa 5 here how use pipe same output:
def masker(df, mask): return df[mask] mask1 = df.index.year != 2014 mask2 = df['month'].isin([3, 4, 5]) mask3 = df['region'] == 'usa' print df.pipe(masker, mask1).pipe(masker, mask2).pipe(masker, mask3) region month 2015-03-31 usa 3 2015-05-31 usa 5 however pandas able process filtering in rather simple (in particular case) way:
print df[mask1 & mask2 & mask3] region month 2015-03-31 usa 3 2015-05-31 usa 5
Comments
Post a Comment