python - Specifying dataframe index as date at time of creation -
i'm newcomer python (3.5) r background , i'm struggling little differences in way dataframes created , used. in particular want create dataframe using series of dates index. following experimental code (note commented out index
) works more or less expect:
import pandas pd import numpy np np.random.seed(123456) num_periods=5 monthindex=pd.date_range('1/1/2014', periods=num_periods, freq='ms') dd = pd.dataframe(data={'date':monthindex, 'c1': pd.series(np.random.uniform(10, 20, size=num_periods)), 'c2': pd.series(np.random.uniform(30, 40, size=num_periods))}, # index=monthindex, ) print(dd)
...and gives me output:
c1 c2 date 0 11.269698 33.362217 2014-01-01 1 19.667178 34.513765 2014-02-01 2 12.604760 38.402551 2014-03-01 3 18.972365 31.231021 2014-04-01 4 13.767497 35.430262 2014-05-01
...and can specify index after creation this:
dd.index = monthindex print(dd)
...which gets me this, looks right:
c1 c2 date 2014-01-01 11.269698 33.362217 2014-01-01 2014-02-01 19.667178 34.513765 2014-02-01 2014-03-01 12.604760 38.402551 2014-03-01 2014-04-01 18.972365 31.231021 2014-04-01 2014-05-01 13.767497 35.430262 2014-05-01
but if uncomment index
call in code above date in index left na values this:
c1 c2 date 2014-01-01 nan nan 2014-01-01 2014-02-01 nan nan 2014-02-01 2014-03-01 nan nan 2014-03-01 2014-04-01 nan nan 2014-04-01 2014-05-01 nan nan 2014-05-01
i suspect may because 2 series
objects don't share values index don't understand going on.
what happening , how should specify date index during creation of dataframe rather tacking on after call dataframe
?
using numpy arrays directly without creating series first works:
import pandas pd import numpy np np.random.seed(123456) num_periods=5 monthindex=pd.date_range('1/1/2014', periods=num_periods, freq='ms') dd = pd.dataframe(data={'date':monthindex, 'c1': np.random.uniform(10, 20, size=num_periods), 'c2': np.random.uniform(30, 40, size=num_periods)}, index=monthindex, ) print(dd)
output:
c1 c2 date 2014-01-01 11.269698 33.362217 2014-01-01 2014-02-01 19.667178 34.513765 2014-02-01 2014-03-01 12.604760 38.402551 2014-03-01 2014-04-01 18.972365 31.231021 2014-04-01 2014-05-01 13.767497 35.430262 2014-05-01
the series come own indices not match month index. numpy arrays don't have index , use index provide.
Comments
Post a Comment