python - Specifying dataframe index as date at time of creation -


i'm newcomer python (3.5) r background , i'm struggling little differences in way dataframes created , used. in particular want create dataframe using series of dates index. following experimental code (note commented out index) works more or less expect:

import pandas pd import numpy np  np.random.seed(123456) num_periods=5 monthindex=pd.date_range('1/1/2014', periods=num_periods, freq='ms') dd = pd.dataframe(data={'date':monthindex,                         'c1': pd.series(np.random.uniform(10, 20, size=num_periods)),                         'c2': pd.series(np.random.uniform(30, 40, size=num_periods))},                   # index=monthindex, ) print(dd) 

...and gives me output:

          c1         c2       date 0  11.269698  33.362217 2014-01-01 1  19.667178  34.513765 2014-02-01 2  12.604760  38.402551 2014-03-01 3  18.972365  31.231021 2014-04-01 4  13.767497  35.430262 2014-05-01 

...and can specify index after creation this:

dd.index = monthindex print(dd) 

...which gets me this, looks right:

                   c1         c2       date 2014-01-01  11.269698  33.362217 2014-01-01 2014-02-01  19.667178  34.513765 2014-02-01 2014-03-01  12.604760  38.402551 2014-03-01 2014-04-01  18.972365  31.231021 2014-04-01 2014-05-01  13.767497  35.430262 2014-05-01 

but if uncomment index call in code above date in index left na values this:

            c1  c2       date 2014-01-01 nan nan 2014-01-01 2014-02-01 nan nan 2014-02-01 2014-03-01 nan nan 2014-03-01 2014-04-01 nan nan 2014-04-01 2014-05-01 nan nan 2014-05-01 

i suspect may because 2 series objects don't share values index don't understand going on.

what happening , how should specify date index during creation of dataframe rather tacking on after call dataframe?

using numpy arrays directly without creating series first works:

import pandas pd import numpy np  np.random.seed(123456) num_periods=5 monthindex=pd.date_range('1/1/2014', periods=num_periods, freq='ms') dd = pd.dataframe(data={'date':monthindex,                         'c1': np.random.uniform(10, 20, size=num_periods),                         'c2': np.random.uniform(30, 40, size=num_periods)},                   index=monthindex, ) print(dd) 

output:

                   c1         c2       date 2014-01-01  11.269698  33.362217 2014-01-01 2014-02-01  19.667178  34.513765 2014-02-01 2014-03-01  12.604760  38.402551 2014-03-01 2014-04-01  18.972365  31.231021 2014-04-01 2014-05-01  13.767497  35.430262 2014-05-01 

the series come own indices not match month index. numpy arrays don't have index , use index provide.


Comments

Popular posts from this blog

php - Wordpress website dashboard page or post editor content is not showing but front end data is showing properly -

How to get the ip address of VM and use it to configure SSH connection dynamically in Ansible -

javascript - Get parameter of GET request -