python - Record Array to json.dumps -
i need generate json pandas dataframe, using df.to_json shows segmentation error, want find way create json , thing got create records array dataframe.
now need create json.dumps names of files. this
{ "id":123, "name":"myname"} this code i've managed create , file (http://pastebin.com/iyeweftg):
import pandas pd import json columns = [u'salesorderid', u'orderdate', u'duedate', u'shipdate', u'salesordernumber', u'title', u'firstname', u'middlename', u'lastname', u'suffix', u'phonenumber', u'phonenumbertype', u'emailaddress', u'emailpromotion', u'addresstype', u'addressline1', u'addressline2', u'city', u'stateprovincename', u'postalcode', u'countryregionname', u'subtotal', u'taxamt', u'freight', u'totaldue', u'unitprice', u'productname', u'productsubcategory', u'productcategory'] data = pd.read_csv('../uploads/txtdatasimplified.txt', header=0, names=columns, sep='\t') data2 = data.to_records(index=0) arrayjson = [] r in data2: c in columns: d=[] d[c] = r.__getattribute__(c) arrayjson.append(d) i need json this:
[ { 'city':'sooke', 'firstname':'devin', 'title':nan, 'lastname':'phillips', 'subtotal':'189,97', 'orderdate':'2014-06-30 00:00:00.000', 'addresstype':'home', 'phonenumbertype':'home', 'taxamt':'15,1976', 'addressline2':nan, 'addressline1':'2742 cincerto circle', 'duedate':'2014-07-12 00:00:00.000', 'totaldue':'209,9169', 'shipdate':'2014-07-07 00:00:00.000', 'stateprovincename':'british columbia', 'middlename':nan, 'productcategory':'accessories', 'phonenumber':'425-555-0163', 'countryregionname':'canada', 'postalcode':'v0', 'salesordernumber':'so75123', 'suffix':nan, 'productname':'all-purpose bike stand', 'salesorderid':75123, 'emailaddress':'devin38@adventure-works.com', 'emailpromotion':0, 'freight':'4,7493', 'unitprice':'159', 'productsubcategory':'bike stands' }, { 'city':'sooke', 'firstname':'devin', 'title':nan, 'lastname':'phillips', 'subtotal':'189,97', 'orderdate':'2014-06-30 00:00:00.000', 'addresstype':'home', 'phonenumbertype':'home', 'taxamt':'15,1976', 'addressline2':nan, 'addressline1':'2742 cincerto circle', 'duedate':'2014-07-12 00:00:00.000', 'totaldue':'209,9169', 'shipdate':'2014-07-07 00:00:00.000', 'stateprovincename':'british columbia', 'middlename':nan, 'productcategory':'clothing', 'phonenumber':'425-555-0163', 'countryregionname':'canada', 'postalcode':'v0', 'salesordernumber':'so75123', 'suffix':nan, 'productname':'awc logo cap', 'salesorderid':75123, 'emailaddress':'devin38@adventure-works.com', 'emailpromotion':0, 'freight':'4,7493', 'unitprice':'8,99', 'productsubcategory':'caps' } ] and error i'm getting is:
traceback (most recent call last): file "/home/ubuntu/workspace/python/tests2.py", line 11, in <module> d[c] = r.__getattribute__(c) typeerror: list indices must integers, not unicode but appreciate final result, i've been changings error error, unable want. need json insert in mongodb.
like error says, d list, trying index unicode strings. have change dictionary (d = {}).
however, output still wouldn't you'd expect. instead can this:
for r in data2: arrayjson.append(dict(zip(columns, r.tolist()))) or this:
arrayjson = [dict(zip(columns, r.tolist())) r in data2] tolist() convert record r normal list containing native python values. can serialized json.dumps. json.dumps may still include values such nan though, not valid json. can replace these values in dataframe using: data.fillna(value="", inplace=true).
this like:
import pandas pd import json columns = [...] data = pd.read_csv('../uploads/txtdatasimplified.txt', header=0, names=columns, sep='\t') data.fillna(value="", inplace=true) data2 = data.to_records(index=0) arrayjson = [dict(zip(columns, r.tolist())) r in data2] print(json.dumps(arrayjson))
Comments
Post a Comment