2021年3月31日星期三

How to read TXT file containing JSON data into python Pandas dataframe

I am trying to load JSON data that is stored in a .txt file to a Pandas dataframe. The .txt file is very large (3gb). I have followed the answers from this question yet I am getting an error code.

Here is the code I'm using:

import pandas as pd  import json    # file to import  file = 'mag_authors_10.txt'    with open(file) as f:      string = f.read()    jsonData = json.loads(string)  print(pd.DataFrame(jsonData))  

Here is the error code:

  File "my_path.py", line 23, in <module>      jsonData = json.loads(string)      File "C:\Users\Seb\anaconda3\lib\json\__init__.py", line 357, in loads      return _default_decoder.decode(s)      File "C:\Users\Seb\anaconda3\lib\json\decoder.py", line 340, in decode      raise JSONDecodeError("Extra data", s, end)    JSONDecodeError: Extra data  

My data looks like this:

{  "id": "53e9ab9eb7602d970354a97e",  "title": "Data mining: concepts and techniques",  "authors": [  {  "name": "jiawei han",  "org": "department of computer science university of illinois at urbana champaign"  },  {  "name": "micheline kamber",  "org": "department of computer science university of illinois at urbana champaign"  },  {  "name": "jian pei",  "org": "department of computer science university of illinois at urbana champaign"  }  ],  "year": 2000,  "keywords": [  "data mining",  "structured data",  "world wide web",  "social network",  "relational data"  ],  "fos": [  "relational database",  "data model",  "social network"  ],  "n_citation": 29790,  "references": [  "53e99ef4b7602d97027c2346",  "53e9aa23b7602d970338fb5e",  "53e99cf5b7602d97025aac75"  ],  "doc_type": "book",  "lang": "en",  "publisher": "Elsevier",  "isbn": "1-55860-489-8",  "doi": "10.4114/ia.v10i29.873",  "pdf": "//static.aminer.org/upload/pdf/1254/370/239/53e9ab9eb7602d970354a97e.pdf",  "url": [  "http://dx.doi.org/10.4114/ia.v10i29.873",  "http://polar.lsi.uned.es/revista/index.php/ia/article/view/479"  ]  }  
https://stackoverflow.com/questions/66895239/how-to-read-txt-file-containing-json-data-into-python-pandas-dataframe April 01, 2021 at 04:59AM

没有评论:

发表评论