I am trying to load JSON data that is stored in a .txt file to a Pandas dataframe. The .txt file is very large (3gb). I have followed the answers from this question yet I am getting an error code.
Here is the code I'm using:
import pandas as pd import json # file to import file = 'mag_authors_10.txt' with open(file) as f: string = f.read() jsonData = json.loads(string) print(pd.DataFrame(jsonData))
Here is the error code:
File "my_path.py", line 23, in <module> jsonData = json.loads(string) File "C:\Users\Seb\anaconda3\lib\json\__init__.py", line 357, in loads return _default_decoder.decode(s) File "C:\Users\Seb\anaconda3\lib\json\decoder.py", line 340, in decode raise JSONDecodeError("Extra data", s, end) JSONDecodeError: Extra data
My data looks like this:
{ "id": "53e9ab9eb7602d970354a97e", "title": "Data mining: concepts and techniques", "authors": [ { "name": "jiawei han", "org": "department of computer science university of illinois at urbana champaign" }, { "name": "micheline kamber", "org": "department of computer science university of illinois at urbana champaign" }, { "name": "jian pei", "org": "department of computer science university of illinois at urbana champaign" } ], "year": 2000, "keywords": [ "data mining", "structured data", "world wide web", "social network", "relational data" ], "fos": [ "relational database", "data model", "social network" ], "n_citation": 29790, "references": [ "53e99ef4b7602d97027c2346", "53e9aa23b7602d970338fb5e", "53e99cf5b7602d97025aac75" ], "doc_type": "book", "lang": "en", "publisher": "Elsevier", "isbn": "1-55860-489-8", "doi": "10.4114/ia.v10i29.873", "pdf": "//static.aminer.org/upload/pdf/1254/370/239/53e9ab9eb7602d970354a97e.pdf", "url": [ "http://dx.doi.org/10.4114/ia.v10i29.873", "http://polar.lsi.uned.es/revista/index.php/ia/article/view/479" ] }
https://stackoverflow.com/questions/66895239/how-to-read-txt-file-containing-json-data-into-python-pandas-dataframe April 01, 2021 at 04:59AM
没有评论:
发表评论