The file is supposed to have thousands number of rows. But using below it only returns the first couple of rows in dataframe
File https://www.hkex.com.hk/eng/services/trading/securities/securitieslists/ListOfSecurities.xlsx
Failed example
import pandas as pd url = 'https://www.hkex.com.hk/eng/services/trading/securities/securitieslists/ListOfSecurities.xlsx' df = pd.read_excel(url, engine='openpyxl', header=2, usecols='A:D', verbose=True) print(df.shape)
# output - only 5 rows Reading sheet 0 (5, 4)
Working example
Same file. Downloaded it first, opened up in Excel, modifed a text and saved (didn't change format and keep xlsx) and then use read_excel() to open from file
url = 'https://www.hkex.com.hk/eng/services/trading/securities/securitieslists/ListOfSecurities.xlsx' path = os.path.join(os.path.dirname(__file__), 'download') wget.download(url, out=path) file = os.path.join(path, 'ListOfSecurities.xlsx') # open to edit and then save in Excel df = pd.read_excel(file, engine='openpyxl', header=2, usecols='A:D', verbose=True) print(df.shape)
# output Reading sheet 0 (17490, 4)
https://stackoverflow.com/questions/65432992/failed-to-download-full-rows-using-pandas-read-excel-for-xlsx-file December 24, 2020 at 09:40AM
没有评论:
发表评论