I have a muti-sheet xlsx
file which I want to process selected pages and finally save them as CSV
.
This is a snapshot of a few raws from one page:
I use this code to load all pages and process each one-by-one:
def load_raw_excel_file(file_full_name): df = pd.read_excel(file_full_name, sheet_name=None, engine="openpyxl", header=0) sheets_name = list(df.keys()) return df, sheets_name
The output of the code (from the same page) looks like this:
dfs, shs = load_raw_excel_file("myexelfile.xlsx") dfs['myselectedsheetname']
As you can see, some values from the Contract
column have changed to date, but I don't want any changes. I've tried using convertors
and dtype
in pd.read_excel
, but it didn't work:
df = pd.read_excel(file_full_name, sheet_name=None, engine="openpyxl", header=0, dtype=str)
or
df = pd.read_excel("myexelfile.xlsx", sheet_name='selectedsheetname', header=0, converters={'Contract':str})
any idea?
https://stackoverflow.com/questions/66791588/how-to-convert-column-values-to-str-when-reading-multi-sheet-xlsx-using-pd-read March 25, 2021 at 09:00AM
没有评论:
发表评论