I have a Pandas dataframe that looks similar to this:
datetime data1 data2 2021-01-23 00:00:31.140 a1 a2 2021-01-23 00:00:31.140 b1 b2 2021-01-23 00:00:31.140 c1 c2 2021-01-23 00:01:29.021 d1 d2 2021-01-23 00:02:10.540 e1 e2 2021-01-23 00:02:10.540 f1 f2 The real dataframe is very large and for each unique timestamp, there are a few thousand rows.
I want to save this dataframe to a Parquet file so that I can quickly read all the rows that have a specific datetime index, without loading the whole file or looping through it. How do I save it correctly in Python and how do I quickly read only the rows for one specific datetime?
After reading, I would like to have a new dataframe that contains all the rows for that specific datetime. For example, I want to read only the rows for datetime "2021-01-23 00:00:31.140" from the Parquet file and receive this dataframe:
datetime data1 data2 2021-01-23 00:00:31.140 a1 a2 2021-01-23 00:00:31.140 b1 b2 2021-01-23 00:00:31.140 c1 c2 I appreciate any help, thank you very much in advance!
https://stackoverflow.com/questions/66502174/read-only-specific-timestamp-multiple-rows-from-parquet-file-in-python-pandas March 06, 2021 at 11:47AM
没有评论:
发表评论