2020年12月23日星期三

Pandas dataframe - move N number of rows from one dataframe to another

I have a training set and test set for machine learning, however the training set contains too many rows of data and the test set contains too little. I calculated I need to move 245 rows from the training set to the test set to produce a better split. How can I do this? I have 5116 total rows in training set.

First I randomized the rows of the training set using this

train_df = train_df.sample(n = len(train_df)).reset_index(drop=True)  

And then I wanted to grab the last 245 rows and move them to test_df

I found these two solutions here

Pandas dataframe - move rows from one dataframe to another

and

Pandas move rows from 1 DF to another DF

However they are selecting the rows based on a condition which I don't have. I kind of want to do it like you would in python using slice on arrays if that's possible.

Maybe like (rows 0-5116 - 245 and all columns starting from 0)

transferdata_df = train_df.iloc[5115 - 245:, 0:]  

Then append that to the test set like

test_df.append(transferdata_df)  

I'm not sure if this is the correct way or not.

https://stackoverflow.com/questions/65433114/pandas-dataframe-move-n-number-of-rows-from-one-dataframe-to-another December 24, 2020 at 10:00AM

没有评论:

发表评论