This is a multipart problem. I have found solutions for each separate part, but when I try to combine these solutions, I don't get the outcome I want.
Let's say this is my dataframe:
df = pd.DataFrame(list(zip([1, 3, 6, 7, 7, 8, 4], [6, 7, 7, 9, 5, 3, 1])), columns = ['Values', 'Vals']) df Values Vals 0 1 6 1 3 7 2 6 7 3 7 9 4 7 5 5 8 3 6 4 1
Let's say I want to find the pattern [6, 7, 7] in the 'Values' column. I can use a modified version of the second solution given here: Pandas: How to find a particular pattern in a dataframe column?
pattern = [6, 7, 7] pat_i = [df[i-len(pattern):i] # Get the index for i in range(len(pattern), len(df)) # for each 3 consequent elements if all(df['Values'][i-len(pattern):i] == pattern)] # if the pattern matched pat_i [ Values Vals 2 6 7 3 7 9 4 7 5]
The only way I've found to narrow this down to just index values is the following:
pat_i = [df.index[i-len(pattern):i] # Get the index for i in range(len(pattern), len(df)) # for each 3 consequent elements if all(df['Values'][i-len(pattern):i] == pattern)] # if the pattern matched pat_i [RangeIndex(start=2, stop=5, step=1)]
Once I've found the pattern, what I want to do, within the original dataframe, is reorder the pattern to [7, 7, 6], moving the entire associated rows as I do this. In other words, going by the index, I want to get output that looks like this:
df.reindex([0, 1, 3, 4, 2, 5, 6]) Values Vals 0 1 6 1 3 7 3 7 9 4 7 5 2 6 7 5 8 3 6 4 1
Then, finally, I want to reset the index so that the values in all the columns stay in the new re-ordered place;
Values Vals 0 1 6 1 3 7 2 7 9 3 7 5 4 6 7 5 8 3 6 4 1
In order to use pat_i
as a basis for re-ordering, I've tried to modify the second solution given here: Python Pandas: How to move one row to the first row of a Dataframe?
target_row = 2 # Move target row to first element of list. idx = [target_row] + [i for i in range(len(df)) if i != target_row]
However, I can't figure out how to exploit the pat_i
RangeIndex object to use it with this code. The solution, when I find it, will be applied to hundreds of dataframes, each one of which will contain the [6, 7, 7] pattern that needs to be re-ordered in one place, but not the same place in each dataframe.
Any help appreciated...and I'm sure there must be an elegant, pythonic way of doing this, as it seems like it should be a common enough challenge. Thank you.
https://stackoverflow.com/questions/65911081/find-pattern-in-pandas-dataframe-reorder-it-row-wise-and-reset-index January 27, 2021 at 08:02AM
没有评论:
发表评论