I have a pd dataframe that has a column that contains values as ['cat, pet','dog, pet','dog','bird', 'bird, pet','tail', 'cat, tail'], and I want to find all places where s contains both of ['cat', 'pet'], and extract the rows that match.
NOTICE that this question is not made for the specific case ['cat', 'pet'], but for a dynamic input that could handle any combination of lists, and the dataframe is more than 10.000 rows long.
My goal is to filter the rows based on values contained in a specific column
I know that if I want to find 'cat' OR 'pet', I just filter like:
search = ['cat', 'pet'] df[df['column'].str.contains('|'.join(search))] But what if I want to match 'cat' AND 'pet' or other lists with different combinations of values??
I tried:
df[df['column'].str.contains('&'.join(search))] But it is not working for me :/ Also tried:
np.logical_and.reduce([df['column'].str.contains(word) for word in search]) https://stackoverflow.com/questions/66165025/how-to-test-if-a-dfcolumn-contains-one-of-the-substrings-in-a-list-in-panda February 12, 2021 at 08:43AM
没有评论:
发表评论