I am a newbie and trying to figure our if I can run the following code by not using apply and lambda functions. Any help is greatly appreciated.
The code is:
df1['Category'] = df[key_column].apply(lambda x: process_df1(x, 'category')) where df1 is a dataframe, key_column is a specific column identified to be operated upon process_df1 is a function defined to run on df1.
The problem is I am trying to avoid the error: "A value is trying to be set on a copy of a slice from a DataFrame. Try using
.loc[row_indexer,col_indexer] = value instead" I don't want to ignore / suppress the warnings or set pd.options.mode.chained_assignment = None.
Is there an alternative besides these 2?
I have tried using
df.loc[df1['Category'] = df[key_column].apply(lambda x: process_df1(x, 'category'))] but it still produces the same error.
Apologies if it is a confusing question.
Update:
Here is what process_df1 is doing:
def process_df1(string, item_type): d = {'category': 0, 'classification': 1, 'subclassification': 2, 'line_item':3} if not pd.isnull(string): if string not in exception_list: string = get_cleaned_word(string) for tup in x1: if tup[2] == string: return tup[0][d[item_type]] return np.nan And key_column is find_key_column(df). def find_key_column(df): columns = df.columns max_count = 0 most_likely_column = None for col in columns: count = 0 for string in df[col].values: if (type(string) == str): for item in flat_list: if (fuzz.token_set_ratio(string, item) > 95): count += 1 else: continue if count > max_count: most_likely_column = col max_count = count return most_likely_column https://stackoverflow.com/questions/65622765/is-there-an-alternative-to-apply-and-lambda-in-python January 08, 2021 at 10:48AM
没有评论:
发表评论