I am trying to create a new dataframe using existing dataframe values. Below code accepts a dataframe called dfhiddencols, which has 3 columns in it
Parent , Childlist, Formula
then it creates a new dataframe called newdf with 2 columns called
Parent, Child
then it loops through each row of dfhiddencols to find a particular pattern. when it finds the pattern, it adds a new row to dfnew. by fetching parent column value from dfhiddencols and matched pattern string.
However, when this new record is added its adding 2 additional columns to newdf
childlist, formula
These 2 columns are not defined when creating the dictionary createrow. Do you know why the columns are getting passed to the new dataframe and how to avoid such scenario?
def extracthiddencolumns(dfhiddencols): newdf = pd.DataFrame(columns=['child', 'parent']) createrow ={} for idx, row in dfhiddencols.iterrows(): #if len(str(row['formula'])) > 3: for formula in row['formula'].split('|||'): if formula != '' and '??' in formula: formula = formula.strip('\n') formula = formula.strip('\t') for i in re.findall(r"\[\?\?([A-Za-z0-9_]+)\.([A-Za-z0-9_]+)\?\?\]", formula): strconcat = i[0] + "." + i[1] parent = row['parent'] createrow = {'child': parent, 'parent': strconcat} newdf = dfhiddencols.append(createrow, ignore_index=True) createrow = {} newdf.drop(columns=['childlist', 'formula']) return newdf https://stackoverflow.com/questions/66620708/pandas-dataframe-columns-getting-passed-to-the-new-data-dataframe March 14, 2021 at 11:06AM
没有评论:
发表评论