2021年3月13日星期六

Pandas DataFrame columns getting passed to the new data dataframe

I am trying to create a new dataframe using existing dataframe values. Below code accepts a dataframe called dfhiddencols, which has 3 columns in it

Parent , Childlist, Formula

then it creates a new dataframe called newdf with 2 columns called

Parent, Child

then it loops through each row of dfhiddencols to find a particular pattern. when it finds the pattern, it adds a new row to dfnew. by fetching parent column value from dfhiddencols and matched pattern string.

However, when this new record is added its adding 2 additional columns to newdf

childlist, formula

These 2 columns are not defined when creating the dictionary createrow. Do you know why the columns are getting passed to the new dataframe and how to avoid such scenario?

def extracthiddencolumns(dfhiddencols):  newdf = pd.DataFrame(columns=['child', 'parent'])  createrow ={}  for idx, row in dfhiddencols.iterrows():      #if len(str(row['formula'])) > 3:          for formula in row['formula'].split('|||'):              if formula != '' and '??' in formula:                  formula = formula.strip('\n')                  formula = formula.strip('\t')                  for i in re.findall(r"\[\?\?([A-Za-z0-9_]+)\.([A-Za-z0-9_]+)\?\?\]", formula):                      strconcat = i[0] + "." + i[1]                      parent = row['parent']                      createrow = {'child': parent, 'parent': strconcat}                      newdf = dfhiddencols.append(createrow, ignore_index=True)              createrow = {}  newdf.drop(columns=['childlist', 'formula'])  return newdf  
https://stackoverflow.com/questions/66620708/pandas-dataframe-columns-getting-passed-to-the-new-data-dataframe March 14, 2021 at 11:06AM

没有评论:

发表评论