2021年5月7日星期五

Pass folder containing 18000 email text files into function to extract all emails and subject in text

I defined a function with which I want to extract email and store it in a list, and also to extract subject from a folder having numerous emails as text file. The function is:

def preprocess(text):      em =[]      st =""      for i in re.findall(r'[\w\-\.]+@[\w\.-]+\b', text):          temp =[]          temp = i.split('@')[1]          temp = temp.split('.')                if 'com' in temp:              temp.remove('com')          for i in temp:              if len(i) >2:                  em.append(i)      for i in em:          st +=i          st += " "      return em,st  

To pass each text file in above function, I did:

os.chdir(path)    myFiles = glob.glob('*.txt')  print(type(myFiles))  for text in myFiles:      email_list,subject = preprocess(text)  

The output I am getting is an empty email_list and an empty subject, but when I pass a single text file, the function is giving output. How do I pass all text files from a folder in the function so that I can extract the email and subject from each text file?

https://stackoverflow.com/questions/67439057/pass-folder-containing-18000-email-text-files-into-function-to-extract-all-email May 08, 2021 at 01:05AM

没有评论:

发表评论