I have the following two string and I'd like to use regex to pull matches that start at AUG and end with either (UAA|UAG|UGA)
string1 = 'AGCCAUGUAGCUAACUCAGGUUACAUGGGGAUGACCCCGCGACUUGGAUUAGAGUCUCUUUUGGAAUAAGCCUGAAUGAUCCGAGUAGCAUCUCAG' string2 = 'CUGAGAUGCUACUCGGAUCAUUCAGGCUUAUUCCAAAAGAGACUCUAAUCCAAGUCGCGGGGUCAUCCCCAUGUAACCUGAGUUAGCUACAUGGCU' The matches I'm looking for are:
'AUGUAG' 'AUGGGGAUGACCCCGCGACUUGGAUUAGAGUCUCUUUUGGAAUAA' 'AUGACCCCGCGACUUGGAUUAGAGUCUCUUUUGGAAUAA' 'AUGCUACUCGGAUCAUUCAGGCUUAUUCCAAAAGAGACUCUAAUCCAAGUCGCGGGGUCAUCCCCAUGUAACCUGAGUUAG' I tried the following, pattern but it didn't work. Any explanation why?
pattern = re.compile(r'AUG\w*(UAA|UAG|UGA)') matches1 = pattern.finditer(string1) matches2 = pattern.finditer(string2) And while I'm at it, I am also curious if one can implement a list ['UAA','UAG','UGA'] into a regex pattern (instead of (UAA|UAG|UGA)) Thanks so much!
没有评论:
发表评论