I have the following two string and I'd like to use regex to pull matches that start at AUG
and end with either (UAA|UAG|UGA)
string1 = 'AGCCAUGUAGCUAACUCAGGUUACAUGGGGAUGACCCCGCGACUUGGAUUAGAGUCUCUUUUGGAAUAAGCCUGAAUGAUCCGAGUAGCAUCUCAG' string2 = 'CUGAGAUGCUACUCGGAUCAUUCAGGCUUAUUCCAAAAGAGACUCUAAUCCAAGUCGCGGGGUCAUCCCCAUGUAACCUGAGUUAGCUACAUGGCU'
The matches I'm looking for are:
'AUGUAG' 'AUGGGGAUGACCCCGCGACUUGGAUUAGAGUCUCUUUUGGAAUAA' 'AUGACCCCGCGACUUGGAUUAGAGUCUCUUUUGGAAUAA' 'AUGCUACUCGGAUCAUUCAGGCUUAUUCCAAAAGAGACUCUAAUCCAAGUCGCGGGGUCAUCCCCAUGUAACCUGAGUUAG'
I tried the following, pattern but it didn't work. Any explanation why?
pattern = re.compile(r'AUG\w*(UAA|UAG|UGA)') matches1 = pattern.finditer(string1) matches2 = pattern.finditer(string2)
And while I'm at it, I am also curious if one can implement a list ['UAA','UAG','UGA']
into a regex pattern (instead of (UAA|UAG|UGA)
) Thanks so much!
没有评论:
发表评论