2021年4月4日星期日

Regex to strip text after comment char

I know the short answer to this question is to 'use a parser', but I'm curious if this simple rule can be implemented with a regex only. I want to remove comments from a line, where a comment is a ; and everything after that until a newline. For example:

1+2+3 ; this is my comment  2+3  

The regex I can use for this is:

>>> re.sub(r";.+", " ", s).split()  ['1+2+3', '2+3']  

Now I would like to introduce a string, which is something between double-quotes, such as "this is a string", which also recognizes an escape character within it, such as "this is a \";string;\"", which would be interpreted as: this is a ";string;".

Is it possible to remove comments via a regex with the string type as well? Here would be a few example inputs:

1+2+3 ; this is my comment  1+2+"h;\";;;\"ello;" ; another comment  1+2+;"a comment ;;;  1+2+ "; \"" ";\"hello" ; comment  

And to match a single string: "[^\\"]*(?:\\.[^\\"]*)*". Here is a regex101 with both a string-only and a comment-only regex with some example patterns: https://regex101.com/r/3WTWtY/1.

https://stackoverflow.com/questions/66945015/regex-to-strip-text-after-comment-char April 05, 2021 at 03:20AM

没有评论:

发表评论