I know the short answer to this question is to 'use a parser', but I'm curious if this simple rule can be implemented with a regex only. I want to remove comments from a line, where a comment is a ;
and everything after that until a newline. For example:
1+2+3 ; this is my comment 2+3
The regex I can use for this is:
>>> re.sub(r";.+", " ", s).split() ['1+2+3', '2+3']
Now I would like to introduce a string, which is something between double-quotes, such as "this is a string"
, which also recognizes an escape character within it, such as "this is a \";string;\""
, which would be interpreted as: this is a ";string;"
.
Is it possible to remove comments via a regex with the string type as well? Here would be a few example inputs:
1+2+3 ; this is my comment 1+2+"h;\";;;\"ello;" ; another comment 1+2+;"a comment ;;; 1+2+ "; \"" ";\"hello" ; comment
And to match a single string: "[^\\"]*(?:\\.[^\\"]*)*"
. Here is a regex101 with both a string-only and a comment-only regex with some example patterns: https://regex101.com/r/3WTWtY/1.
没有评论:
发表评论