I have to parse string like this:
yo (p:abc-123-def) meets \(p:2) \(in the cinema\) \\ (p:3) (p:4\) won't
What I need to extract are all
(<entity>:<id>) markups but ignore escaped things like
\(in the ciname\) or
\\. From the above example, the regex should only match
\(p:4) since the brackets are escaped.
Now, I am still able to modify that markup so if there is a simpler way to do the whole thing I’m open to suggestions. If not, I’d need to be able to get those
(<entity>:<id>) markups from a regex.
Something like this
would work but look-behind groups are not supported by all browsers.
It can get complex when backslashes are repeated many times, like:
\\\\\\\\\\\\\\(p:1). You would need to know whether the number of backslashes is even or odd in order to know whether the
( is escaped or not.
Secondly, the colon occurring within parentheses might be escaped as well, and would then not count(?).
So I would suggest to work with something like
(?:\\.|[^:)\\])* which deals with escaped characters (
.) and puts some requirements for unescaped characters, like
So this is the result:
This uses look-behind which is being supported in the latest versions of popular browsers.
If look-behind is not an option, then capture the character that precedes the potential backslashes, and make a capture group for the part you need:
So here you need to work with the first captured group.
Answered By – trincot
Answer Checked By – Cary Denson (BugsFixing Admin)