I’m currently experementing with regular expressions. And am currently writing code that is supposed to take a String and find all points where a string is defined in code like “test” or ““this text is in quotes””.
I set the following rules:
It has to begin with a " without a \ coming before
Between the begin and the end there should be a minimal amount of other chars
It has to end with a " without a \ coming before
I made the following regex "(?!\\\\)\".*?(?!\\\\)\""
However my code seems to ignore the end condition. I think this is because if it ends with " it re-interprets the \ as a part of the .*? part.
Here is the code I’m currently using.
String test="\"test1\\\"\" test2 \"test 3\"";
import java.util.regex.*;
void setup() {
println(find(test, "(?!\\\\)\".*?(?!\\\\)\""));
}
int[] find(String inp, String regex) {
Pattern p=Pattern.compile(regex);
Matcher matcher=p.matcher(" "+inp+" ");
ArrayList<Integer> matchesBeg=new ArrayList();
if (matcher.find()) do {
matchesBeg.add(matcher.start()) ;
println((" "+inp+" ").substring(matcher.start(),matcher.end()));
} while (matcher.find(matcher.end()+1));
int ret[]=new int[matchesBeg.size()];
for (int i=0; i<ret.length; i++) ret[i]=matchesBeg.get(i);
return ret;
}
And this is a minimal example in wich the problem occurs:
String sample="\"\\\"\"";
String regex="(?!\\\\)\".*?(?!\\\\)\"";
println(sample+"\n"+regex);
println(sample.matches(regex));
Does someone know how to fix the regex?
(Edit: Ok I managed to fix it myself. The correct regex is: String regex="(?<!\\\\).*(?<!\\\\)\""
.