Java Eclipse Linux Operating Systems Web Technology Software Software Engineering Computing Societies

The first problem we noticed is that variable substitutions strings
at the beginning of the line are ignored. The problem is that the
regular expression requires for at least one non-`\`

character before the variable substitution string. When the variable
substitution string is located at the beginning of the line, no such
character can be found before the variable substitutions strings can
be found. As a result, there is not a match.

To solve this problem, we can make use of the `^`

symbol, which indicates the beginning of line, and the conception of
alternation. Consider the regular expression
`(`

.
The first capturing is modified to match either a string of even
number of **^(?:\\\\)*|**.*?[^\\](?:\\\\)*)\$\{([a-zA-Z0-9.]+)\}`\`

slashed at the beginning of the line (through
the subexpression

, or a string of any
characters, followed by at least one non-**^(?:\\\\)***`\`

character and
optionally an even number of `\`

characters.

`substituteString`

, Take 3If we modify the test case to use the above regular expression, we
end up with TestRegex03.java.
Running the `main`

method, we have the following
output:

```
"abc" becomes "abc"
"a${abc}" becomes "aABC"
"${abc}" becomes "ABC"
"${abc}${defg}" becomes "ABC${defg}"
"${abc}z${defg}" becomes "ABCzDEFG"
"abc${abc}${defg}xyz" becomes "abcABC${defg}xyz"
"abc${abc}xy${defg}xyz" becomes "abcABCxyDEFGxyz"
"abc\${abc}xy\\${defg}xyz\\\${abc}xy" becomes "abc\${abc}xy\\DEFGxyz\\\${abc}xy"
"\${abc}xy\\${defg}xyz\\\${abc}xy" becomes "\${abc}xy\\DEFGxyz\\\${abc}xy"
"\\${abc}xy\\${defg}xyz\\\${abc}xy" becomes "\\ABCxy\\DEFGxyz\\\${abc}xy"
"\\${abc}\\${defg}xyz\\\${abc}xy" becomes "\\ABC\\${defg}xyz\\\${abc}xy"
"\\\${abc}xy\\${defg}xyz\\\${abc}xy" becomes "\\\${abc}xy\\DEFGxyz\\\${abc}xy"
"\a\${abc}xy\\${defg}xyz\\\${abc}xy\\\lmn" becomes "\a\${abc}xy\\DEFGxyz\\\${abc}xy\\\lmn"
```

This is better, but there are still a few variable substitution strings that were ignored erroneously. We shall now look at how we can remedy that.

If we examine the output from `substituteString`

, Take
3, we notice that a variable substitution string that follows another
does not get properly substituted. The problem is that the regular
expression we have thus far expects the variable substitution string
to be preceded by at least the beginning of line or any character
other than a `\`

, for each substring it is supposed to
match. When a variable substitution string is followed immediately by
another, thus the second variable string will be matched by
`.*?`

, instead of `\$\{([a-zA-Z0-9.]+)\}`

as
desired.

The lookbehind quantifiers can come to the rescue in these cases. Lookahead and lookbehind quantifiers are "zero-width", in the sense "consume" the substring that they matched. The matched substring either constitutes a prefix or a suffix that lies outside of the string matched by the regular expression, or the same substring must be matched by another part of the regular expression. The lookahead and lookbehind quantifiers can be used to specify constraints on what must precede or follow a particular regular expression construct.

Consider the regular expression
`(^(?:\\\\)*|.*?[^\\](?:\\\\)*`

.
Here we added the subexpression **|(?<=})(?:\\\\)***)\$\{([a-zA-Z0-9.]+)\}

to
the alternation. The new subexpression is a positive lookbehind
quantifier, which means it will enforce the constraint that the string
**(?<=})(?:\\\\)***`}`

precedes whatever follows. Taken together the regular
expression matches any variable substitution string that are preceded
by either

- Beginning of line (optionally followed by an even number of the
`\`

character, - Any character sequence (matched reluctantly), followed by any
character except
`\`

(optionally followed by an even number of the`\`

character, or - The character
`}`

, which may be consumed by the regular expression (as part of the previous variable substitution string). This may be followed (optionally) by an even number of`\`

characters.

`substituteString`

, Take 4If we modify the test case to use the above regular expression, we
end up with TestRegex04.java.
Running the `main`

method, we have the following
output:

```
"abc" becomes "abc"
"a${abc}" becomes "aABC"
"${abc}" becomes "ABC"
"${abc}${defg}" becomes "ABCDEFG"
"${abc}z${defg}" becomes "ABCzDEFG"
"abc${abc}${defg}xyz" becomes "abcABCDEFGxyz"
"abc${abc}xy${defg}xyz" becomes "abcABCxyDEFGxyz"
"abc\${abc}xy\\${defg}xyz\\\${abc}xy" becomes "abc\${abc}xy\\DEFGxyz\\\${abc}xy"
"\${abc}xy\\${defg}xyz\\\${abc}xy" becomes "\${abc}xy\\DEFGxyz\\\${abc}xy"
"\\${abc}xy\\${defg}xyz\\\${abc}xy" becomes "\\ABCxy\\DEFGxyz\\\${abc}xy"
"\\${abc}\\${defg}xyz\\\${abc}xy" becomes "\\ABC\\DEFGxyz\\\${abc}xy"
"\\\${abc}xy\\${defg}xyz\\\${abc}xy" becomes "\\\${abc}xy\\DEFGxyz\\\${abc}xy"
"\a\${abc}xy\\${defg}xyz\\\${abc}xy\\\lmn" becomes "\a\${abc}xy\\DEFGxyz\\\${abc}xy\\\lmn"
```

Written by Mike Kwong