Here is another draft. Chapter 10 for Techniques.txt.
Quote:10a Loops -- Limiting expression scopes.
You can use "+" loops to isolate subexpressions, removing their
capatibility to look ahead.
Example:
Say we want to match <foo ... >, but only if the following tag isn't </foo >
<foo*>(^*</foo >)
... wouldn't work, because "*>" doesn't stop at the first match but is looking
ahead.
<foo[^>]+>(^[^<]+</foo >)
... would work, but [^...] forces inspection of each character.
<foo(*>)+{1}(^(*<)+{1}/foo >)
... does what we want, quickly. "*>", "*<" are not looking ahead anymore.
10b Avoiding superfluous tests in OR conditions.
Example:
Say we want to match "prefix-possible_suffix ... some_string" and capture
"-possible_suffix" if present.
prefix(-possible_suffix|)\1*some_string
... would cause the filter attempting twice to match:
"prefix-possible_suffix ... no_match"
prefix((-possible_suffix)+)\1*some_string
... does what we want.
10c However, +/++ loops remove the uniqueness of the string under test, even if
followed by {1,*}.
If possible, and if you aren't just testing the very beginning of a
document or bounds match, try to start your test string with at least one
unique character (better more),
Example:
To test for 100 asterisk symbols anywhere in a document:
\*\*\*+{98}
Suggestions welcome.