P5RE - a COM object unleashing the power of Perl 5 regular expressions in Visual Basic
By Daniele Mezzetti
Wrapping the pcre library from Philip Hazel

hosted on SourceForge.net Logo

The Regexp object encapsulates the regular expression.

Example

Dim r As Regexp

Set r = New Regexp

r = "(\w+)"

r.Match("aaa", Caseless:=True)

r.SetOptions(Caseless:=False)

r.Match("aaa")

Debug.Print r.Atoms.Count

'print main pattern match

Debug.Print r.Atoms(0)

'loop over all matches

For i = 0 To r.Atoms.Count

    Debug.Print r.Atoms(i)

Next

'loop over subpatterns

for each s in r.Atoms
   
    Debug.Print s

Next

r.Optimize = True

r.locale "ITA"

r.Pattern = "(\w+)"

Debug.Print r.Match("ààà")

Regexp.Pattern is the default property, so that
r.pattern = "pattern" and r = "pattern" are the same.

Atoms is the collection object of all patterns. You can iterate over it both with For..Next and with Foreach loop. Atoms(0) is the whole pattern match, Atoms(1..Atoms.Count) are the subpattern matches.

Optimization allows for faster execution when the Regexp is used multiple times.

Regexp supports locales via the .Locale method.

Options can be set with the SetOptions method or directly during Match invocation.

Options are:

(force recompilation if necessary)
  • Anchored
  • Caseless
  • DollarAtEnd
  • DotAll
  • Extended
  • Multiline
  • Nogreedy
(do not force recompilation)
  • Notbol
  • Noteol
  • Notempty
For their meaning, see the accompanying pcre.txt document. Options of the first group must be set before compilation of the regexp. If such an option is set on an already-compiled regexp, P5RE does the recompilation automatically.

P5RE supports error handling VB-style via the Err object.

P5RE is built with '\n' (char 0x0A) as line separator. Beware that MS apps do not share a consistent behaviour: text coming from an Excel multi-line spreadsheet cell uses '\n', text coming from a textbox control has instead the infamous "\n\r" (chars 0x0D+0x0A) couple. I'll fix this someway someday.


 release beta 0.2