Atlanta Custom Software Development 

 
   Search        Code/Page
 

User Login
Email

Password

 

Forgot the Password?
Services
» Web Development
» Maintenance
» Data Integration/BI
» Information Management
Programming
  Database
Automation
OS/Networking
Graphics
Links
Tools
» Regular Expr Tester
» Free Tools

Search multiple substrings with the RegExp object

Total Hit ( 3721)

Rate this article:     Poor     Excellent 

 Submit Your Question/Comment about this article

Rating


 


The RegExp object in the Microsoft VBScript Regular Expression type library supports regular expression patterns containing the | (or) operator, which lets you search for multiple substrings at the same time. For example, the following piece of code lets you search for a month name in a source text:

Click here to copy the following block
' NOTE: this code requires a reference to the
'    Microsoft VBScript Regular Expression type library

Dim re As New RegExp
Dim ma As Match

re.Pattern = "january|february|march|april|may|june|july|september|october|novem" _
  & "ber|december"

' case isn't significant
re.IgnoreCase = True
' we want all occurrences
re.Global = True

' we assume that the string to be parsed is in the sourceText variable
For Each ma In re.Execute(sourceText)
  Print "Found '" & ma.Value & "' at index " & ma.FirstIndex
Next

The code above doesn't search for whole words, though, and would find false matches such as "marches". To force the Execute method to search only for whole words, we must embed the list of words among parenthesis, and add the \b sequence to specify that the occurrence should be on a word boundary:

Click here to copy the following block
re.Pattern = "\b(january|february|march|april|may|june|july|september|october|no" _
  & "vember|december)\b"

Thanks to the Join function, it is easy to create a generic function that searches for any word in an array:

Click here to copy the following block
' Search all the words specified in the array passed as a second argument
' returns a bi-dimensional array of variants, where arr(0,n) is the N-th
' matched word, and arr(1,n) is the index where the word has been found

' NOTE: requires a reference to the
'    Microsoft VBScript Regular Expression type library


Function InstrAllWords(ByVal Text As String, words() As String, _
  Optional IgnoreCase As Boolean) As Variant
  Dim re As New RegExp
  Dim ma As Match
  Dim maCol As MatchCollection
  Dim index As Long
  
  ' create the pattern in the form "\b(word1|word2|....|wordN)\b"
  re.pattern = "\b(" & Join(words, "|") & ")\b"
  ' we want all occurrences
  re.Global = True
  ' case insensitive?
  re.IgnoreCase = IgnoreCase
  
  ' get the result
  Set maCol = re.Execute(Text)
  
  ' now we can DIMension the result array
  ReDim res(1, maCol.Count) As Variant
  
  ' move results into the array
  For Each ma In maCol
    index = index + 1
    res(0, index) = ma.Value
    res(1, index) = ma.FirstIndex
  Next
  
  ' return to caller
  InstrAllWords = res
End Function

Here's is an example of how you can use the function above:

Click here to copy the following block
' fill an array with desired words
Dim words(2) As String
words(0) = "Visual": words(1) = "Basic": words(2) = "Windows"

Dim arr() as Variant
arr = InstrAllWords(txtSource.Text, words())
For i = 1 To UBound(arr, 2)
  Print "'" & arr(0, i) & "' at index " & arr(1, i)
Next


Submitted By : Nayan Patel  (Member Since : 5/26/2004 12:23:06 PM)

Job Description : He is the moderator of this site and currently working as an independent consultant. He works with VB.net/ASP.net, SQL Server and other MS technologies. He is MCSD.net, MCDBA and MCSE. In his free time he likes to watch funny movies and doing oil painting.
View all (893) submissions by this author  (Birth Date : 7/14/1981 )


Home   |  Comment   |  Contact Us   |  Privacy Policy   |  Terms & Conditions   |  BlogsZappySys

© 2008 BinaryWorld LLC. All rights reserved.