Regular Expressions

Advanced search and replace using regular expressions

Basic find and replace operations are a form of regular expressions. Regular expressions are also referred to as regex or regexp. Advanced operations of regular expressions allow more control of matching patterns.

Matching patterns can be set to matching the entire document, the first instance, or part of a word.

This tutorial uses Perl-derivative regular expressions. The regular expressions can be used on any text file any any application that supports regular expressions. Many programming languages support regular expressions that are similar.

    Tools are required:

  • Text editor with support for regular expressions.
  • Word Processor with support for regular expressions.
  • Optional Perl-like language such as Awk, Perl or PHP.

Optional Download and install suitable text editor

Any text editor with regular expression support will work. To select a reviewed lightweight programming editor read the Ojambo.com Lightweight Programming Editors.

Optional Download and install word processor

Regular Expressions Cheat Sheet

Anchors Description
^ Beginning position
$ Ending position
\b Word boundary
$ Ending position
Character Classes Description
\s White space (for not white space use “\S”)
\d Digit (for not digit use “\D”)
\w Word (for not word use “\W”)
Quantifiers Description
* Zero (0) or more
+ One (1) or more of preceding
? Zero (0) or One (1)
(min,max) Min/Max characters eg (4,10)
Logic Description
. Any Character eg .(6,9) for password
[] Range eg [abc]
[^] Not eg [^abc]
| Or eg gr(a|e)y gray or grey
[-] Between eg [a-c]
() Relationship eg ([a-z](4,10))
Special Characters Description
\ Escape Any Character
\r Carriage Return
# Comments eg # This is a comment
Pattern Modifiers Description
g Global match eg /every/g
i Ignore case eg /every/i
m Multiline eg /every/m
x Extended eg /every/gxm

Regular Expression Examples

Regular Expression Description
Numbers
[0-9]+ First occurrence of a number
Multiple line matching
/^.*NEED.*$/gxm Matching all lines with “NEED”
/^.*NEED.*$\r/gxm Matching multiple lines with “NEED”
Email
(‘/^[a-zA-Z0-9.-_]+@
[a-zA-Z0-9.-_]+\.
([a-zA-Z](2,4))/’
Email@domain.com
Grab tags
(<img).+?> Grab img tag
(<img).+?(‘|”)> Grab img tag
Strip tags
/(<iframe|<object).+?
(</iframe|</object>)/gi
Strip out object and iframe tags

Text Editor Regular Expression

Image Missing
Geany – Regular Expressions

Conclusion:

The regular expressions cheat sheet can be used in future projects. Keep the expressions simple and use as needed. The Perl-like syntax is the most common and can be applied in text editors, word processors and programming languages.

    Recommendations:

  1. This tutorial uses an optional lightweight programming editor with regular expression support.
  2. The regular expression concepts can be applied to other programming languages.
  3. Make your own cheat-sheet.