Advanced search and replace using regular expressions
Basic find and replace operations are a form of regular expressions. Regular expressions are also referred to as regex or regexp. Advanced operations of regular expressions allow more control of matching patterns.
Matching patterns can be set to matching the entire document, the first instance, or part of a word.
This tutorial uses Perl-derivative regular expressions. The regular expressions can be used on any text file any any application that supports regular expressions. Many programming languages support regular expressions that are similar.
- Tools are required:
- Text editor with support for regular expressions.
- Word Processor with support for regular expressions.
- Optional Perl-like language such as Awk, Perl or PHP.
Optional Download and install suitable text editor
Any text editor with regular expression support will work. To select a reviewed lightweight programming editor read the Ojambo.com Lightweight Programming Editors.
Optional Download and install word processor
- Word Processors with regular expression:
- AbiWord.
- LibreOffice Writer.
- OpenOffice Writer.
- KOfice KWord.
- Siag Office XedPlus.
- LyX.
- Google Documents.
Regular Expressions Cheat Sheet
Anchors | Description |
---|---|
^ | Beginning position |
$ | Ending position |
\b | Word boundary |
$ | Ending position |
Character Classes | Description |
\s | White space (for not white space use “\S”) |
\d | Digit (for not digit use “\D”) |
\w | Word (for not word use “\W”) |
Quantifiers | Description |
* | Zero (0) or more |
+ | One (1) or more of preceding |
? | Zero (0) or One (1) |
(min,max) | Min/Max characters eg (4,10) |
Logic | Description |
. | Any Character eg .(6,9) for password |
[] | Range eg [abc] |
[^] | Not eg [^abc] |
| | Or eg gr(a|e)y gray or grey |
[-] | Between eg [a-c] |
() | Relationship eg ([a-z](4,10)) |
Special Characters | Description |
\ | Escape Any Character |
\r | Carriage Return |
# | Comments eg # This is a comment |
Pattern Modifiers | Description |
g | Global match eg /every/g |
i | Ignore case eg /every/i |
m | Multiline eg /every/m |
x | Extended eg /every/gxm |
Regular Expression Examples
Regular Expression | Description |
---|---|
Numbers | |
[0-9]+ | First occurrence of a number |
Multiple line matching | |
/^.*NEED.*$/gxm | Matching all lines with “NEED” |
/^.*NEED.*$\r/gxm | Matching multiple lines with “NEED” |
(‘/^[a-zA-Z0-9.-_]+@ [a-zA-Z0-9.-_]+\. ([a-zA-Z](2,4))/’ |
Email@domain.com |
Grab tags | |
(<img).+?> | Grab img tag |
(<img).+?(‘|”)> | Grab img tag |
Strip tags | |
/(<iframe|<object).+? (</iframe|</object>)/gi |
Strip out object and iframe tags |
Text Editor Regular Expression
Conclusion:
The regular expressions cheat sheet can be used in future projects. Keep the expressions simple and use as needed. The Perl-like syntax is the most common and can be applied in text editors, word processors and programming languages.
- Recommendations:
- This tutorial uses an optional lightweight programming editor with regular expression support.
- The regular expression concepts can be applied to other programming languages.
- Make your own cheat-sheet.