Regular Expressions in Linux Explained with Examples
Regular expressions (Regexp) is one of the advanced concept we require to write efficient shell scripts and for effective system administration.
- Basically regular expressions are divided in to 3 types for better understanding.
- Basic Regular expressions
- Interval Regular expressions (Use option -E for grep and -r for sed)
- Extended Regular expressions (Use option -E for grep and -r for sed)
- What is a Regular expression?
A regular expression is a concept of matching a pattern in a given string.
- Which commands/programming languages support regular expressions?
vi, tr, rename, grep, sed, awk, perl, python etc.
BASIC REGULAR EXPRESSIONS
Basic regular expressions:
- This set includes very basic set of regular expressions which do not require any options to execute.
- This set of regular expressions are developed long time back.
^ –Caret/Power symbol to match a starting at the beginning of line.
$ –To match end of the line
* –0 or more occurrence of the previous character.
. –To match any character
[] –Range of character [^char] –negate of occurrence of a character set<word> –Actual word finding
–Escape character
[ad type=”banner”]$ REGULAR EXPRESSION
- Match all the files which ends with sh[pastacode lang=”bash” manual=”%20%20ls%20-l%20%7C%20grep%20sh%24%0A” message=”bash code” highlight=”” provider=”manual”/]
As $ indicates end of the line, the above command will list all the files whose names end with sh.
how about finding lines in a file which ends with dead
How about finding empty lines in a file?
[pastacode lang=”bash” manual=”grep%20’%5E%24’%20filename%0A” message=”bash code” highlight=”” provider=”manual”/]REGULAR EXPRESSION
Example : Match all files which have a word twt, twet, tweet etc in the file name.
[pastacode lang=”bash” manual=”ls%20-l%20%7C%20grep%20’twe*t’%0A” message=”bash code” highlight=”” provider=”manual”/] [ad type=”banner”]How about searching for apple word which was spelled wrong in a given file where apple is misspelled as ale, aple, appple, apppple, apppppple etc. To find all patterns
[pastacode lang=”bash” manual=”grep%20’ap*le’%20filename%0A” message=”bash code” highlight=”” provider=”manual”/]Readers should observe that the above pattern will match even ale word as * indicates 0 or more of the previous character occurrence.
[^CHAR] REGULAR EXPRESSION
Example: Match all the file names except a or b or c in it’s filenames
[pastacode lang=”bash” manual=”%20ls%20%7C%20grep%20%20’%5B%5Eabc%5D’%0A” message=”bash code” highlight=”” provider=”manual”/]This will give output all the file names except files which contain a or b or c.
<WORD> REGULAR EXPRESSION
Example: Search for a word abc, for example I should not get abcxyz or readabc in my output
[pastacode lang=”bash” manual=”%20%20%20grep%20’%3Cabc%3E’%20filename%0A” message=”bash code” highlight=”” provider=”manual”/]ESCAPE REGULAR EXPRESSION
Example : Find files which contain [ in it’s name, as [ is a special charter we have to escape it
[pastacode lang=”bash” manual=”%0Agrep%20%22%5B%22%20filename%0A%0Aor%0A%0Agrep%20’%5B%5B%5D’%20filename%0A” message=”bash code” highlight=”” provider=”manual”/] [ad type=”banner”][] SQUARE BRACES/BRACKETS REGULAR EXPRESSION
Example : Find all the files which contains a number in the file name between a and x
[pastacode lang=”bash” manual=”ls%20-l%20%7C%20grep%20’a%5B0-9%5Dx’%0A%0AThis%20will%20find%20all%20the%20files%20which%20is%0Aa0xsdf%0Aasda1xsdfas%0A..%0A..%0Aasdfdsara9xsdf%0Aetc.%0A” message=”bash code” highlight=”” provider=”manual”/]- So where ever it finds a number it will try to match that number.
- Some of the range operator examples for you.
- [a-z] –Match’s any single char between a to z.
- [A-Z] –Match’s any single char between A to Z.
- [0-9] –Match’s any single char between 0 to 9.
- [a-zA-Z0-9] – Match’s any single character either a to z or A to Z or 0 to 9
- [!@#$%^] — Match’s any ! or @ or # or $ or % or ^ character.