Regular Expressions in Linux Explained with Examples

Regular expressions (Regexp) is one of the advanced concept we require to write efficient shell scripts and for effective system administration.

  • Basically regular expressions are divided in to 3 types for better understanding.
  1. Basic Regular expressions
  2. Interval Regular expressions (Use option -E for grep and -r for sed)
  3. Extended Regular expressions (Use option -E for grep and -r for sed)
  • What is a Regular expression?

A regular expression is a concept of matching a pattern in a given string.

  • Which commands/programming languages support regular expressions?

vi, tr, rename, grep, sed, awk, perl, python etc.

BASIC REGULAR EXPRESSIONS

Basic regular expressions:

  • This set includes very basic set of regular expressions which do not require any options to execute.
  • This set of regular expressions are developed long time back.

^ –Caret/Power symbol to match a starting at the beginning of line.

$ –To match end of the line

* –0 or more occurrence of the previous character.

. –To match any character

[] –Range of character

[^char] –negate of occurrence of a character set

<word> –Actual word finding

–Escape character

[ad type=”banner”]

$ REGULAR EXPRESSION

  • Match all the files which ends with sh[pastacode lang=”bash” manual=”%20%20ls%20-l%20%7C%20grep%20sh%24%0A” message=”bash code” highlight=”” provider=”manual”/]

    As $ indicates end of the line, the above command will list all the files whose names end with sh.

    how about finding lines in a file which ends with dead

[pastacode lang=”bash” manual=”grep%20’dead%24’%20filename%0A” message=”bash code” highlight=”” provider=”manual”/]

How about finding empty lines in a file?

[pastacode lang=”bash” manual=”grep%20’%5E%24’%20filename%0A” message=”bash code” highlight=”” provider=”manual”/]

REGULAR EXPRESSION

Example : Match all files which have a word twt, twet, tweet etc in the file name.

[pastacode lang=”bash” manual=”ls%20-l%20%7C%20grep%20’twe*t’%0A” message=”bash code” highlight=”” provider=”manual”/] [ad type=”banner”]

How about searching for apple word which was spelled wrong in a given file where apple is misspelled as ale, aple, appple, apppple, apppppple etc. To find all patterns

[pastacode lang=”bash” manual=”grep%20’ap*le’%20filename%0A” message=”bash code” highlight=”” provider=”manual”/]

Readers should observe that the above pattern will match even ale word as * indicates 0 or more of the previous character occurrence.

[^CHAR] REGULAR EXPRESSION

Example: Match all the file names except a or b or c in it’s filenames

[pastacode lang=”bash” manual=”%20ls%20%7C%20grep%20%20’%5B%5Eabc%5D’%0A” message=”bash code” highlight=”” provider=”manual”/]

This will give output all the file names except files which contain a or b or c.

<WORD> REGULAR EXPRESSION

Example: Search for a word abc, for example I should not get abcxyz or readabc in my output

[pastacode lang=”bash” manual=”%20%20%20grep%20’%3Cabc%3E’%20filename%0A” message=”bash code” highlight=”” provider=”manual”/]

ESCAPE REGULAR EXPRESSION

Example : Find files which contain [ in it’s name, as [ is a special charter we have to escape it

[pastacode lang=”bash” manual=”%0Agrep%20%22%5B%22%20filename%0A%0Aor%0A%0Agrep%20’%5B%5B%5D’%20filename%0A” message=”bash code” highlight=”” provider=”manual”/] [ad type=”banner”]

[] SQUARE BRACES/BRACKETS REGULAR EXPRESSION

Example : Find all the files which contains a number in the file name between a and x

[pastacode lang=”bash” manual=”ls%20-l%20%7C%20grep%20’a%5B0-9%5Dx’%0A%0AThis%20will%20find%20all%20the%20files%20which%20is%0Aa0xsdf%0Aasda1xsdfas%0A..%0A..%0Aasdfdsara9xsdf%0Aetc.%0A” message=”bash code” highlight=”” provider=”manual”/]
  • So where ever it finds a number it will try to match that number.
  • Some of the range operator examples for  you.
  • [a-z] –Match’s any single char between a to z.
  • [A-Z] –Match’s any single char between A to Z.
  • [0-9] –Match’s any single char between 0 to 9.
  • [a-zA-Z0-9] – Match’s any single character either a to z or A to Z or 0 to 9
  • [!@#$%^] — Match’s any ! or @ or # or $ or % or ^ character.

 

Categorized in: