The awk command is a powerful technique for processing or analyzing text files—especially, facts files which can be prepared by way of lines (rows) and columns.

Simple awk commands may be run from the command line. Greater complicated tasks must be written as awk programs (so-called awk scripts) to a file.

The fundamental layout of an awk command looks like this:

awk 'pattern {action}' input-file > output-file

This indicates: take each line of the enter document; if the line incorporates the pattern apply the action to the road and write the resulting line to the output-record.

If the sample is overlooked, the motion is carried out to all line. For example:

awk '{ print $5 }' table1.txt > output1.txt

This declaration takes the element of the fifth column of every line and writes it as a line within the output file “output.Txt”. The variable ‘$4’ refers to the second column. Further you may access the primary, 2nd, and 0.33 column, with $1, $2, $3, and so on. With the aid of default columns are assumed to be separated by way of areas or tabs (so referred to as white area). So, if the enter file “table1.Txt” contains those lines:

1, Justin Timberlake, Title 545, Price $7.30
2, Taylor Swift, Title 723, Price $7.90
3, Mick Jagger, Title 610, Price $7.90
4, Lady Gaga, Title 118, Price $7.30
5, Johnny Cash, Title 482, Price $6.50
6, Elvis Presley, Title 335, Price $7.30
7, John Lennon, Title 271, Price $7.90
8, Michael Jackson, Title 373, Price $5.50

Then the command could write the subsequent traces to the output record “output1.Txt”:

545,
723,
610,
118,
482,
335,
271,
373,

If the column separator is something aside from areas or tabs, along with a comma, you could specify that inside the awk announcement as follows:

awk -F, '{ print $3 }' table1.txt > output1.txt

This can select the element from column three of each line if the columns are taken into consideration to be separated with the aid of a comma.

Therefore the output, in this case, might be:

Title 545
Title 723
Title 610
Title 118
Title 482
Title 335
Title 271
Title 373

The listing of statements in the curly brackets (”,”) is known as a block. In case you put a conditional expression in the front of a block, the announcement inside the block may be completed handiest if the situation is genuine.


awk '$7=="\$7.30" { print $3 }' table1.txt

In this situation, the situation is $7==”$7.30″, this means that that the detail at column 7 is identical to $7.30. The backslash in front of the dollar signal is used to save you the gadget from decoding $7 as a variable and rather take the dollar signal literally.

So this awk statement prints out the detail at the 3rd column of every line that has a “$7.30” at column 7.

You can also use everyday expressions because the condition. As an instance:

awk '/30/ { print $3 }' table1.txt

The string between the two slashes (‘/’) is the regular expression. In this case, it’s miles just the string “30.” this indicates if a line contains the string “30”, the gadget prints out the element on the third column of that line. The output in the above instance would be:

Timberlake,
Gaga,
Presley,

If the desk elements are numbers awk can run calculations on them as in this case:

awk '{ print ($2 * $3) + $7 }'

Besides the variables that get admission to elements of the cutting-edge row ($1, $2, etc.) there’s the variable $0 which refers to the complete row (line), and the variable nf which holds to the range of fields.

You can additionally define new variables as in this situation:

awk '{ sum=0; for (col=1; col<=NF; col++) sum += $col; print sum; }'

This computes and prints the sum of all the factors of every row.

Awk statements are regularly mixed with sed instructions.

Categorized in: