Monday, 18 August 2008

awk variables

The awk programming language starts to get really useful when you start building some logic into it. Although the examples given here are simple one-line commands they contain the some of the building blocks with which you can really start to build complex awk programs.

Take a look at this code:

awk -v x=0 'NF != 6 { ++x } END { print x, NR }' file.txt

Firstly we're presented with a new flag '-v'. This tells awk that the next parameter on the command line is going to be a variable that we want to pass into the awk command. In this case we're defining x to be zero.

The next part of the awk command says that whenever we find a line that doesn't have six fields increment x by one (++x).

Then we see END which we've not come across before. This formatting of the command separates the actions awk takes down into two separate parts. Everything before the END is performed on every line in the input file. Everything after the END is done on the results of the previous part. So, the END statement in this command says print the value of x after every line has been checked against our test (NF != 6) and then print NR. NR is also new to us; it meerly means Number-of-Records - or more specifically the number of the last record or last line in the file.

So, if you're in need of an awk command that will give you a count of how many lines there are in a file that don't have a specific number of fields and you wish to know how many lines (records) there are in that file then this is the command for you! ;)

No comments: