Wednesday 3 September 2008

awk: using the length() function

OK, take the following code snippit as an example:

$ echo this.is.test-for-awk | awk -F'.' '{ printf substr($3,6,length($3)-9)"\nLength of $3: "length($3)"\n" }'

Output:

for
Length of $3: 12

All the above awk command is doing is accepting the input of "this.is.test-for-awk" from an echo command and splitting it into it's component parts as described within the body of the awk command.

As you can see the input has two different types of field separators - which can be quite common. So I thought I'd start with the '.' separator and I've defined this by using the -F'.' flag.

I then wanted to split down what has now been defined as $3 (as the separator is '.'). This I have done using the substring and length functions that are built into awk. Strictly I didn't need to use the length function as I could have just put in the number '3' so this component of the command would've looked like this:

substr($3,6,3)

The above substr() command translates to: strip $3 (field 3) from digit number 6 to digit number 3. But I wanted to show how the length function could be used so this component of the command looked like this:

substr($3,6,length($3)-9)

Now, what the above is doing is splitting down $3 based starting from the 6th digit and then getting the length of the field and taking away 9. As you can see the length of $3 is 12 digits - including the separators. So by taking away 9 we're left with 3. The result is that we have the output 'for'.

The last part of the command proves that the length function is reading the input text correctly by showing us the length of field 3.

"\nLength of $3: "length($3)"\n"

Because I've used the printf function I can format the output to make it more readable by including \n to print new-lines.