OK, so I had this file that had different types of events in it and I wanted to find the count of how many occurences there were for each event type. The file was comma separated and the events were at $1 (the first field) and looked like this:
TRAF:5
TRAF:8
TRAF:3
Here's the awk command that got the result I was after:
awk -F, '{ te[$1]++ } END { for ( i in te ) print i" : " te[i] }' traf.test
Here's what it's doing:
The file delimiter flag '-F,' we're already familiar with. This tells awk that the file is comma separated.
Now, in the next bit we're introducing awk arrays for the first time: "te[$1]++"
We're creating an array called 'te', you can call this anything you like. We're creating an index in our array based on the contents of $1 (the first field) which is our traffic event type. The double-plus signs are saying that the value of te[$1] is to be incremented. So, what happens is that an array index is created for every unique value found in $1. That means that we've captured all the different possibilities of $1 with out doing much work at all. When awk finds another example of that same index value ($1) it increments the value of that array component.
Once we've gone through the whole file we get to the END section of the code. Here we're seeing a for loop for the first time:
for (i in te) print i" : " te[i]
This loop iterates through the array and prints out the index of the array (i) and then the value of that component. We end up with a print out of each unique value found at $1 and then a count of the occurences of that value.
Subscribe to:
Post Comments (Atom)
2 comments:
thanks a bunch, that was exactly what I was looking for and very well explained!
Rob
this is great! i was trying to do this in perl but it's a little bit painful and then i discovered awk and your post. with 2 lines of code i can count the errors in my log files so i can see the trends.
thank you!
Post a Comment