agg - aggregate fields and calculate group totals

In report generators you can easily calculate sub-totals. Unix has almost all the tools to do relational algebra (sort, uniq, wc, join, cut, fgrep, ...) but nothing which directly calculates sub-totals.

For instance, if you wanted to know how many times you have visited various sites using your squid proxy's access.log which has the following format:

1010927001.993    568 TCP_MISS/200 324 GET - DIRECT/ image/gif
1010927002.050    612 TCP_CLIENT_REFRESH_MISS/200 2409 GET - DIRECT/ image/gif
1010927002.521   1078 TCP_MISS/200 18799 GET - DIRECT/ image/gif
1010927183.773    685 TCP_REFRESH_HIT/200 13456 GET - DIRECT/ text/html
1010927184.359    456 TCP_REFRESH_HIT/200 27351 GET - DIRECT/ image/gif
1010927184.405    448 TCP_REFRESH_HIT/200 5101 GET - DIRECT/ image/gif
1010931268.636   2824 TCP_REFRESH_MISS/200 9683 GET - DIRECT/ text/html
1010931269.170    397 TCP_REFRESH_HIT/200 7386 GET - DIRECT/ image/jpeg

You could process the log with a perl script; and awk script; dump it into a mySQL database and generate the report there; or something completely different. All of them needs specialized scripts to do it, and all of them is overkill (IMHO)

agg is able to calculate group sub-totals.  You can generate the report with this command-line:

fgrep "/200 " | cut -d/ -f4 | sed -e "s/$/,1/" | sort | agg -d, -g1 -f2

That is: fgrep only successes, cut the domain name, add a field "1" to every line so agg has something to sum, sort it (agg assumes its input is sorted) and then let agg calculate sub-totals for field 2 on every group of field 1 (fields are comma-delimited)


agg [OPTION]... [FILE]
use DELIM instead of comma
Define grouping fields. Default is first field
Sum these fields. Default is second field
Trim (remove whitespace from fields)
Display help


Source: agg-0.9.tar.gz (5KB)