agg - aggregate fields and calculate group totals

In report generators you can easily calculate sub-totals. Unix has almost all the tools to do relational algebra (sort, uniq, wc, join, cut, fgrep, ...) but nothing which directly calculates sub-totals.

For instance, if you wanted to know how many times you have visited various sites using your squid proxy's access.log which has the following format:

1010927001.993    568 127.0.0.1 TCP_MISS/200 324 GET http://images2.slashdot.org/Slashdot/pc.gif? - DIRECT/64.28.67.114 image/gif
1010927002.050    612 127.0.0.1 TCP_CLIENT_REFRESH_MISS/200 2409 GET http://images.slashdot.org/topics/topicslashdot.gif - DIRECT/64.28.67.57 image/gif
1010927002.521   1078 127.0.0.1 TCP_MISS/200 18799 GET http://images.slashdot.org/banner/tkgk0184en.gif? - DIRECT/64.28.67.57 image/gif
1010927183.773    685 127.0.0.1 TCP_REFRESH_HIT/200 13456 GET http://www.xfree86.org/ - DIRECT/204.152.184.77 text/html
1010927184.359    456 127.0.0.1 TCP_REFRESH_HIT/200 27351 GET http://www.xfree86.org/xfree86.logo.gif - DIRECT/204.152.184.77 image/gif
1010927184.405    448 127.0.0.1 TCP_REFRESH_HIT/200 5101 GET http://www.xfree86.org/news.logo.gif - DIRECT/204.152.184.77 image/gif
1010931268.636   2824 127.0.0.1 TCP_REFRESH_MISS/200 9683 GET http://gcc.gnu.org/ - DIRECT/209.249.29.67 text/html
1010931269.170    397 127.0.0.1 TCP_REFRESH_HIT/200 7386 GET http://gcc.gnu.org/gcc.jpg - DIRECT/209.249.29.67 image/jpeg

You could process the log with a perl script; and awk script; dump it into a mySQL database and generate the report there; or something completely different. All of them needs specialized scripts to do it, and all of them is overkill (IMHO)

agg is able to calculate group sub-totals.  You can generate the report with this command-line:

fgrep "/200 " | cut -d/ -f4 | sed -e "s/$/,1/" | sort | agg -d, -g1 -f2

That is: fgrep only successes, cut the domain name, add a field "1" to every line so agg has something to sum, sort it (agg assumes its input is sorted) and then let agg calculate sub-totals for field 2 on every group of field 1 (fields are comma-delimited)

Usage

agg [OPTION]... [FILE]
Options:
-d DELIM
use DELIM instead of comma
-g LIST
Define grouping fields. Default is first field
-f LIST
Sum these fields. Default is second field
-t
Trim (remove whitespace from fields)
-h
Display help

Download

Source: agg-0.9.tar.gz (5KB)