Apache Log Grepping

by damonp on September 14, 2009

in Snippets

Nothing beats a full stats package, but with a little grepping and a logfile and we can figure out some quick statistics.

Get only hits coming from Google.com/search:

grep 'google.com/search' access_log | head -1

Extract only the referrer field:

grep 'google.com/search' access_log | head -1 | awk '{print $11}'

Get the search terms only:

grep 'google.com/search' access_log | \
  head -1 | \
  awk '{print $11}' | \
  cut -d\? -f2 | cut -d\& -f1 | \
  sed 's/+/ /g;s/%22/"/g;s/q=//'

Here’s a little script that puts it all together. It finds the unique search terms, cleans out the trash, sorts them and list the top twenty five in order of descending hit count.

#!/bin/sh
LOG="/var/logs/httpd/access_log"
grep 'google.com/search' $LOG | \
  awk '{print $11}' | \
  cut -d\? -f2 | cut -d\& -f1 | \
  sed 's/+/ /g;s/%22/"/g;s/q=//' | \
  sed 's/%[0-9a-fA-F][0-9a-fA-F]/ /g;s/"//g' | \
  grep -v '=' | sort | uniq -c | sort -rn | head -25

To see them all remove the | head -25 statement at the end of the command.

Thanks to an article at Linux Journal for the examples.

Popularity: 2%

Most Popular Posts

Damon Parker is a freelance sysadmin and web developer in Texas. He specializes in server setup, server security and high performance server configurations. Need help setting up a web server or getting a server back online after a crash or hack? Email Damon

{ 1 comment… read it below or add one }

Achim November 4, 2009 at 11:09 am

this works very nice! I’m a shell scripting noob, it helped me much.

Reply

Leave a Comment

Previous post:

Next post: