|
logrep is a handy command-line tool for sophisticated, ad-hoc analysis of webserver log files. Using logreplogrep [--mode MODE] [--include | --exclude CLASSES] [-H | -R]
[--output FIELDS] [--filter FILTERS] [--last LAST_N]
[--sort LIM:FIELDS:DIRECTION] [--config CFG_FILE] [--quiet]
[LOG_FILE]
-m MODE There are three modes:
--mode - "grep" parses an entire log file (default).
- "tail" reads from the end of the file.
- "top" shows running performance stats.
-i, -e CLASSES Include or exclude the given URL "classes". You can
--include configure logrep to classify URLs by a set of
--exclude regular expressions. See the installation docs and
/etc/wtop.cfg for how to configure your own classes.
--include and --exclude are mutually exclusive.
Examples:
--include "home,search,wiki"
--exclude "img,xml,js"
-f FILTERS -f filters act on named fields.
--filter There is support for strings & numbers, greater
than (>), less than (<), equals (=), not-equals
(!=), and regular expression match (~ and !~).
For example: Filter successful requests that were
over 10kB in size that do not have 'example.com'
in the Referer field:
-f "status=200,bytes>10000,refdom!~example.com"
AVAILABLE FIELDS:
msec millisecond response time
ip The IP address of the client
url The path of the request, eg '/home'
ref 'Referer' header
refdom domain part of the 'Referer' header
bytes Bytes sent
ua User-agent header
uas First 30 characters of ua
class URL class, configurable in wtop.cfg
status HTTP status code, eg 200, 301, 404
proto Protocol version, eg 'HTTP/1.1'
method HTTP method, eg 'GET', 'POST'
bot Is a robot? 1 or 0. Only a guess.
botname eg 'Googlebot', 'Nutch', 'Slurp', etc
ts Unix timestamp of the request
year
month
day
hour
minute
country country name (see Geocoding, below)
cc ISO-639 country code (see below)
-H, -R Shorthand for a useful but incomplete filter of
robot user-agents. Equivalent to --filter 'bot=0'
or --filter 'bot=1'
-o FIELDS Output only the given fields, tab-delimited. All
--output of the fields listed for --filter are available.
Example:
$ logrep -o 'cc,msec,url'
UK 34 /Madonna.jpg
CA 34 /Padma-Lakshmi.jpg
UK 34 /Shaun-Woo.jpg
US 184 /Ben-Stiller.jpg
...
AGGREGATE FUNCTIONS:
In -m grep mode you can use aggregate functions
on numeric fields such as bytes and msec. Any
non-aggregate fields in the list will be used to
group records together.
count(*)
avg(FIELD) mean average
min(FIELD) lowest seen value
max(FIELD) highest seen value
sum(FIELD) summation of all values
var(FIELD) population variance
dev(FIELD) deviation (square root of variance)
Example (grouped by status):
$ logrep -o 'status,count(*),avg(msec)'
200 4196 242.58
302 5 79.75
404 1 9.00
304 798 15.76
-s LIM:FIELDS:DIRECTION
--sort Use this option to sort & limit aggregate records.
LIMIT is the number of records to return, FIELDS
is a comma-delimited list of column positions
starting with 1, and DIRECTION is either
'descending' (default) or 'ascending'.
Example (total bytes sent, by hour & minute)
$ logrep -o 'hour,minute,sum(bytes)' -s'3600:1,2:a'
12 0 1895927
12 1 7418972
12 2 2103828
12 3 7419371
12 4 1680468
...
Example (the 10 most popular URLs):
$ logrep -o 'url,count(*)' -s '10:2'
/home 23718
/wiki 8211
/about 2703
...
-l LAST_N (grep mode) Only read the last N log lines.
--last
-c CFG_FILE Feed logrep a custom config file. By default it
--config will use:
/etc/wtop.cfg (Linux, BSD, OSX, etc)
Python sys.prefix (Windows)
-q, --quiet Quiet mode. Does not print warnings to stderr.
LOG_FILE The path to a log file. By default logrep will
read from the file path specified in wtop.cfg
If you specify '-', logrep will read from STDIN.
GEOCODING:
logrep will use the MaxMind GeoIP library if it is installed. This
will enable two extra fields for filtering and output: country
(eg "United Kingdom"), and cc (ISO-639 country code, eg "UK"). These
are a *guess* at the country the HTTP client is from.
KNOWN BUG:
Some installations of Apache have HostnameLookups defaulted to On.
This means that the %h field will contain the fully-qualified domain
name of the client (xdsl456.foo.example.com) instead of the IP
address (123.1.2.3). Geocoding will work but will require a DNS
lookup to resolve the IP address. Using the 'cc' or 'country'
field in this case will generate a *LOT* of DNS traffic and can
hang the program. It is recommended to explicitly set
HostnameLookups Off in your Apache configuration.
EXAMPLES:
"wtop" for all human traffic:
$ logrep -m top -f 'bot=0' access.log
Status code & response times for all Googlebot homepage hits:
$ logrep -f 'botname=Googlebot' -i home -o status,msec
Tail for pages about Angelina Jolie or Brad Pitt sent from example.com
$ logrep -m tail -f 'url~jolie|pitt,ref~example.com' access.log
Get maximum response size and average response time for requests
grouped by URL class:
$ logrep -o 'class,max(bytes),avg(msec)' access.log
0.6.3 1 Sep 2008 carlos@bueno.org http://code.google.com/p/wtop
|
' ytgq ;;er ';tlg]dflreq?eedlewlrwer[?4e'lwlpleqlq4eloert