My favorites | Sign in
Project Logo
                
Search
for
Updated Aug 29, 2008 by aristus
Labels: Featured
UsingLogrep  

logrep is a handy command-line tool for sophisticated, ad-hoc analysis of webserver log files.

Using logrep

logrep  [--mode MODE] [--include | --exclude CLASSES] [-H | -R]
        [--output FIELDS] [--filter FILTERS] [--last LAST_N] 
        [--sort LIM:FIELDS:DIRECTION] [--config CFG_FILE] [--quiet] 
        [LOG_FILE]


   -m MODE           There are three modes:
   --mode              - "grep" parses an entire log file (default).
                       - "tail" reads from the end of the file.
                       - "top"  shows running performance stats.

   -i, -e CLASSES    Include or exclude the given URL "classes". You can
   --include         configure logrep to classify URLs by a set of 
   --exclude         regular expressions. See the installation docs and
                     /etc/wtop.cfg for how to configure your own classes.
                     --include and --exclude are mutually exclusive. 

                     Examples: 
                        --include "home,search,wiki"
                        --exclude "img,xml,js"

   -f FILTERS        -f filters act on named fields.
   --filter          There is support for strings & numbers, greater 
                     than (>), less than (<), equals (=), not-equals 
                     (!=), and regular expression match (~ and !~).

                     For example: Filter successful requests that were 
                     over 10kB in size that do not have 'example.com'
                     in the Referer field:

                        -f "status=200,bytes>10000,refdom!~example.com" 

                     AVAILABLE FIELDS:
                        msec       millisecond response time
                        ip         The IP address of the client
                        url        The path of the request, eg '/home'
                        ref        'Referer' header
                        refdom     domain part of the 'Referer' header
                        bytes      Bytes sent 
                        ua         User-agent header
                        uas        First 30 characters of ua
                        class      URL class, configurable in wtop.cfg
                        status     HTTP status code, eg 200, 301, 404
                        proto      Protocol version, eg 'HTTP/1.1'
                        method     HTTP method, eg 'GET', 'POST'
                        bot        Is a robot? 1 or 0. Only a guess.
                        botname    eg 'Googlebot', 'Nutch', 'Slurp', etc
                        ts         Unix timestamp of the request
                        year
                        month
                        day
                        hour
                        minute
                        country    country name (see Geocoding, below)
                        cc         ISO-639 country code (see below)


   -H, -R            Shorthand for a useful but incomplete filter of 
                     robot user-agents. Equivalent to --filter 'bot=0'
                     or --filter 'bot=1'


   -o FIELDS         Output only the given fields, tab-delimited. All
   --output          of the fields listed for --filter are available.

                     Example:
                     $ logrep -o 'cc,msec,url'
                        UK      34      /Madonna.jpg
                        CA      34      /Padma-Lakshmi.jpg
                        UK      34      /Shaun-Woo.jpg
                        US      184     /Ben-Stiller.jpg
                        ...

                     AGGREGATE FUNCTIONS:
                     In -m grep mode you can use aggregate functions
                     on numeric fields such as bytes and msec. Any
                     non-aggregate fields in the list will be used to
                     group records together.
                        count(*)     
                        avg(FIELD)   mean average
                        min(FIELD)   lowest seen value
                        max(FIELD)   highest seen value
                        sum(FIELD)   summation of all values
                        var(FIELD)   population variance
                        dev(FIELD)   deviation (square root of variance)

                     Example (grouped by status):
                     $ logrep -o 'status,count(*),avg(msec)'
                        200 4196    242.58
                        302 5       79.75
                        404 1       9.00
                        304 798     15.76

   -s LIM:FIELDS:DIRECTION
   --sort            Use this option to sort & limit aggregate records. 
                     LIMIT is the number of records to return, FIELDS 
                     is a comma-delimited list of column positions 
                     starting with 1, and DIRECTION is either 
                     'descending' (default) or 'ascending'.

                     Example (total bytes sent, by hour & minute)
                     $ logrep -o 'hour,minute,sum(bytes)' -s'3600:1,2:a'
                        12	0	1895927
                        12	1	7418972
                        12	2	2103828
                        12	3	7419371
                        12	4	1680468
                        ...

                     Example (the 10 most popular URLs):
                     $ logrep -o 'url,count(*)' -s '10:2'
                        /home    23718
                        /wiki    8211
                        /about   2703
                        ...

   -l LAST_N         (grep mode) Only read the last N log lines.
   --last


   -c CFG_FILE       Feed logrep a custom config file. By default it 
   --config          will use:
                        /etc/wtop.cfg         (Linux, BSD, OSX, etc)
                        Python sys.prefix     (Windows)


   -q, --quiet       Quiet mode. Does not print warnings to stderr.

   LOG_FILE          The path to a log file. By default logrep will
                     read from the file path specified in wtop.cfg
                     If you specify '-', logrep will read from STDIN.

 GEOCODING:
    logrep will use the MaxMind GeoIP library if it is installed. This
    will enable two extra fields for filtering and output: country 
    (eg "United Kingdom"), and cc (ISO-639 country code, eg "UK"). These
    are a *guess* at the country the HTTP client is from.
   
 KNOWN BUG: 
    Some installations of Apache have HostnameLookups defaulted to On.
    This means that the %h field will contain the fully-qualified domain
    name of the client (xdsl456.foo.example.com) instead of the IP
    address (123.1.2.3). Geocoding will work but will require a DNS
    lookup to resolve the IP address. Using the 'cc' or 'country'
    field in this case will generate a *LOT* of DNS traffic and can
    hang the program. It is recommended to explicitly set 
    HostnameLookups Off in your Apache configuration.


 EXAMPLES:
 
 "wtop" for all human traffic:   
     $ logrep -m top -f 'bot=0' access.log

 Status code & response times for all Googlebot homepage hits:
     $ logrep -f 'botname=Googlebot' -i home -o status,msec

 Tail for pages about Angelina Jolie or Brad Pitt sent from example.com
     $ logrep -m tail -f 'url~jolie|pitt,ref~example.com' access.log

 Get maximum response size and average response time for requests 
 grouped by URL class:
     $ logrep -o 'class,max(bytes),avg(msec)' access.log


0.6.3   1 Sep 2008    carlos@bueno.org   http://code.google.com/p/wtop

Comment by kill...@rocketmail.com, Oct 02, 2009

' ytgq ;;er ';tlg]dflreq?eedlewlrwer[?4e'lwlpleqlq4eloert


Sign in to add a comment
Hosted by Google Code