|
|
Splunk Replay: IT Events in Motion
Inspired by glTail.rb and Digg Lab’s Stack, Splunk Replay is a Flash-based, data visualization tool which “replays” your Splunk'd logfile activities in an animated layout. You can click on the image below to see an early version of Replay running on anonymized event data from our internal wiki system.
Replay generates animated barchart graphs using two extracted fields from the events it receives from Splunk. For example, if you have Splunk eat wiki data, you can plot the wiki user and wiki page they are editing, and then animate those relationships over a given time range.
Events particles are emitted from rows on the y-axis and stack up in columns x-axis. When a new row value is created, a random color is assigned to it for the duration of the session. These colors are then used in stacked bars to illustrate the amount of activity for a given row value. Older values on both axis are cycled out if more room is needed for newer data.
Not all data sets are suitable for this type of visualization. On the whole, it's better to use two fields which share a many-to-many relationship. Small unique counts of extracted values for one or both field will result in less than interesting visualizations.
Installing Splunk
Replay is a Flash application which uses Splunk's REST APIs to fetch and graph event data. To use it, you'll need to download and install Splunk first. The free version of Splunk will index 500MB of data a day. If you need help setting up Splunk or getting it to eat your event data, visit the documentation pages, or hit us up on #splunk on IRC.
Come back here when you are done installing Splunk and setting it up to index one of your web server logfiles. Also, make sure you have your $SPLUNK_HOME variable set to wherever you installed Splunk. For example, if your Splunk directory was /opt/splunk/, you could run:
export SPLUNK_HOME=/opt/splunk/
Note: For this particular example, we'll assume you are having Splunk index a combined-access web server logfile. If your web access logs are fairly "normal", Splunk should automatically extract the values for named fields like clientip, uri, status, user-agent, and more.
Check Your Field Extractions
Replay can only run on fields extracted from your indexed event data. For example, if you wanted to graph IP addresses against user agent names, you'd need both a clientip and user_agent field extracted. Again, in most cases, Splunk will automatically identify web access logfiles, and already be extracting your fields for you.
To check you have the necessary fields extracted, go into the Splunk interface and click the access_combined sourcetype in the dashboard, then click on the "fields" pulldown at the top left. A long list of fields will appear, and you can see if Splunk knows about the fields you want to graph. Here's a screenshot of the "clientip" field (remote ip address) showing as being extracted:
Note: It's not necessary to put a check mark next the fields you'll be using in the Splunk UI. Replay uses the REST APIs to fetch it's information, and Splunk provides all field extractions in the requests to the API by default. We'll discuss narrowing those fields here in a second.
Installing Replay
Replay needs to be served up by Splunk to work correctly. If you have wget and tar, you can do all this from the command line:
export SPLUNK_HOME=/opt/splunk/ wget http://splunk-flash.googlecode.com/files/splunk-replay.tar.gz tar xvfz splunk-replay.tar.gz mv splunk-replay $SPLUNK_HOME/share/splunk/search_oxiclean/static/html/
Note: These examples assume you have Splunk installed in /opt/splunk/ - make sure you use the right path for your install!
Installing the Patch
Splunk 3.22 and 3.23 need a small patch applied to be compatible with the Flash Replay application. Start by stopping your Splunk instance, and then downloading and installing the patch. This example assumes you are running 3.23:
export SPLUNK_HOME=/opt/splunk/ $SPLUNK_HOME/bin/splunk stop wget http://splunk-flash.googlecode.com/files/splunk_flash_patch_for_3.23.tar.gz tar xvfz splunk_flash_patch_for_3.23.tar.gz cd splunk_flash_patch_for_3.23 mv *.py $SPLUNK_HOME/lib/python2.5/site-packages/splunk/appserver/oxiclean $SPLUNK_HOME/bin/splunk start
Note: This procedure overwrites one exiting file and installs a new one that is used to proxy the requests from splunkd to Replay.
Configuring Replay
You'll need to tell Replay where it's installed. If you have vi installed, you can just run the following two commands:
cd $SPLUNK_HOME/share/splunk/search_oxiclean/static/html/replay vi replay_search_config.xml
Here's a quick run down of the config file properties:
- hostPath - The URL for the server and port that SplunkWeb is listening on. This is usually the default URL for your Splunk instance, and can be copied out of your browser. An example would be 'http://www.zoto.com:8000'. Note: This must be on the same subdomain/domain that your Splunk instance is listening to!
- username - The username for the Splunk account. Use `` if you are running the free version of Splunk.
- password - The password for the Splunk account. Use `` if you are running the free version of Splunk. Sounds familiar.
- outputTruncate - The length at which to truncate the outputField output strings. A value of 0 will not truncate the output strings.
- timeField - The field that contains the time values for each event (number of seconds before or after 0:00:00 GMT January 1, 1970). This is usually the _time field.
- minColor - The minimum color values to use for generating the colors in Replay. Some minimum values will result in dark, hideously colored graphs.
- maxColor - The maximum color values to use for generating the colors in Replay. Don't do it man.
Note: Flash is really, really fussy with caching things. If you change your configuration file, you may need to clear your browser's cache, or restart it to get the changes to take effect.
Using Replay
Figure out the URL for Replay
Because SplunkWeb requires things to be in certain directories, and does page versioning, Replay's URL is a bit long and complicated:
http://yourhost.yourserver.com:8000/static_search_oxiclean_35555/html/splunk-replay/replay.html
If this doesn't work for you, or you are using 3.22, try looking at the HTML source of one of the Splunk pages on your server. Basically you want to build a new URL that uses the 'static_search_oxiclean_xxxxx' path segment. You should see something like this at the top:
<link rel="stylesheet" type="text/css" media="all" href="/static_search_oxiclean_35555/css/default.css" /> <link rel="stylesheet" type="text/css" media="print" href="/static_search_oxiclean_35555/css/print.css" />
Notice the version number on the end of the oxiclean path. That's the same path segment you'll want to use in your Replay URL.
Doing Searches
Before you enter the search into Replay, you should execute the search in the Splunk interface to make certain you get enough results, and that there aren't too many results for Replay to handle.
Start by narrowing your search to a given time range:
The next step is optional, but you'll probably want to do it to speed up Replay's load times. If you don't narrow the fields returned to Replay, it can take upwards of 5 minutes to load ~10K events. To narrow to a particular set of fields, use the "fields" command in your search term. For example, if you want to graph clientip to uri of your access_log logfile, from the day of 4/17/08, you could do the following search:
source="access_log" starttime="04/17/2008:00:00:00" endtime="04/17/2008:23:59:59" | fields _time, _raw, clientip, uri
Note: The "time" and "raw" fields are used by Splunk, and are needed by Replay to parse the events correctly. We've included the "starttime" and "endtime" entries here, but you won't see these in the Splunk UI because they are parsed out and handled by the calendar widget.
Once you get the search you want to pass to Replay built, click on the search pulldown and select save search. The UI will display a modal popup with the entire search string you need to copy and paste.
Once you've copied the string, go into Replay and paste the string into the search box. Click on the > button to start the search. If you mess up the search, just reload the page to clear the search box. We're working on improving the interface, so for now editing and deleting in the search box is a bit rough.
You should see a spinner appear and a status on the number of events being loaded. If the spinner doesn't go away, try narrowing your search a bit, and verifying it brings back results in the Splunk UI.
Once the search is done, you can use the field pulldowns to select which fields will be plotted against each other. Use the field pulldown at the bottom to display an event field (or the raw field) when an event is displayed.
We'll post a screencast on doing all this in a few days. If you need assistance with setting up Replay, please post to the mailing list.
Sign in to add a comment

