My favorites | Sign in
Project Home Downloads Wiki Issues Source
Search
for
UserGuide  
This is the User Guide for jdbcGrabber
Updated May 6, 2010 by trcull%t...@gtempaccount.com

Overview: what am I getting myself into?

At its heart, jdbcGrabber is just a super-thin wrapper around your existing JDBC driver. Conceptually, all it does is notify a "sink" about SQL that just came into the JDBC layer and then pass that same SQL directly through to your existing JDBC driver without any modification whatsoever. Stripped down to its bare essentials, jdbcGrabber doesn't need any external libraries, doesn't start any threads, doesn't touch any files, and doesn't store anything in memory.

In its very simplest configuration with a StdoutSink, all it does is print SQL to standard out (i.e. System.out.println()). In its most complicated configuration with a LimitedQueueSink wrapping a SummarizingSink that writes to a FileSummaryStore it passes your SQL off to a separate thread, remembers what's been executed from which class, and keeps a count of how many times this all happens, writing the results periodically to a file.

How jdbcGrabber behaves depends entirely on what Sinks you tell it to use. To get started, you replace exactly two things in the configuration of your existing application: your JDBC url and your JDBC driver class.

A sink is anything that implements this interface:

public interface SQLSink {
	public void consume(String sql);

	public void consume(TraceElement trace);
}

Where TraceElement is simply a two-part data structure containing a stack trace and a SQL string.

Out of the box, jdbcGrabber has these sinks:

Sink Name Description
StdOutSink Just prints what it gets straight to standard out
SummarizingSink Keeps track of what SQL and what stack traces it has seen and periodically writes them to a file store or prints them to standard out
LimitedQueueSink Wraps either of the other two sinks in a limited-size buffer that's drained on a separate thread, to help prevent jdbcGrabber from falling behind and/or slowing down your app

Take me through step-by-step

Here's what you've got to do if you want to use jdbcGrabber in its very simplest configuration that just writes to standard out:

  1. Add jdbcGrabber.jar to your classpath
  2. Create a text file to hold your jdbcGrabber configuration (let's call it "myconfig.properties" and put it in the /usr/local directory for this example)
  3. Find all the configuration files in your application with the old JDBC url (i.e. "jdbc:mysql://localhost/mydatabase?user=&password=") and replace it with this url: "jdbc:jdbcGrabber:config=/usr/local/myconfig.properties"
  4. Find all the configuration files in your application with the old JDBC driver class (i.e. "org.gjt.mm.mysql.Driver") and replace it with this: "com.thedwick.jdbcGrabber.gbDriver"
  5. Add the following entries to /usr/local/myconfig.properties:
  6.             sink.name=myfirstsink 
                sink.class=com.thedwick.jdbcGrabber.sink.StdOutSink
                url=<the old url from step 3, i.e. jdbc:mysql://localhost/mydatabase?user=&password=>
                driver=<the old driver class from step 4, i.e. org.gjt.mm.mysql.Driver>

I get it, now tell me everything I can put in the config file

You don't have to use just the simplest configuration. There are a lot more options to put in your configuration file to get things started.

Config Used By Description Example
sink.name all Used to uniquely identify this sink for this database connection. If you are using jdbcGrabber on more than one database connection, make sure this is unique across them sink.name=myfirstsink
sink.class all The full java name of the very top level sink sink.class=com.thedwick.jdbcGrabber.sink.StdOutSink
driver all The Java class of the underlying, "real" JDBC driver driver=org.gjt.mm.mysql.Driver
url all The JDBC url of the underlying, "real" JDBC connection url=jdbc:mysql://localhost/mydatabase?user=&password=
LimitedQueueSink.underlying LimitedQueueSink The full java class of the sink that the LimitedQueueSink is wrapping LimitedQueueSink.underlying=com.thedwick.jdbcGrabber.sink.SummarizingSink
LimitedQueueSink.threads LimitedQueueSink Number of worker threads to start to drain the queue LimitedQueueSink.theads=2
LimitedQueueSink.queuesize LimitedQueueSink Maximum number of SQL statements that can be in the queue before the LimitedQueueSink should consider itself full and just start dropping statements to conserve memory and processing power LimitedQueueSink.queuesize=200
summarizingSink.store SummarizingSink The full Java class name of the store to write results to. Out of the box, com.thedwick.jdbcGrabber.sink.FileSummaryStore and com.thedwick.jdbcGrabber.sink.StdOutSummaryStore are supported summarizingSink.store=com.thedwick.jdbcGrabber.sink.StdOutSummaryStore
fileSummaryStore.filePath FileSummaryStore The path where the store should keep its files fileSummaryStore.filePath=/var/log

There are so many options, what do you recommend I start with?

I recommend starting with this configuration, which will give you a limited queue sink that summarizes data and writes the results out to a set of files in /var/log:

sink.name=myfirstsink
sink.class=com.thedwick.jdbcGrabber.sink.LimitedQueueSink
driver=<your driver, i.e. "org.gjt.mm.mysql.Driver">
url=<your url, i.e. jdbc:mysql://localhost/nydatabase?user=&password=">


LimitedQueueSink.underlying=com.thedwick.jdbcGrabber.sink.SummarizingSink
LimitedQueueSink.threads=2
LimitedQueueSink.queuesize=200

summarizingSink.store=com.thedwick.jdbcGrabber.sink.FileSummaryStore
fileSummaryStore.filePath=/var/log

Ok, I've captured my data, now what?

Presumably you didn't capture your SQL just for giggles. If you want to understand what your application is doing, then you have to read the files it saved data to and convert them to a format the lends itself to analysis. jdbcGrabber has been optimized to capture SQL and stack traces as fast as possible without hindering your running application or using too much memory, but unfortunately this means its file format isn't terribly readable without further processing.

Luckily, jdbcGrabber comes with two post-processors that will convert the capture files to something more sensible: BasicSummary and GraphvizDotGraph.

Just taking a peek

BasicSummary is exactly what it sounds like: it reads the capture files, collates them all together, and outputs summary information in a nicer format to standard out like this:

+++++++++++++++++++++++++++++++

For sink: myfirstsink

=====================================

 SELECT p.id, c.name, p.price FROM products p, customers c WHERE c.id = ? AND p.bought_by = c.id

...is called 500 times from...

500 times
	com.myorg.dao.SqlBox.executeQuery(83)
	com.myorg.business.CRMManager.doSomething(555)
	com.myorg.service.ExternalFacade.execute(212)

=====================================

 SELECT signed_date, id, name FROM customers

...is called 10000 times from...

10000 times
	com.myorg.dao.SqlBox.executeQuery(83)
	com.myorg.business.CRMManager.findStaleCustomers(555)
	com.myorg.service.ExternalFacade.execute(212)

=====================================

 SELECT title, paygrade FROM people WHERE title IS NOT NULL

...is called 151 times from...

149 times
	com.myorg.dao.SqlBox.executeQuery(83)
	com.myorg.business.LegitLogic.storeSalaries(332)
	com.myorg.service.ExternalFacade.execute(212)
2 times
	com.myorg.dao.SqlBox.executeQuery(83)
	com.myorg.business.TimTrojan.doINeedARaise(351)
	com.myorg.service.ExternalFacade.execute(212)

when called with this from the jdbcGrabber distribution directory:

java -classpath ./jdbcGrabber.jar:libs/Zql.jar com.thedwick.jdbcGrabber.analysis.BasicSummary /var/log com.myorg full myfirstsink

Turning it all into a pretty graph

Here's where things get cool. You can also use the venerable GraphViz library to create a graph that looks like this:

When called with this:

java -classpath ./jdbcGrabber.jar:libs/Zql.jar com.thedwick.jdbcGrabber.analysis.GraphvizDotGraph ../.. com.myorg myfirstsink > ../../example.dot

Which yields the following GraphViz DOT format file:

digraph G {
	subgraph cluster_1{
		2 [label="products"];
		3 [label="customers"];
		4 [label=" SELECT p.id..."];
                  4 -> 2 [label=""];
                  4 -> 3 [label=""];

		3 [label="customers"];
		8 [label=" SELECT sign..."];
                  8 -> 3 [label=""];

		9 [label="people"];
		10 [label=" SELECT titl..."];
                  10 -> 9 [label=""];

	label="myfirstsink";}
	subgraph cluster_13{
		5 [label="com.myorg.dao.SqlBox"];
                  5 -> 4 [label="500 times"];
		6 [label="com.myorg.business.CRMManager"];
                  6 -> 5 [label="executeQuery"];
		7 [label="com.myorg.service.ExternalFacade"];
                  7 -> 6 [label="doSomething"];
                  5 -> 8 [label="10000 times"];
                  7 -> 6 [label="findStaleCustomers"];
                  5 -> 10 [label="2 times"];
		11 [label="com.myorg.business.TimTrojan"];
                  11 -> 5 [label="executeQuery"];
                  7 -> 11 [label="doINeedARaise"];
                  5 -> 10 [label="149 times"];
		12 [label="com.myorg.business.LegitLogic"];
                  12 -> 5 [label="executeQuery"];
                  7 -> 12 [label="storeSalaries"];

	label="Java";}
}

Understanding the capture files

Now that you're familiar with the tools for pretty-printing your output, let's talk about the raw capture files. The data captured by jdbcGrabber is captured in five different files, each with its own purpose. We use five different files because doing so allows us to tightly optimize the memory usage and I/O usage of jdbcGrabber. For example, by having a "sqlTable" file, we can count the number of calls to a particular SQL statement without having to actually keep that SQL in memory. Any of you who have seen the lengthy SQL that Hibernate creates will appreciate that capability.

So, if you want to do your own analysis of the results, you'll likely have to write a little script to parse the five files and collate them together in memory to do something with them. Below is a table describing the files and their formats.

File Format Description
sqlCounts.<sink name>.dat <sql id> : <count>\n Keeps track of the total number of times a given piece of SQL shows up
sqlTable.<sink name>.dat <sql id> : <sql>\n Maps SQL id to the actual SQL
traceCounts.<sink name>.dat <trace id> : <count>\n Keeps track of the total number of times a given stack trace shows up
traceMapping.<sink name>.dat <sql id> : <trace id>,<trace id>,<trace id>,<...>,\n Maps one or more stack traces to a given piece of SQL
traceTable.<sink name>.dat <trace id> : \n <stack trace, likely multi line>\n|\n Maps trace id to the actual stack trace

I'm feeling pretty comfortable with all this, how do I write my own custom sink?

Ok, so now you understand all the capability available out of the box for jdbcGrabber. But maybe you want to do something different and custom when SQL gets executed. Well, you can.

Because of the pluggable architecture of jdbcGrabber, you can write your own sink and tell jdbcGrabber to call it when it sees a piece of SQL. For this example, let's say you want to print the SQL and stack trace out to standard error instead of standard out. No problem!

Writing the sink

First you have to write your own sink class. A sink class is anything that implements the interface "SQLSink", like this one below:

package com.myorg.analysis;

import java.util.Properties;
import com.thedwick.jdbcGrabber.util.ThreadUtil;

public class StdErrSink implements SQLSink {

	public StdErrSink(Properties props){
		//props is everything that you put in your myconfig.properties 
                //  file.  In this case, we don't care about it too much, but 
                //  we still need a constructor to take it because that's what
                //  jdbcGrabber expects.
	}
	
	@Override
	public void consume(String sql) {
		consume(new TraceElement(ThreadUtil.getStackTrace(), sql));
	}

	@Override
	public void consume(TraceElement trace) {
		System.err.println(trace.getSql());
		for (StackTraceElement element : trace.getStackTrace()){
			System.err.println(element.toString());
		}
	}
}

That wasn't so bad, was it? Now compile that class into a jar somewhere in your application and make sure that jar is in the classpath of your application.

Configuring jdbcGrabber to use my sink

Now you just have to tell JDBCGrabber to use your new sink. For that, you simply have to edit the configuration file you already have. Specifically, change the "sink.class" entry to say this:

sink.class=com.myorg.analysis.StdErrSink

and keep everything else the same. Presto! You've got your own sink running.

Comment by dyer.k...@gmail.com, Aug 10, 2010

good work on the section titles.


Sign in to add a comment
Powered by Google Project Hosting