How Data is CollectedIn my experience as a performance tester, the collecting and analysis of data is somewhat varied and likely to change depending on client requirements and more often than not, security or access restrictions to hosts being monitored. As such I've had to script and re-script different solutions for different clients. Sometimes I had the luxury of using commercial toolsets such as HP LoadRunner, and other times access to existing performance monitoring solutions such as HP OpenView or IBM Tivoli. However as most would appreciate, these tools are often expensive and cumbersome to maintain. Essentially I took the path of least resistance and in many cases I rolled my own tools, which you can find in the tools/monitors folder. These scripts typically rely on ssh access or typerf access in order to abstract performance metrics from the target hosts. I find this to be the least intrusive way of sampling data, requiring minimal software to get up and running quickly. I've also experimented with the use of UDP client/servers to send/retrieve data from monitored hosts. The key point I want to make for numbcrunchr, is that it doesnt really matter how you sample the data, all that matters is the way in which you subsquently store it in the numbrcrunchr mysql schema. SchemaEach numbrcrunchr schema has the following layout: CREATE TABLE `perf_jvm` (
`rowid` int(11) NOT NULL auto_increment,
`date` date default NULL,
`time` time default NULL,
`groupid` varchar(255) default NULL,
`subgroupid` varchar(255) default NULL,
`used_physical_memory` int(11) default NULL,
`heap_size_max` int(11) default NULL,
`free_physical_memory` int(11) default NULL,
`total_garbage_collection_time` int(11) default NULL,
`total_garbage_collection_count` int(11) default NULL,
`total_nursery_size` int(11) default NULL,
`heap_free_percent` int(11) default NULL,
PRIMARY KEY (`rowid`),
KEY `groupid_idx` (`groupid`),
KEY `subgroupid_idx` (`subgroupid`)
) ENGINE=InnoDB AUTO_INCREMENT=125277 DEFAULT CHARSET=latin1; There are some essential fields that all numbrcrunchr tables must contain. They are: - rowid - this helps order the data and is useful in sorting data
- date - this should be in the mysql format e.g. 'YYYY-MM-DD'
- time - this should be in the mysql format e.g. 'hh:mm:ss'
- groupid - at a minimum, this field must be included. Typically populated by the target host being monitored.
- subgroupid - this is optional, but is useful for hosts that may have subgroups of information. e.g. A JMS server might have a number of managed instances between which you'd like to differentiate.
The remaining columns should simply map to the metrics being collected for that table. e.g. My example perf_jvm table has a number of metrics being sampled including garbage collection count, heap free percentages and so on. You can have as many fields as you like (within limitations of mysql table settings). There is no further configuration required within numbrcrunchr, as available fields will be automatically enumerated by the web application. Note: all numbrcrunchr tables must be prefixed by perf_<tablename> FutureIt is my intent to keep developing and providing new monitors for the numbrcrunchr toolset. What I've included to date is a set of monitors I know currently works. In future, hopefully more developers or testers can contribute to this toolset with their own tools. I am currently experimenting with other RRD toolsets such as client monitors developed in python which I hope to include soon.
|