|
MeanTimeBeforeFailure
mtbf: The mean time before failure
IntroductionMTBF, not to be confused with MFBT, plots the mean time before failure of builds for the first 60 days of their release. This report is useful for identifying releases which are crashing more often than expected, day over day through release. DetailsFor initial pages of MTBF, we're looking to compare specific release types against other specific release types (see Release Types below) -- e.g., Firefox 3.0.7 vs Firefox 3.0.6 and before. To do this, we want to record the following data from all crash reports for a given release:
This is recorded for day 0 through day 59 of a release and then plotted with day intervals against number of seconds. One would expect a gentle up and to the right trend for a healthy release. The data can be drilled down to reveal OS-specific MTBF information. Ideal OutputIdeally, we want to be able to do the following things with an MTBF report, in order by importance:
The reporting itself should happen using a 14-day floating window. Specifically:
Release TypesMTBF groups product releases into three types:
for Mozilla valid examples are:
TODOA couple enhancements are planned
AdministrationConfiguring new productsThe MTBF report is powered by the mtbfconfig and productdims tables.
INSERT INTO productdims (product, version, os_name, release) VALUES ('Firefox', '3.0.4', 'ALL','major');
INSERT INTO productdims (product, version, os_name, release) VALUES ('Firefox', '3.0.4', 'Win','major');
INSERT INTO productdims (product, version, os_name, release) VALUES ('Firefox', '3.0.4', 'Mac','major');INSERT INTO mtbfconfig (productdims_id, start_dt, end_dt) SELECT id, '2008-12-16', '2009-02-16' FROM productdims WHERE product = 'Firefox' AND version = '3.0.4'; Mozilla SpecificMake sure to matchup the release type. versions with pre are milestone. Versions with a or b in them are development. OperationsThis report is populated by a cron python script which runs at 8:00pm PST. The run is controlled by configuration data from a table in the database. Each product is given a start and end date which is a 60 day or less window where we want to record statistics. When cron runs it checks for valid products based on the date and it generates the daily report. In future this will be managed via an admin page, but currently it is managed via SQL. Bug - populating old data, or data for only one product isn't possible without deleting all data in the timeframe and re-running the cron for that day. Hack from datetime import date
from datetime import timedelta
curDay = date(2008, 12, 10)
offset = timedelta(days=1)
for i in range(0, 60):
print "python breakpad/socorro/scripts/startMtbf.py -d %s" % (curDay.isoformat())
curDay += offsetDevelopmentDetails about the database design are in ReportDatabaseDesign |
Sign in to add a comment
Typo: s/gental/gentle/