My favorites | Sign in
Logo
                
Search
for
Updated Jun 15, 2009 by baron.schwartz
ProjectRoadmap  
The vision for Maatkit's future

The vision for Maatkit is to work on several things in parallel:

  1. Important bug fixes, new features, and sponsored work. This is ongoing.
  2. Get out of "code debt." Code cleanup, tool consistency, and better test coverage. Also ongoing.
  3. Approximately Q3 to Q4 2009, revisit some stalled projects and begin new ones.
  4. Approximately Q4 2009 to Q1 2010, begin tackling some of the harder challenges, such as the scaffolding for wrapping the tools into a higher-level service.
  5. Late 2009 to 2010, Baron might finish a book Using Maatkit.
  6. Late 2009 to 2010, we might begin some work on the long-term plans below.

This work requires that we balance immediate needs and continue our progress on long-term goals. The idea is to balance urgent and important work in the Seven Habits sense.

Urgent Work

These things can interrupt ongoing work. The idea is that the short-term gain is worth putting off work on situations that are livable, even if resolving them will help a lot in the long run. We prioritize critical bug fixes, sponsored work, and easy wins.

Getting Out Of Debt

Certain things damage productivity, quality, and usability in the long term. These include

See the CodeCleanup label.

There are certain things we can do for the project as a whole, such as adding init scripts (issue 238). These are pretty much the same priority as the "debt" items. We plan to work on these things at a steady pace.

Improve Existing Tools, Build New Ones

We're going to improve existing tools and build new ones, and also work on building tools so they can be added together into more than the sum of the parts. An example of how this is currently done is the --replicate integration between mk-table-checksum and mk-table-sync. We'll be working on similar things with the audit tools, for example.

Each tool has its own page to describe its roadmap. See the links in the sidebar at left.

A rough priority list:

Audit Tools

We plan to build a suite of audit tools that can do both static and dynamic analysis. These include mk_query_audit, mk_schema_audit, and mk_status_audit. The idea is to be able to do analysis with or without connecting to a MySQL server. For example, suppose a customer sends a file containing CREATE TABLE statements to a Percona consultant. We want to be able to read that file and say intelligent things about the tables it will create. If we have access to the server, we might also inspect the tables themselves and look at cardinality, etc. We can do similar things with query text vs. query execution plans, and variable/status output. These three sister tools will together replace most of mk_audit and some other tools, and we will see what's left of mk_audit and the other tools afterwards.

Long-Term Plans

These are project-scoped, administrative things that would be nice to have or do someday.

  1. Add (TM) to the logo. Register the Maatkit trademark, then add (R).
  2. Create a Maatkit foundation to own the trademark and employ people.
  3. Assign copyright to the foundation, or at least clean it up and make copyright and license enforceable.

Part of a Bigger Plan

The 50,000-foot view of Maatkit's future includes our ideas for building useful services on top of and around the Maatkit tools. This is all beyond the horizon. We know it's there, but we don't know much about it, and we don't know when or if this effort will really start.

The core need this is intended to address is the lack of a really good monitoring and advisory system for MySQL. However, there is a real problem with monitoring, graphing and advisory systems: there are too many, and they all fail miserably in very basic ways! This happens for a very simple reason: such systems are much harder to build than people believe they are. I (Baron) haven't seen anyone even come close to getting it right yet, though I hold out hope for Reconnoiter.

Maatkit will not fall into that trap. Instead of building yet another graphing system or yet another system that alerts the user when the key buffer miss ratio grows too high, or embarking on yet another doomed attempt to boil the ocean, we are going to be far more intelligent about it.

Key points we will avoid:

  • Simplistic, useless alarms built around ratios of status variables.
  • Re-inventing the wheel.
  • A system that's tied to a particular type of user.

Key points we intend to address:

  • Different use cases. The system will not be integrated or pre-packaged. Instead, it will consist of a set of tools, and instructions on how to stack them together for different uses. The tools should scale flexibly to meet these uses:
    1. One-off jobs: A single user downloads the tools and runs them one time for a performance audit.
    2. Periodic collection, local or no storage: A sysadmin installs a cron job or logrotate action to run one or more tools periodically and process the results (perhaps by emailing them out).
    3. Continual collection, local storage: The tools run as daemons and collect information continuously, and store it in the database they monitor.
    4. Continual collection, central storage: The tools run as daemons, and are either a) polled by a central system or b) push information to a central system.
    5. Useful functionality for shared hosting providers, who might want to run the system and provide each user access to their information, but not see information about other users who are running on the same server.
  • Different types of analysis and presentation. Because the data collection is decoupled from the storage, we can let people store their cake and then eat it any way they want:
    1. Visual inspection of pretty-printed reports.
    2. Helper tools that work with pretty-printed reports and/or information from the database.
    3. Query the database manually with SQL.
    4. Build an HTML front-end.
  • Technology agnostic. There should be no dependency on particular server patches or versions of MySQL. We should be able to collect as much information as is available. If we're running against a stock MySQL server, we can sniff the processlist or use tcpdump to gather query events, for example; if we're using a Percona build we will have significantly more information in the slow query log, and we should smoothly upgrade the functionality accordingly.
  • Choice of polling or agent-based monitoring. Some people like one or the other. Each has advantages -- agents can survive past network blips, polling keeps configuration centralized; some people want zero-install on the monitored system, some people want zero-configuration on the central system. We should not preclude any approach.
    1. There's nothing wrong with "ssh user@host tail -f /var/lib/mysql/slow.log | mk-query-digest
    2. There's nothing wrong with "mk-query-digest --processlist host --review-history localhost" either
    3. Centralizing the work of aggregating and reporting the events is fine; distributing it with agents and saving bandwidth is fine, too.
  • Choice of how to get results from an agent/daemon. You could give it a SIGHUP, for example; or it could listen on a socket and empty its ring buffer into the socket in some structured markup. Or it could just store its results into a MySQL server, which avoids the need for a protocol and an open socket.

Up until now we've been a bit vague about exactly what kinds of information this system will deal with. That's because this should also be pretty loosely defined. Defining it too well makes the system suitable only for some users, and we want it to be suitable for all kinds of users. We are also not so arrogant as to pretend to tell users what they need. Here are some types of information a user might want:

  • Detailed information about query execution, including worst-time examples, EXPLAIN plans and the ability to see when they change.
  • Table and index sizes (data and row counts), plus index. Basically SHOW TABLE STATUS.
  • Configuration variables and records of their changes over time.
  • A signature of the server (IP address, master host, MySQL version) and changes over time.
  • User, table and index statistics (from the Google patches) if available.
  • Snapshots of SHOW FULL PROCESSLIST at intervals. It is interesting to do this as well as store the aggregated information from the log parser, because the log parser can't show you the interaction of queries (one thread updates a MyISAM table, everything else piles up behind it.)
  • Snapshots of SHOW STATUS and SHOW INNODB STATUS (both the text, and extracted values) at intervals.
  • Functionality to permit adding meta-data to anything: a server, a query, a user, what have you.

As a reminder, this system will not build and deliver the above-mentioned functionality. Rather, it will provide basic services that can be used in the above-mentioned ways, and not preclude any of the mentioned use cases. Most monitoring/advisory/graphing systems say "there are lots of ways to do this; we pick this one and you're stuck with it." The Maatkit tools will support those who say "I can't find anything that does what I need, so I will glue these tools together with a few cron jobs and a web front-end to suit my preferences."

Hosted by Google Code