Exercise for the Reader

June 27, 2009

The Ugly Truth Behind Pretty Pictures

Filed under: Uncategorized — Seth Porter @ 12:06 am

Or maybe not that ugly, but it makes a better headline.

Last post I promised the backstory behind the pretty pictures. As a fair warning, there are no pictures this time around, just lots of discussion of Linux monitoring tools and their integration.

As a second caveat, I’m not always positive whether a given tool is vital for the monitoring per se, or whether it was just something I used to set the stage and figure out what I could monitor. That being said, I’ll do my best to reconstruct the history, with the help of Aptitude logs and FSVS diffs.

A very brief plug for a package I like: fsvs is basically Subversion-for-config-files in my usage, though I understand it  can be used as a bare metal deployment system. Well worth looking into. (And I just discovered that apparently it stands for “Fast System VerSioning”. Huh. I always assumed the “FS” was for “File System”, but I admit I never thought much about the “VS”… shame they had to force the acronym with intercaps.)

There are three distinct layers to the software stack, if you follow the data from original collection all the way to final presentation. I’ll start with the data providers.

Sources for Health Data

For some of these I’ll link to the project’s home page, but in all cases these were installed from the Debian package repositories. I’m using the most recent versions for Debian Lenny, aka stable. If it’s relevant, you can check which version exactly I’m using via the Debian search page.

cpufrequtils
I used these to set up CPU frequency scaling policy; can’t remember if this is also used directly in data gathering. The key point here is using kernel freq management instead of a userspace daemon. I don’t think it necessarily matters from a monitoring point of view, but it’s much tidier and should be more efficient.
hddtemp
Monitors hard drive temperatures as reported by SMART-capable drives (otherwise known as “just about any SATA or IDE hard drive, these days). I can’t find a good homepage for this tool (it’s not the one at hddtemp.com, which appears to be Windows-based and a bit full of itself). As I recall, the main purpose (for this use) is to provide a network daemon (localhost only, thank you) which the data gathering can query on demand. I explicitly disable the periodic monitoring option, which would be redundant (and rolls my logs).
smartmontools
I use this package to enable SMART on all drives and monitor for catastrophic changes — in that case, it sends an e-mail directly to my phone. I also use it to schedule automatic self-tests of the drives (scattered across typical idle time). I don’t remember if this tool is required to make the pictures, but it’s a key part of the peace of mind story.
lm-sensors
(A confession is in order here. I accidentally lied to a co-worker about this one; I don’t actually use sensord at all. I did at one point, but it was just logging to syslog, which kept the drives awake at all times but wasn’t easy to skim for trends. Instead, I use a common monitoring package for all values I’m tracking, as described below. Sorry if I threw you off-track, Eric!)
This very useful package reads motherboard sensors for things like CPU and VRM temperatures, voltages, and fan speeds. There’s a command line tool for detecting and configuring your sensors (some config file editing is required, but it’s a pretty straightforward process). The sensord companion package can monitor for thresholds and log sensor values, but as noted above I prefer to use a common tool for that.
RRDWeather
This is a neat little tool for fetching weather data from weather.com’s web service and storing it in an RRD database. This tool is a little unlike the others, in that it’s directly storing samples, rather than providing a service which can be queried on demand.

The rest of the monitored values are either built into the kernel, or provided directly by the system being monitored (such as Apache’s server stats).

Data Storage

The key part here, and the one that got me started on all of this in the first place, is RRDTool. The RRD stands for “Round Robin Database”. It has no built-in data collection apparatus, nor any GUI or web-based support for graph creation and rendering (though I believe there’s a small example CGI script). Rather, it simply provides a command-line mechanism for creating databases, a command-line or programmatic interface for adding data samples, and a command-line tool to generate charts given a chart specification.

The neat thing about RRD is that the databases can cover long periods of time, with fine near-term temporal resolution, while only consuming a fixed amount of disk. In rough terms, the trick is to progressively degrade the resolution: store the last day or so at 5 second intervals, say, but the past week only at 1 minute intervals, and so on until far enough back you might only save only value per day. This is a very clever trick, and solves one pernicious problem of long-baseline logging: when the data gathering itself becomes the problem you’re trying to detect.

There are some interesting consequences of this scheme. The most important to the user is that you have to define, at database creation time, exactly how you want to combine values (to come up with a single value to represent a minute, starting with samples every five seconds). From an alternative point of view, you need to figure out what charts you want to draw, and be sure to save that data. Fortunately, you can specify multiple aggregations functions, but you pay for it in increased disk usage for a given time period. For example, you might decide that you only care about the maximum recorded temperature in a period; in that case the database will only store maximums once it starts coalescing. For something like voltages, where any deviation up or down is bad, you might go further and record average, min and max. (I imagine this could also make a nice “envelope” chart, showing all three at once.) The only real problem is that you can’t decide retroactively that you wish you were recording some other aggregate function, since it’s already thrown that information away.

A second consequence of this scheme is more subtle, and didn’t immediately occur to me. Simple graphs of these values will behave as expected. However, when you start getting cute and graphing calculated values (such as my stacked CPU utilization chart), you can see anomalies where the line breaks out of the bounds of the actual physical measurement. For example, the stacked line for idle plus system plus user time hit a whopping 300%, despite the theoretical limit of 200% (one hundred per “virtual core”). The problem is that I’m gathering maximums over a period of time, then stacking them together. If, at one point during the five minute period, idle was at 100% (er, I mean 200%), and at another time the kernel was using 200%, then each max would be 200% and the stacked line would be twice as high as it logically could be. Makes perfect sense when you think about it, but it took me longer to catch on to than I’d care to admit. I suspect that this effect would only show up for certain aggregation functions and certain combinations of them (maybe average would always sum to unity?), but I wouldn’t care to bet on it without doing more math than I’m prepared to do for this post.

Oh, one other thing: I posted exclusively PNGs in the “pretty pictures” post, to increase the odds that everyone would be able to see them correctly. The tool, however, is perfectly happy rendering to SVG, PDF, or EPS as well. This is good news if you want to annotate or print the resulting charts. I haven’t played with the other formats much, so I can’t speak to details like how the SVGs look under significant resizing.

Data Collection

As noted above, RRDTool provides data storage and rendering, but defers to others to actually provide the data. There’s a remarkably rich ecosystem here — of course you can code your own with the scripting language of your choice, but there are also tools for gathering data from a wide variety of sources. You can find a partial list at the “RRDWorld” page.

Most of the full systems (as opposed to task-specific utilities) are focused on large-scale monitoring; the kind of task where one computer is dedicated almost exclusively to data gathering and presentation, watching an enterprise LAN. Very nice, and presumably useful if you’re into that sort of thing, but muchly overkill for my needs. Also, these suites seem to mostly assume SNMP as a common monitoring language, so I’d need to bridge all those disparate data sources to that format.

A second obvious option would be writing a script to gather exactly the data I want. This would be a cool project, would let me talk to any data source with a reasonable textual interface, and would give me the satisfaction of doing it all myself (er, except for the data storage and rendering). The trouble is I’m just not up to it. I mean, I could write the script perfectly well, but I’d be cutting corners — I probably wouldn’t properly daemonize it, making startup and shutdown tricky, and the time I’d save by working in Perl would be returned with interest in the time I’d spend trying to get CPU utilization down. After all, this isn’t a run-once data-munging script; this would be polling multiple times a minute, potentially spawning quite a few processes each time to gather reports. What I really need is a dedicated program, ideally written in an efficient language (trading programmer time for CPU cycles), and even more ideally able to talk to most of these data sources in their native APIs (to avoid repeatedly launching the command line interfaces).

Fortunately for me, this program has already been written: collectd. At least at the time I made the choice, this was a dedicated, narrow-focus app, written in C, and emphasizing low impact for high frequency monitoring. (Since then they seem to have added an embedded Perl interpreter and gone to a more modular architecture, but so far the core values seem to have remained intact.) In the newer version in Lenny, all data gathering (and even the RRD backend) is plugin based. However, there are native plugins for all the sources listed above (as well as quite a few kernel and application-native sources), so I’m not running a boatload of Perl fragments every few seconds.

Configuration is slightly tricky: collectd.conf is an Apache-style semi-XML format, which took some getting used to. A few of the counters still don’t seem to do exactly what I want; in particular I haven’t found a setup for disk traffic monitoring that deals gracefully with my RAID5 and LVM partitioning scheme. However, on the whole I’ve been very happy with the tool, and I’m quite willing to believe that my remaining problems are user error.

Oh, one word of warning: collectd’s preferred RRD setup changed significantly in the upgrade from Debian Etch to Lenny. Unfortunately I missed this at the time, and when I finally caught on to it there wasn’t an easy way to merge my newly collected data and the historical baseline. I still have the old files archived, but this pointed up a particular risk in upgrading data collection tools when you’re trying to maintain a baseline. I’d probably be a lot more upset if I’d been looking at the loss of a year or so of data, so I’m grateful I learned the lesson at the cost of January through March. They were boring months anyway, at least from a server health point of view.

Graph Definition and Rendering

As may have been obvious from the discussion of calculated values, RRDTool has a pretty robust scheme for defining composite values. It’s a sort of funky RPN syntax, but that’s a nice reminder of old HP calculators. This actually happens inside a chart definition, where you can load data from one or more files, then combine the values as desired, before finally drawing various styles of stacked or filled lines. In my limited experience, the RPN syntax is manageable, but the overall file format is a bit of a handful (and begs for a templating engine for related charts). Since I needed a tool anyway to bridge graph requests into Apache (and ideally to manage little things like caching and updating them), I decided to try some web-based front ends.

I started out using Torrus (linked from the RRDWorld page cited above). The version in Debian Etch was SNMP-centric, and offered its own data gathering services, but I could disable all that and just point it at a directory tree of RRD files (which is what collectd produces). There was some Perl wrangling to produce the charts I wanted (since collectd tends to store one datapoint per file, and I wanted both CPUs on the same chart, for example), but the end result was quite workable. I got a hierarchical view of all available data sources, with auto-generated charts for each, as well as the couple of hand-tuned ones. Configuration was a little heavyweight (modifying Perl files as well as XML config files), but the end result was quite usable.

As you may guess from the past tense, we have since parted ways. When I upgraded to Debian Lenny, the new collectd hierachy wasn’t as close to what Torrus expected. Torrus had also changed; the “plain old directory tree” option was now barely supported, in favor of a streamlined workflow for directly querying your SNML sources. Wonderful if that’s what you’re doing, but in my case it’s not. So, faced with the option of proliferating scripts to try to glue these tools together, I looked for another solution.

That’s when I started using drraw. It follows the lead of RRDTool in being completely ignorant of the semantic domain: it’s simply (and wonderfully) a tool for creating and displaying charts generated from RRD datasets. I briefly missed the auto-generated charts from Torrus, but I found myself making much more data-dense charts as I set out to recreate them. Every single chart on my “dashboard” homepage is at least displaying multiple datasets, and all but one of them are using more or less complex calculated values as well. (The exception that comes to mind is the memory usage chart, which instead uses the RRD-native stacking functionality, as well as using base-1024 instead of base-1000 for scaling.)

So far, several months in, I’ve been very happy with drraw. Its URLs are a little ugly, and if it weren’t for FSVS I’d need a solution for backing up any important chart definitions, but the flexibility is great: it’s trivial to make ad hoc modifications to a chart to test a theory or reveal a trend, then abandon them or update the saved definition if they work out.

In Closing

I have the utmost respect for anyone who’s made it this far. I’d be happy to share configuration details with anyone who wants to try this approach (or a modified version); I’d also love to hear from anyone with better ideas. I certainly haven’t done an exhaustive survey of the available software; this is pretty much satisficing rather than optimizing. My standards are pretty high in terms of data fidelity, and at least moderate in terms of system overhead, but beyond that I wanted to get this monitoring up and running, then get back to my programming projects.

Next time I’ll try to introduce my latest long-running hobby project, a pure .Net (and Mono) implementation of the UPnP MediaServer protocol stack, and maybe sketch some of the twists and turns between having an interesting idea and actually having working software. Thanks for reading.

Advertisements

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: