Exercise for the Reader

November 26, 2008

Justifying the Library Habit

Filed under: Development Infrastructure — Seth Porter @ 6:24 pm

I think we still tend to underestimate the importance of (code) libraries and how our code interacts with them. The only time I’ve written fan mail was to Steve Mcconnell after reading (and being blown away by) Code Complete; I asked him for his thoughts on how to write code which fully leverages libraries, but still retains some independence. In conversations at work and elsewhere, I’ve realized that I have an internal categorization of libraries which isn’t widely shared (perhaps for good reason). Be that as it may, it’s my blog, so I’m going to lay them out.

There are several different reasons for using a library; each needs to be accompanied by a different style of use. The different reasons also imply different levels of coupling between your code and the library. At one extreme, your code can be completely oblivious to the existence of a particular library (some pluggable image loader models, for example, which transparently expand the file handling capabilities of all applications). At the other limit, you explicitly assume throughout your application that a particular library is available. The first is clearly preferable, but there are many times when you’re forced to accept a degree of coupling for one reason or another. The key is to be honest about how dependent you are on a library, and make sure that there’s some inverse correlation between degree of dependency and the likelihood that you’ll need to replace that library.

Using a Library to Do Something You Can’t

This is the simplest and most common reason for breaking out a library, and there is certainly nothing wrong with it. In some extreme examples, it may be literally impossible to do without the library; for example, low-level I/O functions in the more protective languages may be completely inaccessible without using the core runtime libraries. More commonly, they provide levels of performance (matrix transposition tweaked for different processors, say) or correctness (full implementations of the TIFF standards, for instance) which you simply can’t afford to match.

This is especially true when you don’t have a lot of people or time, or when the functionality is unrelated to the core focus of your work. Compilers are a great example of this sort of this: many a company has innocently written a DSL (domain specific language) for some task, and found themselves a few years later working full time on scoping issues, implicit numeric promotions, and other fun language tasks completely unrelated to the original problem. This can be a fun way to make a living, I imagine, but it’s a rare organization that can afford to support it.

In this case, the make or buy choice is obvious — you are unable to create and maintain the code yourself, so you have to find it elsewhere.

Sometimes, you may be able to maintain some independence from a particular implementation (standardized libraries fall into this camp, as do implementations of specified interfaces such as video drivers or JDBC adapters). In other cases, there is only one reasonable implementation, and you might as well assume it throughout your code. Using the language runtime libraries is a typical case here (where you commit to the assuming the standardized library interface, and quite possibly to the quirks of your local implementation). Another might be the hosting application, for plug-ins or scripted tools; if your application is written in WordBasic, it might be reasonable to assume the existence of Word.

Using a Library Because Everyone Else Is

Again, this is not a bad thing. In the absence of compelling reasons otherwise, you may well want to use the same logging library as some other components you’re using, for instance. Your and their logging can be controlled from the same configuration file, use the same formatters, and so forth.

If the implementation is sufficiently broadly used, you can even get away with tolerating some bugs — everyone else has the same behavior, and other people will just have to work around it. An example here might be using the Internet Explorer engine for HTML rendering; it’s known to have strange issues and to lack standard compliance, but pretty much everyone just works around it. I don’t encourage this approach in general, but you have to be able to narrow your focus somehow, and de facto standards can help. A more optimistic reasoning here is that if all parts of your application use the same libraries, at least they’ll all have the same bugs. You don’t want your syntax checker to have a different interpretation of whitespace than your actual parser, unless you want to spend a lot of time meeting new people with very pointed opinions about your code.

This reason alone may be enough to determine your choice of libraries, if your project is sufficiently resource-constrained. If you need to download as part of a web page, or run on a mobile device, or even just live within a given installation budget, then you may simply not have room for redundant XML parsers.

This is also a case where the make or buy decision may be forced. If you can’t or don’t want to use a different third-party implementation, it’s hard to justify reimplementing it yourself (in developer time or code size or internal consistency). Your level of independence from these libraries will vary

Using a Library As a Convenience

This is a perhaps the easiest reason to justify; the danger is that it’s easy to slip into one of the other cases without realizing it. In the other cases, the answer to “Could I (reasonably) write this code myself?” was probably a resounding “No!” In this case, you can easily imagine getting the results you’re currently getting without using the library. That doesn’t mean that you’re getting no value from the library; typically it will provide a great deal of flexibility, while your imagined replacement code would be hard-wired for one purpose. However, from the point of view of your code, there’s nothing happening that you couldn’t reproduce. Some of my favorite libraries fall into this category. They tend to neatly handle cross-cutting tasks of the “I could write that, but I don’t want to have to maintain it” variety.

One example is the Spring Framework for inversion of control / dependency injection. (It does a lot of other things, but that’s what I think of as its primary mission; most of the rest are more “now that we have a good inversion of control abstraction, we can do this neat thing” — applications of the core function, not the core themselves.) For any given Spring document (XML files listing objects, primitive property values, or links to other objects), it would be easy to write straight-line Java or C# or whatever to construct the same object graph. The advantage is that it moves that knowledge out of the cooperating classes and into some externalized blueprint for assembly.

Another example here would be Hibernate. Again, reading rows from a database result set and reconstituting them into plain old objects is not an impossible task. It’s just very very tedious, hard to maintain, and easy to get wrong (especially over the many repetitions that you’ll need).

I think what I like about these libraries is that I don’t have to feel guilty for using them. There’s no magic smoke in there; if there’s a bug they won’t fix, I can just work around them. In the meantime, I get to dive right into my problem space, with instantiation and serialization as solved problems.

Advertisements

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: