The Canvas for CAR


Derek Willis


January 13, 2007

Computer-assisted reporting and newspapers have always been a slightly imperfect match. In general, newspapers provided more support for CAR than have radio or television stations, but as newspaper space is at a premium (and getting more expensive), those stories involving a significant amount of data usually end up leaving much of it behind. Accompanying graphics help in many ways, but they can only do so much. Even when story length isn’t as much of a concern – at some magazines, for instance – the instinct is not to put a bunch of data where carefully produced words might otherwise go.

Still, newspapers have been the best venue for CAR for many years, mostly because that’s where CAR practitioners have gone to work. CAR can be expensive: instead of a bare-bones PC and notebooks, papers need to spend money on more powerful machines, expensive software and hardware and pay for the time that doing CAR can take. And then the paper prints a summary of the CAR work; most of the rest of it is left behind.

The Web began to change that as soon as it became popular. Instead of relying on a paper’s database analysis alone, readers could actually search the data themselves. Organizations like the Center for Public Integrity (where I used to work) have made data a central part of their offerings; the Center’s telecom project puts its Media Tracker database front and center. Some newspapers have begun collecting their databases in a single location; the Louisville Courier-Journal is one example of this.

On the other hand, too many papers (my own included) take the printed paper’s restrictions and apply them to the Web. For example, the Fort Worth Star-Telegram’s site has what it calls a “Databank” of building permits and other municipal records. It’s a long list, not searchable, not related to anything. Other than the fact that we print this stuff in the paper, there’s no reason for doing this online. It’s as if we, as an industry, simply seek to do the same thing regardless of platform.

It shouldn’t be this way. The Web is the canvas for CAR, better than any other platform we’ve come up with as an industry. It has every advantage that should be available to the CAR practitioners, including unlimited depth, the ability to customize or personalize and the luxury of designing a database so that it will truly be useful to readers. Some papers get this, or are beginning to realize it. Think of USA Today, where that paper’s sports salaries databases not only produce stories for the paper but also help cement its reputation as a premier destination for national sports information. When bloggers and other publishers start using your site as the “standard” for that topic or piece of information, that increases your influence and reach. Go ahead: search Google for “baseball salaries”. What’s the first result?

That’s the sort of reach that we’ve been trying to achieve with the Congressional Votes Database at (and indeed, if you search for “congressional votes,” that’s usually the top result). CEO Rich Skrenta writes about “appropriate discoverability” on the Web, saying that while newspapers produce restaurant reviews and movie reviews, almost none of those are easy to find on the Web. That’s because we’re not, as an industry, taking advantage of the opportunity to collect them in a database and index them properly.

But there are so many Web uses for CAR – not just for things like movie reviews, but for investigations and other “hard news” stories. Tracking the activities of government is something that many papers do better than most others (save lobbyists), but most of that expertise is trapped within a few reporters and editors. There might be a dozen stories in a huge government database, but usually papers have time and room for finding and reporting only a few. Opening up access to those datasets for our readers would provide them (and us) with the ability to answer the follow-up questions prompted by an investigative report. It could open new avenues of inquiry and connect us to new ideas.

The Web is the most natural home that CAR probably ever will have, and it’s certainly a better platform than print. While there’s much to be said for the ability to distill loads of information and data into a digestible story, why would we leave it at just that when we could keep readers coming back to our sites or staying there for longer periods of time? On many levels – economically, culturally – journalists have had a tough time adjusting to the Web. Some of those difficulties are natural and must still be worked out, and news Web operations themselves haven’t helped by not embracing CAR as they should be. Yet I can think of few other areas in which Web sites get more for less than with those people who know how to use databases. A great deal of Web data runs on open-source applications that cost less and, when properly automated, require less time and effort to maintain than a collection of text that needs to be copy-and-pasted into a CMS.

The question of how database content is presented on the Web is still one that has no single answer and remains a work in progress. But the more that we get the CAR people and the Web people together – indeed, the more that the CAR people and the Web people are the same people – the better we’ll get at both making data more accessible and less separate from other Web content. And CAR will finally get the home it deserves.