Outsourcing Database Development, or the Caspio Issue


Derek Willis


September 7, 2007

Updated: Caspio’s David Milliron responds in the comments.

The good news is that there are plenty more databases served up on newspaper Web sites than there used to be. Some papers are organizing entire desks around data. The bad news - and people can disagree on this - is that in some cases, the papers aren’t really doing much in the way of Web database development. They’re outsourcing much of the work to Caspio and its Bridge application.

This can’t be such a bad thing, right? I mean, more databases online is a good thing, and of all people, I should be encouraging any way to get the stuff up. Unfortunately, it’s not that simple. By leaving the work to Caspio and reducing database development to the safety of point and click, news organizations are far more likely to end up with a bunch of cookie-cutter apps that go just far enough to satisfy the “hey, we need some databases” crowd but not nearly far enough to hold the attention of readers and provide a real service.

Give Caspio this much: it spotted an opportunity in an industry that’s trying to do more on the Web with less, and the company has managed to sign up clients including the Indianapolis Star, the Arizona Daily Star, the Palm Beach Post and the Atlanta Journal-Constitution. The pitch is attractive: Web databases in hours, not weeks, and all you have to do is load the data and decide what how to display it. Indeed, Caspio boasts “No programming skills required” to use it. (I should say that I have not seen Caspio Bridge in person and have only had it described to me.)

So what’s the harm? I see several. First, Caspio’s product is an abstraction, built atop a database server (SQL Server, in this case) without giving users all of the power of that DBMS. It can’t, because in order to make things simple, it has to limit the ability of users. That translates into, for example, two choices for storing text: a maximum 255-character VARCHAR field and a maximum 4000-character VARCHAR field, even though SQL Server supports more. When you have a text field that always will be, say, 5 characters or less, it makes no sense to use a 255-character field. A small app won’t be affected in most cases, but larger ones could see a performance hit, especially on searches using wildcards. In addition, Caspio says it supports importing from text, MS Access databases and something called Caspio Bridge XML, which means that “only XML files previously generated by Caspio Bridge can be imported. XML files from other sources are not supported.” It also says, with no apparent irony, that “Text files are the least-desired import file formats because they cannot contain field data type information. If possible, import your data in one of the other formats. Caspio Bridge assigns a Text (4000) data type to all fields in tables imported as text files, unless you choose to appended (sic) or replace an existing table.” So it creates excessively large fields when importing text files, can’t deal with most XML and encourages the use of Access, which in practice means that very large datasets are going to be fun to deal with. Oh, and your organization (or Web site) doesn’t run on Windows? You may be out of luck.

Second, the data isn’t on your servers. I’m not sure what news organization would put a critical and exclusive data application on servers not under its control. That leaves room for plenty of potentially interesting data apps, like one charting Cleveland Browns’ games since 1946 from The Plain Dealer, but the lack of flexibility and control involved is ultimately going to be frustrating. If the PD wants to expand the Browns app, it has to add data to its database on Caspio’s servers. Can one Caspio app use data from another? It’s not clear from the docs. What about serving data to other types of apps like Flash? Nope. Want to use your Caspio app to build a Google Maps mashup? That’ll be an extra charge.

Third, Caspio makes it sound like programmers and developers only make things worse. They play on the ignorance of people who don’t know what good programmers can do. An example, from the online help: “A scripted function written by a programmer for you is handcrafted, and may not meet the quality and reliability standards that your application or web site demands.” Um, sure, if you have your neighbor’s 13-year-old write your database apps. Or this: “Hand-coded scripts are difficult to edit and maintain because they contain hundreds of lines of code that may or may not be properly documented. The original programmer may no longer be available, or the logic behind the script may be forgotten.” Apparently Caspio hasn’t heard of version control or documentation or, gasp, even Web frameworks that abstract many of the simple functions that it describes as potentially being convoluted. Caspio’s sales pitch seems to be: “Hey, you don’t know anything about Web database development, and you shouldn’t trust anybody who does.”

Lastly, I believe that anybody who has done any kind of work with data realizes that you learn more about the data by being more involved in its use. Some of the best features on the Congress Votes Database came to us only after we had spent time doing the Web development. So while speed-to-deployment is a very attractive feature, many times it results in a one-and-done approach: just slap it up and move on. This would be less risky if there weren’t smart people outside the media who can and will get their hands on useful data and do a better job with it then we are doing. News organizations can’t afford to rely on an approach that limits their choices and encourages quick but shallow development.

Caspio may have an upside; news organizations may come to realize that the value of putting up a database requires that they invest more time and effort, not less, and that it works better if they have the highest degree of control and flexibility. So maybe Caspio is a stepping-stone, an actual bridge rather than a crutch that people will be using rather than expanding their skills. But right now it sure doesn’t feel that way to me. News executives seeking to be able to tell the bosses that “we have some databases on the site” will find lots to love about a product like Caspio Bridge. The rest of us should take no comfort in that.