Projects

Here are list of data and software projects that I’m working on or have worked on in the past. Consider these side projects.

OpenElections

OpenElections is a non-profit effort to collect, standardize and publish certified election results from all 50 states and the District of Columbia. This isn’t a live results project, but focuses on elections from 2000-onward. Started with my friend and AP developer Serdar Tumgoren, OpenElections is powered by volunteers from all over. You can help! Code: Python, Tabula, MongoDB, some JavaScript.

Cricket Data

Along with my friend Gaurav Sood, I’ve collected cricket match data for research and journalism analysis, including an article on ESPNCricInfo and a white paper, both covering the impact of winning the pre-match coin toss (full code here). Another product of these efforts is the Python library python-espncricinfo, which wraps the undocumented JSON API provided by the site. I’ve also written a Ruby wrapper, cricketer.

TweetRewrite

A Ruby on Rails app that, given a URL of a news story from The New York Times, The Washington Post and ProPublica, will find tweets containing the URL but not the title of the story.

Post Haste

I’ve developed a Ruby wrapper for washingtonpost.com articles and blog posts, including their comments. Potentially suitable for building custom feeds of Washington Post content, in the event that you don’t want to actually visit washingtonpost.com. Like ESPNCricInfo, the Post site has an undocumented API.

Web Apps

  • Extractor - A proof-of-concept Python app for extracting text from URLs.
  • LinkChecker - Web app to find links to Wikipedia in URLs.
  • Paper of Record - A tiny JS app for tracking mentions of The New York Times in the Congressional Record.

API Wrappers

  • Binya - The Ruby wrapper for the Federal Reserve.
  • USA Today Census - Ruby wrapper for the USA Today Census API

Utilities

  • NCAA API - A Python application to turn the NCAA’s Web-based statistics into an API.
  • exempt_orgs - Basic utility to convert IRS exempt org files to CSVs.