Talkin’ Transit: Data, Data, Data

Photo courtesy of
‘Horton #9′
courtesy of ‘Chris Rief aka Spodie Odie’

Metro is hoping that a release of their real time data to developers will help them get information about trains and buses to riders faster and in ways they want. On July 8, Metro announced that a public application programming interface (API) aimed at the developer community would be made available in August.

The agency told us it is looking to the community of developers to help solve some of the issues they currently face, including making live data available to the disabled, and helping to make regional transit information easier to access. Metro spokesman Ron Holzer says they would also be “delighted to be surprised with applications that are totally unexpected.”

Metro is also looking to “foster a better culture of transparency, customer service and performance accountability,” with the release of this data. So how will this work? What would this data look like for developers, and what should riders expect to see in the not too distant future?

Photo courtesy of
courtesy of ‘(afm)’

For many developers, the first question is what format is the data, followed quickly by how easy will it be to access it and what data will be available. We are told by the agency that the API will be available as a web service, in two formats: REST and JSON. Developers will need an API key, and registration and data flow will be handled by Mashery (who handle similar service for a host of big-name companies).

Metro hasn’t revealed exactly what data will be available, but expect real time train arrival prediction information, service issues, elevator and escalator outages to be available by line and station. The API will allow for granular data retrieval, so you shouldn’t need to grab more information than your application needs. They expect to have more information on what data will be available closer to the August 11th launch date.

The good news is that Metro is not planning on charging for this service. They say that all developers will be welcome to use the data, including in for-profit applications. The terms of use haven’t been finalized, but I wouldn’t expect them to be substantially different that the current developer agreement, which you need to agree to in order to use data currently released in GTFS.

In June, Transport for London released a similar API for London’s Tube. Unfortunately, they had to turn it off a few weeks later because of its popularity — 10 million requests in the first week. Metro says they hope to avoid a similar problem by using cloud computing and Mashery’s traffic control features. They are planning on having reasonable restrictions (“no more than two calls per second and five thousand calls per day”), but anticipate having to make adjustments. “Our goal is to encourage the development of applications that our riders will be excited about,” says Ron. “We do not want to introduce any limitations that would work counter to that goal.”

Metro tells us that they’ve put in some work over the last three years to make it possible to deliver real time data from different sources across the organization. They believe the next logical step in this work was to make it available to customers by making it available for developers to use. And while much of the train data has been moved to this new system, they are still working on enhancing the bus data for this service. They expect to have bus information available in the fall.

Photo courtesy of
‘good morning.’
courtesy of ‘volcanojw’

So what can riders expect? Depending on the data available, we could see applications that show you the nearest station to go to in order to catch a particular train, applications that warn you when there are problems on the line you normally ride, and cool applications that show where all the trains are (like the one created by Matthew Somerville before the TfL data went dark).

Metro adds that they expect to hear feedback from the community about data that people want. They cannot commit to making all data available, but will listen to most reasonable requests. I’m hoping that one of the data sets available is live entry and exit information from each station. Knowing train schedules, average load per car, and the number of users that have entered a station over a certain time, one could create a heat map of the system showing the most congested stations in real time. Riders could then figure out how crowded their commute is going to be and maybe wait at the bar for a little while.

Will all these bits and bytes help get Metro’s information into Google Transit? Not directly is the answer. Metro believes that the API release will provide developers with the same information that Google would have, but they assure us that have not abandoned a deal with Google.

Born in Lebanon, Samer moved to DC to go to college. A lot of good that did him. Twenty-two years later, he still lives in the area. When he’s not writing for a blog or tweeting incessantly, he wanders the streets (and the globe) photographing whatever gets in his way.

4 thoughts on “Talkin’ Transit: Data, Data, Data

  1. Pingback: Tweets that mention Talkin’ Transit: Data, Data, Data » We Love DC --

  2. Unfortunately it seems that “have not abandoned a deal with Google” means “continuing to try to get money out of Google for something that would be good for our consumers.” If there’s another explanation that holds water I’d be delighted to hear it but so far nothing of the sort has been offered.

    On the upside for Google, if they ever decide to take WMATA’s offer of data for pay, is that WMATA seems to have a history of sticking up for its paid partners. Several developers who have tried to make the NextBus service more usable for consumers via their own application have reported a range of issues between legal threats, simple obfuscation, and application-review rigging. Being in bed with folks like that doesn’t seem to be bothering WMATA.

    This Mashery partnership seems like a divergence from this history of laying down with dogs, so I’m hopeful. But I’ll celebrate when I see it in action.

  3. I don’t own a mobile phone, so getting info on the go would be a challenge, but I support ANY effort by Metro to increase transparency. And this is definitely something I’d use at work before I head home, to check out how the system’s moving (or not).

    Tapping into these types of technology is absolutely essential for the continued health of the system and could even lend a certain “cool factor” to public transit in DC. And I agree with Dan’s comments on Mashery, which has the potential to be great.

    Now if we could just get Metro to stop using so many dang pdfs on their website…

  4. @EmilyHaHa Real Time train data is available now on our website by clicking “Maps” under “Rail” then clicking on the station of your choice. A bubble will pop up with the station name, address and next three trains. If you click the station name from the bubble, there is another link on the station page about half way down that says “Real Time Data” which will show all trains just as if you were reading the displays on the platforms of that station.