The State of Open Data 2011

David Eaves penned an excellent post of the same name this morning ( http://eaves.ca/2011/10/21/the-state-of-open-data-2011/ ). I offer the following as another complimentary perspective on the state of the emerging Open Data field.

2011 has been a huge success for open data... there've been lots of awesome hackathons (events where every day people and programmers alike get together to use and extend government data and applications ) being hosted in BC. In-fact, there's one tomorrow at the Bard and Banker pub in Victoria.

Open Data truly has come a long way this year, but its path has also been damaged by certain events which I'll get to shortly.

So what's gone right? Well, first, governments in BC (both local and provincial) have embraced opendata in a big way. They know that citizen-centric, bi-directional government is the future. This is awesome and the Premier and Mayors from around BC must be applauded for their vision. This is space program type thinking, and shows a commitment not only to the needs of today but to building a truly participatory democracy in the future. It's the stuff people /should/ vote for and that I hope becomes the stuff of elections and, amazingly, its already happening in Victoria BC with the launch of OpenVictoria.ca.

But its not all roses and smart thinking -- While some governments have dived in head first and fully embraced the philosophy of open, others like the province itself are taking baby steps towards openness and making some critical mistakes along the way.

So what is the "Open" philosophy. Hows it different from FOI of the past, and what are governments not getting?

"Open"  contrary to popular belief isn't the act of sharing information. Its a philosophy, and its closer to a religion than it is a concrete set of rules. Open can be traced back to the early days of the Free Software Movement (F/LOSS) -- years ago, guys like Richard Stallman ran into the same problems with releasing information to the world. They said -- I want to share, to cooperate, but the law doesn't understand me. They set out to develop a legal framework for innovation. Stallman and others introduced the GPL and created the concept of Copyleft, while others introduced the BSD license, the MIT license and later licenses like the MPL and Apache License which all have different degrees of freedom and ecosystem obligations.

So what is 'Open'? To Stallman it was about access to source code, but to Open Data today it is about primary sources... what I'll call the source code of 'data'.

The Open Source Definition (OSI) includes a list of specific freedoms and today, those principles are complementarity transformed and defined for Open Data through the Open Definition. http://opendefinition.org/okd/ .

So what is the state of "Open" in government open-data? Well, some Municipalities like Surrey and Langley have embraced full-openess, and have released a significant amount of the data they hold to the public domain. No complex or revocable license terms, just public domain. On the other hand, the province has put forward the Open Government License -- which while a solid first step has made some critical mistakes in meeting the open definition.

First, "Open" is about creating market effects, competition and unintended consequences. If you can predict the outcome, its not open -- and this is where the province has made its first failure.

Failure 1: The province has not embraced license diversity and is declaring that a single license must be used for provincial open data. This restriction isn't found in the OGL license itself, but in the accompanying policy 'stack' which you can find at http://blog.data.gov.bc.ca/2011/10/the-databc-policy-stack/ .

The BC government has failed for decades by trying to define common standards and eschewing the open principle of ecosystem diversity. I still remember the days when BC government standard web development was on Microsoft Front Page and Front Page Extensions while the market was innovating with PHP, Python, Ruby and ColdFusion. The government found themselves wasting ungodly amounts of money on portal projects that were doomed to failure-by-standard before they even began. There were few successes sure, but the ones that won awards were the unauthorized skunkworks projects completed by ignoring the standards.

So what does diversity mean to the BC government? I think nearly everyone outside government recognises diversity as a critical element for long term success. Whether that's racial diversity, bio-diversity or conceptual diversity -- rational thinkers know that a marketplace of diverse ideas is better than commandments from upon high. For the PHB types, its the difference between a Market Economy and a Planned Economy. In Open Data, this means we need publisher diversity (not portals) we need license diversity (not standardized policy stacks) and we need data diversity (open, but not standardized formats).

It is what I often describe as the Saurons Ring fallacy. Governments are searching for that one ring that will rule them all. The one perfect piece of policy that can handle all circumstances. I call it policy alchemy and to me it is just as ridiculous as trying to turn lead into gold.

The BC government is searching for that ring. They have a monoculture license (which has issues that have been raised by the community), they've put extensive resources into a new portal, and are moving towards standardized formats for data publication. If achieved it would be revolutionary, we would have a perfect license, a portal overflowing with data and everything would be in perfectly useful formats. But its a pipe dream. It wont happen. It will fail. Sorry, the ring you are searching for does not exist.

In my opinion, this is the wrong path and it wont lead to "Open". Instead, the BC government and other jurisdictions attempting Open Data must truly embrace "Open" and stop trying to control everything. Be a participant not a provider. Create policy frameworks that handle the minimum legal requirements, like Intellectual Property and Privacy and Policy clearance, and then let Ministries apply those rules in whatever diverse way makes sense for their business case. Data like Hansard can never be Open in the same way that a Map can be -- and while some will try to categorize data here or there as exceptions to a rule, it is the exceptions that are key and the rule that is exceptional.

Failure 2: Data Publishing Architecture.

Civil servants around the province are passionate about publishing data and they know what free can achieve. So its time to publish, but for one reason or another data is not materializing at the needed rate. The BC data portal is adding about 1 dataset per day, and thats pretty amazing for a single team trying to seek out and claw data from ministries. By the end of any year they'll have put out ~365 new datasets -- which is -awesome-. However, how many new datasets were created by government this year? This architecture, despite how well everyone does their job, simply doesn't scale. The government could hire 100 people to clear data, and they might get up to 36,500 sets released a year -- but I'm told there are potentially hundreds of thousands of datasets held by the BC government. So how do we get them all out there? Only via Open frameworks and policies that encourage the direct participation between Ministry staff and citizens. that encourage communication of data bi-directionally at the civil servant layer. Think it can't be done? It can. The BC Social Media Guidelines are an example of this hands-off policy development going exactly right -- and they were received with great acclaim for their innovation. I'm sure people said that there would be all kinds of trouble if Socmed comments didnt go through official communcations lines... but to the contrary, the sky never fell.

Failure 3: Ignoring failures and not repeating success.

So if you believe as I do that one-size-fits-all government standards have failed for years and the new open social media standards are a huge success, then why are we not learning from our failures and successes? From the skunkworks projects that won awards or guidelines that made national news for their progressive thinking. Why are we repeating the failure architecture and not our success?

Failure 4: Getting over Cost recovery.

This failure is shared by pretty much every open-data publisher we've seen so far. If the data is being charged for, no one is stepping up to release it. At the pinnacle of insult, we cant even get the postal code database made open. From BC Laws, to DFO's Hydrographic Charts, data is extremely valuable, but charging for it is extremely costly to the government.

Rather than recovering costs, the government is costing themselves more money by not releasing data than they are by charging for it. Take the DFO Hydrographic Charts (Something known in GIS terms as S-57 ENC's). These files are massively expensive to create -- you literally need ships in the oceans and research teams, cartographers and GIS techs to make them. They're super expensive to create -- no doubt. But If we look to our US neighbours, they release all the data to the public domain. Totally free -- and the effect has been a marketplace of companies developing mapping and GIS applications for S-57 in the US. I have similar ambitions in Canada, and actually sought this data out. The agreement presented required notaries, royalty payments and access fees beyond anything that could be economicly viable for a small organization such as mine. I killed the product line after investing significant time into prototype development. Had the data been open, I may have been successful in creating a Canadian competitor to the US and other International firms, I would have made some money and the government would have got more tax dollars. Instead, they got $0 from me and Canada simply isn't an innovation leader in this area.

Beyond that, some agencies have shown what no-cost data can achieve. The BC Map Place (mapplace.ca) helps mining companies prospect, stake claims and do the research that is required for new mine development. These are typically huge companies with billions of dollars... one might think who better to charge for access to data? Wrong. The mapplace.ca has been a wild success and in the process has certainly created more mining opportunties than would have otherwise have been found. Small prospecting companies thrive in BC and new mines pay lot of taxes. If even one mine opened because of Mapplace then it brought in more than it could ever have charged to users. Progressive thinking.

So lets stop cutting off our nose to spite our face. Drop the cost for data and reap the economic rewards that come from broadly accessible data.

The State of Open Data in 2011

Depsite the failures noted above, we've also had major successes. The community around open-data grows stronger every day. Hackathons see participants from diverse backgrounds and have created some amazing apps with government data. The VanShelter app prototype is truly amazing and is a window into what can be achieved by open data. WhereDidItGo.ca, Waterly.ca, and the BC Property Tax Comparatron all stand out as amazing uses of Open Data that have emerged from OpenDataBC hackathons.

We've seen the data catalogs grow and agencies move towards greater transparency. The BC government has released several high value datasets for proactive financial disclosure and those sets now underpin proactivedisclosure.ca. This type of data release will result in unparalleled transparency and accountability and must be commended.

The province has also begun proactively publishing FOI information... which while it could be better (the data is accessible but not open licensed)... it has been an amazing turn towards transparency of data and the democratization of media. However, this transition isn't without controversy and the change draws into question the legitimacy of charging journalists high fees for FOI access when the results will be widely published. Charging one requesting user for government transparency seems grossly unfair to many folks and may actually result in fewer FOI requests being filed. The point? Traditional cost models will have to adapt to the new proactive transparency culture.

In the end the market place is developing largely as expected with successes and setbacks. Hopefully 2012 will see Open Data truly come to BC and the province correct their course towards a more diverse market economy model for open data.

Submitted for $0.02 as always.