An introduction to open data

Data & Analytics

sally-jenkinson
  • AN INTRODUCTION TO OPEN DATA Sally Jenkinson - Fronteers - Amsterdam - 09.10.2015 @sjenkinson | [email protected]
  • [email protected] | @sjenkinson Digital solutions architect & consultant Records Sound the Same Ltd Sally Jenkinson
  • DATA
  • OPEN DATA
  • “Big data” @sjenkinson
  • 90% of the world’s total data has been created within the last 2 years ! (IBM, 2014) @sjenkinson
  • I ♡ DATA
  • @sjenkinson
  • @sjenkinson
  • sallyjenkinson.co.uk/labs/teatracker
  • BUT…
  • “You agree to maintain your apps and your systems in accordance with industry standard quality levels…”
  • DATA SHARING
  • WHAT IS OPEN DATA?
  • Open data and content can be freely used, modified, and shared by anyone for any purpose. opendefinition.org
  • Re-publish Derive new content or data Make money by selling products Charge a fee for access
  • Make money by selling products Charge a fee for access
  • “We observed that often people think of open data as a specific ‘kind’ of data – something separate and distinct from the data they use day-to-day in their organisation or team – rather than a choice about how people publish data.” theodi.org/blog/closed-shared-open-data-whats-in-a-name
  • theodi.org/guides/publishers-guide-open-data-licensing | theodi.org/guides/reusers-guide-open-data-licensing Public domain (CC0) Attribution (CC-by) Attribution & share-alike (CC-by-sa) OPEN LICENCES FOR CREATIVE CONTENT
  • theodi.org/guides/publishers-guide-open-data-licensing | theodi.org/guides/reusers-guide-open-data-licensing Public domain (PDDL) Attribution (ODC-by) Attribution & share-alike (ODbL) OPEN LICENCES FOR DATABASES
  • theodi.org/guides/publishers-guide-open-data-licensing | theodi.org/guides/reusers-guide-open-data-licensing Open Government Licence OS Open Licence etc OTHER OPEN LICENCES
  • WHERE CAN I GET IT FROM?
  • wiki.dbpedia.org
  • musicbrainz.org
  • earthquake.usgs.gov/earthquakes/search/
  • plaidplug.com
  • data.id/dataset/daftar-titik-reklame-di-dki-jakarta/resource/361ce01f-34ed-4e00-a204-6062c7b9ad64
  • web.archive.org/web/20150520175645/http://137.189.35.203/WebUI/CatDatabase/catData.html
  • vision.stanford.edu/aditya86/ImageNetDogs/
  • {"gilded": 0,"author_flair_text":"Male","author_flair_css_class":"ma le","retrieved_on":1425124228,"ups": 3,"subreddit_id":"t5_2s30g","edited":false,"controversial ity": 0,"parent_id":"t1_cnapn0k","subreddit":"AskMen","body":"I can't agree with passing the blame, but I'm glad to hear it's at least helping you with the anxiety. I went the other direction and started taking responsibility for everything. I had to realize that people make mistakes including myself and it's gonna be alright. I don't have to be shackled to my mistakes and I don't have to be afraid of making them. ","created_utc":"1420070668","downs":0,"score": 3,"author":"TheDukeofEtown","archived":false,"distinguish ed":null,"id":"cnasd6x","score_hidden":false,"name":"t1_c nasd6x","link_id":"t3_2qyhmp"} x ~1.7 billion reddit.com/r/datasets/comments/3bxlg7/i_have_every_publicly_available_reddit_comment/
  • ♥ github.com/caesar0301/awesome-public-datasets
  • CONSUMING OPEN DATA
  • @sjenkinson
  • d3js.org
  • MORE THAN WEBSITES
  • iquantny.tumblr.com/post/92116352544/mapping-nyc-hydrant-revenue-upper-easts-19th
  • Generating value & making savings @sjenkinson
  • +$3 trillion / year mckinsey.com/insights/business_technology/open_data_unlocking_innovation_and_performance_with_liquid_information open data
  • Transparency @sjenkinson
  • “…within two years chemical emissions nationwide (at least as reported, and presumably also in fact) had decreased by 40 percent. ! Some companies were launching policies to bring their emissions down by 90 percent, just because of the release of previously sequestered information.” maban.co.uk/80
  • DATA & USER EXPERIENCES
  • “How far do you live from your workplace? Chances are, you'd answer that question in minutes rather than miles. ! An hour on the bus tells us a lot more than 47 miles. That's why we made Mapumental. ! Given any start point or destination, it'll show everywhere within the chosen commute time, by public transport.” mapumental.com/services/travel-time
  • “How accessible is your nearest school, post office, or GP’s surgery? ! In Wales, that’s not always a simple question: the country’s mountainous landscapes, rural populations, and sometimes infrequent bus services can mean that those without cars are rather cut off from public service provision.” mapumental.com/services/accessibility
  • “Just how quickly could fire engines reach a given postcode in case of a fire? ! It’s a question that’s pivotal to decisions made by both the emergency services and the insurance industry.” mysociety.org/2013/04/22/fire-fire-mapumental-and-fire-engine-journey-times
  • Improved efficiency Improved effectiveness Impact measurement @sjenkinson
  • Improved or new private products or services & innovation @sjenkinson
  • NOT JUST DIGITAL
  • opensensors.io
  • DOUG MCCUNE dougmccune.com
  • STEFANIE POSAVEC stefanieposavec.co.uk
  • “Air Transformed is a series of wearable data objects that communicate this physical burden in different ways. Though seemingly decorative, they are based entirely on open air quality data from Sheffield, UK, a former steelmaking city and notorious for its bad air.” stefanieposavec.co.uk/data/#/airtransformed
  • Participation & self-empowerment @sjenkinson
  • LINKED DATA
  • New knowledge from combined data sources and patterns in large data volumes @sjenkinson
  • Misrepresentation
  • tylervigen.com/spurious-correlations
  • tylervigen.com/spurious-correlations
  • Combining data sets & licences clipol.org/tools/compatibility
  • PUBLISHING OPEN DATA
  • “There are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns – the ones we don't know we don't know.” en.wikipedia.org/wiki/There_are_known_knowns
  • STEP ONE Identification & planning @sjenkinson
  • Clear licensing & usage information Structure & quality A plan for support @sjenkinson
  • Accuracy
  • STEP TWO Extracting & cleaning @sjenkinson
  • Data privacy & the individual
  • openrefine.org
  • STEP THREE Sharing @sjenkinson
  • FIVE STAR DATA 5stardata.info
  • ★ Make your data available on the web (in whatever format) under an open license. ★★ Make it available as structured data (e.g., Excel instead of image scan of a table). ★★★ Use non-proprietary formats (e.g., CSV instead of Excel). ★★★★ Use URIs to denote things, so that people can point at your data. ★★★★★ Link your data to other data to provide context.
  • OPEN DATA CERTIFICATES certificates.theodi.org
  • IN CONCLUSION…
  • 1. Choose open data 2. Publish your data 3. Link it 4. Use standards 5. Promote freedom 6. Do some good 7. Be creative
  • @sjenkinson ! [email protected] ! recordssoundthesame.com THANK YOU. Thank you to these lovely people for making their content open: Linking Open Data cloud diagram 2014, by Max Schmachtenberg, Christian Bizer, Anja Jentzsch and Richard Cyganiak - lod-cloud.net The Data Spectrum - theodi.org/data-spectrum Doug McCune - dougmccune.com Stefanie Posavec - stefanieposavec.co.uk Data abstract painting - flickr.com/photos/rachubarama/2709346242 IE Market Share vs Murder Rate - imgur.com/47D7zGq Troy Marusek - flickr.com/photos/troymars/9113025616 The Roof of Wales - flickr.com/photos/stray_croc/4743302841 Fire Wall - flickr.com/photos/epleitez/1714341218 Money - flickr.com/photos/mikephotoart/12839909303 cc - flickr.com/photos/kalexanderson/7175627336 RDF - flickr.com/photos/gertcha/8292978031 Small Parts - flickr.com/photos/oskay/2156889157/ Hydrant - flickr.com/photos/pamhule/4677109732/ Upsala Glacier Retreat - flickr.com/photos/nasamarshall/10726540434/
Please download to view
104
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Description
Text
  • AN INTRODUCTION TO OPEN DATA Sally Jenkinson - Fronteers - Amsterdam - 09.10.2015 @sjenkinson | [email protected]
  • [email protected] | @sjenkinson Digital solutions architect & consultant Records Sound the Same Ltd Sally Jenkinson
  • DATA
  • OPEN DATA
  • “Big data” @sjenkinson
  • 90% of the world’s total data has been created within the last 2 years ! (IBM, 2014) @sjenkinson
  • I ♡ DATA
  • @sjenkinson
  • @sjenkinson
  • sallyjenkinson.co.uk/labs/teatracker
  • BUT…
  • “You agree to maintain your apps and your systems in accordance with industry standard quality levels…”
  • DATA SHARING
  • WHAT IS OPEN DATA?
  • Open data and content can be freely used, modified, and shared by anyone for any purpose. opendefinition.org
  • Re-publish Derive new content or data Make money by selling products Charge a fee for access
  • Make money by selling products Charge a fee for access
  • “We observed that often people think of open data as a specific ‘kind’ of data – something separate and distinct from the data they use day-to-day in their organisation or team – rather than a choice about how people publish data.” theodi.org/blog/closed-shared-open-data-whats-in-a-name
  • theodi.org/guides/publishers-guide-open-data-licensing | theodi.org/guides/reusers-guide-open-data-licensing Public domain (CC0) Attribution (CC-by) Attribution & share-alike (CC-by-sa) OPEN LICENCES FOR CREATIVE CONTENT
  • theodi.org/guides/publishers-guide-open-data-licensing | theodi.org/guides/reusers-guide-open-data-licensing Public domain (PDDL) Attribution (ODC-by) Attribution & share-alike (ODbL) OPEN LICENCES FOR DATABASES
  • theodi.org/guides/publishers-guide-open-data-licensing | theodi.org/guides/reusers-guide-open-data-licensing Open Government Licence OS Open Licence etc OTHER OPEN LICENCES
  • WHERE CAN I GET IT FROM?
  • wiki.dbpedia.org
  • musicbrainz.org
  • earthquake.usgs.gov/earthquakes/search/
  • plaidplug.com
  • data.id/dataset/daftar-titik-reklame-di-dki-jakarta/resource/361ce01f-34ed-4e00-a204-6062c7b9ad64
  • web.archive.org/web/20150520175645/http://137.189.35.203/WebUI/CatDatabase/catData.html
  • vision.stanford.edu/aditya86/ImageNetDogs/
  • {"gilded": 0,"author_flair_text":"Male","author_flair_css_class":"ma le","retrieved_on":1425124228,"ups": 3,"subreddit_id":"t5_2s30g","edited":false,"controversial ity": 0,"parent_id":"t1_cnapn0k","subreddit":"AskMen","body":"I can't agree with passing the blame, but I'm glad to hear it's at least helping you with the anxiety. I went the other direction and started taking responsibility for everything. I had to realize that people make mistakes including myself and it's gonna be alright. I don't have to be shackled to my mistakes and I don't have to be afraid of making them. ","created_utc":"1420070668","downs":0,"score": 3,"author":"TheDukeofEtown","archived":false,"distinguish ed":null,"id":"cnasd6x","score_hidden":false,"name":"t1_c nasd6x","link_id":"t3_2qyhmp"} x ~1.7 billion reddit.com/r/datasets/comments/3bxlg7/i_have_every_publicly_available_reddit_comment/
  • ♥ github.com/caesar0301/awesome-public-datasets
  • CONSUMING OPEN DATA
  • @sjenkinson
  • d3js.org
  • MORE THAN WEBSITES
  • iquantny.tumblr.com/post/92116352544/mapping-nyc-hydrant-revenue-upper-easts-19th
  • Generating value & making savings @sjenkinson
  • +$3 trillion / year mckinsey.com/insights/business_technology/open_data_unlocking_innovation_and_performance_with_liquid_information open data
  • Transparency @sjenkinson
  • “…within two years chemical emissions nationwide (at least as reported, and presumably also in fact) had decreased by 40 percent. ! Some companies were launching policies to bring their emissions down by 90 percent, just because of the release of previously sequestered information.” maban.co.uk/80
  • DATA & USER EXPERIENCES
  • “How far do you live from your workplace? Chances are, you'd answer that question in minutes rather than miles. ! An hour on the bus tells us a lot more than 47 miles. That's why we made Mapumental. ! Given any start point or destination, it'll show everywhere within the chosen commute time, by public transport.” mapumental.com/services/travel-time
  • “How accessible is your nearest school, post office, or GP’s surgery? ! In Wales, that’s not always a simple question: the country’s mountainous landscapes, rural populations, and sometimes infrequent bus services can mean that those without cars are rather cut off from public service provision.” mapumental.com/services/accessibility
  • “Just how quickly could fire engines reach a given postcode in case of a fire? ! It’s a question that’s pivotal to decisions made by both the emergency services and the insurance industry.” mysociety.org/2013/04/22/fire-fire-mapumental-and-fire-engine-journey-times
  • Improved efficiency Improved effectiveness Impact measurement @sjenkinson
  • Improved or new private products or services & innovation @sjenkinson
  • NOT JUST DIGITAL
  • opensensors.io
  • DOUG MCCUNE dougmccune.com
  • STEFANIE POSAVEC stefanieposavec.co.uk
  • “Air Transformed is a series of wearable data objects that communicate this physical burden in different ways. Though seemingly decorative, they are based entirely on open air quality data from Sheffield, UK, a former steelmaking city and notorious for its bad air.” stefanieposavec.co.uk/data/#/airtransformed
  • Participation & self-empowerment @sjenkinson
  • LINKED DATA
  • New knowledge from combined data sources and patterns in large data volumes @sjenkinson
  • Misrepresentation
  • tylervigen.com/spurious-correlations
  • tylervigen.com/spurious-correlations
  • Combining data sets & licences clipol.org/tools/compatibility
  • PUBLISHING OPEN DATA
  • “There are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns – the ones we don't know we don't know.” en.wikipedia.org/wiki/There_are_known_knowns
  • STEP ONE Identification & planning @sjenkinson
  • Clear licensing & usage information Structure & quality A plan for support @sjenkinson
  • Accuracy
  • STEP TWO Extracting & cleaning @sjenkinson
  • Data privacy & the individual
  • openrefine.org
  • STEP THREE Sharing @sjenkinson
  • FIVE STAR DATA 5stardata.info
  • ★ Make your data available on the web (in whatever format) under an open license. ★★ Make it available as structured data (e.g., Excel instead of image scan of a table). ★★★ Use non-proprietary formats (e.g., CSV instead of Excel). ★★★★ Use URIs to denote things, so that people can point at your data. ★★★★★ Link your data to other data to provide context.
  • OPEN DATA CERTIFICATES certificates.theodi.org
  • IN CONCLUSION…
  • 1. Choose open data 2. Publish your data 3. Link it 4. Use standards 5. Promote freedom 6. Do some good 7. Be creative
  • @sjenkinson ! [email protected] ! recordssoundthesame.com THANK YOU. Thank you to these lovely people for making their content open: Linking Open Data cloud diagram 2014, by Max Schmachtenberg, Christian Bizer, Anja Jentzsch and Richard Cyganiak - lod-cloud.net The Data Spectrum - theodi.org/data-spectrum Doug McCune - dougmccune.com Stefanie Posavec - stefanieposavec.co.uk Data abstract painting - flickr.com/photos/rachubarama/2709346242 IE Market Share vs Murder Rate - imgur.com/47D7zGq Troy Marusek - flickr.com/photos/troymars/9113025616 The Roof of Wales - flickr.com/photos/stray_croc/4743302841 Fire Wall - flickr.com/photos/epleitez/1714341218 Money - flickr.com/photos/mikephotoart/12839909303 cc - flickr.com/photos/kalexanderson/7175627336 RDF - flickr.com/photos/gertcha/8292978031 Small Parts - flickr.com/photos/oskay/2156889157/ Hydrant - flickr.com/photos/pamhule/4677109732/ Upsala Glacier Retreat - flickr.com/photos/nasamarshall/10726540434/
Comments
Top