data science @NYT ; inaugural Data Science Initiative Lecture

Data & Analytics

chris-wiggins
The present document can't read!
Please download to view
93
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Description
Text
  • data science @ The New York Times [email protected] [email protected] @chrishwiggins references: bit.ly/brown-refs
  • data science @ The New York Times
  • data science @ The New York Times
  • “data science” jobs, jobs, jobs
  • “data science” jobs, jobs, jobs
  • data science: mindset & toolset drew conway, 2010
  • modern history: 2009
  • modern history: 2009
  • “data science” ancient history: 2001
  • “data science” ancient history: 2001
  • data science context
  • home schooled
  • B.A. & M.Sc. from Brown
  • PhD in topology
  • “By the end of late 1945, I was a statistician rather than a topologist”
  • invented: “bit”
  • invented: “software”
  • invented: “FFT”
  • “the progenitor of data science.” - @mshron
  • “The Future of Data Analysis,” 1962 John W. Tukey
  • introduces: “Exploratory data anlaysis”
  • Tukey 1965, via John Chambers
  • TUKEY BEGAT S WHICH BEGAT R
  • Tukey 1972
  • Tukey 1975 In 1975, while at Princeton, Tufte was asked to teach a statistics course to a group of journalists who were visiting the school to study economics. He developed a set of readings and lectures on statistical graphics, which he further developed in joint seminars he subsequently taught with renowned statistician John Tukey (a pioneer in the field of information design). These course materials became the foundation for his first book on information design, The Visual Display of Quantitative Information
  • TUKEY BEGAT VDQI
  • Tukey 1977
  • TUKEY BEGAT EDA
  • fast forward -> 2001
  • “The primary agents for change should be university departments themselves.”
  • data science @ The New York Times histories 1. slow burn @Bell: as heretical statistics (see also Breiman) 2. caught fire 2009-now: as job description historical rant: bit.ly/data-rant
  • biology: 1892 vs. 1995
  • biology: 1892 vs. 1995 biology changed for good.
  • biology: 1892 vs. 1995 new toolset, new mindset
  • genetics: 1837 vs. 2012 ML toolset; data science mindset
  • genetics: 1837 vs. 2012
  • genetics: 1837 vs. 2012 ML toolset; data science mindset arxiv.org/abs/1105.5821 ; github.com/rajanil/mkboost
  • data science: mindset & toolset
  • 1851
  • news: 20th century church state
  • church
  • church
  • church
  • news: 20th century church state
  • news: 21st century church state engineering
  • 1851 1996 newspapering: 1851 vs. 1996
  • example: millions of views per hour2015
  • "...social activities generate large quantities of potentially valuable data...The data were not generated for the purpose of learning; however, the potential for learning is great’’
  • "...social activities generate large quantities of potentially valuable data...The data were not generated for the purpose of learning; however, the potential for learning is great’’ - J Chambers, Bell Labs,1993
  • data science: the web
  • data science: the web is your “online presence”
  • data science: the web is a microscope
  • data science: the web is an experimental tool
  • 1851 1996 newspapering: 1851 vs. 1996 vs. 2008 2008
  • “a startup is a temporary organization in search of a repeatable and scalable business model” —Steve Blank
  • every publisher is now a startup
  • every publisher is now a startup
  • news: 21st century church state engineering
  • news: 21st century church state engineering
  • learnings
  • learnings - predictive modeling - descriptive modeling - prescriptive modeling
  • (actually ML, shhhh…) - (supervised learning) - (unsupervised learning) - (reinforcement learning)
  • learnings - predictive modeling - descriptive modeling - prescriptive modeling cf. modelingsocialdata.org
  • predictive modeling, e.g., cf. modelingsocialdata.org
  • predictive modeling, e.g., “the funnel” cf. modelingsocialdata.org
  • interpretable predictive modeling su pe r co ol s tu ff cf. modelingsocialdata.org
  • interpretable predictive modeling su pe r co ol s tu ff cf. modelingsocialdata.org arxiv.org/abs/q-bio/0701021
  • optimization & learning, e.g., “How The New York Times Works “popular mechanics, 2015
  • optimization & prediction, e.g., “How The New York Times Works “popular mechanics, 2015 (some models) (s om e mo ne ys )
  • recommendation as predictive modeling
  • recommendation as predictive modeling bit.ly/AlexCTM
  • descriptive modeling, e.g, cf. daeilkim.com ; import bnpy
  • modeling your audience bit.ly/Hughes-Kim-Sudderth-AISTATS15
  • modeling your audience (optimization, ultimately)
  • also allows insight+targeting as inference modeling your audience
  • prescriptive modeling
  • prescriptive modeling cf. modelingsocialdata.org
  • prescriptive modeling aka “A/B testing”; RCT cf. modelingsocialdata.org
  • prescriptive modeling, e.g,
  • prescriptive modeling, e.g,
  • prescriptive modeling, e.g,
  • Reporting Learning Test Optimizing Exploredescriptive: predictive: prescriptive:
  • Reporting Learning Test Optimizing Exploredescriptive: predictive: prescriptive:
  • common requirements in data science:
  • common requirements in data science: 1. people 2. ideas 3. things cf. John Boyd, USAF
  • data science: ideas
  • data skills data science and… - data engineering - data embeds - data product - data multiliteracies cf. “data scientists at work”, ch 1
  • data science: ideas - new mindset > new toolset
  • data science: people
  • thanks to the data science team!
  • data science @ The New York Times [email protected] [email protected] @chrishwiggins references: bit.ly/brown-refs
Comments
Top